Author Topic: SpaceX Falcon 9 v1.1 CRS-3 Splashdown Video Repair Task Thread  (Read 1201584 times)

Offline princess

  • Member
  • Posts: 65
  • Liked: 106
  • Likes Given: 25
Excellent, I've been trying to clean them up some.  But I'm totally in the dark.  I've done what I can for parts _11 and _12.  But I'm only really playing with a hex editor.  So I don't even know if what I'm doing is helpful, beyond the fact that it's cleaned up a bunch of the continuity errors.  It did return a few additional frames, but I can't see them (no computer Fu, i'm capable of following exact directions but that's it).  I've attached my attempts.  Someone let me know if they're any improvement or if I'm just wasting my time.  Not that that will necessarily keep me from goofing around with them.

That's great, thanks for posting those. I've had a quick look at part 11 and what you've done to it - I've attached the output of my TS fix tool when run against shanuson's part 11.

The good part is that it's managed to fix a few more TS-level issues! It deliberately doesn't change the contents of the data stream, so it hasn't fixed some of the MPEG4 headers that you'd got. I'll see if I can update the tool so it can incorporate your MPEG4 changes.

One thing I did notice is that sometimes you removed the adaptation field from an 0x03e8 packet when it looks too long, but as far as I can see it is sometimes legitimately long. For example, packet 87 at file offset 0x00003fe4 is marked as having a 69-byte adaptation field. It's human nature to think "whoa, that can't be right" and zero the AF, but if you do this and then look at the data it has a huge run of 0xff bytes before the data starts (from deruch's edit):

Packet 87 at 0x00003fe4: PID 0x03e8 Pay3:4500ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

I believe the correct thing to do in this case is to leave the AF at 69 bytes (from my edit):

Packet 87 at 0x00003fe4: PID 0x03e8 AF[69] Pay3:ff01c0d80f09017b09418cde36def8f3fb90982fa7bf8dbee58fbdef1bcf7c6d

However, in other cases you can see that a packet header has become corrupted so that it adds a freaky AF, sometimes one that's longer than the 188-byte TS packet itself! So it is hard to automate this kind of checking, it has to be done manually. Just be super-careful when removing the AF from a packet, as you might introduce a huge run of padding into the MPEG4 datastream.

If we sort out the lower-level TS problems, your MPEG4 header fixes will be very useful, so we'll probably need to work together on this one!

Offline Req

  • Full Member
  • ****
  • Posts: 405
  • Liked: 434
  • Likes Given: 2580
I've been attempting to tune this bash script, with an eye towards performance.

This script is searching for two bits between 8800 and 9065 on frame 52.  It's a lot of combinations, and I am getting roughly 600 frames per minute.

Quote
for i in {8800..9064} ; do for j in {8801..9065} ; do ffmpeg.exe -debug mb_pos_size -s:0 704:480 -mmb X:76768:80,X:22120:80,X:45038:80,X:42196:80,X:66234:80,X:50298:80,X:$i:80,X:$j:80, 0:7:-2:-10:-10:-10:-10:8:-5,2:7:15506,9:7:-1,15:7:16391,17:7:-1,28:7:16704, 35:7:-2:-10:-10:-10:-10:8:-5,29:9:21074,34:9:-1,39:9:21626,5:28:-1::63,10:28:80196 -i iframe52.mpg4-img 2> output-$i-$j.txt -f image2 /dev/null -y ; done ; done

The output images have been entirely disabled, and the directive to ignore errors has been removed.  I am not 100% sure that the latter is "safe" but it does significantly increase the speed of the process.

Note that there are spaces in the mmb string that would need to be removed to run this.

After the run is done, I first do a grep for 42:03:9061 , then a grep for 00:29:81738 .  The second one is just in case the first "known good" position ends up changing due to correctly flipped bits.  Edit: This second test is probably worthless.

If anybody has suggestions for how to speed this up, I am all ears!  The search space can likely be reduced but I'm really unsure of what would be advisable.

Just sticking to two bits right now... going for three in a search space this size would take something like 21 days at a rate of 600 tries per minute.

Farm it out to multiple cores...

A quick paste(and edited a bit for your case) from something I made for a friend to run N rsyncs at once - in this case it's looking for "ffmpeg" in the processlist to track threads.

# use command-line argument for threads if specified
if [ "$1" == "" ]
then
  threads=3
else
  threads=$1
fi

..<snip>..

for i in {8800..9064}
do
  # check to see if enough threads are running and wait if so
  while [ `ps aux | grep ffmpeg | grep -v grep | wc -l` -ge $threads ]
  do
    sleep .1
  done

  # spawn a worker on the current offset
  /root/process_one $i &
  sleep .1
done

# don't exit back to the prompt until all of the workers are finished
while [ `ps aux | grep ffmpeg | grep -v grep | wc -l` -gt 0 ]
do
  sleep .1
done


Maybe farm out the "for j in" part this way.  Use however many logical cores (this includes hyper-threading) that your system has +1 (so for a 4-core i7 you'd want to use 9).  You'll have to eliminate your temp file per an earlier recommendation or use $i for it's filename if you want to run several at once.  Also, yes, as previously mentioned by another poster, if you're going to be using a temp file you want to make sure that it's on a ramfs mount with atime disabled(if applicable in your distro).

Even if ffmpeg is already using multiple cores, you are likely to see at least double your current performance.  If ffmpeg is not using multiple cores, you should see a fairly linear increase relative to number of cores used.  If ffmpeg is already using multiple cores, you should probably run the script with just 2 or 3 threads, not cores+1.

This should work well since you have a decent amount of work to farm out per worker if the farmed out work is the "for j in" part.  If you needed to farm out tons of really short workers you'd want to use memcached listening on a socket to track the number of workers running to eliminate the time of the "ps" call and invoke them from compiled code like C++ or hiphop'd php or something instead of a shell script.

Also, I'm not sure what the output of this looks like, but you should probably append whatever useful output this stuff generates into a logfile and run the main script in a screen, since it may take a while, and you don't want it aborting just because you got disconnected from your SSH session.  You want the logfile because it's a pain in the ass to scroll up in a screen.  This also allows you to detach the screen and just tail -f the logfile.  Note that >> appends stdout and 2>> appends errout, so you'll need to use both if you also want error output in the log.
« Last Edit: 05/24/2014 02:15 am by Req »

Offline deruch

  • Senior Member
  • *****
  • Posts: 2422
  • California
  • Liked: 2006
  • Likes Given: 5634
Good catch, Princess.  There's probably a couple of other places where I made the same mistake.   :'( 

Have you been to the http://spacexlanding.wikispaces.com yet?  There's a section there, towards the bottom of the active work that deals with the headers and .ts files.   
Shouldn't reality posts be in "Advanced concepts"?  --Nomadd

Offline Req

  • Full Member
  • ****
  • Posts: 405
  • Liked: 434
  • Likes Given: 2580
Actually, it occurs to me that the straight up "ps" approach to tracking workers may have problems with the above script, because the workers are constantly starting and stopping ffmpeg, which ps is looking for.

Another easy trick to track spawned workers is just to use PHP.

To check workers you can use:
while [ `ps aux | grep process_one\.php | grep -v grep | wc -l` -ge $threads ]

And to spawn workers you can use:
# spawn a worker(via php passthrough for easy thread tracking) on the current offset
php /root/process_one.php $i &

process_one.php looks like this:

<?

function parseArgs($argv)
{
        array_shift($argv);
        $out = array();
        $i = 0;
        foreach ($argv as $arg)
        {
                $out[$i] = $arg;
                $i = $i + 1;
        }
        return $out;
}

// parse arguments from commandline
$args = parseArgs($argv);
if ($args[0])
{
        $offset = $args[0];
}

exec('/root/process_one ' . $offset);

?>

Again, the overhead php adds to spawning a process in this case is negligible because the workers will be running for a good while each and you're spawning thousands, not millions or billions.
« Last Edit: 05/24/2014 02:31 am by Req »

Offline mvpel

  • Full Member
  • ****
  • Posts: 1125
  • New Hampshire
  • Liked: 1303
  • Likes Given: 1685
Try using HTCondor - when you install it on your Linux box after downloading from http://research.cs.wisc.edu/htcondor/ as an RPM, it will set itself up to manage jobs on the local machine. You submit the jobs and it will feed them into however many cores you have as needed.

Edit: Sorry, typo in the URL.
« Last Edit: 05/24/2014 03:03 am by mvpel »
"Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code." - Eric S. Raymond

Offline Req

  • Full Member
  • ****
  • Posts: 405
  • Liked: 434
  • Likes Given: 2580
Try using HTCondor - when you install it on your Linux box after downloading from http://research.cs.uwisc.edu/htcondor/ as an RPM, it will set itself up to manage jobs on the local machine. You submit the jobs and it will feed them into however many cores you have as needed.

Link is broken.

I can tell you right now though, it'll be quite easier to make 3 10-20 line files in this case.

My background is hosting extremely large scale sites and applications(100,000+ concurrent users), and I also had a sole source contract with the state department for several years to aggregate and analyze "sources of interest" to the tune of >150GB/day(of text) including twitter, facebook, forums, etc for sentiment analysis/event prediction and notification/outlier detection and characterization/personal network visualization and characterization/etc to provide intelligence for select embassies in the middle east.  To complicate matters, it had to "understand" all of the various dialects of arabic/farsi/urdu/kurdish/turkish/etc along with the standard english/german/spanish/etc obviously.  I had to develop a lexicon system that had native speakers characterizing words in the order of their significance in the dataset.  I do have some sense of what it looks like when you want to spread a task across a few cores, or 14 cabinets full of servers.

Back on topic... Implementation isn't even really the crux of the matter, most of the thought and design goes into HOW you will stage the data and scale the task to operate efficiently given your dataset and task.  This particular task can be readily scaled just using the two loops he already uses.  Implementation on the shell won't have a noticeable impact on performance for this data set, and 30-60 lines of code which is mostly copy and paste at this point is pretty ridiculously easy implementation.
« Last Edit: 05/24/2014 04:12 pm by Req »

Offline Untribium

  • Member
  • Posts: 32
  • Switzerland
  • Liked: 32
  • Likes Given: 82
Actually, it occurs to me that the straight up "ps" approach to tracking workers may have problems with the above script, because the workers are constantly starting and stopping ffmpeg, which ps is looking for.

Another easy trick to track spawned workers is just to use PHP.

To check workers you can use:
while [ `ps aux | grep process_one\.php | grep -v grep | wc -l` -ge $threads ]

And to spawn workers you can use:
# spawn a worker(via php passthrough for easy thread tracking) on the current offset
php /root/process_one.php $i &

process_one.php looks like this:

-snip-

Again, the overhead php adds to spawning a process in this case is negligible because the workers will be running for a good while each and you're spawning thousands, not millions or billions.

Looks like this: http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/7 might be an option as well. Generate the addresses using seq and then pipe it to xargs which then runs a fixed number of ffmpeg processes at a time. I'll check it out tomorrow, should get some sleep first :)

Offline Req

  • Full Member
  • ****
  • Posts: 405
  • Liked: 434
  • Likes Given: 2580
Actually, it occurs to me that the straight up "ps" approach to tracking workers may have problems with the above script, because the workers are constantly starting and stopping ffmpeg, which ps is looking for.

Another easy trick to track spawned workers is just to use PHP.

To check workers you can use:
while [ `ps aux | grep process_one\.php | grep -v grep | wc -l` -ge $threads ]

And to spawn workers you can use:
# spawn a worker(via php passthrough for easy thread tracking) on the current offset
php /root/process_one.php $i &

process_one.php looks like this:

-snip-

Again, the overhead php adds to spawning a process in this case is negligible because the workers will be running for a good while each and you're spawning thousands, not millions or billions.

Looks like this: http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/7 might be an option as well. Generate the addresses using seq and then pipe it to xargs which then runs a fixed number of ffmpeg processes at a time. I'll check it out tomorrow, should get some sleep first :)

I am interested to see the results.  I'm just so busy that my "recreation" basically involves this thread and a few other news sites at the moment.  Maybe you have found something that will be very useful to my endeavors in the future, at least on one level of scale!  Although I must admit, I do enjoy coding this type of thing.
« Last Edit: 05/24/2014 04:42 am by Req »

Offline mvpel

  • Full Member
  • ****
  • Posts: 1125
  • New Hampshire
  • Liked: 1303
  • Likes Given: 1685
I've used my bvi/wireshark approach to fix the transport stream headers on fixed_edit8_part_229.ts, and the "clean47" version is attached below. There were good-sized chunks of bad headers at the outset of the file, and I'm hoping that it can reveal something more of the top half of that frame, though I'm not particularly hopeful.
« Last Edit: 05/24/2014 06:10 am by mvpel »
"Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code." - Eric S. Raymond

Offline princess

  • Member
  • Posts: 65
  • Liked: 106
  • Likes Given: 25
I've used my bvi/wireshark approach to fix the transport stream headers on fixed_edit8_part_229.ts, and the "clean47" version is attached below. There were good-sized chunks of bad headers at the outset of the file, and I'm hoping that it can reveal something more of the top half of that frame, though I'm not particularly hopeful.

You did really great! Those headers were totally mangled, but after your fixes your result file is pretty clean from a TS point of view.

I hope you don't mind but I've processed it a little more and fixed a couple of PCC discontinuities that were remaining. I've also removed the AF from any packets where either the AF is longer than the packet, or when the AF shows a wildly invalid PTS time. This is hopefully the correct thing to do, but please take a look and let me know what you think.

Offline Shanuson

  • Full Member
  • ***
  • Posts: 395
  • Liked: 327
  • Likes Given: 2542
I've used my bvi/wireshark approach to fix the transport stream headers on fixed_edit8_part_229.ts, and the "clean47" version is attached below. There were good-sized chunks of bad headers at the outset of the file, and I'm hoping that it can reveal something more of the top half of that frame, though I'm not particularly hopeful.

You did really great! Those headers were totally mangled, but after your fixes your result file is pretty clean from a TS point of view.

I hope you don't mind but I've processed it a little more and fixed a couple of PCC discontinuities that were remaining. I've also removed the AF from any packets where either the AF is longer than the packet, or when the AF shows a wildly invalid PTS time. This is hopefully the correct thing to do, but please take a look and let me know what you think.

what do you mean be remoced the AF? not every AF has a PTS.


Yet the file looks really really clean, only at 2 points there was a 47 23 e8 1x instead of 4703e81x
I did not check if the AF length is correct. looked ok. In the end if the AF length is to small and some FFs get in the img file, that will be ok and only force a reassignment of one MB or so and be handled by the nice group that is fixed P-frames.

Offline Shanuson

  • Full Member
  • ***
  • Posts: 395
  • Liked: 327
  • Likes Given: 2542
here is cleaned up part 7.

I will redo part 1 and 2 to also fixed all TS-headers.

The end result should be 15 fixed parts that only have to be put together to get a final ts file to which no tsfix etc has to be applied.

Cheers
Shanuson

Offline princess

  • Member
  • Posts: 65
  • Liked: 106
  • Likes Given: 25
I've also removed the AF from any packets where either the AF is longer than the packet, or when the AF shows a wildly invalid PTS time. This is hopefully the correct thing to do, but please take a look and let me know what you think.

what do you mean be remoced the AF? not every AF has a PTS.

I'm working on the theory that if a packet has a "long" AF, and it indicates it has a PTS in the AF header, and the PTS is obviously completely wrong, then probably what's happened is that the "AF" bit has been flipped on a normal data packet that shouldn't contain an AF.

The aim is to try to recover more of the MPEG4 data. If an error flips the "AF present" bit, then the normal MPEG4 data gets interpreted as an AF length, and a number of bytes get removed from the MPEG4 data stream. If we can detect when this has happened, we can recover more of the MPEG4 data, which might result in more valid blocks popping into place in I-frames and P-frames.

Offline princess

  • Member
  • Posts: 65
  • Liked: 106
  • Likes Given: 25
here is cleaned up part 7.

The data packets all look good to me - great job! I've cleaned up the big runs of 0x1fff padding packets so that there are no more invalid PIDs or invalid packets in the stream. It won't make any difference to the MPEG4 data (as your edits have already done that), but it means that the file is clean at the TS level.

Offline Oersted

  • Member
  • Senior Member
  • *****
  • Posts: 2897
  • Liked: 4098
  • Likes Given: 2773
My background is hosting extremely large scale sites and applications(100,000+ concurrent users), and I also had a sole source contract with the state department for several years to aggregate and analyze "sources of interest" to the tune of >150GB/day(of text) including twitter, facebook, forums, etc for sentiment analysis/event prediction and notification/outlier detection and characterization/personal network visualization and characterization/etc to provide intelligence for select embassies in the middle east.  To complicate matters, it had to "understand" all of the various dialects of arabic/farsi/urdu/kurdish/turkish/etc along with the standard english/german/spanish/etc obviously.

Cool to have the NSA contributing to this video clean-up now...  ;-)

Offline Shanuson

  • Full Member
  • ***
  • Posts: 395
  • Liked: 327
  • Likes Given: 2542
here is cleaned up part 7.

The data packets all look good to me - great job! I've cleaned up the big runs of 0x1fff padding packets so that there are no more invalid PIDs or invalid packets in the stream. It won't make any difference to the MPEG4 data (as your edits have already done that), but it means that the file is clean at the TS level.


Yes, i clean it by hand so i don't worry much about the 1fff parts. only look it up for CC if there is some ambiguity if a TS packet is data or something else. 
There are 2 cases where the AF bit should be set for a packet with PID 3e8: The first packet of a frame, and the last packet of a frame. The frist contains a PTS and stuff. the last only stuffing FF's in front of the last data bytes. The frist one is always there, but some frames dont need stuffing at the end.

I'm half way through part1 fixing the TS packets. :D More tonight.
Cheers,
Shanuson

Offline princess

  • Member
  • Posts: 65
  • Liked: 106
  • Likes Given: 25
Here is my cleaned version of the part14 TS file. Comments gratefully received!

Offline Lourens

  • Full Member
  • *
  • Posts: 156
  • The Netherlands
  • Liked: 206
  • Likes Given: 304
Question: when we use -mmb X:offset:pattern, are we flipping bits before or after the entropy decoding stage? Is this simply equivalent to flipping bits in the input file? Or is it essentially just a low-level way of editing the MB directly?

Offline saliva_sweet

  • Full Member
  • ****
  • Posts: 614
  • Liked: 476
  • Likes Given: 1826
Out of sheer desperation I decided to go deeper into the macroblock data to see if there's any way to find macroblock start positions, which would be a tremendous help for flipping bits. No amount of flipping can save a block if its start position is wrong. Unfortunately, but also unsurprisingly I did not find much signal in the bitstream to do this. But here are my results.

I took the first 623 macroblocks from frame 169. They are known to be good and haven't been tampered with. The sample is a bit small and mostly represents ocean data though. MB size range is 23 - 725 bits, mean 89, median 75. Data isn't random as expected, there is a slight tendency for ones to follow ones and zeros to follow zeros, but it's only like 2% over 50/50 so no use for flipping.

All macroblocks are type 3, not sure if this is due to small sample and rarity of type 4 macroblocks or they are not used in live encoding. This is about the only concrete result, but all it tells us is that a macroblock does not start with more than 2 zeros.

69% of macroblocks start with a one. 71% of macroblocks end with a one. The first part of macroblock is mcbpc, the ones used are (mcbpc:count):
001 : 37
011 : 32
010 : 123
1 : 431
This is followeb by ac_pred_flag (fixed length 1 bit), it's set to zero in 89% of macroblocks
Then comes cbpy. Here are the ones I saw with counts.
cbpy: 1010 : 43
cbpy: 000011 : 14
cbpy: 0011 : 4
cbpy: 0110 : 65
cbpy: 1000 : 45
cbpy: 1001 : 24
cbpy: 0111 : 20
cbpy: 1011 : 51
cbpy: 0100 : 20
cbpy: 00101 : 13
cbpy: 00011 : 11
cbpy: 0101 : 20
cbpy: 00100 : 6
cbpy: 000010 : 12
cbpy: 11 : 269
cbpy: 00010 : 6
It's set to 11 in 43% of macroblocks. The first bit of block data (1st bit of dct_dc_size_luminance code) is set to 1 in 65% of macroblocks. I didn't go further at this time. On the whole the best signature based on these data for macroblock start was thus 1|10111, where | is the macroblock border. 11% of macroblocks start with this pattern (about 1 in 9). So it could in principle provide a 64X reduction in search space at the cost of 89% of sensitivity. Not sure if that's worthwhile.

Offline arnezami

  • Full Member
  • **
  • Posts: 285
  • Liked: 267
  • Likes Given: 378
Hi guys,

Now that we have come so far already and developed new techniques I wanted to express my thoughts about why I think we are succeeding and how we can best proceed.

Why are we succeeding?

I think there are two main reasons why we are succeeding in what was deemed to be impossible. The first is that we really went outside-the-box with our approach. The second is that our techniques are such that we can collaborate in our efforts. Let me elaborate on both.

Outside-the-box thinking:

The "conventional" way of thinking when repairing a video is (1) to try to repair all predictable header information and (2) try to repair each decoding error one at the time. The thinking is: since you cannot decode something if you haven't decoded everything before it (from the point of a key frame) you can only fix the error where the decoder halts.

Keep in mind that everything in mpeg-4 compressed video is highly dependent on each other: if decoding of a block fails you lose all data in three dimensions! You lose every block to the right, every row of blocks to the bottom and every block of every frame after this frame in this right-lower region. This is horrible. Especially when the error occurs near the top of the key frame. And when you can't repair an error you're stuck. This is the situation we started in. On the blog of Benoît Joossen he describes what he did to extract parts of the two key frames and could not do much more than that, not without enormous effort.

This situation looks very similar to having an error in encrypted data where the encryption algorithm used the "Propagating cipher-block chaining"-mode. Meaning that if you have a single error in the data everything after that consists of seemingly random bits when you decrypt. However compression is very different from encryption: while the end-result of both encrypted and compressed data may look similar - they both look like random bits - the goal of encryption is to prevent decoding without a key while the goal of compression is to lower the amount of bits needed.

We can learn something very useful though from the area of encryption: if you want to break a cipher you ultimately want to recover it's "internal state". And if a cipher is weak this is actually possible. Since a compression algorithm can always be considered a weak cipher algorithm (it's not designed to be protecting it's "internal state") there might be a way to recover it's "internal state" after an error has occurred.

The mpeg-4 decoder algorithm has something which you could call an "internal state". It contains the current x and y of the macroblock, the bitposition of that block, the DC values from either the top or left block and some other values (like MV values). If you have these values at a certain x,y then you can decode a video from there.

What has happened is that after some time struggling with the problem and realizing the bad situation we were in, we had two major breakthroughs in this regard. One was that we told the decoder to ignore errors. This allowed us to see what the decoder would do after the error, if it could somehow self-recover somewhat. And this appeared to be the case: parts of it's "internal state" were apparently restored because it was clearly producing recognizable parts in the picture (albeit in the wrong color, brightness and x,y-position). The clock was first found this way. :)

The other breakthrough was that we managed to change parts of the "internal state": by setting the bitposition of a macroblock the best effort by Benoît Joossen was essentially recreated. But in our case there was no brute forcing needed (which is what Benoît Joossen used). We could simply set the bitposition in the "internal state" at the moment a certain macroblock was supposed to be decoded. Later the tool was released so that everybody could start trying mmb-commands.

Which brings us to the second reason why we are succeeding. ;)

Collaboration becomes possible:

There is no substitute for the human eye! :) And brain!

It is simply incredible how much time and effort people are putting into this major task! After the mmb-command became usable people started to play around with it and were quickly producing nice I-frame results. This lead to more added features and more use of those features, leading to more progress. Within a few days we had a wiki, a repository, command line repair tools and a really cool online repair tool! The amount of constant updates on the wiki is also impressive. Some people seem absolutely dedicated making this video work. Almost every day I start reading this thread and the forum simply won't let me "like" fast enough (120 seconds delay). Thats how good it is.

All this has personally inspired me to do even more. It was really nice to see the first P-frames coming out. I'm so proud of everyone here :). The latest video was also awesome. And the tweet from Elon made it all worth it for me.

Anyway. It turns out this problem was very fit to be crowd-sourced. NSF rocks!

How should we proceed?

Here are the three phases I mentioned earlier for reference:

Quote
What this whole problem boils down to is three phases (per frame):

1) Extracting: Correcting the predictable streamheaders and I/P-frame headers. This way we (or actually the decoder) can extract all the video data without missing packets.

2) Positioning: Finding all the macroblock-data (that is: finding all bitpositions where each mb-data starts) and then assigning the right macroblocks to the these starting bitpositions.

3) Repairing: Correcting left-over DC-values errors either by bit flipping or by changing the DC-values after decoding. This should implcitly also fix "bad inheritance": where a macro inherents its DC base value from the wrong macroblock (that is: left or top).


I think we are doing a very good job at 1 and 2 and we are also doing a bit of bitflipping (3). As I mentioned last time I've got some ideas how to make things even better. Here is a description of the tools I am thinking about:

Macroblock data finder:

In a lot of cases I see that there are still quite large swaths of blocks that are "turned off" (as in a -1 somewhere before it) because of some error that occurred earlier. It's not always easy to know where actual non-error data begins again. Especially the last row of the I-frames. And I think I have found out a way to find most missing data for macroblocks.

The algorithm goes like this. You iterate through every possible starting position (begin-bit to end-bit of the frame data) and record at which bitposition it begins and at which bitposition it ends (which is either an error or the end of the frame data) of each of these starting positions. You count the amount of bits and macroblocks are decoded for each of these starting positions until it hits an error (or the end of the frame data). So a really good part for I-frame 169 would start at bitposition 550: it decodes many (thousands!) of bits and many blocks from bit position 550 before it hits an error.

Now you sort all these bitsequences by the length they have (the total number of bits they decode from that bitposition). You take the first of that list because that is obviously the best sequence. Then you remove it from the list and also remove all sequences that either end or start within the bitrange of the "winning" sequence. They are all considered to be of less value. And you keep doing this: taking the first sequence, removing all sequences that lie in its coverage area. Until you have no sequences left.

I believe at the end you will have all the longest bitsequences that are available in the frame data. You "only" have to assign the right macroblocks to the starting bitpositions of these sequences in this set. And as said before I believe it would be helpful if we auto-generate all the likely starting positions for each frame and make these available as a static file to the online editor. This would allow the user to choose from good starting positions.

Brute force bit flipper inside the decoder:

Another idea is to do the bruteforce bitflipping inside the decoder. The main reason to do the brute forcing inside the code of the decoder is that it is likely to be much faster. Lets say you have four blocks of -1 (for example 15:12, 16:12, 17:12 and 18:12 in I-frame 150). To the left and right of it is are good macroblocks (14:12 at 45301 and 19:12 at 46628). You remove the four -1s so it will give an error again. Now you say to the decoder (in broteforce mode) try bitflipping 1-3 bits between the starting position of 15:12 (45472) and the starting position of 19:12 (46628) and each time you do this try to decode from 15:12 and see if you find 4 macroblocks and end at exactly at 19:12 (46628).

The advantage of this would be that there is a very clear end goal which is not likely to happen by chance. The decoder doesn't have to decode the entire frame over and over again (only 4 macroblocks decoding each time). And you could even reduce that to close to just one macroblock decoding each try. And if it found a candidate it would simply print it out.

I think that would be the fastest way of bitflipping. There may be smarter variations of it. It is however not trivial (at least not for me ;)) to "reset" an already decoded macroblock. @michaelni: do you know how to do this? Because that would be required.

DC value fixer:

Changing the DC values turns out to be very tricky. Not only can you get some unpredictable behaviour to the right and bottom of a block, it is also possible that everything gets messed up if a block-position is changed above and to the left. And it's also very tedious I think. Hard to get it perfect.

There may be a way to (semi-)automatically fix the DC values (here I am less certain though).  We are assuming all marcoblock have been given a position or are marked as -1. So no error blocks. And we have done all the bitflipping we could imagine. So this is basically something you do at the very end: "post-processing" if you will.

This is roughly how it would go. You disable color output. So only luminance comes out. You set 0,0 to -1 and you set 42:0 it designated bitposition. To the right of 42:0 is the last block of the row. It is affected by DC values of 42:0. If you see a horizontal "line" on the 8th pixel from above (as in: consistently lighter or darker pixels between 8 pixels and 9 pixels from above) then you know that the DC values of your block is off and change it. Then you go do the same with 41:0. But now you can check a longer (more reliable) horizontal line.

There are still some problems with that idea though. I will have to experiment and think about it longer.

Have fun! :)

Regards,

arenzami
« Last Edit: 05/25/2014 09:15 pm by arnezami »

Tags:
 

Advertisement NovaTech
Advertisement Northrop Grumman
Advertisement
Advertisement Margaritaville Beach Resort South Padre Island
Advertisement Brady Kenniston
Advertisement NextSpaceflight
Advertisement Nathan Barker Photography
1