Bit by Bit: Crime Scene Investigation

In my Advanced Java Concepts class we were given a project to recover pictures from a .raw file. In this file it contained pictures that had been deleted. So it was our job to recover the pictures and then after recovering them go to the spot around school and take a picture at least one of the spots. To complete the assignment we had to submit our code, as well as, the picture that we took at one of the spots. The specifics of the assignment are included in the text below:

The Task

A digital camera has been found, but there do not appear to be any pictures on it. Perhaps they have been deleted. Of course, the camera's flash card has data on it, even if the data is not readily accessible. The contents of the camera's flash card is given to you to see if you can recover any of the pictures.
Your task is to write a Java program to recover as many pictures as possible from the flash card data 80MB. Take a picture of yourself at any one of the scenes found in the pictures.

You need to know that all pictures taken by this camera are jpeg images. Also, jpeg viewing software does not care if a jpeg file has extraneous bytes at the end of the file because the size of the image data is encoded in the jpeg format. You are asked to recover each jpeg picture in a file of its own, but extraneous bytes at the end of the file are allowed. As a consequence it is not necessary to actually write a program that fully parses the jpeg format (this is quite difficult).

This camera formats the flash card in blocks of 512 bytes. Each picture the camera takes starts at the begining of a block and is stored in contiguous blocks. If the picture does not fit exactly into an even number of blocks (and it hardly ever would), the rest of the last block is not used. Deleting a picture marks the picture's blocks as available to be reused. Hopefully, no pictures have been deleted from the flash card before another picture was taken. If this is the case, none of the blocks have been used to store more than one picture since the flash card was new.

Note

My camera broke and now I have a different camera. Try to make you program detect any jpg image (assuming the camera manufactor follows the jpg standard).

Input/Output

The program should expect one command-line argument --- the URL of the (binary) flash card image. So, the program might be invoked as follows:

java Recover http://www.cs.fit.edu/~ryan/cse4051/projects/csi/card.raw

or:

java Recover file://test.bin

The size of input will be a multiple of the block size. For each jpg image identified in the input file, write a separate file: image01.jpg, image02.jpg, image03.jpg, etc. Please use names of this form; start numbering with 1; use 2 decimal digits. Do not output thumbnail images (or any other suparts) separately.

This was a challenging project as we were given information that was very broad and were told that the jpeg images link above contained some information we could use. It turns out that the only information that was useful was the beginning indicator of the jpeg.

The easiest way to solve this problem is by looking for the beginning indicator and then searching for the next beginning indicator. Now you are probably asking, "Well what the heck Zach, what about the ending indicator? You have to know when the picture ends." That is a valid question, but because the way an operating system interprets the bytes of a jpeg image, you do not have to worry about he extraneous bytes after the end of image indicator. The picture viewer will ignore these extraneous bytes and display the image.

Another useful piece of information is that the flash card is formatted in 512 bytes, and by only reading 512 blocks at a time it makes it easier to interpret the data.

Originally when doing this assignment I did not have the information that I provided above and was iterating through the whole 512 byte block that I had taken in from the file. Later, I found out that if the begin of image indicator is not at the beginning of the block then there is no need to search the rest of the block. This effectively took my searching algorithm from O(N) to O(1), where N is the number of bytes per file because even though you check only a block at a time you have to search the whole card to make sure that there are no more pictures, which in the algorithm performance world is fantastic. Also, because we only have to check the first two bytes it went from O(N + 512) to O(N + 2) where N is the number of bytes and the constant following is the number of bytes searched. Even though this does not save very much time in the grand scheme of things it is still helpful to optimize as much as possible.

I did not include the method that saves the picture to the file but that can be found on my GitHub. The pictures found throughout this post are some of the pictures that I recovered and I will include the one that I submitted. Overall it was an interesting assignment and can be done in very little code.

Bit by Bit

Monday, September 9, 2013

Crime Scene Investigation

No comments :

Post a Comment

About Me