Quantcast
Channel: VBForums - Visual Basic 6 and Earlier
Viewing all articles
Browse latest Browse all 21271

Trouble parsing JPEG tags

$
0
0
I'm working on a project to be able to automatically detect JPEG files embedded in other files (such as game resources). In the absence of knowledge of the resource file format, but with knowledge that it is uncompressed, I can theoretically locate the start and end of each datafile in the resources file. While fairly simple with BMP as BM starts the file and the length of the file is determined by the very next data field, it's more complicated with JPEG.

Ok I know I can find the start of a JPEG file embedded in another file by looking for the first the first tag that opens a JPEG file. It is the specific tw-byte-integer 0xFFD8. Following that are a series of tags and offsets.
Each tag is of the form 0xFFXX (where XX means any valid 1-byte value). After each tag is a two byte offset in BigEndian format. This is SUPPOSED TO point to the next tag (it's a relative offset from the begining of the current offset field). But it does NOT always point to the next tag. In theory I could follow all the jumps from these offsets until I reach the EOF tag which is 0xFFD9. But it doesn't work in practice. I found that at least once following this offset will land me in somewhere in the middle of the compressed JPEG image data.

My test file is the Windows XP sample image file "Blue hills.jpg". I parsed it manually in a hex editor, and here's my results.

Code:

Tag      Relative Offset to Next Tag
0xFFE0  0x0010
0xFFED  0x094C
0xFFEE  0x000E
0xFFDB  0x0084
0xFFC0  0x0011
0xFFDD  0x0004
0xFFC4  0x013F
0xFFDA  0x000C
0xF4D9 THIS IS NOT A VALID JPEG TAG!!!!!

The LAST TAG that gets processed in a JPEG image file should the end-of-image tag which is 0xFFD9, but no such tag is EVER encountered in the jumps in my test. Instead it runs into a region of the file containing apparently invalid data! My technique is completely according to the official JPEG/JFIF specs. But it doesn't work entirely. I personally think that the TRUE SPECS are some kind of trade-secret known only to the JPEG organization, and is a spec not publicly available, and only licensed out to official software developing corporations.

If someone here can shed some light on what I'm doing wrong, please let me know. Thanks in advance.

Viewing all articles
Browse latest Browse all 21271

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>