How to binary file headers: a hex reference for developers
- Step 1Open a known-good sample of the format — Drop a clean file of the format you're parsing into the inspector. Reading a correct file first gives you the ground truth to test your parser against. The file must fit your plan's size limit (10 MB Free → 2 GB Developer), which is never a constraint for header-reading.
- Step 2Use the offset column to find the field — The left column is the hex offset, advancing by 16 per row. To read a field at offset
0x0C, go to row00000000and count to the 13th byte. For0x10and beyond, the offset is the row label. - Step 3Read the bytes in the correct endianness — PNG and JPEG length/size fields are big-endian — read the bytes left to right. PE, ELF-on-x86, and ZIP fields are little-endian — reverse the byte order. The inspector shows raw bytes; the conversion is yours.
- Step 4Cross-check the ASCII sidebar for tags — Four-character chunk/box tags (
IHDR,ftyp,JFIF) and signatures (%PDF,PK) read straight out of the sidebar. This is the fastest way to confirm you're at the right structural boundary. - Step 5Compare against your parser's expectations — If your parser reads a field your way and the bytes disagree, you've found the bug — usually an endianness flip or an off-by-one offset. The inspector is the cheapest way to ground-truth those assumptions.
- Step 6Note anomalies for malformed-input handling — Fields that deviate from spec — wrong magic at a sub-format offset, non-null bytes where padding belongs — are exactly the malformed inputs your parser must reject gracefully. Capture the offending bytes from the dump for a regression test.
Header magic numbers and endianness at a glance
The signature and the dominant endianness for size/offset fields in each format. 'Hex' is what the inspector shows at offset 0 (unless noted); 'ASCII' is the sidebar rendering.
| Format | Magic (hex) | ASCII | Field endianness |
|---|---|---|---|
| PE / EXE (DOS stub) | 4D 5A | MZ | Little-endian |
| ELF | 7F 45 4C 46 | .ELF | Per byte 5 (1=LE, 2=BE); x86/x64 = LE |
| PNG | 89 50 4E 47 0D 0A 1A 0A | .PNG.... | Big-endian (chunk lengths) |
| JPEG | FF D8 FF | ... | Big-endian (marker segment lengths) |
| ZIP / OOXML | 50 4B 03 04 | PK.. | Little-endian |
25 50 44 46 2D | %PDF- | ASCII text header (version digits) | |
| GIF | 47 49 46 38 39 61 | GIF89a | Little-endian (logical screen) |
| MP4 / MOV | 66 74 79 70 at offset 4 | ftyp | Big-endian (box sizes) |
PE and PNG field maps
The fields you parse first in two very different formats — a little-endian executable and a big-endian image — with how each appears in the dump.
| Format | Field | Offset | Read it as |
|---|---|---|---|
| PE | MZ signature | 0x00 | 4D 5A (literal bytes) |
| PE | e_lfanew (PE header ptr) | 0x3C | 4 bytes, little-endian → PE offset |
| PE | PE signature | value of e_lfanew | 50 45 00 00 (PE..) |
| PE | Machine type | e_lfanew + 4 | WORD, LE: 014C=x86, 8664=x64 |
| PNG | IHDR length | 0x08 | 4 bytes, big-endian (always 13 = 00 00 00 0D) |
| PNG | IHDR type tag | 0x0C | 49 48 44 52 (IHDR) |
| PNG | Width | 0x10 | 4 bytes, big-endian |
| PNG | Height | 0x14 | 4 bytes, big-endian |
| PNG | Bit depth / colour type | 0x18 / 0x19 | 1 byte each |
ZIP local file header and MP4 ftyp box
A little-endian archive header and a big-endian box-based container — two layouts that catch parser authors out in opposite ways.
| Structure | Field | Offset | Read it as |
|---|---|---|---|
| ZIP local header | Signature | 0x00 | 50 4B 03 04 (PK..) |
| ZIP local header | Version needed | 0x04 | WORD, little-endian |
| ZIP local header | General-purpose flags | 0x06 | WORD, LE (bit 0 = encrypted) |
| ZIP local header | Compression method | 0x08 | WORD, LE (0=store, 8=deflate) |
| MP4 ftyp | Box size | 0x00 | 4 bytes, big-endian |
| MP4 ftyp | Box type | 0x04 | 66 74 79 70 (ftyp) |
| MP4 ftyp | Major brand | 0x08 | 4 ASCII bytes (isom, mp42, M4A ) |
Cookbook
Offset-by-offset reads of clean files. Dumps show the inspector's on-screen layout (offset · hex · ASCII). Use them as ground truth for your parser's assumptions.
PNG IHDR: dimensions are big-endian
The #1 PNG parser bug is reading the big-endian width/height as little-endian. The inspector shows the raw bytes; read them left-to-right. IHDR length is always 13 (00 00 00 0D).
00000000 89 50 4E 47 0D 0A 1A 0A 00 00 00 0D 49 48 44 52 .PNG........IHDR 00000010 00 00 04 00 00 00 03 00 08 06 00 00 00 ... ............... Parse: 0x08: 00 00 00 0D → IHDR length = 13 (big-endian) 0x0C: 49 48 44 52 → 'IHDR' 0x10: 00 00 04 00 → width = 1024 (big-endian!) 0x14: 00 00 03 00 → height = 768 (big-endian!) 0x18: 08 → bit depth 8 0x19: 06 → colour type 6 (RGBA)
PE e_lfanew: a little-endian pointer
The PE header isn't at a fixed offset — you follow the little-endian DWORD at 0x3C. Read the four bytes reversed to get the target offset, then confirm the PE signature there.
00000030 ... 80 00 00 00 ............ 00000080 50 45 00 00 4C 01 03 00 ... PE..L........... Parse: 0x3C: 80 00 00 00 → e_lfanew = 0x00000080 (little-endian) @0x80: 50 45 00 00 → 'PE..' signature +4: 4C 01 → machine 0x014C = x86 (little-endian) Bug to avoid: reading 80 00 00 00 as 0x80000000.
JPEG markers: big-endian segment lengths
After the SOI (FF D8), JPEG is a stream of FF xx markers. APP0 (FF E0) is JFIF; APP1 (FF E1) is EXIF. The 2-byte length after each marker is big-endian and includes the length field itself.
00000000 FF D8 FF E0 00 10 4A 46 49 46 00 01 ... ......JFIF...... Parse: FF D8 → SOI (start of image) FF E0 → APP0 marker 00 10 → segment length 16 (big-endian, includes these 2 bytes) 4A 46 49 46 → 'JFIF' identifier (readable in the sidebar) An EXIF JPEG would show FF E1 then 'Exif' instead.
ZIP local header vs OOXML — same first 4 bytes
A .docx, .xlsx, .pptx, .jar, and a plain .zip all start PK 03 04. The inspector can't tell them apart from offset 0 alone — the distinguishing content ([Content_Types].xml for OOXML) appears just after the local header.
00000000 50 4B 03 04 14 00 06 00 08 00 ... PK.............. ... (a few bytes later, the first entry name appears) ... 5B 43 6F 6E 74 65 6E 74 5F 54 79 70 65 73 5D ... [Content_Types] Parse: 50 4B 03 04 → ZIP local file header (also docx/xlsx/jar) 0x08: 08 00 → compression method 8 = deflate '[Content_Types]' run → confirms this ZIP is OOXML
MP4 starts with a size, not a magic byte
Box-based containers (MP4/MOV) don't put a magic number at offset 0 — they put the first box's size, then its 4-char type. The ftyp tag sits at offset 4, and the major brand at offset 8.
00000000 00 00 00 18 66 74 79 70 69 73 6F 6D 00 00 02 00 ....ftypisom.... Parse: 0x00: 00 00 00 18 → ftyp box size = 24 (big-endian) 0x04: 66 74 79 70 → 'ftyp' 0x08: 69 73 6F 6D → major brand 'isom' Don't look for a magic byte at 0 — read the box size first.
Edge cases and what actually happens
Reading a big-endian field as little-endian
Parser bugPNG chunk lengths and JPEG segment lengths are big-endian; reading them reversed yields nonsense (e.g. a width of 1024 read as 0x00040000). The inspector shows raw bytes only — it never converts endianness, so the conversion (and the bug) is on your code. Cross-check against a known-good file.
Assuming the PE header is at a fixed offset
Parser bugThere is no fixed PE-header offset — you must follow e_lfanew (little-endian DWORD at 0x3C). Hardcoding 0x80 works for many compilers and breaks on others. The inspector shows the real pointer value; always dereference it dynamically.
Field of interest is past your tier window
Out of windowIf e_lfanew (or any field) points beyond 256 B (Free) / 1 KB (Pro) / 8 KB (Developer), the inspector can't display the target. Upgrade for a larger window, or read the full header in a desktop tool. The header maps in this guide all fit within 8 KB for normal files.
OOXML / JAR / ZIP indistinguishable at offset 0
Expected50 4B 03 04 is shared by ZIP, docx, xlsx, pptx, and jar. The first bytes alone can't disambiguate. Look a little further for the first entry name ([Content_Types].xml for OOXML, META-INF/ for JAR) — visible in the ASCII sidebar within the window.
Non-null bytes where padding is expected
AnomalyReserved/padding fields that should be zero but aren't can indicate corruption, a non-conformant writer, or deliberate hiding. Capture the bytes from the dump as a regression test and make your parser reject or warn on out-of-spec values.
Multi-byte UTF-8 in a text-ish header
ExpectedIn headers with embedded text (PDF metadata, some tags), UTF-8 multi-byte sequences have bytes above 0x7E and render as . in the sidebar. Read the hex pairs to reconstruct codepoints — the sidebar only resolves single-byte ASCII (0x20–0x7E).
Expecting field labels or a struct template
Not availableThe inspector shows raw offset/hex/ASCII with no field names, no types, and no template engine. It's a reference for mapping bytes by hand. For declarative struct parsing (named fields, repeats), use 010 Editor binary templates on the desktop.
GIF87a vs GIF89a version bytes
ExpectedBoth GIF variants start 47 49 46 38; the next byte distinguishes them (37=7 for 87a, 39=9 for 89a) followed by 61 (a). The sidebar shows GIF87a or GIF89a directly — read the version byte rather than assuming 89a.
Frequently asked questions
Does the inspector convert endianness for me?
No. It shows raw bytes in offset/hex/ASCII form. You apply endianness yourself: read PNG and JPEG length fields big-endian (left to right) and PE/ELF-x86/ZIP fields little-endian (reverse the bytes). That's the most common parser bug, so the inspector is a good ground-truth check.
What is the PE e_lfanew field?
A little-endian DWORD at offset 0x3C in the DOS/MZ header that stores the file offset of the PE signature (50 45 00 00). It varies per file, so always follow it dynamically — never hardcode the PE-header offset. The inspector shows the pointer value at 0x3C.
Why do some ZIP files start at a non-zero offset?
A ZIP's end-of-central-directory record is at the end of the file, and self-extracting archives prepend an executable stub, pushing the PK 03 04 local header mid-file. The inspector always reads from offset 0, so it shows the stub, not the mid-file signature — use a desktop tool to seek.
How do I read a JPEG marker segment length?
After each FF xx marker (e.g. FF E0 APP0/JFIF, FF E1 APP1/EXIF), the next two bytes are the segment length, big-endian, and they include those two length bytes themselves. Read them left to right from the dump.
Can I tell a .docx from a .zip in the inspector?
Not from offset 0 — both start 50 4B 03 04. Look a little further: OOXML files have [Content_Types].xml as an early entry, readable in the ASCII sidebar within the window. For an automatic verdict, magic-byte-validator reports the detected type.
What's the maximum header depth I can read?
256 bytes on Free, 1 KB on Pro / Pro+Media, and 8 KB on Developer. Every header map in this guide fits within 8 KB for normal files. For headers (or fields a pointer leads to) deeper than that, use a desktop hex editor that can seek.
Does it label fields or run templates like 010 Editor?
No. The inspector renders raw bytes with no field names, no tooltips, and no template engine. You map bytes to fields manually — which is exactly the skill this reference builds. For declarative struct parsing, use 010 Editor templates.
How do I confirm my parser reads the right bytes?
Drop a known-good file into the inspector, read the field at its documented offset, and compare to what your parser produces. A mismatch is almost always an endianness flip or an off-by-one offset. Capture the bytes for a regression test.
Why does an accented character in a header show as a dot?
The ASCII sidebar only resolves single-byte ASCII in 0x20–0x7E. UTF-8 multi-byte characters (e.g. é = C3 A9) have bytes above 0x7E, so each shows as .. Read the hex to reconstruct the codepoint.
Is the data sent anywhere when I inspect a sample?
No. The on-page tool reads the file locally via FileReader and renders in your browser. Nothing is uploaded — fine for inspecting proprietary or pre-release file samples while developing a parser.
Can I copy the bytes into a unit test?
Yes — the Copy button gives a continuous lowercase hex string of the sliced bytes. Decode it (e.g. bytes.fromhex(...) in Python) to build a fixture. Note it's not spaced or offset-annotated, so re-format if your test framework expects grouped bytes.
Should I use this or a detection library in my code?
Different jobs. The inspector is for you to read bytes by hand while developing. In production code, use a detection library — the JAD magic-byte-validator uses the file-type library (300+ formats) for that exact purpose. The inspector helps you understand what such a library sees.
Privacy first
Every JAD Security operation runs entirely in your browser. Files, passwords, and PGP private keys never leave your device — verified by zero outbound network requests during processing.