How to archive metadata extractor online for free
- Step 1Open the tool — Visit /archive-tools/archive-metadata-extractor. No account is required for the free tier. The tool loads its WASM/JS chunks once, then runs offline.
- Step 2Confirm your input is a ZIP — This tool reads the ZIP central directory only. If you are not sure the file is really a ZIP (a renamed
.raror.7zis common), run /archive-tools/auto-format-detector first — it reads the magic bytes and tells you the true format. - Step 3Drop the ZIP into the dropzone — Drag one
.ziponto the upload area, or click to pick it. The dropzone showsmax 50 MB per fileon the free tier. This tool takes a single file — it is not a batch or folder tool. - Step 4Click Process — The button reads Process (not Generate — this tool reads an archive, it does not build one). Processing happens in-tab; there is no progress bar because central-directory parsing is near-instant even for thousands of entries.
- Step 5Read the result panel — The panel shows the
Entriesmetric (the count parsed from the End Of Central Directory record) and renders the JSON. You can copy it to the clipboard or download it. - Step 6Download the JSON report — Click Download to save
<name>-metadata.json— forrelease.zipyou getrelease-metadata.json. The top-level shape is{ archive, totalEntries, entries: [...] }.
Every field in the JSON report
The exact keys emitted per entry by extractMetadata(), in order. The report root is { archive, totalEntries, entries }.
| JSON key | Source in the central-directory record | Example value |
|---|---|---|
name | Filename bytes, decoded as UTF-8 (non-fatal — invalid bytes become U+FFFD) | src/index.ts |
versionMadeBy | 16-bit field at offset +4; low byte = ZIP spec version, high byte = host OS | 798 (0x031E = Unix, spec 3.0) |
hostOS | High byte of versionMadeBy (versionMadeBy >> 8) | 3 (Unix), 0 (DOS/FAT), 11 (NTFS) |
versionNeeded | 16-bit field at offset +6 — minimum reader version (e.g. 20 = 2.0, 45 = 4.5/ZIP64) | 20 |
flags.encrypted | General-purpose bit 0 | false |
flags.utf8 | General-purpose bit 11 (0x800) — filename is UTF-8 | true |
flags.dataDescriptor | General-purpose bit 3 (0x8) — sizes/CRC follow the data | false |
compressionMethod | Method code mapped to a name by compressionMethodName() | Deflate, Stored, AES |
lastModified | DOS date+time pair decoded, then .toISOString() | 2026-05-04T13:22:14.000Z |
crc32 | 32-bit CRC as lowercase hex, zero-padded to 8 chars | a1b2c3d4 |
compressedSize / uncompressedSize | 32-bit size fields at offsets +20 / +24 | 1024 / 4096 |
hasExtraField / hasComment | Booleans — true when the extra-field / comment length is non-zero | true / false |
Compression-method codes you may see
The method name comes from compressionMethodName(). Unknown codes render as 'Method N' verbatim.
| ZIP method code | Reported name | What it means |
|---|---|---|
| 0 | Stored | No compression — bytes stored as-is |
| 1 | Shrunk | Legacy PKWARE Shrink (rare) |
| 6 | Imploded | Legacy PKWARE Implode (rare) |
| 8 | Deflate | The standard ZIP method — almost everything you will see |
| 9 | Deflate64 | Enhanced Deflate (large windows) |
| 12 | BZIP2 | BZIP2 inside a ZIP container |
| 14 | LZMA | LZMA inside a ZIP container |
| 99 | AES | AES-encrypted entry — the real method hides in the AES extra field; flags.encrypted will be true |
Tier limits for this tool
Per-job limits from the archive family in tier-limits.ts. This tool reads one file at a time, so batch counts are not used here.
| Tier | Max file size | Max entries per archive |
|---|---|---|
| Free | 50 MB | 500 |
| Pro | 500 MB | 50,000 |
| Pro-media | 2 GB | 500,000 |
| Developer | 2 GB | 500,000 |
Cookbook
Real ZIPs produce reports that answer real questions. These examples show the JSON you actually get back and what each field tells you.
Tell who built the ZIP from hostOS
The high byte of versionMadeBy identifies the creating system. A Unix-built ZIP carries hostOS 3 (and usually permission bits in extra fields); a Windows/NTFS one carries 11; a DOS/FAT one carries 0. This is the field that explains why some ZIPs preserve chmod and others lose it.
Report excerpt for a ZIP built on Linux:
{
"archive": "build.zip",
"totalEntries": 2,
"entries": [
{
"name": "run.sh",
"versionMadeBy": 798,
"hostOS": 3,
"versionNeeded": 20,
"flags": { "encrypted": false, "utf8": true, "dataDescriptor": false },
"compressionMethod": "Deflate",
...
}
]
}
hostOS 3 = Unix. A Windows-built ZIP of the same files would show hostOS 11 here.Spot encrypted entries without the password
flags.encrypted comes from general-purpose bit 0, which lives in plaintext in the central directory — so you can see which entries are encrypted without ever decrypting anything. AES entries also report compressionMethod 'AES'.
{
"name": "secret.docx",
"flags": { "encrypted": true, "utf8": false, "dataDescriptor": false },
"compressionMethod": "AES",
"crc32": "00000000"
}
→ Encrypted. To test a candidate password use
/archive-tools/archive-password-tester ;
to classify ZipCrypto-vs-AES across the whole archive use
/archive-tools/encrypted-archive-detector .Find non-UTF-8 filenames before they mojibake
When flags.utf8 is false, the filename was stored in a legacy OEM/ANSI code page. The tool still decodes the bytes as UTF-8 (non-fatal), so a name with accents may come back with U+FFFD replacement characters. The flag is your signal that the name needs code-page handling on extraction.
{
"name": "r\uFFFDsum\uFFFD.pdf",
"flags": { "encrypted": false, "utf8": false, "dataDescriptor": false }
}
utf8: false → the original was likely 'résumé.pdf' stored as CP-1252.
The \uFFFD marks bytes that are not valid UTF-8. To clean such
names on extraction, see /archive-tools/filename-sanitizer .Read the stored timestamp of every entry
lastModified is the MS-DOS date/time pair decoded to a local Date, then serialised with toISOString() (UTC). DOS timestamps have 2-second granularity and no timezone, so seconds are always even and the wall-clock interpretation is the builder's local time.
{ "name": "app.js", "lastModified": "2026-05-04T13:22:14.000Z" }
{ "name": "app.css", "lastModified": "2026-05-04T13:22:14.000Z" }
All-identical timestamps usually mean a reproducible build.
To force a fixed timestamp on output, use
/archive-tools/timestamp-normalizer (default 1980-01-01).Audit compression ratio from the raw sizes
Each entry reports compressedSize and uncompressedSize straight from the central directory. Divide to get the per-entry ratio. The report does not compute the ratio for you — that is what the ratio calculator is for — but the inputs are right here.
{ "name": "data.csv", "compressedSize": 18234, "uncompressedSize": 142998 }
→ 18234 / 142998 = 0.127 (compressed to ~13%)
{ "name": "photo.jpg", "compressedSize": 482100, "uncompressedSize": 484992 }
→ already entropy-dense; barely shrinks.
For a full per-file ratio breakdown without doing the math by hand:
/archive-tools/compression-ratio-calculator .Edge cases and what actually happens
Input is not a ZIP (a renamed 7z, RAR, or TAR)
Unsupported formatThis tool parses the ZIP central directory only — it does not use the libarchive WASM bridge. If findEocd cannot locate the End Of Central Directory signature, parsing returns zero entries and the tool throws 'Not a valid ZIP archive (or unsupported format for metadata extraction).' A renamed .rar/.7z/.tar.gz hits this. Run /archive-tools/auto-format-detector to confirm the real format first.
Truly empty ZIP (valid EOCD, zero entries)
RejectedAn archive whose central directory contains no entries also produces an empty entries array, so the same 'Not a valid ZIP archive…' error is thrown. The check is entries.length === 0, which cannot distinguish 'not a ZIP' from 'an empty ZIP'. There is nothing to report for an entry-less archive.
File larger than your tier cap
Tier limit exceededBefore processing, the client checks file.size against the tier ceiling and throws 'File "<name>" exceeds the <tier> tier per-job limit (<size>). Upgrade for larger files.' Free is 50 MB, Pro 500 MB, Pro-media and Developer 2 GB. The cap is on file bytes; the separate 500-entry (free) / 50,000 (pro) entry ceiling is enforced by the tier schema.
Encrypted ZIP
SupportedEncryption does not block metadata extraction. The central directory is never encrypted, so names, sizes, timestamps, methods, and flags.encrypted all read normally. Only the entry payloads are encrypted, and this tool never touches payloads. AES entries report compressionMethod: 'AES' and a crc32 of 00000000.
Entry uses a non-standard compression method code
PreservedIf a method code is not in the known list (Stored, Shrunk, Imploded, Deflate, Deflate64, BZIP2, LZMA, XZ, PPMd, AES), compressionMethodName() returns the literal string Method N with the raw code. The entry is still reported in full — nothing is dropped.
Self-extracting ZIP (.exe with a ZIP payload appended)
SupportedBecause findEocd scans backwards from the end of the file for the EOCD signature, a self-extracting archive (an executable stub followed by a ZIP) is read correctly — the central directory still sits near the end. The stub bytes are ignored.
ZIP comment present but you want the comment text
By designThe report tells you hasComment: true / hasExtraField: true per entry but does NOT include the comment text or extra-field contents. To read the global archive comment use /archive-tools/comment-extractor; per-entry comments are surfaced there too. This tool reports presence, not payload, for those two fields.
Filename with invalid UTF-8 bytes
Decoded lenientlyNames are decoded with TextDecoder('utf-8', { fatal: false }), so byte sequences that are not valid UTF-8 are replaced with U+FFFD (the replacement character) rather than throwing. Combined with flags.utf8: false, a U+FFFD in a name is your signal that the original used a legacy code page.
More than 65,535 entries (16-bit count overflow)
TruncatedThe entry count is read from the 16-bit EOCD field, which maxes out at 65,535. A ZIP64 archive with more entries than that will report a wrapped totalEntries and the loop stops early. Most real archives are far below this; if you suspect a giant ZIP64 archive, verify the entry count with another tool. The free/pro entry caps (500 / 50,000) keep typical inputs well within range.
Corrupt or truncated central directory
Partial / rejectedThe parser advances entry-by-entry and stops the moment a record does not start with the central-directory signature 0x02014b50. A truncated directory therefore yields a partial report (whatever parsed before the break); a directory damaged before the first entry yields zero entries and the 'Not a valid ZIP archive…' error. Run /archive-tools/corrupted-zip-repair to attempt recovery.
Frequently asked questions
Does the Metadata Extractor upload my files?
No. The ZIP is read with file.arrayBuffer() through the browser File API and parsed in-tab — there is no server-side path for archive tools (browserOnly: true). Open DevTools → Network during a run and you will see no outbound request carrying the file bytes.
Which formats can it read?
ZIP only. This tool walks the ZIP central directory; it does not invoke the libarchive WASM bridge used elsewhere in the suite for 7z/RAR/TAR/XZ/BZ2. If you drop a non-ZIP file it throws 'Not a valid ZIP archive (or unsupported format for metadata extraction).' Identify the real format with /archive-tools/auto-format-detector.
What's the maximum file size and entry count?
Free: 50 MB and 500 entries. Pro: 500 MB and 50,000 entries. Pro-media: 2 GB and 500,000 entries. Developer: 2 GB and 500,000 entries. The size cap is checked against the file before processing; the entry cap comes from the archive tier schema.
What exact fields does the report contain?
Per entry: name, versionMadeBy, hostOS, versionNeeded, a flags object with encrypted/utf8/dataDescriptor, compressionMethod, lastModified (ISO 8601), crc32 (8-hex), compressedSize, uncompressedSize, hasExtraField, and hasComment. The root object is { archive, totalEntries, entries }.
Can I get CSV instead of JSON?
Not from this tool — it outputs JSON only (<name>-metadata.json). If you want a CSV/TXT listing of names, sizes, ratios, methods, CRCs and the encryption flag, use /archive-tools/file-listing-generator, which offers csv/json/txt output.
Does it include the comment text or extra-field bytes?
No. It reports hasComment and hasExtraField as booleans — presence, not contents. To read the actual comment text use /archive-tools/comment-extractor; to inspect signing-related extra fields use /archive-tools/archive-signing-info.
Why are all my lastModified seconds even?
MS-DOS timestamps store seconds in 2-second units, so the decoded second is always even. DOS timestamps also carry no timezone — the tool decodes to a local Date and serialises with toISOString(), so the UTC string you see reflects the builder's local clock shifted to UTC.
What does versionNeeded tell me?
It is the minimum ZIP reader version required to extract the entry: 20 means 2.0 (standard Deflate), 45 means 4.5 and signals ZIP64. A surprisingly high versionNeeded is a hint the archive uses ZIP64 or an advanced method your downstream tooling may not support.
Can I tell if a ZIP was made on Windows vs Linux?
Yes — read hostOS. 0 is DOS/FAT, 3 is Unix (Linux/macOS), 11 is NTFS/Windows. Unix-built ZIPs typically carry permission bits in an extra field (so hasExtraField is often true), which is why chmod survives across some ZIPs and not others.
Why does an AES entry show crc32 of 00000000?
AES-encrypted ZIP entries (method code 99, reported as AES) commonly store a zero CRC in the central directory because the real CRC is protected with the encrypted data. A 00000000 CRC alongside flags.encrypted: true and compressionMethod: 'AES' is expected, not corruption.
Can I integrate this with CI/CD?
Archive tools have no public REST API (apiAvailable: false) — they run in the browser, or on Pro+ tiers through the @jadapps/runner headless-browser path. For a scripted pipeline today, run the same central-directory parse with a Node ZIP library; the JSON shape this tool emits is a good target schema to mirror.
Which browsers are supported?
Any browser with the File API and WebAssembly: Chrome, Edge, Firefox, Safari, Brave, Opera. Mobile works too, though very large archives may exceed device memory since the whole file is read into an ArrayBuffer.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.