How to zip metadata extractor in developer workflows
- Step 1Grab the artifact — Pull the release
.zip(or.jar/.aar/.apk— all ZIP containers) from your CI output, a PR's build attachment, or a registry. No branch checkout needed. - Step 2Read its metadata — Open /archive-tools/archive-metadata-extractor, drop the file, click Process. The JSON shows every entry's headers instantly.
- Step 3Verify the reproducible-build invariant — Check that all
lastModifiedvalues equal your fixed epoch and thatcrc32values match a known-good build. A drift in either means the build is not bit-reproducible. - Step 4Diff two builds — Download the JSON for the old and new artifact, then
diff <(jq -S . old.json) <(jq -S . new.json). Changed method, size, timestamp, or flags surface immediately. - Step 5Fix what you find with the right sibling — Non-fixed timestamps → /archive-tools/timestamp-normalizer (rewrites to 1980-01-01 by default). Need a hash to pin in CI → /archive-tools/checksum-generator. Stray top-level folder → /archive-tools/path-prefix-remover.
- Step 6Pin a checksum in CI — After normalising, capture a SHA-256 with /archive-tools/checksum-generator and assert it in your pipeline so future builds that drift fail loudly.
Developer questions answered by each field
What to read in the JSON for common developer checks. All fields come straight from the ZIP central directory.
| Field | Developer question it answers | Act on it with |
|---|---|---|
lastModified | Are timestamps fixed for a reproducible build? | /archive-tools/timestamp-normalizer |
crc32 | Did the entry's content change between builds? | /archive-tools/archive-integrity-tester |
compressionMethod | Stored vs Deflate — is anything accidentally uncompressed? | /archive-tools/smart-archive-compressor |
compressedSize / uncompressedSize | How well does each entry compress? | /archive-tools/compression-ratio-calculator |
hostOS | Was it built on the OS I expect (Unix vs NTFS)? | (provenance check) |
name | Is there an unwanted top-level folder prefix? | /archive-tools/path-prefix-remover |
flags.utf8 | Are filenames UTF-8 or legacy code page? | /archive-tools/filename-sanitizer |
flags.encrypted | Did an encrypted entry sneak into a release? | /archive-tools/encrypted-archive-detector |
Batch reality for this tool
What this tool does and does not do — so you route batch jobs correctly. This tool is single-file and read-only.
| Capability | This tool | Where to go instead |
|---|---|---|
| Multiple files at once | No — single file per run | /archive-tools/batch-compression-report (analysis across many archives) |
| Writes / modifies an archive | No — read-only JSON report | /archive-tools/timestamp-normalizer, /archive-tools/path-prefix-remover |
| Non-ZIP formats | No — ZIP central directory only | /archive-tools/archive-previewer |
| Public REST API | No (apiAvailable: false) | @jadapps/runner on Pro+ (headless browser) |
| Output format | JSON only | /archive-tools/file-listing-generator (csv/json/txt) |
Cookbook
Developer-shaped recipes against real artifacts. The JSON shapes shown are exactly what the tool returns.
Assert a reproducible build's fixed timestamps
A reproducible build pins every entry's mtime to a constant (1980-01-01 is the ZIP epoch and the de facto standard). One entry with a wall-clock timestamp means the build is not reproducible. The report makes the outlier obvious.
$ jq '[.entries[].lastModified] | unique' release-metadata.json [ "1980-01-01T00:00:00.000Z", "2026-05-04T13:22:14.000Z" <-- not normalised! ] → One entry escaped normalisation. Re-run the build through /archive-tools/timestamp-normalizer (default 1980-01-01) so every entry collapses to one timestamp.
Diff two release artifacts' headers
Because the schema is stable, diffing two reports highlights exactly which entries changed method, size, timestamp or CRC — without unzipping either build.
$ diff <(jq -S . v1.2.0-metadata.json) <(jq -S . v1.2.1-metadata.json) < "compressionMethod": "Deflate", > "compressionMethod": "Stored", < "compressedSize": 18234, > "compressedSize": 142998, → data.csv flipped from Deflate to Stored and ballooned. Re-zip with /archive-tools/smart-archive-compressor .
Catch an accidentally-Stored entry
compressionMethod is per-entry. A single 'Stored' file in an otherwise-Deflate archive usually means a build step added it raw. The report flags it before the bloated artifact ships.
$ jq '.entries[] | select(.compressionMethod == "Stored")
| {name, uncompressedSize}' app-metadata.json
{ "name": "vendor/big.json", "uncompressedSize": 2400000 }
→ 2.4 MB stored uncompressed. Recompress the artifact via
/archive-tools/smart-archive-compressor (level 6 default).Explain lost permissions via hostOS
Executable bits live in Unix extra fields, present only when the ZIP was built on a Unix host (hostOS 3). If your CI artifact reports hostOS 11 (NTFS), the bits never made it in — which is why scripts come out non-executable after extraction.
$ jq '[.entries[] | {host: .hostOS, extra: .hasExtraField}]
| unique' ci-artifact-metadata.json
[ { "host": 11, "extra": false } ]
→ Built on NTFS, no extra fields = no Unix mode bits.
Build the ZIP on a Unix runner (hostOS 3) if you need
+x to survive extraction.Strip a stray top-level folder before shipping
If every entry name starts with a build-dir prefix, consumers extract into an extra nested folder. The report's name list reveals the common prefix; the path-prefix-remover strips it.
$ jq '[.entries[].name] | .[0:3]' dist-metadata.json [ "my-app-1.2.0/index.js", "my-app-1.2.0/styles.css", "my-app-1.2.0/README.md" ] → Shared 'my-app-1.2.0/' prefix. Remove it with /archive-tools/path-prefix-remover (auto-detects a single common top-level folder when prefix is left empty).
Edge cases and what actually happens
Artifact is a JAR / AAR / APK
SupportedThese are all ZIP containers, so the tool reads them fine — you will see META-INF/MANIFEST.MF and friends as entries. To inspect the signing extra fields and signature blocks specifically, use /archive-tools/archive-signing-info; this tool only reports hasExtraField, not the signature contents.
You want to process a whole batch of artifacts
Single-file onlyThis tool reads one file per run — it is not in the multi-file set and shows no batch input. For metrics across many archives use /archive-tools/batch-compression-report; for many extractions use /archive-tools/batch-extraction-manager.
You expected the tool to fix the timestamps
Read-onlyThe Metadata Extractor reports; it does not modify. It will show you a non-reproducible timestamp but will not rewrite it. Use /archive-tools/timestamp-normalizer (default 1980-01-01) to actually fix the archive, then re-run this tool to confirm.
CRC matches but you want true verification
Stored value onlyThe reported crc32 is the value stored in the directory, not a fresh recomputation. Two builds with matching stored CRCs are a strong reproducibility signal, but to verify the payload actually matches, run /archive-tools/archive-integrity-tester (recomputes) or pin a whole-file SHA-256 via /archive-tools/checksum-generator.
tar.gz or 7z release artifact
Unsupported formatZIP only. A .tar.gz or .7z throws 'Not a valid ZIP archive (or unsupported format for metadata extraction).' Convert a tar.gz to ZIP first with /archive-tools/tar-gz-to-zip, or list it with /archive-tools/archive-previewer.
Entry uses Deflate64 or LZMA inside the ZIP
ReportedcompressionMethod reports Deflate64, LZMA, BZIP2, etc. by name. The metadata reads fine, but downstream consumers and some sibling tools may not extract these methods — a high versionNeeded alongside the method name is your warning to test extraction before shipping.
File exceeds the tier cap
Tier limit exceededThe client throws 'File "<name>" exceeds the <tier> tier per-job limit (<size>). Upgrade for larger files.' before processing. Free is 50 MB, Pro 500 MB, Developer 2 GB — pick a tier that holds your release artifact without splitting.
Self-extracting installer (.exe with appended ZIP)
SupportedBecause the EOCD is located by scanning backward from the end of the file, an SFX installer (executable stub + ZIP) reads correctly; the stub bytes are ignored. Handy for inspecting what a self-extracting installer actually bundles.
Frequently asked questions
Does it work on JAR, AAR, and APK files?
Yes — those are ZIP containers, so the tool walks their central directories and lists every entry (manifests, classes, resources). For the signing blocks specifically, chain /archive-tools/archive-signing-info; this tool reports hasExtraField but not signature contents.
Can I use it to verify a reproducible build?
Yes. Check that every entry's lastModified equals your fixed epoch (1980-01-01 is standard) and that the per-entry crc32 values match a known-good build. Any drift means the build is not bit-reproducible. Fix timestamps with /archive-tools/timestamp-normalizer.
Is there an API I can call from CI?
No public REST API — archive tools are browser-only (apiAvailable: false). Pro+ tiers can drive them through the @jadapps/runner headless-browser path. For unattended CI, a Node ZIP library reading the central directory mirrors this tool's JSON schema and runs without a browser.
How do I diff two builds' metadata?
Download the JSON for each, then diff <(jq -S . a.json) <(jq -S . b.json). Sorting keys with jq -S makes the diff stable. Changed method, size, timestamp or CRC fields stand out. For a content-level archive diff, use /archive-tools/archive-diff.
Can it process several artifacts at once?
No — it is single-file per run and shows no batch input. For analysis across many archives use /archive-tools/batch-compression-report; for many extractions use /archive-tools/batch-extraction-manager.
Does it modify or normalise the archive?
No — it is strictly read-only and outputs a JSON report. To normalise timestamps use /archive-tools/timestamp-normalizer; to strip a path prefix use /archive-tools/path-prefix-remover; to recompress use /archive-tools/smart-archive-compressor.
Why does my Stored entry show compressedSize == uncompressedSize?
Stored means no compression, so the two sizes are equal by definition. A Stored entry in an otherwise-Deflate archive is usually an accident (a raw file added by a build step) and worth recompressing.
What does versionNeeded tell a developer?
It is the minimum ZIP reader version: 20 = standard Deflate, 45 = ZIP64. A high value warns that downstream tools (or some sibling tools) may not extract the archive — test extraction before shipping if you see 45 or an exotic method name.
Can it tell me if filenames will break on another OS?
Partly — flags.utf8: false flags entries stored in a legacy code page, which can mojibake on a different OS. The tool decodes such names leniently (U+FFFD for bad bytes). To normalise names for cross-platform safety, use /archive-tools/filename-sanitizer.
Does it output CSV for a spreadsheet?
No — JSON only (<name>-metadata.json). For a CSV/TXT listing of names, sizes, ratios, methods, CRCs and the encryption flag, use /archive-tools/file-listing-generator, which supports csv/json/txt.
Is it safe for proprietary build artifacts?
Yes — archive tools are browser-only (browserOnly: true); the file is read in-tab via the File API and never uploaded. Nothing about your closed-source artifact leaves the machine.
How do I pin a checksum so future builds fail on drift?
After confirming the metadata, capture a whole-file SHA-256 with /archive-tools/checksum-generator and assert it in CI. Combined with timestamp normalisation, that gives you a bit-reproducible artifact whose hash a pipeline can verify on every build.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.