How to archive diff in developer workflows
- Step 1Grab the baseline artifact — Download the artifact from the target branch — the last release ZIP, or the CI artifact from
main. This is archive A. It must be a real ZIP; the engine reads ZIP central directories. - Step 2Grab the PR's artifact — Download the artifact the PR produced (from its CI run or attached to the PR). This is archive B. No checkout, no local build required.
- Step 3Diff at /archive-tools/archive-diff — Drop A and B and click Process. There are no options — the comparison is fixed to name + CRC32 over each ZIP's central directory.
- Step 4Review the Changed bucket first — Changed entries (same path, different content) are the meaningful review surface. Added/Removed tell you about new and deleted files. Unchanged is just the count of files that didn't move.
- Step 5Make the build reproducible — If two machines produce different ZIPs, the Changed bucket names the non-deterministic files. Normalise embedded timestamps with Timestamp Normalizer and re-diff until only intended changes remain.
- Step 6Attach the report to the PR — Download
diff-A-vs-B.htmland link or attach it in the review so the artifact delta is documented alongside the source diff.
Reading the diff during code review
How each bucket maps to a review concern for a build artifact.
| Bucket | Review meaning | What to do |
|---|---|---|
| Added | New files in the bundle | Confirm they're intended (no stray debug/source maps) |
| Removed | Files dropped from the bundle | Confirm nothing required was pruned |
| Changed | Same path, new content | Extract and text-diff — the core of the review |
| Unchanged | Byte-identical (same CRC32) | Ignore; reported as a count only |
Reproducible-build interpretation
Diffing the same source tree built twice. CRC32 is content-only, so timestamp differences don't register.
| Diff result | Interpretation | Fix |
|---|---|---|
| All Unchanged | Fully reproducible build | None — ship it |
| Changed: files with embedded build dates | Non-determinism from timestamps/build host | Timestamp Normalizer; strip embedded dates |
| Added/Removed differ run-to-run | Non-deterministic file inclusion (e.g. temp/cache) | Exclude transient files before zipping |
| Changed: minified bundle hash | Non-deterministic minifier ordering | Pin tool versions and sort inputs |
Cookbook
Developer-shaped comparisons: PR artifacts, plugin bundles, and reproducible-build checks.
Reviewing a PR's dist bundle without checking out
Download main's dist.zip and the PR's dist.zip, diff them, and the output delta is right there — one new chunk, one changed entry, one removed legacy file.
Archive A: dist-main.zip Archive B: dist-pr-482.zip Diff summary Added: 1, Removed: 1, Changed: 1, Unchanged: 88 + Added (1) + assets/chunk-9f2.js - Removed (1) - assets/chunk-old.js ~ Changed (1) ~ index.html
Catching an accidentally-shipped source map
The Added bucket immediately surfaces files that shouldn't be in a production bundle, like a stray .map, before they reach release.
Archive A: build-prev.zip Archive B: build-candidate.zip Diff summary Added: 1, Removed: 0, Changed: 0, Unchanged: 140 + Added (1) + assets/app.js.map ← should not ship to prod
Reproducible-build check across two machines
Build the same commit on CI and on a laptop, diff the two ZIPs. CRC32 ignores timestamps, so only genuinely non-deterministic files appear in Changed.
Archive A: build-ci-abc123.zip Archive B: build-laptop-abc123.zip Diff summary Added: 0, Removed: 0, Changed: 1, Unchanged: 256 ~ Changed (1) ~ META-INF/BUILDINFO ← embeds build host → not reproducible
Verifying a published package matches the tagged source build
Diff the package you built from the git tag against the one downloaded from the registry to confirm the registry artifact wasn't altered.
Archive A: pkg-from-tag.zip Archive B: pkg-from-registry.zip Diff summary Added: 0, Removed: 0, Changed: 0, Unchanged: 73 (registry artifact matches the tagged build, file-for-file)
Narrowing a regression to the changed entries
A bug appeared between two builds. The Changed bucket is the precise list of output files that differ — extract just those and bisect the source that produced them.
Archive A: good-build.zip Archive B: broken-build.zip Diff summary Added: 0, Removed: 0, Changed: 3, Unchanged: 411 ~ Changed (3) ~ dist/router.js ~ dist/store.js ~ dist/app.js ← three files to inspect, not 411
Edge cases and what actually happens
No public API for CI
NoteArchive Diff has no programmatic endpoint today (apiAvailable is false). For automated CI checks, script fflate or unzip -Z1 + checksums; use the browser tool for the manual investigation when a check fails.
Artifact is a .tar.gz or .tgz
Zero entriesMany build pipelines emit TAR.GZ. The diff engine reads ZIP central directories only, so a TAR.GZ parses to zero entries. Convert with tar.gz to zip first, then diff the ZIPs.
Timestamp-only differences between builds
UnchangedBecause comparison is by CRC32 (content), files that differ only in their archive timestamp stay in Unchanged — which is exactly what you want for a reproducible-build signal.
Same file moved to a new directory
ExpectedA moved file shows as Removed (old path) + Added (new path). The diff matches by exact entry name and doesn't pair by content across paths, so refactors that relocate files read as deletes plus adds.
Minified bundle changed but source didn't
ChangedNon-deterministic minifier output (chunk ordering, hashes) changes the CRC32 even when source is identical. Pin tool versions and sort inputs if you need bit-stable bundles.
Bundle exceeds your tier cap
Tier limitEach archive is checked against your tier (Pro 500 MB / 50,000 entries; Pro+Media and Developer 2 GB / 500,000 entries). Split large monorepo artifacts with Archive Splitter or upgrade.
Path prefix differs (./dist vs dist)
ExpectedTwo pipelines that zip with different root prefixes produce all add+remove noise. Normalise with Path Prefix Remover so only real content changes show.
Both archives identical
SupportedTwo byte-identical bundles report Added 0 / Removed 0 / Changed 0 and Unchanged equal to the entry count, with no lists — a clean 'nothing changed' result.
Frequently asked questions
Can I use this in CI?
Not via an API — Archive Diff is a browser tool with no public endpoint yet. For automated gates, script fflate (Node) or unzip -Z1 + checksums; use the browser tool when a gate fails and you need to see the exact delta.
How is this faster than rebuilding to compare?
It reads each ZIP's central directory and compares CRC32 — it never decompresses or rebuilds. A few-hundred-MB artifact diffs in under a second, versus minutes to check out and build.
Does it work for reproducible-build checks?
Yes. CRC32 reflects content, not timestamps, so identical-content files stay Unchanged. Any genuinely non-deterministic file appears in Changed — exactly the signal you want.
My artifact is a tar.gz — can I diff it?
Not directly; the engine is ZIP-only. Convert with tar.gz to zip, then diff the resulting ZIPs.
Does it show me the lines that changed inside a file?
No — it's an archive-level diff. The Changed bucket names the entries that differ; extract those and run a text diff on them. It narrows 400 files down to the 3 that matter.
Will a moved file show as a rename?
No. A move shows as Removed (old path) + Added (new path) because matching is by exact name. Compare CRC32 across the two names to confirm it's the same content relocated.
Is it safe for closed-source artifacts?
Yes. The diff runs in your browser tab and uploads nothing. Pre-release and proprietary bundles never leave your machine.
How do I make a build bit-identical across machines?
Diff the two ZIPs to find the non-deterministic files, normalise embedded dates with Timestamp Normalizer, pin tool versions, sort inputs, and re-diff until only intended changes remain.
What output do I attach to the PR?
Download diff-A-vs-B.html — a self-contained report with the four bucket counts and colour-coded lists. Link or attach it in the review next to the source diff.
What's the size limit for artifacts?
Per archive: Pro 500 MB / 50,000 entries, Pro+Media and Developer 2 GB / 500,000 entries. Split larger artifacts with Archive Splitter or upgrade.
Can it batch-diff many artifact pairs?
No — it takes exactly two archives per run (it's a dual-file tool). For many archives, script the comparison in CI; the browser tool is for the focused, manual pair.
Does it verify a registry package matches my source build?
Yes — diff the ZIP you built from the tag against the one pulled from the registry. All-Unchanged proves the published artifact matches your build file-for-file.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.