How to archive diff for security & compliance
- Step 1Establish the baseline — Obtain your known-good archive — the signed release, the clean image, or the vendor's original deliverable. This becomes archive A. Keep its provenance documented.
- Step 2Obtain the copy to verify — Pull the deployed, restored, or received archive locally — from the server, backup vault, or partner channel. This becomes archive B. Both stay on disk; nothing is uploaded.
- Step 3Run the diff — At /archive-tools/archive-diff drop A and B and click Process. Both must be real ZIPs; the engine reads ZIP central directories. There are no options to configure.
- Step 4Triage the four buckets — Added = files present only in the copy (possible injected content). Removed = files missing from the copy. Changed = same path, different CRC32 (possible tampering). Unchanged = byte-identical content.
- Step 5Strengthen evidence with checksums — For chain-of-custody, generate SHA-256 manifests of both archives with Per-File Checksum Generator. CRC32 finds the suspects; SHA-256 makes the finding defensible against an adversary.
- Step 6Record the finding — Download the HTML report and attach it to your ticket or evidence record. Note tooling, tier, date, and the SHA-256 of both archives for a complete trail.
Mapping diff buckets to security findings
What each bucket typically indicates during a tamper or integrity review. Always extract Changed/Added entries to confirm.
| Bucket | Security interpretation | Typical next action |
|---|---|---|
| Added | Files in the copy that aren't in the baseline — possible injected payload or webshell | Extract and inspect each; check against an allowlist |
| Removed | Baseline files missing from the copy — possible deletion or incomplete restore | Confirm whether removal was authorised |
| Changed | Same path, different content — the highest-priority tamper signal | Extract both versions; diff contents; verify with SHA-256 |
| Unchanged | Byte-identical content (same CRC32) — no action needed | Note as verified; no extraction required |
CRC32 vs SHA-256 for this use case
The diff uses CRC32 from the central directory. For adversarial tamper-evidence, layer SHA-256 on top.
| Property | CRC32 (Archive Diff) | SHA-256 (Checksum Generator) |
|---|---|---|
| Detects accidental change | Yes, reliably | Yes |
| Collision-resistant vs an attacker | No | Yes |
| Cost to compute | Free — already stored in the ZIP | Reads/decompresses each entry |
| Best role | Fast triage of which files differ | Defensible content proof of those files |
Cookbook
Verification scenarios from real security and compliance work. Paths anonymised.
Verifying a deployed release against the signed artifact
Diff the signed release ZIP (A) against the ZIP pulled back off the production host (B). A clean deploy shows zero Added/Removed/Changed. Anything in those buckets is a deviation to explain.
Archive A: release-3.4.0-signed.zip Archive B: prod-host-snapshot.zip Diff summary Added: 1, Removed: 0, Changed: 1, Unchanged: 980 + Added (1) + public/uploads/.x.php ← not in signed release ~ Changed (1) ~ config/app.php ← extract & inspect
Confirming a restored backup matches the original
After a restore, diff the original snapshot (A) against the restored one (B). All-Unchanged proves a faithful restore; any Removed flags silent data loss.
Archive A: snapshot-original.zip Archive B: snapshot-restored.zip Diff summary Added: 0, Removed: 3, Changed: 0, Unchanged: 1521 - Removed (3) - data/2026/q1/ledger-03.csv ← restore dropped 3 files
Checking a vendor re-delivery against the original deliverable
When a vendor resends a build, diff the original against the new one to confirm only the agreed files changed and nothing else was touched.
Archive A: vendor-build-v1.zip Archive B: vendor-build-v1-resend.zip Diff summary Added: 0, Removed: 0, Changed: 1, Unchanged: 644 ~ Changed (1) ~ bin/patched-module.dll ← matches the agreed change ✓
Layering SHA-256 over a Changed finding
CRC32 told you which files changed. For the report, capture SHA-256 of each version so the finding stands up against a claim of accidental collision.
Step 1 — Archive Diff: ~ Changed (1) config/app.php Step 2 — Per-File Checksum Generator (both archives): A config/app.php sha256: 9f1c…a2 B config/app.php sha256: 4e88…d0 ← differs → confirmed tamper
Clean result on a hardened image
Diffing a freshly built golden image against a deployed instance should be all-Unchanged. A non-zero result is your drift signal for the configuration-management review.
Archive A: golden-image.zip Archive B: instance-0427.zip Diff summary Added: 0, Removed: 0, Changed: 0, Unchanged: 2310 (no deviations — instance matches the golden image)
Edge cases and what actually happens
CRC32 is not collision-resistant
NoteA motivated attacker can craft two files with the same CRC32, so a CRC32 match alone isn't proof of identical content against an adversary. For tamper-evidence, confirm matches with SHA-256 via Per-File Checksum Generator.
Evidence archive is a 7z or TAR.GZ
Zero entriesThe diff engine reads ZIP central directories only. A non-ZIP evidence file parses to zero entries and skews the result. Re-package as a ZIP, or list contents with Multi-Format Extractor for non-ZIP formats first.
Encrypted ZIP
Metadata-onlyThe central directory (names and CRC32) is readable even when entries are encrypted, so the diff still works at the entry level — but it can't see inside encrypted contents. Note encrypted status with Encrypted Archive Detector.
Directories differ but files are identical
By designFolder entries are skipped, so an empty-directory difference never appears as a finding. Only file content drives the result — keeping audit output focused on substantive change.
Renamed file looks like add + remove
ExpectedA legitimately renamed file shows as one Removed and one Added. Confirm by comparing CRC32 values across the two names, or by SHA-256, before recording it as a deletion or injection.
Corrupt central directory on the suspect copy
Zero entriesAn unreadable ZIP returns zero entries, so every baseline file shows as Removed — a false alarm. Validate the suspect archive with Archive Integrity Tester before drawing conclusions.
Archive over the tier cap
Tier limitEach archive is checked against your tier (Pro 500 MB / 50,000 entries; Pro+Media and Developer 2 GB / 500,000 entries). Oversized evidence is rejected — split with Archive Splitter and diff the parts, or upgrade.
Same content, different path prefix
ExpectedIf the baseline was zipped with a ./ prefix and the copy without it, identical files read as add+remove. Normalise with Path Prefix Remover before the verification diff.
Frequently asked questions
Is this defensible for tamper detection?
For triage, yes — it surfaces exactly which entries differ by content checksum. For a defensible finding against an adversary, confirm with SHA-256 (CRC32 isn't collision-resistant). The diff finds the suspects; the checksum proves them.
Does any evidence get uploaded?
No. The diff runs entirely in your browser tab. Both archives stay on the machine, so the regulated data boundary doesn't move — it's equivalent to running a local command-line tool.
Why CRC32 instead of a hash?
CRC32 is already stored per entry in the ZIP central directory, so the diff is instant and never decompresses anything. It reliably catches accidental change; for adversarial integrity, layer SHA-256 with Per-File Checksum Generator.
Will it diff encrypted archives?
At the entry level, yes — names and CRC32 live in the central directory even for encrypted entries. It can't compare the decrypted contents without the password, but the added/removed/changed structure is still visible.
Can it handle regulated data (HIPAA, PCI, FedRAMP)?
Because nothing is uploaded, using the tool doesn't expand your data boundary — it's local processing in the browser. Confirm with your compliance team that local browser processing meets your policy; most treat it like a local CLI.
What does a 'Changed' entry mean for an audit?
Same file path, different content checksum — the highest-priority tamper signal. Extract both versions, diff the contents, and record SHA-256 of each for the finding.
How do I prove a backup restored faithfully?
Diff the original snapshot against the restored one. All-Unchanged proves a faithful restore; any Removed or Changed entry is silent data loss or corruption to investigate.
Can multiple analysts use it at once?
Yes — each browser tab is an independent instance with no shared state. Tier limits apply per session. It's a Pro-tier tool, so each analyst needs a Pro or higher plan.
What format does the output take for evidence?
A self-contained HTML report (diff-A-vs-B.html) with the four bucket counts and colour-coded lists. Attach it to a ticket or chain-of-custody record alongside the SHA-256 of both archives.
Does it detect a renamed malicious file?
It shows a rename as add + remove. To tell a rename from an injection, compare CRC32 (or SHA-256) across the two names — matching content across a delete/add pair indicates a move, not new content.
What if the suspect ZIP is corrupt?
A corrupt central directory parses to zero entries, making every baseline file look Removed. Validate with Archive Integrity Tester before treating the result as a finding.
Is it suitable for non-ZIP evidence?
Not directly — the diff is ZIP-only. For 7z/RAR/TAR.GZ evidence, list contents with Multi-Format Extractor and compare listings, or re-package as ZIP for the diff.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.