How to archive previewer for devops & incident response
- Step 1Receive the bundle on the responder's machine — Keep the archive local. Because the previewer is browser-only there's no upload step and no copy leaves the workstation.
- Step 2Drop it into the previewer — Open archive-previewer and drop the
.tar.gz,.zip, or.7z. One archive at a time; no options to configure. - Step 3Read the tree for shape — Confirm the bundle has the directories you expect —
var/log/,journal/,etc/, pod logs — and isn't a surprise nested archive or a single giant blob. - Step 4Sort by size to find the loud logs — The top-200 table is largest-first. The 2 GB
messagesor the runawayapp.logjumps to the top — that's where your disk and your grep time will go. - Step 5Decide what to extract — Now pull only what matters with selective-extractor (glob, e.g.
var/log/**/*.log) instead of unpacking the whole bundle. - Step 6Attach the report to the ticket — Download the HTML preview as a record of the bundle's contents at receipt time, then proceed to deeper analysis.
Common incident bundles and how the previewer handles them
What responders typically receive and what the previewer surfaces. ZIP gets full central-directory detail; the rest get tree + uncompressed sizes.
| Bundle type | Usual format | What you learn from the preview | Detail level |
|---|---|---|---|
Linux sosreport | .tar.xz | Tree of sos_commands/, var/log/; biggest logs by size | Tree + uncompressed size (XZ via libarchive) |
Kubernetes must-gather | .tar.gz | Per-namespace pod log layout, oversized container logs | Tree + uncompressed size (fflate) |
| Vendor support bundle | .zip or .7z | Config + logs split; ZIP shows method/CRC, 7Z tree only | Full for ZIP; tree for 7Z |
/var/log capture | .tar.gz | Which log files exist and their relative sizes | Tree + uncompressed size |
| Memory/disk image carve | .zip | Carved-file inventory with real compressed sizes and CRC | Full ZIP central-directory detail |
| Single rotated log | .gz | Confirms one inner file and its expanded size | Inner name + uncompressed size |
Tier limits for the Archive Previewer
Per-job limits enforced before parsing. The previewer reads one archive at a time, so the batch column matters less here than the per-file size and the per-archive entry cap. Numbers are the live values from the archive tier-limit table.
| Tier | Max file size | Max entries / archive | Files per job | Notes |
|---|---|---|---|---|
| Free | 50 MB | 500 entries | 1 | Enough for a release ZIP or a Finder-zipped folder. Larger archives prompt an upgrade before any bytes are read. |
| Pro | 500 MB | 50,000 entries | 20 | Covers most CI artifacts and node_modules snapshots. Entry cap, not size, is usually the first wall on monorepo ZIPs. |
| Pro + Media | 2 GB | 500,000 entries | 100 | For dataset bundles and disk-image-sized archives. |
| Developer | 2 GB | 500,000 entries | unlimited | Same ceilings as Pro + Media with unlimited files per job for scripted runs via the runner. |
Forensic properties of the previewer
What a responder can rely on, and what they cannot, when using the previewer as a triage step.
| Property | Behaviour | Implication for IR |
|---|---|---|
| Upload | None — browser-only | Evidence stays on the workstation; safe for residency/CoC constraints |
| ZIP decompression | Never (central-dir parse only) | No native unarchiver runs against hostile ZIP bytes |
| Non-ZIP decompression | In-memory to enumerate entries | 7Z/TAR/XZ are decoded locally; nothing written to disk |
| ZIP timestamps | Real per-entry DOS date/time | Usable as a first-pass timeline hint for ZIP entries |
| Non-ZIP timestamps | Archive file date (fallback) | Not reliable for per-entry timeline — confirm after extraction |
| CRC-32 (ZIP) | Parsed from central directory | Quick integrity / dedup signal; not a cryptographic hash |
Cookbook
Triage-first recipes for responders — get the shape and the heavy hitters before you spend disk and time extracting.
Spot the runaway log in a must-gather
A 1.4 GB must-gather is mostly one chatty pod. The top-200 table makes it obvious before extraction.
Drop: must-gather.tar.gz (1.4 GB) on Pro+
must-gather — 38,902 entries (TAR)
Top files by size:
namespaces/payments/pods/api-7f/api.log 980 MB
namespaces/payments/pods/api-7f/prev.log 210 MB
...rest < 5 MB each
→ One pod is 85% of the bundle. Extract just that path.Confirm a vendor ZIP has the config you need
Before escalating, verify the support ZIP actually contains the config tree, with real compressed sizes for the heavy files.
Drop: support-bundle.zip (44 MB)
support-bundle.zip — 612 entries (ZIP)
/bundle/
├ config/ (eng.yaml, routes.json …)
├ logs/ (Method: Deflate, CRC parsed)
└ metrics/
→ config/ present; safe to attach finding to the ticket.Pull only the log subtree after triage
The previewer told you where the logs are; now extract just that glob instead of the whole bundle.
Previewer tree → logs live under var/log/
Next step (selective-extractor):
glob: var/log/**/*.log
→ extracts only the .log files, leaving 12 GB
of binaries and cores unpacked.Keep a chain-of-custody record
Download the preview as proof of what the bundle contained when received — no need to ship the bundle to anyone.
Drop: evidence-2026-06-10.zip Process → Download → evidence-2026-06-10-preview.html HTML records: - entry count, full tree, sizes, CRCs - timestamp of inspection (your note) Attach to the IR ticket; the archive itself stays sealed.
Catch a nested archive bomb early
If the bundle is actually an archive-of-archives, the tree shows inner .zip/.gz entries — handle recursion deliberately, not by accident.
Drop: capture.zip
capture.zip — 6 entries (ZIP)
/
├ logs-a.tar.gz (1.2 GB)
├ logs-b.tar.gz (1.1 GB)
└ ...
→ Nested archives. Use nested-archive-extractor with a
bounded maxDepth rather than blindly recursing.Edge cases and what actually happens
Bundle exceeds the tier size cap
BlockedA 6 GB capture is over even the 2 GB Pro + Media / Developer ceiling and won't preview. Split or carve it first, or list it with a native CLI on the secured workstation. Free tops out at 50 MB, Pro at 500 MB.
Per-entry timestamps on a .tar.gz
By designNon-ZIP previews use the archive file's date for every entry, so the tree's times aren't a per-file timeline for TAR/7Z. Treat ZIP timestamps as a hint and verify all per-entry times after extraction with your forensic tooling.
Encrypted vendor 7Z
May failIf the 7Z encrypts its header, libarchive may not enumerate it without the password and the preview can fail. ZIP central directories stay readable even when entry data is encrypted, so ZIP bundles always list.
Nested archive inside the bundle
ExpectedThe previewer lists inner .tar.gz / .zip as ordinary entries — it does not auto-recurse. To go deeper, hand the inner archive to nested-archive-extractor with a bounded depth so you don't trip a zip bomb.
Only the 200 largest files are tabulated
ExpectedThe table caps at 200; the tree shows everything and the header reports the true count. For a complete evidence inventory, generate a CSV/JSON manifest with file-listing-generator.
Compressed size needed for a 7Z
By designThe previewer reports uncompressed size for non-ZIP entries. If a finding hinges on compression behaviour, compute it with compression-ratio-calculator after triage.
Corrupt or truncated capture
Invalid formatA partially transferred bundle whose magic bytes don't match a known format errors with Could not detect or extract archive format. Re-transfer with a checksum, then re-run; consider archive-integrity-tester to confirm soundness.
Air-gapped / no network at all
SupportedAfter the page and WASM bundle have loaded, parsing is fully local. On a network-isolated forensic host, load the tool once, then it processes archives without any further network calls.
Frequently asked questions
Is it safe to triage potential evidence in a browser tool?
Yes for listing — the previewer is browser-only with no upload, so the bundle never leaves the workstation, and for ZIP it never decompresses at all. That keeps chain-of-custody and data-residency intact and avoids running a native unarchiver against possibly-hostile bytes. For the formal extraction step, follow your normal forensic procedure on a controlled host.
Can responders rely on the timestamps shown?
Only for ZIP, where each entry's DOS date/time is read from the central directory. For TAR/7Z/GZIP the previewer falls back to the archive file's own date for every entry, so those aren't a per-file timeline. Use the preview times as a first-pass hint and confirm with forensic tooling after extraction.
How do I avoid extracting a 10 GB bundle just to see the logs?
Preview first to find where the logs live and which are biggest, then extract only that subtree with selective-extractor using a glob like var/log/**/*.log. You skip unpacking gigabytes of binaries, cores, and metrics you don't need.
What about a bundle that's an archive of archives?
The tree lists inner .tar.gz / .zip entries as files — the previewer does not recurse automatically, which is a feature for IR because it won't blindly inflate a nested zip bomb. Hand inner archives to nested-archive-extractor with a bounded maxDepth.
Can I produce an inventory manifest for the case file?
Download the HTML preview as a snapshot, or for a structured manifest run file-listing-generator to emit CSV or JSON of every entry. Pair it with checksum-generator if you need per-file or whole-archive hashes for integrity.
Does the previewer extract or decompress my archive?
For ZIP, no. It reads only the central directory — the index ZIP keeps at the end of the file — so a 1 GB ZIP with 50,000 entries lists in well under a second because no compressed bytes are touched. For every other format (7Z, RAR, TAR, GZIP, BZIP2, XZ, ISO) it must decode the whole archive in memory to enumerate entries, because those containers have no detached index a browser can parse. Nothing is written to disk and nothing is uploaded either way.
Why is the Compressed column equal to the Size column for my 7Z or TAR?
Because the previewer only has the real per-entry compressed size for ZIP, which it reads straight from the central directory. For 7Z, RAR, TAR, GZIP and the libarchive-handled formats it decodes the entries and reports the uncompressed byte length in both columns. So treat the Compressed column as meaningful for ZIP only. If you need true compressed-vs-original numbers for a non-ZIP archive, use compression-ratio-calculator.
Are timestamps in the tree accurate?
For ZIP, yes — each entry's DOS date/time is decoded from the central directory. For non-ZIP formats the previewer does not surface per-entry timestamps; it falls back to the archive file's own last-modified date for every entry. To normalise or inspect ZIP timestamps specifically, see timestamp-normalizer.
Can it preview password-protected archives?
For ZIP, yes — filenames in a ZIP central directory are stored in the clear even when the file data is AES-256 or ZipCrypto encrypted, so the tree and table render fine and the entry simply shows an encryption flag. The previewer never asks for or needs the password because it does not decompress ZIP data. Encrypted 7Z/RAR that libarchive cannot open without a password may fail to enumerate. To actually unlock and pull files, use multi-format-extractor.
Is there any options panel to configure?
No. The previewer has zero options — you drop one archive and press Process. There is no glob filter, no sort toggle, no depth slider. If you need to filter the listing by pattern, generate a CSV/JSON listing, or limit recursion, those are separate sibling tools: selective-extractor, file-listing-generator, and nested-archive-extractor.
How is the format detected — by extension?
By magic bytes, not extension. The first eight bytes are checked: PK for ZIP, 1F 8B for GZIP, BZh for BZIP2, the XZ and 7Z signatures, Rar! for RAR, and ustar at offset 257 for TAR. A .zip renamed to .bin still previews correctly; a corrupt header that matches nothing falls through to a best-effort ZIP attempt and otherwise errors.
Why do only 200 files show in the table?
The flat table is capped at the 200 largest files by uncompressed size so the HTML stays responsive. The folder tree above it is not capped — it renders every entry — and the header line reports the true total entry count. If you need a complete machine-readable listing of all entries, run file-listing-generator for CSV, JSON, or a full text tree.
Do my files get uploaded?
No. Archive tools are browser-only — there is no server-side path. The bytes are read by FileReader, parsed by fflate / libarchive WASM locally, and the result is rendered in the same tab. The result panel shows a 0 bytes uploaded badge. On Pro and above, scripted runs go through the local @jadapps/runner (a short-lived headless Chromium on your own machine), still without leaving your hardware.
Does it work on an air-gapped forensic host?
Yes — load the page and its WASM once while connected, after which parsing is entirely local with no further network calls. This makes it usable on isolated hosts where installing unzip/7z/unrar is undesirable or disallowed.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.