How to archive splitter for automation & sre
- Step 1Pull the oversized archive off the pipeline — Grab the CI artifact,
logs.tar.gz, or incident capture you need to break up. Keep it as the original archive - the splitter reads it directly; you don't need to re-collect the source files. - Step 2Open the splitter (Pro plan required) — Go to /archive-tools/archive-splitter on a Pro, Pro-media, or Developer plan. SRE teams typically need Pro-media/Developer for the 2 GB / 500,000-entry caps that match large log bundles.
- Step 3Drop in one archive — Drag the single archive onto the drop zone. zip/tar/gz read natively; 7z/rar/bz2/xz/iso read through libarchive WASM. Provide the password if the ZIP is encrypted.
- Step 4Choose the mode that matches your constraint — Fitting an attachment cap? Use
sizeand setsplitSizeMba little below the cap (it measures uncompressed bytes, so parts land slightly under). Need reproducible chunks across runs? Usecountwith a fixedsplitFileCount. - Step 5Run and verify part sizes — Run the split in-browser. Read the per-part sizes in the result panel and confirm the largest part is under your upload/attachment limit before you attach anything.
- Step 6Download and route the parts — Click
Download N parts- each<stem>-part-NNN.zipsaves separately. Attach the relevant part(s) to the ticket, or upload them one at a time to the artifact store.
The real Archive Splitter option contract
Every control the tool actually exposes, with its enum/min/max and default, read straight from the option schema. There are no presets, no drag-reorder, and no part-naming field - output parts are always named <stem>-part-001.zip, -002, and so on.
| Option | Type | Range / values | Default | What it does |
|---|---|---|---|---|
splitMode | enum | size, count | size | Split by total uncompressed bytes per part, or by a fixed number of entries per part. |
splitSizeMb | number | 1 - 4096 | 50 | Target part size in MB (decimal: 1 MB = 1,000,000 bytes) when splitMode is size. Measured against uncompressed entry sizes. |
splitFileCount | number | 1 - 100000 | 100 (when unset) | Entries per part when splitMode is count. The schema has no UI default; the processor falls back to 100 if you leave it blank. |
| output format | fixed | ZIP only | ZIP | Each part is rebuilt as a fresh ZIP at compression level 6. You cannot choose 7z/tar/gz output here. |
What you can feed in vs what comes out
The splitter detects the input format by magic bytes, then extracts every file entry and rebuilds the parts. fflate handles zip/gz/tar natively; a libarchive WASM bridge reads the rest. Output is always ZIP - the splitter does not write 7z, rar, tar, or gz.
| Input format | Read engine | Read? | Output of each part |
|---|---|---|---|
.zip (no encryption) | fflate | Yes | ZIP (level 6) |
.zip (AES / ZipCrypto) | @zip.js/zip.js | Yes, with the password | ZIP (level 6, no encryption re-applied) |
.tar | fflate (tar parser) | Yes | ZIP (level 6) |
.gz (single-member gzip) | fflate | Yes (yields one inner file) | ZIP (level 6) |
.7z, .rar, .bz2, .xz, .iso | libarchive WASM | Yes (read-only) | ZIP (level 6) |
Archive-family tier limits that gate a split
Archive Splitter requires the Pro plan or higher (Free sees an upgrade overlay and cannot run it). Both the file-size cap and the per-archive entry-count cap apply to the input archive.
| Plan | Max input size | Max entries per archive | Files at once | Can run splitter? |
|---|---|---|---|---|
| Free | 50 MB | 500 | 1 | No - tool is Pro-gated |
| Pro | 500 MB | 50,000 | 20 | Yes |
| Pro-media | 2 GB | 500,000 | 100 | Yes |
| Developer | 2 GB | 500,000 | unlimited | Yes |
Cookbook
Concrete splits with the exact options used and the part layout you get back. Sizes are illustrative; the splitter measures the uncompressed size of each entry when bucketing in size mode.
Fit a 1.2 GB log bundle under a 100 MB ticket attachment cap
The ticketing system rejects anything over 100 MB. Size mode at 95 MB keeps every part safely under the cap (uncompressed budget + level-6 recompression means parts land under target).
Input: incident-4821-logs.tar (1.2 GB uncompressed) Options: splitMode = size, splitSizeMb = 95 Result: Parts: 14 Mode: by size incident-4821-logs-part-001.zip ~88 MB ...all parts < 100 MB... Attach the relevant part to the ticket - it opens standalone.
Deterministic 2,000-entries-per-part chunks for a replay job
A reprocessing job ingests at most 2,000 files per batch. Count mode gives identical chunk boundaries every run, so the batch numbering is stable.
Input: events-export.zip (47,300 entries) Options: splitMode = count, splitFileCount = 2000 Result: Parts: 24 Mode: by file count events-export-part-001.zip ... part-023.zip (2,000 each) events-export-part-024.zip (1,300)
Split a 7z backup pulled from a fleet node
The node's nightly backup is a 7z. libarchive WASM reads it in-browser; the parts come out as ZIP so any teammate can open them without 7-Zip installed.
Input: node07-backup.7z (read via libarchive WASM) Options: splitMode = size, splitSizeMb = 200 Output: node07-backup-part-001.zip, ... (ZIP parts) No 7z tooling needed on the receiving end.
A 1.5 GB core dump inside the capture
The capture has one enormous core file. With a 200 MB target it would never fit a part - so the splitter gives it its own part rather than corrupting it across two.
Input: crash-capture.zip (core.dump = 1.5 GB + logs) Options: splitMode = size, splitSizeMb = 200 Result: crash-capture-part-001.zip ~200 MB (logs) crash-capture-part-002.zip ~1.5 GB (core.dump alone) The core file stays intact in its own part.
Capture too big for the tier - split upstream first
A 3 GB capture exceeds even the 2 GB Developer cap. The splitter blocks it. The runbook step is to pre-split with the CLI on the node, then refine into shareable ZIP parts in-browser.
Input: mega-capture.tar.gz (3 GB) -> blocked (> 2 GB cap)
Runbook:
1. On the node: 7z a -v1500m mega.7z capture/ (rough cut)
2. Drop each <2 GB piece into the browser splitter
to produce self-contained ZIP parts for sharing.Edge cases and what actually happens
Output parts are ZIP even when the input was 7z/rar/tar
By designThe splitter rebuilds every part with fflate's zipSync at level 6, so a .7z, .rar, .tar, or .gz input always produces .zip parts. If you specifically need the parts in the original format, the splitter is the wrong tool - it normalises everything to ZIP. Check archive-format-converter for format changes.
`size` mode measures uncompressed bytes, not download size
ExpectedEach part's budget (splitSizeMb) is checked against the uncompressed size of the entries it holds. Because the part is then re-compressed at level 6, the actual downloaded .zip is usually smaller than the target. If you need parts that are exactly N MB on disk, size mode will run a little under.
A single entry exceeds the part size
PreservedWe never split one file across two archives. If an entry is larger than splitSizeMb, it is placed in its own part - so that part can be bigger than the target. This keeps every entry intact and openable.
Directory-only entries vanish from the parts
PreservedExtraction keeps only file entries; pure directory records (names ending in /) are dropped. The files inside those directories keep their full relative paths, so the folder structure is reconstructed when you unzip - empty directories are not preserved.
Archive has encrypted ZIP entries and no password is given
Fails - password requiredEncrypted ZIP entries are read through @zip.js/zip.js, which needs the password. Without it the extraction step fails before any part is produced. Supply the password in the prompt and re-run. Note: the output parts are not re-encrypted.
Input format is unrecognised or corrupt
Error - unknown formatFormat is detected from the file's magic bytes. A truncated header, a renamed non-archive, or a damaged file is treated as unknown and routed to libarchive, which will error if it cannot parse it. Test the archive with archive-integrity-tester first if you suspect corruption.
Archive contains no extractable entries
Error - no entriesIf extraction yields zero file entries (for example an archive that holds only empty directories), the splitter throws Archive contains no entries. There is nothing to group into parts.
Input exceeds the tier file-size or entry-count cap
Blocked - over tier limitPro caps input at 500 MB and 50,000 entries; Pro-media and Developer at 2 GB and 500,000 entries. An archive over your plan's size or entry-count limit is blocked before processing. Split a smaller archive or upgrade the plan.
Free plan cannot open the tool at all
Blocked - Pro requiredArchive Splitter's minimum tier is Pro. On the Free plan the page renders an upgrade overlay instead of the drop zone, so you cannot run a split until you are on Pro or higher.
Parts download individually, not as one bundle
ExpectedClicking Download N parts triggers one browser download per part. The <stem>-N-parts.zip name shown in the result is only a label - there is no single combined ZIP-of-parts file. Your browser may ask to allow multiple downloads the first time.
Frequently asked questions
Can I run this inside a CI/CD pipeline?
No - execution is browser-only with no API for splitting. For headless pipelines use zip -s/7z -v/split. The JAD splitter is for the human-in-the-loop step: turning a too-big archive into self-contained, shareable ZIP parts.
How do I guarantee every part fits a 100 MB cap?
Use size mode and set splitSizeMb a little below the cap (e.g. 95). The budget is measured on uncompressed bytes and parts are re-compressed at level 6, so they land under target. Then confirm the largest part in the result panel.
What about an entry that's bigger than my cap?
It gets its own part, which will exceed the cap because we never split one file across parts. For a single oversized file you'll need a byte-level split instead, or to recompress that file separately.
Are the parts safe to attach to a public ticket?
The parts are plain, unencrypted ZIPs (encryption is not re-applied on output). Treat them as plaintext - don't attach sensitive captures to a public ticket without separately protecting them.
Will it read our `.tar.gz` log bundles?
A plain .tar reads natively; gzip is read by fflate too. The contents are extracted and rebuilt as ZIP parts. The parts are ZIP, not tar.gz.
Can I split a 7z backup from a node?
Yes - libarchive WASM reads 7z, rar, bz2, xz, and iso. The parts come back as ZIP so teammates without 7-Zip can still open them.
What are the hard size limits?
Pro: 500 MB / 50,000 entries. Pro-media and Developer: 2 GB / 500,000 entries. Both caps apply, so a high-file-count log bundle can hit the entry cap before the size cap.
Does it preserve directory structure?
File paths are preserved (so folders reconstruct on unzip), but empty directory records are dropped. Use empty-folder-pruner if you want that done deliberately, or file-listing-generator to audit contents first.
How do I reassemble parts later?
Each part is independent, so often you don't need to. To recombine all parts into one archive, use archive-merger, which merges complete archives (not spanned volumes).
Is processing reproducible across runs?
count mode is deterministic - same input, same chunk boundaries. size mode depends on per-entry sizes and ordering, so boundaries are stable for a given input too, but a content change shifts them.
Can I verify an archive before splitting it?
Yes - run it through archive-integrity-tester first to catch corruption, since a truncated header makes the splitter error on extraction.
Why not just use spanned 7z volumes?
Spanned volumes can't be opened individually - the recipient needs every piece and a reassembly step. For ticket attachments and vendor handoffs, self-contained ZIP parts are far less error-prone.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.