Archive Splitter for Developers - Chunk Artifacts & Datasets

How to archive splitter in developer workflows

Step 1
Grab the build artifact — Take the existing .zip/.tar/.tar.gz/.7z your build or packaging step produced. The splitter reads the finished archive directly - no need to re-run the build or re-gather source files.
Step 2
Open the splitter on a Pro plan — Visit /archive-tools/archive-splitter. It is Pro-gated; large dataset/artifact work usually wants Pro-media or Developer for the 2 GB / 500,000-entry ceiling.
Step 3
Drop the single archive in — Drag one archive onto the drop zone. zip/tar/gz are native; 7z/rar/bz2/xz/iso go through libarchive WASM. Encrypted ZIP? Enter the password when prompted.
Step 4
Choose size or count — To clear a 2 GB upload cap, use size with splitSizeMb near (just under) 2000. To document a fixed contract, use count with splitFileCount (e.g. 5000).
Step 5
Run and check the manifest — Run in-browser. The result panel lists every <stem>-part-NNN.zip with its size and the total Parts count - your basis for a README table or a checksum manifest.
Step 6
Download and publish parts — Click Download N parts; each part downloads as its own file. Publish them as separate release assets or bucket objects, and document the reassembly (or note that each part is standalone).

The real Archive Splitter option contract

Every control the tool actually exposes, with its enum/min/max and default, read straight from the option schema. There are no presets, no drag-reorder, and no part-naming field - output parts are always named <stem>-part-001.zip, -002, and so on.

Option	Type	Range / values	Default	What it does
`splitMode`	enum	`size`, `count`	`size`	Split by total uncompressed bytes per part, or by a fixed number of entries per part.
`splitSizeMb`	number	1 - 4096	50	Target part size in MB (decimal: 1 MB = 1,000,000 bytes) when `splitMode` is `size`. Measured against uncompressed entry sizes.
`splitFileCount`	number	1 - 100000	100 (when unset)	Entries per part when `splitMode` is `count`. The schema has no UI default; the processor falls back to 100 if you leave it blank.
output format	fixed	ZIP only	ZIP	Each part is rebuilt as a fresh ZIP at compression level 6. You cannot choose 7z/tar/gz output here.

What you can feed in vs what comes out

The splitter detects the input format by magic bytes, then extracts every file entry and rebuilds the parts. fflate handles zip/gz/tar natively; a libarchive WASM bridge reads the rest. Output is always ZIP - the splitter does not write 7z, rar, tar, or gz.

Input format	Read engine	Read?	Output of each part
`.zip` (no encryption)	fflate	Yes	ZIP (level 6)
`.zip` (AES / ZipCrypto)	@zip.js/zip.js	Yes, with the password	ZIP (level 6, no encryption re-applied)
`.tar`	fflate (tar parser)	Yes	ZIP (level 6)
`.gz` (single-member gzip)	fflate	Yes (yields one inner file)	ZIP (level 6)
`.7z`, `.rar`, `.bz2`, `.xz`, `.iso`	libarchive WASM	Yes (read-only)	ZIP (level 6)

Archive-family tier limits that gate a split

Archive Splitter requires the Pro plan or higher (Free sees an upgrade overlay and cannot run it). Both the file-size cap and the per-archive entry-count cap apply to the input archive.

Plan	Max input size	Max entries per archive	Files at once	Can run splitter?
Free	50 MB	500	1	No - tool is Pro-gated
Pro	500 MB	50,000	20	Yes
Pro-media	2 GB	500,000	100	Yes
Developer	2 GB	500,000	unlimited	Yes

Cookbook

Concrete splits with the exact options used and the part layout you get back. Sizes are illustrative; the splitter measures the uncompressed size of each entry when bucketing in size mode.

Split a 2.4 GB release into <1900 MB GitHub assets

GitHub caps a single release asset at 2 GB. Size mode at 1900 MB keeps each part comfortably under, and each part is independently downloadable.

Input:  app-v3.2.0-full.zip  (2.4 GB uncompressed)
Options: splitMode = size, splitSizeMb = 1900

Result: Parts: 2   Mode: by size
  app-v3.2.0-full-part-001.zip  ~1.7 GB
  app-v3.2.0-full-part-002.zip  ~0.6 GB
Upload each as a separate release asset.

Chunk a training dataset into 5,000-file parts

A documented, deterministic contract: every part holds exactly 5,000 samples. Count mode makes the boundaries reproducible across rebuilds.

Input:  dataset-images.zip  (128,000 .png entries)
Options: splitMode = count, splitFileCount = 5000

Result: Parts: 26   Mode: by file count
  dataset-images-part-001.zip ... part-025.zip (5,000 each)
  dataset-images-part-026.zip (3,000)

Break a 7z dependency cache into shareable ZIPs

A teammate shared a .7z cache; you need ZIP parts for a CI cache that only accepts ZIP. libarchive WASM reads the 7z; the parts are ZIP.

Input:  deps-cache.7z  (read via libarchive WASM)
Options: splitMode = size, splitSizeMb = 250

Output: deps-cache-part-001.zip, ... (ZIP, not 7z)
Feed each ZIP into the CI cache step.

One huge model file in the artifact

The artifact bundles a 3 GB model weight. It exceeds any reasonable part size, so it lands in its own part untouched - the rest of the files pack to the target.

Input:  bundle.zip  (model.safetensors = 3 GB + code)
Options: splitMode = size, splitSizeMb = 1000

Result:
  bundle-part-001.zip  ~1 GB (code + assets)
  bundle-part-002.zip  ~3 GB (model.safetensors alone)
The weight file stays intact.

Document parts with a checksum manifest

After splitting, generate per-part checksums for your release notes so downloaders can verify each independent part.

Downloaded parts:
  release-part-001.zip
  release-part-002.zip
  release-part-003.zip

Next: run each through checksum-generator and publish a
SHA-256 manifest next to the assets in the README.

Edge cases and what actually happens

Output parts are ZIP even when the input was 7z/rar/tar

By design

The splitter rebuilds every part with fflate's zipSync at level 6, so a .7z, .rar, .tar, or .gz input always produces .zip parts. If you specifically need the parts in the original format, the splitter is the wrong tool - it normalises everything to ZIP. Check archive-format-converter for format changes.

`size` mode measures uncompressed bytes, not download size

Expected

Each part's budget (splitSizeMb) is checked against the uncompressed size of the entries it holds. Because the part is then re-compressed at level 6, the actual downloaded .zip is usually smaller than the target. If you need parts that are exactly N MB on disk, size mode will run a little under.

A single entry exceeds the part size

Preserved

We never split one file across two archives. If an entry is larger than splitSizeMb, it is placed in its own part - so that part can be bigger than the target. This keeps every entry intact and openable.

Directory-only entries vanish from the parts

Preserved

Extraction keeps only file entries; pure directory records (names ending in /) are dropped. The files inside those directories keep their full relative paths, so the folder structure is reconstructed when you unzip - empty directories are not preserved.

Archive has encrypted ZIP entries and no password is given

Fails - password required

Encrypted ZIP entries are read through @zip.js/zip.js, which needs the password. Without it the extraction step fails before any part is produced. Supply the password in the prompt and re-run. Note: the output parts are not re-encrypted.

Input format is unrecognised or corrupt

Error - unknown format

Format is detected from the file's magic bytes. A truncated header, a renamed non-archive, or a damaged file is treated as unknown and routed to libarchive, which will error if it cannot parse it. Test the archive with archive-integrity-tester first if you suspect corruption.

Archive contains no extractable entries

Error - no entries

If extraction yields zero file entries (for example an archive that holds only empty directories), the splitter throws Archive contains no entries. There is nothing to group into parts.

Input exceeds the tier file-size or entry-count cap

Blocked - over tier limit

Pro caps input at 500 MB and 50,000 entries; Pro-media and Developer at 2 GB and 500,000 entries. An archive over your plan's size or entry-count limit is blocked before processing. Split a smaller archive or upgrade the plan.

Free plan cannot open the tool at all

Blocked - Pro required

Archive Splitter's minimum tier is Pro. On the Free plan the page renders an upgrade overlay instead of the drop zone, so you cannot run a split until you are on Pro or higher.

Parts download individually, not as one bundle

Expected

Clicking Download N parts triggers one browser download per part. The <stem>-N-parts.zip name shown in the result is only a label - there is no single combined ZIP-of-parts file. Your browser may ask to allow multiple downloads the first time.

Frequently asked questions

Is there an API or CLI to call this from CI?

No - the splitter is browser-only with no scripting endpoint. For CI, use zip -s/7z -v/split. Use the JAD splitter when you want self-contained, individually downloadable ZIP parts for a release or a teammate.

Will the parts upload as separate GitHub release assets?

Yes - each part is its own .zip file. Set splitSizeMb below GitHub's 2 GB per-asset limit and upload each part as a distinct asset. Downloaders can fetch and extract any part independently.

Does it keep the original format?

No. Whatever you put in (zip, tar, 7z, rar...), the parts come out as ZIP at compression level 6. If you must preserve a specific format, the splitter isn't the right tool.

Can I document an exact files-per-part contract?

Yes - count mode is deterministic. 'Each part contains 5,000 files' holds for a given input, and only the final part is smaller (the remainder).

Are parts reproducible build-to-build?

count boundaries are stable for identical input. Any change to the set or order of entries shifts where parts break, so regenerate the manifest after a content change.

How do I verify the parts?

Run each part through checksum-generator and publish a SHA-256 manifest. Use archive-integrity-tester to confirm a part is structurally sound.

What if one file is bigger than my part size?

It gets its own part (larger than the target). We never split a single entry across parts, so an extracted binary or model weight is never corrupted.

Can I split a private prerelease without it leaving my machine?

Yes - all extraction and re-zipping happen in the browser. The page shows 0 bytes uploaded. Nothing about the artifact is sent to a server.

Will it handle a 2 GB+ artifact?

Only up to the tier cap: 500 MB on Pro, 2 GB on Pro-media/Developer. Above 2 GB you must pre-split with a CLI, then refine each piece into ZIP parts in-browser.

Does it preserve empty folders in the artifact?

No - empty directory records are dropped; only file entries (with their paths) are kept. To manage empty folders, see empty-folder-pruner.

How do downloaders put the parts back together?

Usually they don't need to - each part is independent. If you want a single recombined archive, point them to archive-merger, which merges complete archives.

Can I build a fresh multi-part archive instead of splitting one?

Yes - use multi-part-archive-creator to build chunks straight from loose files, or folder-to-zip to zip a folder first, then split it here.

Privacy first

Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.

How to archive splitter in developer workflows

Step 1
Grab the build artifact — Take the existing .zip/.tar/.tar.gz/.7z your build or packaging step produced. The splitter reads the finished archive directly - no need to re-run the build or re-gather source files.
Step 2
Open the splitter on a Pro plan — Visit /archive-tools/archive-splitter. It is Pro-gated; large dataset/artifact work usually wants Pro-media or Developer for the 2 GB / 500,000-entry ceiling.
Step 3
Drop the single archive in — Drag one archive onto the drop zone. zip/tar/gz are native; 7z/rar/bz2/xz/iso go through libarchive WASM. Encrypted ZIP? Enter the password when prompted.
Step 4
Choose size or count — To clear a 2 GB upload cap, use size with splitSizeMb near (just under) 2000. To document a fixed contract, use count with splitFileCount (e.g. 5000).
Step 5
Run and check the manifest — Run in-browser. The result panel lists every <stem>-part-NNN.zip with its size and the total Parts count - your basis for a README table or a checksum manifest.
Step 6
Download and publish parts — Click Download N parts; each part downloads as its own file. Publish them as separate release assets or bucket objects, and document the reassembly (or note that each part is standalone).

The real Archive Splitter option contract

Option	Type	Range / values	Default	What it does
`splitMode`	enum	`size`, `count`	`size`	Split by total uncompressed bytes per part, or by a fixed number of entries per part.
`splitSizeMb`	number	1 - 4096	50	Target part size in MB (decimal: 1 MB = 1,000,000 bytes) when `splitMode` is `size`. Measured against uncompressed entry sizes.
`splitFileCount`	number	1 - 100000	100 (when unset)	Entries per part when `splitMode` is `count`. The schema has no UI default; the processor falls back to 100 if you leave it blank.
output format	fixed	ZIP only	ZIP	Each part is rebuilt as a fresh ZIP at compression level 6. You cannot choose 7z/tar/gz output here.

What you can feed in vs what comes out

Input format	Read engine	Read?	Output of each part
`.zip` (no encryption)	fflate	Yes	ZIP (level 6)
`.zip` (AES / ZipCrypto)	@zip.js/zip.js	Yes, with the password	ZIP (level 6, no encryption re-applied)
`.tar`	fflate (tar parser)	Yes	ZIP (level 6)
`.gz` (single-member gzip)	fflate	Yes (yields one inner file)	ZIP (level 6)
`.7z`, `.rar`, `.bz2`, `.xz`, `.iso`	libarchive WASM	Yes (read-only)	ZIP (level 6)

Archive-family tier limits that gate a split

Archive Splitter requires the Pro plan or higher (Free sees an upgrade overlay and cannot run it). Both the file-size cap and the per-archive entry-count cap apply to the input archive.

Plan	Max input size	Max entries per archive	Files at once	Can run splitter?
Free	50 MB	500	1	No - tool is Pro-gated
Pro	500 MB	50,000	20	Yes
Pro-media	2 GB	500,000	100	Yes
Developer	2 GB	500,000	unlimited	Yes

Cookbook

Concrete splits with the exact options used and the part layout you get back. Sizes are illustrative; the splitter measures the uncompressed size of each entry when bucketing in size mode.

Split a 2.4 GB release into <1900 MB GitHub assets

GitHub caps a single release asset at 2 GB. Size mode at 1900 MB keeps each part comfortably under, and each part is independently downloadable.

Input:  app-v3.2.0-full.zip  (2.4 GB uncompressed)
Options: splitMode = size, splitSizeMb = 1900

Result: Parts: 2   Mode: by size
  app-v3.2.0-full-part-001.zip  ~1.7 GB
  app-v3.2.0-full-part-002.zip  ~0.6 GB
Upload each as a separate release asset.

Chunk a training dataset into 5,000-file parts

A documented, deterministic contract: every part holds exactly 5,000 samples. Count mode makes the boundaries reproducible across rebuilds.

Input:  dataset-images.zip  (128,000 .png entries)
Options: splitMode = count, splitFileCount = 5000

Result: Parts: 26   Mode: by file count
  dataset-images-part-001.zip ... part-025.zip (5,000 each)
  dataset-images-part-026.zip (3,000)

Break a 7z dependency cache into shareable ZIPs

A teammate shared a .7z cache; you need ZIP parts for a CI cache that only accepts ZIP. libarchive WASM reads the 7z; the parts are ZIP.

Input:  deps-cache.7z  (read via libarchive WASM)
Options: splitMode = size, splitSizeMb = 250

Output: deps-cache-part-001.zip, ... (ZIP, not 7z)
Feed each ZIP into the CI cache step.

One huge model file in the artifact

The artifact bundles a 3 GB model weight. It exceeds any reasonable part size, so it lands in its own part untouched - the rest of the files pack to the target.

Input:  bundle.zip  (model.safetensors = 3 GB + code)
Options: splitMode = size, splitSizeMb = 1000

Result:
  bundle-part-001.zip  ~1 GB (code + assets)
  bundle-part-002.zip  ~3 GB (model.safetensors alone)
The weight file stays intact.

Document parts with a checksum manifest

After splitting, generate per-part checksums for your release notes so downloaders can verify each independent part.

Downloaded parts:
  release-part-001.zip
  release-part-002.zip
  release-part-003.zip

Next: run each through checksum-generator and publish a
SHA-256 manifest next to the assets in the README.

Edge cases and what actually happens

Output parts are ZIP even when the input was 7z/rar/tar

By design

`size` mode measures uncompressed bytes, not download size

Expected

A single entry exceeds the part size

Preserved

Directory-only entries vanish from the parts

Preserved

Archive has encrypted ZIP entries and no password is given

Fails - password required

Input format is unrecognised or corrupt

Error - unknown format

Archive contains no extractable entries

Error - no entries

If extraction yields zero file entries (for example an archive that holds only empty directories), the splitter throws Archive contains no entries. There is nothing to group into parts.

Input exceeds the tier file-size or entry-count cap

Blocked - over tier limit

Free plan cannot open the tool at all

Blocked - Pro required

Archive Splitter's minimum tier is Pro. On the Free plan the page renders an upgrade overlay instead of the drop zone, so you cannot run a split until you are on Pro or higher.

Parts download individually, not as one bundle

Expected

Frequently asked questions

Is there an API or CLI to call this from CI?

Will the parts upload as separate GitHub release assets?

Yes - each part is its own .zip file. Set splitSizeMb below GitHub's 2 GB per-asset limit and upload each part as a distinct asset. Downloaders can fetch and extract any part independently.

Does it keep the original format?

No. Whatever you put in (zip, tar, 7z, rar...), the parts come out as ZIP at compression level 6. If you must preserve a specific format, the splitter isn't the right tool.

Can I document an exact files-per-part contract?

Yes - count mode is deterministic. 'Each part contains 5,000 files' holds for a given input, and only the final part is smaller (the remainder).

Are parts reproducible build-to-build?

count boundaries are stable for identical input. Any change to the set or order of entries shifts where parts break, so regenerate the manifest after a content change.

How do I verify the parts?

Run each part through checksum-generator and publish a SHA-256 manifest. Use archive-integrity-tester to confirm a part is structurally sound.

What if one file is bigger than my part size?

It gets its own part (larger than the target). We never split a single entry across parts, so an extracted binary or model weight is never corrupted.

Can I split a private prerelease without it leaving my machine?

Yes - all extraction and re-zipping happen in the browser. The page shows 0 bytes uploaded. Nothing about the artifact is sent to a server.

Will it handle a 2 GB+ artifact?

Only up to the tier cap: 500 MB on Pro, 2 GB on Pro-media/Developer. Above 2 GB you must pre-split with a CLI, then refine each piece into ZIP parts in-browser.

Does it preserve empty folders in the artifact?

No - empty directory records are dropped; only file entries (with their paths) are kept. To manage empty folders, see empty-folder-pruner.

How do downloaders put the parts back together?

Usually they don't need to - each part is independent. If you want a single recombined archive, point them to archive-merger, which merges complete archives.

Can I build a fresh multi-part archive instead of splitting one?

Yes - use multi-part-archive-creator to build chunks straight from loose files, or folder-to-zip to zip a folder first, then split it here.

Archive Splitter in Developer Workflows

How to archive splitter in developer workflows

The real Archive Splitter option contract

What you can feed in vs what comes out

Archive-family tier limits that gate a split

Cookbook

Split a 2.4 GB release into <1900 MB GitHub assets

Chunk a training dataset into 5,000-file parts

Break a 7z dependency cache into shareable ZIPs

One huge model file in the artifact

Document parts with a checksum manifest

Edge cases and what actually happens

Output parts are ZIP even when the input was 7z/rar/tar

`size` mode measures uncompressed bytes, not download size

A single entry exceeds the part size

Directory-only entries vanish from the parts

Archive has encrypted ZIP entries and no password is given

Input format is unrecognised or corrupt

Archive contains no extractable entries

Input exceeds the tier file-size or entry-count cap

Free plan cannot open the tool at all

Parts download individually, not as one bundle

Frequently asked questions

Is there an API or CLI to call this from CI?

Will the parts upload as separate GitHub release assets?

Does it keep the original format?

Can I document an exact files-per-part contract?

Are parts reproducible build-to-build?

How do I verify the parts?

What if one file is bigger than my part size?

Can I split a private prerelease without it leaving my machine?

Will it handle a 2 GB+ artifact?

Does it preserve empty folders in the artifact?

How do downloaders put the parts back together?

Can I build a fresh multi-part archive instead of splitting one?

Privacy first

Related guides

Archive Splitter in Developer Workflows

How to archive splitter in developer workflows

The real Archive Splitter option contract

What you can feed in vs what comes out

Archive-family tier limits that gate a split

Cookbook

Split a 2.4 GB release into <1900 MB GitHub assets

Chunk a training dataset into 5,000-file parts

Break a 7z dependency cache into shareable ZIPs

One huge model file in the artifact

Document parts with a checksum manifest

Edge cases and what actually happens

Output parts are ZIP even when the input was 7z/rar/tar

`size` mode measures uncompressed bytes, not download size

A single entry exceeds the part size

Directory-only entries vanish from the parts

Archive has encrypted ZIP entries and no password is given

Input format is unrecognised or corrupt

Archive contains no extractable entries

Input exceeds the tier file-size or entry-count cap

Free plan cannot open the tool at all

Parts download individually, not as one bundle

Frequently asked questions

Is there an API or CLI to call this from CI?

Will the parts upload as separate GitHub release assets?

Does it keep the original format?

Can I document an exact files-per-part contract?

Are parts reproducible build-to-build?

How do I verify the parts?

What if one file is bigger than my part size?

Can I split a private prerelease without it leaving my machine?

Will it handle a 2 GB+ artifact?

Does it preserve empty folders in the artifact?

How do downloaders put the parts back together?

Can I build a fresh multi-part archive instead of splitting one?

Privacy first

Related guides