Size Analyser vs unzip -l / 7z l / du

How to size analyser vs unzip -l / 7z l / du

Step 1
Reproduce unzip -l aggregation in the browser — unzip -l a.zip | awk '{print $4}' | ... is the manual route to a per-type total. The Size Analyser does it directly: drop a.zip, read byExtension, the biggest type is the first array element.
Step 2
Reproduce 7z l for 7z/RAR — 7z l a.7z needs p7zip installed. The Size Analyser routes 7z and RAR through libarchive WASM, so you get the same per-entry sizes summed by type — no local install. Note it fully decompresses 7z/RAR to measure, where 7z l reads only the header.
Step 3
Skip the du step entirely — du -sh extracted/* only works after unzip. The analyser reports per-top-folder bytes straight from the archive, so you never write the extracted tree to disk.
Step 4
Compare on size ceiling — Files over 2 GB belong on the CLI — browser memory becomes the bottleneck, especially for 7z/RAR where libarchive expands entries in RAM. Under the tier caps the browser is faster to results because there is nothing to install or pipe.
Step 5
Compare on privacy — Both the browser tool and the CLI run locally, so both are private. The browser path adds: no install, no admin rights, nothing written to disk. Useful when auditing untrusted input you do not want to extract.
Step 6
Compare on automation — For one PR or one ad-hoc triage, the browser is faster end to end. For nightly jobs over thousands of archives, the CLI with find ... -exec wins — the analyser reads one archive per run.

Command-by-command equivalents

What each CLI command gives you versus what the Size Analyser returns. The analyser's headline advantage is that the aggregation step is already done.

Goal	CLI	Size Analyser
List entries + sizes	`unzip -l a.zip` / `7z l a.7z` / `tar -tvf a.tar.gz`	Reads them internally; not printed flat — returned as grouped totals
Total bytes per file type	`unzip -l a.zip \| awk` + `sort`	`byExtension`, pre-summed and sorted descending
Total bytes per top folder	extract, then `du -sh extracted/*`	`byTopFolder`, no extraction needed
Which group is biggest	manual `sort -rn`	First element of each array (already sorted)
Machine-readable output	parse whitespace-aligned text	Structured JSON: `{ ext, count, totalSize }`

Browser tool vs CLI: trade-offs

Honest comparison. Neither is strictly better — the right choice depends on file size, install constraints, and whether you are scripting.

Dimension	JAD Size Analyser	unzip -l / 7z l / du
Install	None (browser)	Needs unzip/p7zip/tar
Upload	Never (local WASM)	Never (local)
Max archive	50 MB Free / 500 MB Pro / 2 GB Pro-Media	No ceiling (disk/RAM bound)
Entry cap	500 Free / 50,000 Pro / 500,000 Pro-Media	None
Aggregation	Built in (by ext + by folder)	Manual (awk/sort/du)
Output	JSON report	Plain text
Best for	Ad-hoc triage, locked-down machines	Huge files, scripted batch jobs

Cookbook

Side-by-side: the shell pipeline you would otherwise write, versus the analyser report. Sizes are uncompressed bytes in both.

Per-type total: unzip -l + awk vs one drop

The classic 'which extension dominates' question. The CLI needs a pipe; the analyser returns it sorted.

CLI:
  unzip -l app.zip | awk 'NR>3 {n=split($4,a,"."); s[a[n]]+=$1}
    END {for (k in s) print s[k], k}' | sort -rn

Analyser:
  drop app.zip -> byExtension[0]
  { "ext": "map", "count": 19, "totalSize": 73400320 }

Same answer, no pipeline to debug.

Per-folder total: du vs byTopFolder

du needs the archive extracted first. The analyser reads top-level folder totals directly.

CLI:
  unzip -q bundle.zip -d /tmp/b && du -sh /tmp/b/* | sort -rh

Analyser:
  drop bundle.zip -> byTopFolder
  [ { "folder": "vendor", "totalSize": 188743680, "count": 4012 },
    { "folder": "src",    "totalSize": 8388608,   "count": 230  } ]

No /tmp extraction, no cleanup.

7z without p7zip installed

On a machine with no p7zip, 7z l fails. The analyser reads 7z through libarchive WASM in the browser.

CLI (no p7zip):
  7z l backup.7z  ->  command not found: 7z

Analyser:
  drop backup.7z (engine: libarchive WASM)
  byExtension[0] = { "ext": "sql", "count": 1, "totalSize": 524288000 }

Caveat: libarchive fully decompresses 7z to measure, so this
is RAM-heavier than `7z l` reading only headers.

Where the CLI wins: a 6 GB archive

Above the 2 GB tier ceiling the browser cannot help. Stay on the CLI.

Archive: nightly-dump.tar.gz  (6 GB)

Analyser: rejected (over 2 GB Pro-Media cap)
CLI:      tar -tzvf nightly-dump.tar.gz | awk '{s[...]} ...'
          streams the index without loading it all into RAM

Use the CLI for anything past the tier caps.

Where the CLI wins: batch over 5,000 archives

The analyser reads one archive per run. A nightly sweep belongs in a shell loop.

CLI:
  find /backups -name '*.zip' -print0 |
    xargs -0 -P4 -I{} sh -c 'unzip -l "{}" | tail -1'

Analyser:
  one archive per run; for a multi-archive size summary use
  /archive-tools/batch-compression-report on a paid tier.

Edge cases and what actually happens

Output is JSON, not text columns

By design

Unlike unzip -l, the analyser does not print a fixed-width entry table. It returns grouped JSON. If your downstream parser expects unzip -l columns, you must adapt it — but JSON is far more robust than scraping whitespace-aligned text.

7z/RAR measured by full decompression

RAM cost

7z l reads only the archive header to print sizes; the analyser routes 7z/RAR through libarchive, which decompresses each entry to measure its expanded length. For large 7z archives this uses more memory than the CLI listing — a reason to prefer the CLI on multi-GB 7z files.

Archive larger than the tier cap

Tier limit (rejected)

Above 50 MB Free / 500 MB Pro / 2 GB Pro-Media the analyser rejects the file outright, where unzip -l has no ceiling. Split with /archive-tools/archive-splitter or use the CLI for oversized archives.

Entry count over the cap

Tier limit (rejected)

The analyser also enforces an entry cap (500 Free / 50,000 Pro / 500,000 Pro-Media). unzip -l has none. A ZIP with a million tiny files is a CLI job.

du counts disk blocks, analyser counts bytes

Difference to expect

du reports allocated disk blocks (rounded up to the filesystem block size) of the EXTRACTED tree, so its numbers run slightly higher than the analyser's raw uncompressed byte sums. They will not match to the byte — that is expected, not a bug.

Compressed size not reported

Use a sibling tool

unzip -l shows both compressed and uncompressed columns; the analyser groups by uncompressed bytes only. For compression ratio per file or overall, use /archive-tools/compression-ratio-calculator.

Nested archives

Counted as one entry

Both unzip -l and the analyser list an inner data.zip as a single entry — neither recurses by default. Extract first with /archive-tools/nested-archive-extractor to look inside.

Mis-named archive

Handled by magic bytes

unzip keys off content and so does the analyser (magic-byte detection). A report.zip that is really a 7z is read correctly by the analyser via libarchive — and would fail under plain unzip.

Frequently asked questions

Does the Size Analyser just wrap unzip?

No. It uses fflate (a pure-JS ZIP/GZIP library) to read the ZIP/TAR.GZ central directory, and libarchive compiled to WebAssembly for 7z/RAR/bz2/xz. There is no shell-out and no server — everything runs in your browser tab.

When should I prefer the CLI?

Archives over 2 GB, scripted batch runs across thousands of files, CI pipelines, and any case where the archive already lives on a server you control. The CLI has no size or entry ceiling and slots into find -exec loops.

When should I prefer the browser tool?

One-off triage, machines where you cannot install p7zip or run sudo, untrusted input you do not want to extract to disk, and any time you want the by-type and by-folder rollups without writing an awk pipeline.

Is the output interchangeable with CLI tooling?

The analyser emits JSON, not unzip -l text. If a script expects the CLI's column format you will need to adapt it. The upside is that JSON parsing is far more reliable than scraping aligned columns.

Why might du give bigger numbers?

du measures allocated disk blocks of the extracted files, rounded up to the filesystem block size. The analyser sums raw uncompressed bytes from the archive. The two are close but rarely identical — block rounding accounts for the gap.

Does it read 7z and RAR like 7z l does?

Yes, via libarchive WASM. One difference: 7z l reads only the archive header to print sizes, while the analyser decompresses each 7z/RAR entry to measure it. That makes the analyser heavier on RAM for large 7z files.

Can it handle a 10 GB archive like the CLI?

No. The hard ceiling is 2 GB (Pro-Media / Developer). Past that, browser memory is the constraint and you should use the CLI, which streams the index.

Does it show compressed sizes like unzip -l?

No. The analyser groups by UNCOMPRESSED bytes only. For compressed vs uncompressed comparison and ratios, use /archive-tools/compression-ratio-calculator.

What about counts per type, like a quick wc?

Each byExtension row includes a count alongside totalSize. If you want a count-first view of types use /archive-tools/file-type-breakdown, which is the count-oriented companion.

Can I batch many archives like a shell loop?

Not in this tool — it reads one archive per run. For a multi-archive summary use /archive-tools/batch-compression-report on a paid tier; for true scripted batches the CLI remains the right tool.

Is anything uploaded when I use the browser tool?

No. Like the CLI, it runs locally — just inside your browser instead of your shell. The only network touch is a one-time WASM module fetch for 7z/RAR/bz2/xz support.

Does the analyser find duplicates the way fdupes does?

No. It groups by name and folder, not content hashes. For byte-identical duplicate detection use /archive-tools/redundancy-analyzer, which hashes entries with SHA-256.

Privacy first

Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.

How to size analyser vs unzip -l / 7z l / du

Step 1
Reproduce unzip -l aggregation in the browser — unzip -l a.zip | awk '{print $4}' | ... is the manual route to a per-type total. The Size Analyser does it directly: drop a.zip, read byExtension, the biggest type is the first array element.
Step 2
Reproduce 7z l for 7z/RAR — 7z l a.7z needs p7zip installed. The Size Analyser routes 7z and RAR through libarchive WASM, so you get the same per-entry sizes summed by type — no local install. Note it fully decompresses 7z/RAR to measure, where 7z l reads only the header.
Step 3
Skip the du step entirely — du -sh extracted/* only works after unzip. The analyser reports per-top-folder bytes straight from the archive, so you never write the extracted tree to disk.
Step 4
Compare on size ceiling — Files over 2 GB belong on the CLI — browser memory becomes the bottleneck, especially for 7z/RAR where libarchive expands entries in RAM. Under the tier caps the browser is faster to results because there is nothing to install or pipe.
Step 5
Compare on privacy — Both the browser tool and the CLI run locally, so both are private. The browser path adds: no install, no admin rights, nothing written to disk. Useful when auditing untrusted input you do not want to extract.
Step 6
Compare on automation — For one PR or one ad-hoc triage, the browser is faster end to end. For nightly jobs over thousands of archives, the CLI with find ... -exec wins — the analyser reads one archive per run.

Command-by-command equivalents

What each CLI command gives you versus what the Size Analyser returns. The analyser's headline advantage is that the aggregation step is already done.

Goal	CLI	Size Analyser
List entries + sizes	`unzip -l a.zip` / `7z l a.7z` / `tar -tvf a.tar.gz`	Reads them internally; not printed flat — returned as grouped totals
Total bytes per file type	`unzip -l a.zip \| awk` + `sort`	`byExtension`, pre-summed and sorted descending
Total bytes per top folder	extract, then `du -sh extracted/*`	`byTopFolder`, no extraction needed
Which group is biggest	manual `sort -rn`	First element of each array (already sorted)
Machine-readable output	parse whitespace-aligned text	Structured JSON: `{ ext, count, totalSize }`

Browser tool vs CLI: trade-offs

Honest comparison. Neither is strictly better — the right choice depends on file size, install constraints, and whether you are scripting.

Dimension	JAD Size Analyser	unzip -l / 7z l / du
Install	None (browser)	Needs unzip/p7zip/tar
Upload	Never (local WASM)	Never (local)
Max archive	50 MB Free / 500 MB Pro / 2 GB Pro-Media	No ceiling (disk/RAM bound)
Entry cap	500 Free / 50,000 Pro / 500,000 Pro-Media	None
Aggregation	Built in (by ext + by folder)	Manual (awk/sort/du)
Output	JSON report	Plain text
Best for	Ad-hoc triage, locked-down machines	Huge files, scripted batch jobs

Cookbook

Side-by-side: the shell pipeline you would otherwise write, versus the analyser report. Sizes are uncompressed bytes in both.

Per-type total: unzip -l + awk vs one drop

The classic 'which extension dominates' question. The CLI needs a pipe; the analyser returns it sorted.

CLI:
  unzip -l app.zip | awk 'NR>3 {n=split($4,a,"."); s[a[n]]+=$1}
    END {for (k in s) print s[k], k}' | sort -rn

Analyser:
  drop app.zip -> byExtension[0]
  { "ext": "map", "count": 19, "totalSize": 73400320 }

Same answer, no pipeline to debug.

Per-folder total: du vs byTopFolder

du needs the archive extracted first. The analyser reads top-level folder totals directly.

CLI:
  unzip -q bundle.zip -d /tmp/b && du -sh /tmp/b/* | sort -rh

Analyser:
  drop bundle.zip -> byTopFolder
  [ { "folder": "vendor", "totalSize": 188743680, "count": 4012 },
    { "folder": "src",    "totalSize": 8388608,   "count": 230  } ]

No /tmp extraction, no cleanup.

7z without p7zip installed

On a machine with no p7zip, 7z l fails. The analyser reads 7z through libarchive WASM in the browser.

CLI (no p7zip):
  7z l backup.7z  ->  command not found: 7z

Analyser:
  drop backup.7z (engine: libarchive WASM)
  byExtension[0] = { "ext": "sql", "count": 1, "totalSize": 524288000 }

Caveat: libarchive fully decompresses 7z to measure, so this
is RAM-heavier than `7z l` reading only headers.

Where the CLI wins: a 6 GB archive

Above the 2 GB tier ceiling the browser cannot help. Stay on the CLI.

Archive: nightly-dump.tar.gz  (6 GB)

Analyser: rejected (over 2 GB Pro-Media cap)
CLI:      tar -tzvf nightly-dump.tar.gz | awk '{s[...]} ...'
          streams the index without loading it all into RAM

Use the CLI for anything past the tier caps.

Where the CLI wins: batch over 5,000 archives

The analyser reads one archive per run. A nightly sweep belongs in a shell loop.

CLI:
  find /backups -name '*.zip' -print0 |
    xargs -0 -P4 -I{} sh -c 'unzip -l "{}" | tail -1'

Analyser:
  one archive per run; for a multi-archive size summary use
  /archive-tools/batch-compression-report on a paid tier.

Edge cases and what actually happens

Output is JSON, not text columns

By design

7z/RAR measured by full decompression

RAM cost

Archive larger than the tier cap

Tier limit (rejected)

Entry count over the cap

Tier limit (rejected)

The analyser also enforces an entry cap (500 Free / 50,000 Pro / 500,000 Pro-Media). unzip -l has none. A ZIP with a million tiny files is a CLI job.

du counts disk blocks, analyser counts bytes

Difference to expect

Compressed size not reported

Use a sibling tool

unzip -l shows both compressed and uncompressed columns; the analyser groups by uncompressed bytes only. For compression ratio per file or overall, use /archive-tools/compression-ratio-calculator.

Nested archives

Counted as one entry

Both unzip -l and the analyser list an inner data.zip as a single entry — neither recurses by default. Extract first with /archive-tools/nested-archive-extractor to look inside.

Mis-named archive

Handled by magic bytes

unzip keys off content and so does the analyser (magic-byte detection). A report.zip that is really a 7z is read correctly by the analyser via libarchive — and would fail under plain unzip.

Frequently asked questions

Does the Size Analyser just wrap unzip?

When should I prefer the CLI?

When should I prefer the browser tool?

Is the output interchangeable with CLI tooling?

Why might du give bigger numbers?

Does it read 7z and RAR like 7z l does?

Can it handle a 10 GB archive like the CLI?

No. The hard ceiling is 2 GB (Pro-Media / Developer). Past that, browser memory is the constraint and you should use the CLI, which streams the index.

Does it show compressed sizes like unzip -l?

No. The analyser groups by UNCOMPRESSED bytes only. For compressed vs uncompressed comparison and ratios, use /archive-tools/compression-ratio-calculator.

What about counts per type, like a quick wc?

Each byExtension row includes a count alongside totalSize. If you want a count-first view of types use /archive-tools/file-type-breakdown, which is the count-oriented companion.

Can I batch many archives like a shell loop?

Not in this tool — it reads one archive per run. For a multi-archive summary use /archive-tools/batch-compression-report on a paid tier; for true scripted batches the CLI remains the right tool.

Is anything uploaded when I use the browser tool?

No. Like the CLI, it runs locally — just inside your browser instead of your shell. The only network touch is a one-time WASM module fetch for 7z/RAR/bz2/xz support.

Does the analyser find duplicates the way fdupes does?

No. It groups by name and folder, not content hashes. For byte-identical duplicate detection use /archive-tools/redundancy-analyzer, which hashes entries with SHA-256.

How to size analyser vs unzip -l / 7z l / du

Command-by-command equivalents

Browser tool vs CLI: trade-offs

Cookbook

Per-type total: unzip -l + awk vs one drop

Per-folder total: du vs byTopFolder

7z without p7zip installed

Where the CLI wins: a 6 GB archive

Where the CLI wins: batch over 5,000 archives

Edge cases and what actually happens

Output is JSON, not text columns

7z/RAR measured by full decompression

Archive larger than the tier cap

Entry count over the cap

du counts disk blocks, analyser counts bytes

Compressed size not reported

Nested archives

Mis-named archive

Frequently asked questions

Does the Size Analyser just wrap unzip?

When should I prefer the CLI?

When should I prefer the browser tool?

Is the output interchangeable with CLI tooling?

Why might du give bigger numbers?

Does it read 7z and RAR like 7z l does?

Can it handle a 10 GB archive like the CLI?

Does it show compressed sizes like unzip -l?

What about counts per type, like a quick wc?

Can I batch many archives like a shell loop?

Is anything uploaded when I use the browser tool?

Does the analyser find duplicates the way fdupes does?

Privacy first

Related guides

Size Analyser vs unzip -l / 7z l / du

How to size analyser vs unzip -l / 7z l / du

Command-by-command equivalents

Browser tool vs CLI: trade-offs

Cookbook

Per-type total: unzip -l + awk vs one drop

Per-folder total: du vs byTopFolder

7z without p7zip installed

Where the CLI wins: a 6 GB archive

Where the CLI wins: batch over 5,000 archives

Edge cases and what actually happens

Output is JSON, not text columns

7z/RAR measured by full decompression

Archive larger than the tier cap

Entry count over the cap

du counts disk blocks, analyser counts bytes

Compressed size not reported

Nested archives

Mis-named archive

Frequently asked questions

Does the Size Analyser just wrap unzip?

When should I prefer the CLI?

When should I prefer the browser tool?

Is the output interchangeable with CLI tooling?

Why might du give bigger numbers?

Does it read 7z and RAR like 7z l does?

Can it handle a 10 GB archive like the CLI?

Does it show compressed sizes like unzip -l?

What about counts per type, like a quick wc?

Can I batch many archives like a shell loop?

Is anything uploaded when I use the browser tool?

Does the analyser find duplicates the way fdupes does?

Privacy first

Related guides