How to troubleshooting the duplicate file detector
- Step 1Check the error or empty result first — Note exactly what you see: an error string, zero duplicate groups, a truncated count, or a stall. Each maps to a distinct cause in the table below.
- Step 2Rule out encryption — If you see "Archive contains encrypted entries... Provide a password to extract", the archive has encrypted entries. This tool has no password input. Decrypt with multi-format-extractor (which accepts a password) and analyze the result.
- Step 3Check size and entry caps — If the archive is rejected before analysis, it likely exceeds your tier: Pro is 500 MB / 50,000 entries, Pro-media and Developer 2 GB / 500,000. Free cannot run this tool at all. Split with archive-splitter or upgrade.
- Step 4Verify the format is readable — "Could not detect or extract archive format" means the bytes match no known signature and a fallback ZIP read failed — usually corruption or a non-archive file. Test with archive-integrity-tester; repair a damaged ZIP with corrupted-zip-repair.
- Step 5Interpret zero or truncated groups — Zero groups for a
.gz/.bz2/.xzis correct — they hold one inner file. A capped count means more groups exist than the Top-N slider allows; raise it toward 500. - Step 6Handle stalls and WASM blocks — A large archive is CPU-bound and just needs time — keep the tab focused. If 7z/RAR/etc. fail to load, a browser/extension is blocking WebAssembly; allow WASM or use a ZIP/TAR instead.
Symptom to cause to fix
Every common failure mode, its real cause from the code, and the fix.
| Symptom | Cause | Fix |
|---|---|---|
| "Archive contains encrypted entries..." | Tool extracts without a password; no password input exists | Decrypt with multi-format-extractor, then analyze |
| "Could not detect or extract archive format" | Unknown signature + fallback ZIP read failed (corrupt / not an archive) | Verify with archive-integrity-tester; repair with corrupted-zip-repair |
| Archive rejected before analysis | Over tier size/entry cap | Split with archive-splitter or upgrade tier |
| Tool will not start at all | Free tier — tool requires Pro | Upgrade to Pro or higher |
| Zero duplicate groups on a .gz/.bz2/.xz | Single-stream format, one inner file | Use a multi-file container (zip/tar/7z) |
| Group count looks capped | More groups than the Top-N slider value | Raise Top-N toward 500 |
| 7z/RAR/ISO fail to load | Browser blocks WebAssembly (libarchive) | Allow WASM or use ZIP/TAR |
| Seems stuck processing | Large entry count, CPU-bound hashing | Wait; keep tab focused; do not background it |
Looks like a bug, is actually correct
Expected behaviors that surprise users.
| Observation | Why it's correct |
|---|---|
| All empty files in one group | Zero-byte files share the same SHA-256; grouped with 0 wasted bytes |
| Same name, different folders, NOT grouped | Grouping is by content; different bytes = different hash |
| Directory entries never appear | Folder markers (paths ending in /) are skipped, not hashed |
| duplicateGroups < real total | Returned count is capped at the Top-N slider value |
| Tool reports but does not delete | Report-only by design; use selective-extractor to act |
Tier caps reference
Real limits from lib/tier-limits.ts. The tool's minimum tier is Pro.
| Tier | Max size | Max entries | Can run tool? |
|---|---|---|---|
| Free | 50 MB | 500 | No (requires Pro) |
| Pro | 500 MB | 50,000 | Yes |
| Pro-media | 2 GB | 500,000 | Yes |
| Developer | 2 GB | 500,000 | Yes |
Cookbook
Step-by-step diagnostics for the failures people actually hit.
Encrypted ZIP throws on analysis
The tool calls the extractor without a password, so any encrypted entry stops it. Decrypt first, then analyze the cleartext.
Error: Archive contains encrypted entries (e.g. "secret.docx"). Provide a password to extract. Fix: 1) multi-format-extractor: enter the password, extract 2) re-zip the extracted files (folder-to-zip) 3) drop the plaintext zip into redundancy-analyzer
Zero groups on a .gz file
A bare gzip decompresses to one inner file, so there is nothing to compare. This is correct, not a failure.
Input: backup.gz Report: duplicateGroups 0, totalWastedHuman "0 B" Reason: .gz holds a single stream / one file. To find duplicates, analyze a tar/zip/7z that has many entries. (If it's actually a .tar.gz, it WILL show inner-file groups.)
Result count capped by the slider
The report shows exactly the Top-N value because the archive has more groups than that. Raise the slider.
pairLimit = 100 -> duplicateGroups: 100 (suspiciously round) pairLimit = 500 -> duplicateGroups: 327 (the real total) If duplicateGroups equals your slider value exactly, it is probably capped — raise Top-N to see the rest.
"Could not detect or extract archive format"
The file is not a recognized archive or is corrupt. Confirm integrity, then repair if it is a damaged ZIP.
Error: Could not detect or extract archive format for data.bin Diagnose: - Is it actually an archive? (check the real magic bytes) - archive-integrity-tester to confirm it's readable - corrupted-zip-repair if it's a damaged .zip Then retry the analyzer.
7z fails but ZIP works
7z/RAR/bz2/xz/ISO need libarchive WASM. If WebAssembly is blocked, those formats fail while fflate formats still work.
Symptom: data.7z fails to load; data.zip analyzes fine.
Cause: browser/extension blocking WebAssembly.
Fix: allow WASM for the site, or convert/extract the 7z
elsewhere and analyze a zip/tar copy instead.Edge cases and what actually happens
Encrypted archive
RejectedNo password input exists; the tool extracts without one, so encrypted entries throw "Archive contains encrypted entries... Provide a password to extract." Decrypt with multi-format-extractor first.
Unrecognized or corrupt file
FailedIf the magic bytes match nothing and the fallback ZIP read fails, you get "Could not detect or extract archive format." Verify with archive-integrity-tester; repair a broken ZIP with corrupted-zip-repair.
Over the tier cap
RejectedPro allows 500 MB / 50,000 entries; Pro-media and Developer 2 GB / 500,000. An oversized archive is rejected before analysis. Split with archive-splitter or upgrade.
Free-tier account
BlockedThe tool's minimum tier is Pro. Free accounts cannot run it regardless of archive size. Upgrade to Pro.
Single-stream gz/bz2/xz returns no groups
ExpectedThese hold one inner file, so there is nothing to duplicate. Zero groups is correct. Analyze a multi-file container instead.
All empty files grouped together
ExpectedEvery zero-byte file shares the SHA-256 of the empty string, so they group with perFileSize: 0 and wastedBytes: 0. Correct behavior, not a bug.
Same-named files not grouped
By designGrouping is purely by content hash. Two files with the same name in different folders only group if their bytes are identical. Different content = different hash = separate.
Group count capped at the slider value
TruncatedThe report returns at most the Top-N value of the highest-waste groups. If duplicateGroups equals your slider setting exactly, raise the slider (max 500) to reveal the rest.
WebAssembly blocked
Failed7z/RAR/bz2/xz/ISO need libarchive WASM. A browser policy or extension blocking WebAssembly breaks those formats; ZIP/GZIP/TAR still work via fflate. Allow WASM or use a fflate-supported format.
Large archive seems stuck
ExpectedHashing every entry is CPU-bound and runs in the browser; a 50,000-entry archive takes time. There is no timeout — keep the tab focused and let it finish rather than reloading.
Frequently asked questions
Why does it say the archive contains encrypted entries?
The tool extracts without a password and has no password field, so any encrypted entry stops it. Decrypt the archive first with multi-format-extractor, then analyze the result.
Why do I get "Could not detect or extract archive format"?
The file is not a recognized archive or is corrupt — its magic bytes matched nothing and a fallback ZIP read failed. Verify with archive-integrity-tester or repair with corrupted-zip-repair.
Why does my .gz show zero duplicates?
A bare .gz holds a single inner file, so there is nothing to compare. Analyze a multi-file container like zip, tar, or 7z. A .tar.gz will show inner-file groups.
Why is the duplicate count exactly 100?
That is the default Top-N slider value. If your archive has more groups, the report is capped to the highest-waste ones. Raise the slider toward 500 to see the rest.
Why are all my empty files in one group?
Zero-byte files share the same SHA-256, so they all match — with zero wasted bytes. This is expected behavior.
Why aren't same-named files grouped?
Grouping is by content, not name. Two files with the same name but different bytes have different SHA-256 digests and stay separate.
Why was my archive rejected before it ran?
It probably exceeds your tier cap: Pro is 500 MB / 50,000 entries, higher tiers 2 GB / 500,000. Split it with archive-splitter or upgrade.
Why can't I run it on the Free plan?
The tool's minimum tier is Pro. Free accounts cannot use it at all. Upgrade to Pro or higher.
Why do 7z or RAR files fail while ZIP works?
7z/RAR/bz2/xz/ISO need libarchive WebAssembly. If your browser or an extension blocks WASM, those formats fail; ZIP/GZIP/TAR still work via fflate.
It seems stuck — is it frozen?
Large archives are CPU-bound; hashing thousands of entries takes time and there is no timeout. Keep the tab focused and wait rather than reloading.
Why won't it delete the duplicates it found?
It is report-only by design. To act, keep wanted files with selective-extractor and re-pack with folder-to-zip.
How do I compare two archives instead of finding dupes in one?
Use archive-diff. The redundancy analyzer only inspects a single archive at a time.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.