Troubleshooting the Duplicate File Detector

How to troubleshooting the duplicate file detector

Step 1
Check the error or empty result first — Note exactly what you see: an error string, zero duplicate groups, a truncated count, or a stall. Each maps to a distinct cause in the table below.
Step 2
Rule out encryption — If you see "Archive contains encrypted entries... Provide a password to extract", the archive has encrypted entries. This tool has no password input. Decrypt with multi-format-extractor (which accepts a password) and analyze the result.
Step 3
Check size and entry caps — If the archive is rejected before analysis, it likely exceeds your tier: Pro is 500 MB / 50,000 entries, Pro-media and Developer 2 GB / 500,000. Free cannot run this tool at all. Split with archive-splitter or upgrade.
Step 4
Verify the format is readable — "Could not detect or extract archive format" means the bytes match no known signature and a fallback ZIP read failed — usually corruption or a non-archive file. Test with archive-integrity-tester; repair a damaged ZIP with corrupted-zip-repair.
Step 5
Interpret zero or truncated groups — Zero groups for a .gz/.bz2/.xz is correct — they hold one inner file. A capped count means more groups exist than the Top-N slider allows; raise it toward 500.
Step 6
Handle stalls and WASM blocks — A large archive is CPU-bound and just needs time — keep the tab focused. If 7z/RAR/etc. fail to load, a browser/extension is blocking WebAssembly; allow WASM or use a ZIP/TAR instead.

Symptom to cause to fix

Every common failure mode, its real cause from the code, and the fix.

Symptom	Cause	Fix
"Archive contains encrypted entries..."	Tool extracts without a password; no password input exists	Decrypt with multi-format-extractor, then analyze
"Could not detect or extract archive format"	Unknown signature + fallback ZIP read failed (corrupt / not an archive)	Verify with archive-integrity-tester; repair with corrupted-zip-repair
Archive rejected before analysis	Over tier size/entry cap	Split with archive-splitter or upgrade tier
Tool will not start at all	Free tier — tool requires Pro	Upgrade to Pro or higher
Zero duplicate groups on a .gz/.bz2/.xz	Single-stream format, one inner file	Use a multi-file container (zip/tar/7z)
Group count looks capped	More groups than the Top-N slider value	Raise Top-N toward 500
7z/RAR/ISO fail to load	Browser blocks WebAssembly (libarchive)	Allow WASM or use ZIP/TAR
Seems stuck processing	Large entry count, CPU-bound hashing	Wait; keep tab focused; do not background it

Looks like a bug, is actually correct

Expected behaviors that surprise users.

Observation	Why it's correct
All empty files in one group	Zero-byte files share the same SHA-256; grouped with 0 wasted bytes
Same name, different folders, NOT grouped	Grouping is by content; different bytes = different hash
Directory entries never appear	Folder markers (paths ending in /) are skipped, not hashed
duplicateGroups < real total	Returned count is capped at the Top-N slider value
Tool reports but does not delete	Report-only by design; use selective-extractor to act

Tier caps reference

Real limits from lib/tier-limits.ts. The tool's minimum tier is Pro.

Tier	Max size	Max entries	Can run tool?
Free	50 MB	500	No (requires Pro)
Pro	500 MB	50,000	Yes
Pro-media	2 GB	500,000	Yes
Developer	2 GB	500,000	Yes

Cookbook

Step-by-step diagnostics for the failures people actually hit.

Encrypted ZIP throws on analysis

The tool calls the extractor without a password, so any encrypted entry stops it. Decrypt first, then analyze the cleartext.

Error:
  Archive contains encrypted entries (e.g. "secret.docx").
  Provide a password to extract.

Fix:
  1) multi-format-extractor: enter the password, extract
  2) re-zip the extracted files (folder-to-zip)
  3) drop the plaintext zip into redundancy-analyzer

Zero groups on a .gz file

A bare gzip decompresses to one inner file, so there is nothing to compare. This is correct, not a failure.

Input: backup.gz
Report: duplicateGroups 0, totalWastedHuman "0 B"

Reason: .gz holds a single stream / one file.
To find duplicates, analyze a tar/zip/7z that has many entries.
(If it's actually a .tar.gz, it WILL show inner-file groups.)

Result count capped by the slider

The report shows exactly the Top-N value because the archive has more groups than that. Raise the slider.

pairLimit = 100  -> duplicateGroups: 100  (suspiciously round)
pairLimit = 500  -> duplicateGroups: 327  (the real total)

If duplicateGroups equals your slider value exactly,
it is probably capped — raise Top-N to see the rest.

"Could not detect or extract archive format"

The file is not a recognized archive or is corrupt. Confirm integrity, then repair if it is a damaged ZIP.

Error: Could not detect or extract archive format for data.bin

Diagnose:
  - Is it actually an archive? (check the real magic bytes)
  - archive-integrity-tester to confirm it's readable
  - corrupted-zip-repair if it's a damaged .zip
Then retry the analyzer.

7z fails but ZIP works

7z/RAR/bz2/xz/ISO need libarchive WASM. If WebAssembly is blocked, those formats fail while fflate formats still work.

Symptom: data.7z fails to load; data.zip analyzes fine.
Cause: browser/extension blocking WebAssembly.
Fix: allow WASM for the site, or convert/extract the 7z
     elsewhere and analyze a zip/tar copy instead.

Edge cases and what actually happens

Encrypted archive

Rejected

No password input exists; the tool extracts without one, so encrypted entries throw "Archive contains encrypted entries... Provide a password to extract." Decrypt with multi-format-extractor first.

Unrecognized or corrupt file

Failed

If the magic bytes match nothing and the fallback ZIP read fails, you get "Could not detect or extract archive format." Verify with archive-integrity-tester; repair a broken ZIP with corrupted-zip-repair.

Over the tier cap

Rejected

Pro allows 500 MB / 50,000 entries; Pro-media and Developer 2 GB / 500,000. An oversized archive is rejected before analysis. Split with archive-splitter or upgrade.

Free-tier account

Blocked

The tool's minimum tier is Pro. Free accounts cannot run it regardless of archive size. Upgrade to Pro.

Single-stream gz/bz2/xz returns no groups

Expected

These hold one inner file, so there is nothing to duplicate. Zero groups is correct. Analyze a multi-file container instead.

All empty files grouped together

Expected

Every zero-byte file shares the SHA-256 of the empty string, so they group with perFileSize: 0 and wastedBytes: 0. Correct behavior, not a bug.

Same-named files not grouped

By design

Grouping is purely by content hash. Two files with the same name in different folders only group if their bytes are identical. Different content = different hash = separate.

Group count capped at the slider value

Truncated

The report returns at most the Top-N value of the highest-waste groups. If duplicateGroups equals your slider setting exactly, raise the slider (max 500) to reveal the rest.

WebAssembly blocked

Failed

7z/RAR/bz2/xz/ISO need libarchive WASM. A browser policy or extension blocking WebAssembly breaks those formats; ZIP/GZIP/TAR still work via fflate. Allow WASM or use a fflate-supported format.

Large archive seems stuck

Expected

Hashing every entry is CPU-bound and runs in the browser; a 50,000-entry archive takes time. There is no timeout — keep the tab focused and let it finish rather than reloading.

Frequently asked questions

Why does it say the archive contains encrypted entries?

The tool extracts without a password and has no password field, so any encrypted entry stops it. Decrypt the archive first with multi-format-extractor, then analyze the result.

Why do I get "Could not detect or extract archive format"?

The file is not a recognized archive or is corrupt — its magic bytes matched nothing and a fallback ZIP read failed. Verify with archive-integrity-tester or repair with corrupted-zip-repair.

Why does my .gz show zero duplicates?

A bare .gz holds a single inner file, so there is nothing to compare. Analyze a multi-file container like zip, tar, or 7z. A .tar.gz will show inner-file groups.

Why is the duplicate count exactly 100?

That is the default Top-N slider value. If your archive has more groups, the report is capped to the highest-waste ones. Raise the slider toward 500 to see the rest.

Why are all my empty files in one group?

Zero-byte files share the same SHA-256, so they all match — with zero wasted bytes. This is expected behavior.

Why aren't same-named files grouped?

Grouping is by content, not name. Two files with the same name but different bytes have different SHA-256 digests and stay separate.

Why was my archive rejected before it ran?

It probably exceeds your tier cap: Pro is 500 MB / 50,000 entries, higher tiers 2 GB / 500,000. Split it with archive-splitter or upgrade.

Why can't I run it on the Free plan?

The tool's minimum tier is Pro. Free accounts cannot use it at all. Upgrade to Pro or higher.

Why do 7z or RAR files fail while ZIP works?

7z/RAR/bz2/xz/ISO need libarchive WebAssembly. If your browser or an extension blocks WASM, those formats fail; ZIP/GZIP/TAR still work via fflate.

It seems stuck — is it frozen?

Large archives are CPU-bound; hashing thousands of entries takes time and there is no timeout. Keep the tab focused and wait rather than reloading.

Why won't it delete the duplicates it found?

It is report-only by design. To act, keep wanted files with selective-extractor and re-pack with folder-to-zip.

How do I compare two archives instead of finding dupes in one?

Use archive-diff. The redundancy analyzer only inspects a single archive at a time.

Privacy first

Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.

How to troubleshooting the duplicate file detector

Step 1
Check the error or empty result first — Note exactly what you see: an error string, zero duplicate groups, a truncated count, or a stall. Each maps to a distinct cause in the table below.
Step 2
Rule out encryption — If you see "Archive contains encrypted entries... Provide a password to extract", the archive has encrypted entries. This tool has no password input. Decrypt with multi-format-extractor (which accepts a password) and analyze the result.
Step 3
Check size and entry caps — If the archive is rejected before analysis, it likely exceeds your tier: Pro is 500 MB / 50,000 entries, Pro-media and Developer 2 GB / 500,000. Free cannot run this tool at all. Split with archive-splitter or upgrade.
Step 4
Verify the format is readable — "Could not detect or extract archive format" means the bytes match no known signature and a fallback ZIP read failed — usually corruption or a non-archive file. Test with archive-integrity-tester; repair a damaged ZIP with corrupted-zip-repair.
Step 5
Interpret zero or truncated groups — Zero groups for a .gz/.bz2/.xz is correct — they hold one inner file. A capped count means more groups exist than the Top-N slider allows; raise it toward 500.
Step 6
Handle stalls and WASM blocks — A large archive is CPU-bound and just needs time — keep the tab focused. If 7z/RAR/etc. fail to load, a browser/extension is blocking WebAssembly; allow WASM or use a ZIP/TAR instead.

Symptom to cause to fix

Every common failure mode, its real cause from the code, and the fix.

Symptom	Cause	Fix
"Archive contains encrypted entries..."	Tool extracts without a password; no password input exists	Decrypt with multi-format-extractor, then analyze
"Could not detect or extract archive format"	Unknown signature + fallback ZIP read failed (corrupt / not an archive)	Verify with archive-integrity-tester; repair with corrupted-zip-repair
Archive rejected before analysis	Over tier size/entry cap	Split with archive-splitter or upgrade tier
Tool will not start at all	Free tier — tool requires Pro	Upgrade to Pro or higher
Zero duplicate groups on a .gz/.bz2/.xz	Single-stream format, one inner file	Use a multi-file container (zip/tar/7z)
Group count looks capped	More groups than the Top-N slider value	Raise Top-N toward 500
7z/RAR/ISO fail to load	Browser blocks WebAssembly (libarchive)	Allow WASM or use ZIP/TAR
Seems stuck processing	Large entry count, CPU-bound hashing	Wait; keep tab focused; do not background it

Looks like a bug, is actually correct

Expected behaviors that surprise users.

Observation	Why it's correct
All empty files in one group	Zero-byte files share the same SHA-256; grouped with 0 wasted bytes
Same name, different folders, NOT grouped	Grouping is by content; different bytes = different hash
Directory entries never appear	Folder markers (paths ending in /) are skipped, not hashed
duplicateGroups < real total	Returned count is capped at the Top-N slider value
Tool reports but does not delete	Report-only by design; use selective-extractor to act

Tier caps reference

Real limits from lib/tier-limits.ts. The tool's minimum tier is Pro.

Tier	Max size	Max entries	Can run tool?
Free	50 MB	500	No (requires Pro)
Pro	500 MB	50,000	Yes
Pro-media	2 GB	500,000	Yes
Developer	2 GB	500,000	Yes

Cookbook

Step-by-step diagnostics for the failures people actually hit.

Encrypted ZIP throws on analysis

The tool calls the extractor without a password, so any encrypted entry stops it. Decrypt first, then analyze the cleartext.

Error:
  Archive contains encrypted entries (e.g. "secret.docx").
  Provide a password to extract.

Fix:
  1) multi-format-extractor: enter the password, extract
  2) re-zip the extracted files (folder-to-zip)
  3) drop the plaintext zip into redundancy-analyzer

Zero groups on a .gz file

A bare gzip decompresses to one inner file, so there is nothing to compare. This is correct, not a failure.

Input: backup.gz
Report: duplicateGroups 0, totalWastedHuman "0 B"

Reason: .gz holds a single stream / one file.
To find duplicates, analyze a tar/zip/7z that has many entries.
(If it's actually a .tar.gz, it WILL show inner-file groups.)

Result count capped by the slider

The report shows exactly the Top-N value because the archive has more groups than that. Raise the slider.

pairLimit = 100  -> duplicateGroups: 100  (suspiciously round)
pairLimit = 500  -> duplicateGroups: 327  (the real total)

If duplicateGroups equals your slider value exactly,
it is probably capped — raise Top-N to see the rest.

"Could not detect or extract archive format"

The file is not a recognized archive or is corrupt. Confirm integrity, then repair if it is a damaged ZIP.

Error: Could not detect or extract archive format for data.bin

Diagnose:
  - Is it actually an archive? (check the real magic bytes)
  - archive-integrity-tester to confirm it's readable
  - corrupted-zip-repair if it's a damaged .zip
Then retry the analyzer.

7z fails but ZIP works

7z/RAR/bz2/xz/ISO need libarchive WASM. If WebAssembly is blocked, those formats fail while fflate formats still work.

Symptom: data.7z fails to load; data.zip analyzes fine.
Cause: browser/extension blocking WebAssembly.
Fix: allow WASM for the site, or convert/extract the 7z
     elsewhere and analyze a zip/tar copy instead.

Edge cases and what actually happens

Encrypted archive

Rejected

No password input exists; the tool extracts without one, so encrypted entries throw "Archive contains encrypted entries... Provide a password to extract." Decrypt with multi-format-extractor first.

Unrecognized or corrupt file

Failed

Over the tier cap

Rejected

Pro allows 500 MB / 50,000 entries; Pro-media and Developer 2 GB / 500,000. An oversized archive is rejected before analysis. Split with archive-splitter or upgrade.

Free-tier account

Blocked

The tool's minimum tier is Pro. Free accounts cannot run it regardless of archive size. Upgrade to Pro.

Single-stream gz/bz2/xz returns no groups

Expected

These hold one inner file, so there is nothing to duplicate. Zero groups is correct. Analyze a multi-file container instead.

All empty files grouped together

Expected

Every zero-byte file shares the SHA-256 of the empty string, so they group with perFileSize: 0 and wastedBytes: 0. Correct behavior, not a bug.

Same-named files not grouped

By design

Grouping is purely by content hash. Two files with the same name in different folders only group if their bytes are identical. Different content = different hash = separate.

Group count capped at the slider value

Truncated

The report returns at most the Top-N value of the highest-waste groups. If duplicateGroups equals your slider setting exactly, raise the slider (max 500) to reveal the rest.

WebAssembly blocked

Failed

7z/RAR/bz2/xz/ISO need libarchive WASM. A browser policy or extension blocking WebAssembly breaks those formats; ZIP/GZIP/TAR still work via fflate. Allow WASM or use a fflate-supported format.

Large archive seems stuck

Expected

Hashing every entry is CPU-bound and runs in the browser; a 50,000-entry archive takes time. There is no timeout — keep the tab focused and let it finish rather than reloading.

Frequently asked questions

Why does it say the archive contains encrypted entries?

The tool extracts without a password and has no password field, so any encrypted entry stops it. Decrypt the archive first with multi-format-extractor, then analyze the result.

Why do I get "Could not detect or extract archive format"?

The file is not a recognized archive or is corrupt — its magic bytes matched nothing and a fallback ZIP read failed. Verify with archive-integrity-tester or repair with corrupted-zip-repair.

Why does my .gz show zero duplicates?

A bare .gz holds a single inner file, so there is nothing to compare. Analyze a multi-file container like zip, tar, or 7z. A .tar.gz will show inner-file groups.

Why is the duplicate count exactly 100?

That is the default Top-N slider value. If your archive has more groups, the report is capped to the highest-waste ones. Raise the slider toward 500 to see the rest.

Why are all my empty files in one group?

Zero-byte files share the same SHA-256, so they all match — with zero wasted bytes. This is expected behavior.

Why aren't same-named files grouped?

Grouping is by content, not name. Two files with the same name but different bytes have different SHA-256 digests and stay separate.

Why was my archive rejected before it ran?

It probably exceeds your tier cap: Pro is 500 MB / 50,000 entries, higher tiers 2 GB / 500,000. Split it with archive-splitter or upgrade.

Why can't I run it on the Free plan?

The tool's minimum tier is Pro. Free accounts cannot use it at all. Upgrade to Pro or higher.

Why do 7z or RAR files fail while ZIP works?

7z/RAR/bz2/xz/ISO need libarchive WebAssembly. If your browser or an extension blocks WASM, those formats fail; ZIP/GZIP/TAR still work via fflate.

It seems stuck — is it frozen?

Large archives are CPU-bound; hashing thousands of entries takes time and there is no timeout. Keep the tab focused and wait rather than reloading.

Why won't it delete the duplicates it found?

It is report-only by design. To act, keep wanted files with selective-extractor and re-pack with folder-to-zip.

How do I compare two archives instead of finding dupes in one?

Use archive-diff. The redundancy analyzer only inspects a single archive at a time.

How to troubleshooting the duplicate file detector

Symptom to cause to fix

Looks like a bug, is actually correct

Tier caps reference

Cookbook

Encrypted ZIP throws on analysis

Zero groups on a .gz file

Result count capped by the slider

"Could not detect or extract archive format"

7z fails but ZIP works

Edge cases and what actually happens

Encrypted archive

Unrecognized or corrupt file

Over the tier cap

Free-tier account

Single-stream gz/bz2/xz returns no groups

All empty files grouped together

Same-named files not grouped

Group count capped at the slider value

WebAssembly blocked

Large archive seems stuck

Frequently asked questions

Why does it say the archive contains encrypted entries?

Why do I get "Could not detect or extract archive format"?

Why does my .gz show zero duplicates?

Why is the duplicate count exactly 100?

Why are all my empty files in one group?

Why aren't same-named files grouped?

Why was my archive rejected before it ran?

Why can't I run it on the Free plan?

Why do 7z or RAR files fail while ZIP works?

It seems stuck — is it frozen?

Why won't it delete the duplicates it found?

How do I compare two archives instead of finding dupes in one?

Privacy first

Related guides

Troubleshooting the Duplicate File Detector

How to troubleshooting the duplicate file detector

Symptom to cause to fix

Looks like a bug, is actually correct

Tier caps reference

Cookbook

Encrypted ZIP throws on analysis

Zero groups on a .gz file

Result count capped by the slider

"Could not detect or extract archive format"

7z fails but ZIP works

Edge cases and what actually happens

Encrypted archive

Unrecognized or corrupt file

Over the tier cap

Free-tier account

Single-stream gz/bz2/xz returns no groups

All empty files grouped together

Same-named files not grouped

Group count capped at the slider value

WebAssembly blocked

Large archive seems stuck

Frequently asked questions

Why does it say the archive contains encrypted entries?

Why do I get "Could not detect or extract archive format"?

Why does my .gz show zero duplicates?

Why is the duplicate count exactly 100?

Why are all my empty files in one group?

Why aren't same-named files grouped?

Why was my archive rejected before it ran?

Why can't I run it on the Free plan?

Why do 7z or RAR files fail while ZIP works?

It seems stuck — is it frozen?

Why won't it delete the duplicates it found?

How do I compare two archives instead of finding dupes in one?

Privacy first

Related guides