How to filename sanitiser vs command-line cleanup scripts
- Step 1Decide if you need general extraction or just safe names — If you want to inspect, selectively extract, or repack, a CLI extractor or a sibling JAD tool fits better. If you only want safe names for an untrusted archive, the Sanitiser is the narrow, correct tool.
- Step 2Check your format — ZIP, GZIP, TAR, 7z, RAR, bzip2, xz are all readable. The Sanitiser always emits ZIP, so if you need to preserve a 7z or tar.gz container the CLI wins on that one axis.
- Step 3Weigh privacy — Any upload-based online sanitiser sends your file to a server. The CLI keeps it local; so does this tool — it runs in your browser. For sensitive archives, both local options beat an uploading service.
- Step 4Run the Sanitiser for the quick path — Drop the archive on Filename Sanitiser, read the
Renamescount, download<stem>-sanitized.zip. - Step 5Or script it for repeatable pipelines — If you sanitise thousands of archives nightly, a local script or the JAD runner (headless-browser automation — there is no REST API for archive tools) is the durable choice.
- Step 6Verify either way — After extraction, confirm no path escaped your target directory and that file contents are intact. Both approaches should leave payload bytes unchanged.
Approach comparison
Fixing unsafe archive names: browser Sanitiser vs the common command-line and scripted alternatives. 'Safety' refers to traversal + forbidden-char handling out of the box.
| Approach | Traversal fix | Forbidden chars / reserved names | Reads 7z/RAR | Privacy | Setup |
|---|---|---|---|---|---|
| JAD Filename Sanitiser | Yes — .. runs collapse to _ | Yes, fixed rule set | Yes (libarchive WASM) | Local, no upload | None (browser) |
Modern unzip / bsdtar | Newer versions reject traversal | No — extracts names as-is | bsdtar partial; unzip no | Local | Install / version check |
| 7-Zip (GUI/CLI) | Generally safe on extract | No automatic name fixing | Yes | Local | Install |
| Custom Python/PowerShell | Whatever you code | Whatever you code | With libs | Local | Write + maintain |
| Upload-based online unzip | Varies | Varies | Varies | File leaves your machine | None |
When each one wins
Pick by goal, not by habit.
| Your goal | Best choice | Why |
|---|---|---|
| Make one untrusted ZIP safe fast | JAD Sanitiser | No install, fixed rules, local |
| Selectively extract a few files | selective-extractor or CLI | Sanitiser rewrites all names, doesn't filter |
| Keep the original 7z/tar container | 7-Zip / bsdtar | Sanitiser only outputs ZIP |
| Sanitise thousands nightly | Local script / JAD runner | Automation without a UI click |
| Audit exactly what changed | JAD Sanitiser | Single auditable rule set + rename count |
Cookbook
Side-by-side outcomes for the same hostile archive run through each approach.
Traversal entry: Sanitiser vs old unzip
An entry named ../../config/secret. The Sanitiser collapses the dot-runs; a vulnerable legacy unzip would happily write outside the target.
Entry: ../../config/secret JAD Sanitiser -> _/_/config/secret (cannot escape) Legacy unzip -> writes to ../../config/secret (escapes!) Modern unzip -> refuses with a warning (but no rewrite)
Colon in name on Windows
macOS-created entry meeting:notes.txt. The Sanitiser fixes it; plain extractors fail or create an alternate data stream.
Entry: meeting:notes.txt JAD Sanitiser -> meeting_notes.txt Windows unzip -> error / ADS surprise bsdtar -> extracts as-is, may error on NTFS
Reserved name CON.log
Windows refuses to create a file whose stem is a device name. The Sanitiser prefixes it; extractors do not.
Entry: logs/CON.log JAD Sanitiser -> logs/_CON.log 7-Zip extract -> Windows blocks the write
7z input, normalised to safe ZIP
A 7z whose entries contain forbidden characters. The Sanitiser reads it via libarchive and emits one safe ZIP — no 7-Zip install needed to fix names.
delivery.7z entries: build|win.exe build:mac JAD Sanitiser output (delivery-sanitized.zip): build_win.exe build_mac
Scripted equivalent (for scale)
The rough logic you'd write yourself to match the Sanitiser. The point of the tool is not writing this per project.
# pseudo-equivalent of the fixed rule set
name = name.replace('\\','/')
name = re.sub(r'\.\.+','_', name)
name = name.replace('\0','')
name = re.sub(r'[<>:"|?*\x00-\x1f]','_', name)
# + reserved-name prefix + slash cleanupEdge cases and what actually happens
You need to keep the source format
By designEvery CLI extractor preserves the container; the Sanitiser always outputs ZIP. If keeping 7z/tar.gz matters, sanitise then convert with archive-format-converter.
Sanitiser has no flags to tune the rules
By designUnlike a script, you cannot change which characters map to _ or pick a different replacement. The rule set is fixed and auditable; if you need custom rules, scripting is the alternative.
Two entries collide after sanitising
Last write winsA script could suffix duplicates; the Sanitiser overwrites the earlier entry when two names map to the same safe form. Watch the rename count vs entry count.
Encrypted input
Read errorCLI tools take a password; the Sanitiser does not, so an encrypted ZIP fails. Decrypt first, or use a CLI for that step.
Free-tier 50 MB / 500-entry cap
413 rejectedA CLI has no such cap. For very large or very busy untrusted archives, upgrade the tier (Pro 500 MB / 50,000 entries) or sanitise locally with a script.
COM5–COM9 / LPT3–LPT9
Known gapA thorough script would escape all reserved device names; the Sanitiser only covers COM1–4 and LPT1–2 (plus CON/PRN/AUX/NUL). Higher numbers pass through.
WASM blocked in a locked-down browser
WASM error7z/RAR/bz2/xz need WebAssembly; a CLI doesn't. In hardened environments that block wasm, use the CLI for those formats.
You want to inspect, not rewrite
Wrong toolThe Sanitiser rewrites every name and re-zips. To just look inside, use archive-previewer or file-listing-generator.
Frequently asked questions
Is the Sanitiser safer than modern unzip?
It is more proactive: it rewrites traversal sequences rather than only refusing them, and it also fixes forbidden characters and reserved names, which extractors don't. Modern unzip will reject traversal but won't hand you a cleaned archive.
Why use a browser tool instead of a CLI?
No install, a single fixed rule set you can audit, and no version-dependent safety surprises. For one-off untrusted archives it's faster; for huge nightly batches a local script or the runner scales better.
Does the browser tool upload my file like online unzip sites?
No. It runs entirely in your browser via fflate, @zip.js/zip.js, and libarchive WASM. Upload-based 'online unzip' services do send your file to a server — that's the key privacy difference.
Can the CLI read 7z that the Sanitiser also reads?
7-Zip and bsdtar read 7z natively; stock unzip does not. The Sanitiser reads 7z through libarchive WASM and outputs a safe ZIP, so you don't need 7-Zip installed just to fix names.
Can I customise which characters become underscore?
No. The replacement is fixed at _ for all forbidden characters and control bytes. If you need different mapping (e.g. delete instead of replace), a custom script is the route.
Which is better for keeping the original container format?
The CLI. The Sanitiser always emits ZIP. If you must keep 7z or tar.gz, fix names here then convert with archive-format-converter, or do the whole job in 7-Zip/bsdtar.
Does the Sanitiser handle reserved names better than extractors?
Yes for the common ones — it prefixes CON, PRN, AUX, NUL, COM1–4, LPT1–2 so they extract on Windows. Extractors just fail when Windows refuses the write. It does miss COM5+ and LPT3+, which a thorough script would catch.
Is there an API I can call instead of clicking?
Archive tools have no REST API (apiAvailable: false). Automation runs through the JAD runner in a short-lived headless-browser session. For pure scripting flexibility, a local CLI/script is simpler.
How do the size limits compare?
A CLI has no built-in cap. The Sanitiser caps by tier: free 50 MB / 500 entries, Pro 500 MB / 50,000, Pro-media and Developer 2 GB / 500,000. For larger jobs, upgrade or use a script.
What does the Sanitiser NOT do that a CLI does?
It doesn't selectively extract, preview, repack to non-ZIP formats, accept passwords, or let you change the rules. It does exactly one thing: make names safe and re-zip.
Which is more auditable for a security review?
The Sanitiser's rule set is a small fixed transform that's identical for everyone — easy to document. A team script can drift; CLI behaviour varies by version. For a repeatable, reviewable control, the fixed tool is attractive.
Can I combine the Sanitiser with other JAD tools?
Yes — a common chain is archive-integrity-tester to confirm the file is sound, then the Sanitiser to fix names, then path-prefix-remover or empty-folder-pruner to tidy structure.
Privacy first
Every JAD Archive tool runs entirely in your browser using fflate, @zip.js/zip.js, and the libarchive WASM bridge. Your archives never leave your device — verified by zero outbound network requests during processing.