How to compress a scanned pdf using lossless compression
- Step 1Decide whether you actually need lossless — If you need the scan smaller and can accept some image softening (and losing selectable text), the lossy compressor is the better fit. Choose lossless only when every pixel must be preserved exactly.
- Step 2OCR first if you want searchable text — Lossless keeps an existing text layer but never adds one. To make a scan searchable, run OCR before compressing so the text layer is included in the output.
- Step 3Open the lossless compressor — Go to PDF Compress (Lossless) and drop the scan. The header shows its size and page count.
- Step 4Let it run — and set expectations — It auto-runs (no options panel, no scan-type or JBIG2 selector). It copies pages into a new document, clears Producer/Creator, and re-saves with compressed object streams. On a scan, expect a small reduction.
- Step 5Check the result honestly — If the scan barely shrank, that's expected — lossless can't re-encode the image pixels. If you needed a real drop, that's your signal to switch to the lossy tool.
- Step 6Verify and download — Open the output at high zoom; the scan is pixel-identical to the source. Download the file — it's a standard PDF compatible with every reader.
What lossless does (and doesn't) do to a scan
Verified against the engine implementation. The crucial point: image pixels are never touched, so scans barely shrink. Claims of JBIG2/JPEG 2000 modes do not apply to this tool.
| Capability | In this tool? | Effect on a scan |
|---|---|---|
| Copy page images through unchanged | Yes | Pixels preserved exactly |
| Re-pack structure (object streams) | Yes | Small saving — scans have little structural slack |
| Clear Producer/Creator metadata | Yes | Tiny saving |
| JBIG2 / JPEG 2000 image re-encoding | No — does not exist in this tool | n/a — would change pixels anyway |
| Scan-type (B&W vs colour) selector | No — no options panel at all | n/a |
| Downsample / re-encode images for size | No (that's the lossy tool) | Use lossy compress for real scan shrink |
Lossless vs lossy on a scanned PDF
Pick by what matters most: exact fidelity (lossless) or small size (lossy). For most scans where size is the goal, lossy is the right tool.
| Your priority | Use | Size result | Trade-off |
|---|---|---|---|
| Preserve every pixel exactly | Lossless (this tool) | Barely smaller (0-10%) | None — fidelity is the point |
| Make the scan much smaller | Lossy compress | Often 50-90% smaller | Image softening; selectable text lost |
| Hit a hard size cap (e.g. 1 MB) | Lossy compress | Targets the number | Quality/resolution stepped down to fit |
| Make the scan searchable | OCR (then compress) | OCR doesn't shrink it | Adds a text layer; compress separately |
Cookbook
What actually happens when you run lossless on a scan. Sizes are illustrative; the tool reports the real figures.
A 24 MB colour scan barely moves
The honest result: lossless can't re-encode the page images, so the file stays close to its original size. Use this only when you must preserve the scan exactly.
Input: contract-scan.pdf 24 MB, 14 pages (colour scan) Output: contract-scan.pdf 23 MB, 14 pages Result: ~4% — pixels preserved; for real shrink use the lossy tool
The same scan, shrunk with the lossy tool instead
For comparison, when size matters more than exact fidelity, the lossy compressor re-encodes pages as JPEGs and shrinks dramatically.
Input: contract-scan.pdf 24 MB → /pdf-tools/pdf-compress-lossy, target 2 MB Output: contract-scan.pdf 1.9 MB Trade-off: image softens; text (if any) no longer selectable
OCR first, then lossless keeps the text layer
Lossless preserves an existing text layer but won't add one. Run OCR before compressing so the searchable layer is included.
1. /pdf-tools/pdf-ocr → adds invisible text layer over the scan 2. /pdf-tools/pdf-compress-lossless → structural rebuild Result: scan stays full-fidelity AND searchable; size ~unchanged
A scan with leftover structure from a merge
When scans were merged into one PDF, orphaned objects accumulate. Lossless prunes those even though the images themselves don't shrink.
Input: merged-scans.pdf 60 MB, 300 pages (merged batch) Output: merged-scans.pdf 54 MB, 300 pages Result: ~10% from dead-object pruning; images untouched
Proving the scan is byte-faithful
For a contract or exhibit scan, confirm the output is pixel-identical before relying on it.
1. Open source + output at 400% on the signature block 2. Pixels match exactly — nothing re-encoded 3. If OCR'd, the text still selects Confirmed: scan preserved exactly
Edge cases and what actually happens
The scan barely shrinks
By designThis is the expected outcome, not a failure. Lossless compression never re-encodes image pixels, and a scan is essentially all image — so there's little structural slack to reclaim. If you need the scan meaningfully smaller, use lossy compress, which re-renders each page as a JPEG and can hit a target size, at the cost of softer images and any selectable text.
Expecting a JBIG2 or JPEG 2000 option
Not availableThis tool has no JBIG2, JPEG 2000, or any image-codec option — it does not re-encode images at all. Those modes appear in some desktop scanners and would, by definition, change the pixels (JBIG2's lossy mode is even known for digit-substitution errors). If your aim is image-level scan compression, that's the lossy compressor, which uses JPEG re-encoding.
Looking for a scan-type (B&W / colour) selector
No options panelThe lossless tool runs automatically with no settings — there's no black-and-white vs colour selector, because it doesn't process images by type. It treats every page identically: copy through, re-pack structure. Earlier descriptions suggesting a scan-type choice were inaccurate; this guide reflects the tool's real behaviour.
You want the scan to be searchable
Use OCR firstLossless preserves an existing text layer but never creates one — a raw scan has no searchable text and compression won't add it. Run OCR first to add an invisible text layer, then compress; the layer is preserved in the output. Note OCR doesn't reduce file size; it adds searchability.
The scan is digitally signed
Signature invalidatedRe-saving the file breaks the hash a digital signature covers, so a signed scan will no longer validate after compression. For a signed contract scan, keep the original as the authoritative copy. Check status with Verify Signature; to re-sign after processing use Digital Signature.
The scan is encrypted / password-protected
Loaded with ignoreEncryptionThe engine can open it (encryption ignored on load), but the output is not re-encrypted. If the scan must stay protected, remove the password with Remove Password, compress, then re-apply protection. Don't assume the password carries through the rebuild.
Scan exceeds your tier's size limit
BlockedScans are large, and PDF caps by plan are Free 2 MB, Pro 50 MB, Pro+Media 500 MB, Developer 2 GB. A multi-page scan easily exceeds the free 2 MB cap and is blocked with an upgrade prompt. Split it with Split by Range or upgrade — though for size reduction the lossy tool is usually the better answer.
Scan exceeds your tier's page limit
BlockedPage caps: Free 50, Pro 500, Pro+Media 2,000, Developer 10,000. A long scanned batch over the cap won't process. Break it into smaller files with Split by Range first.
Corrupt scan won't load
ErrorIf the scanned PDF is malformed or truncated, pdf-lib can't parse it and compression can't run. Repair the structure with Repair PDF first, then retry.
Output slightly larger than the original scan
PossibleOn a scan with no dead structure, the object-stream overhead can occasionally make the output marginally larger. The image pixels are preserved exactly either way. If a smaller file is the goal, this confirms lossless is the wrong lever for a scan — use the lossy tool.
Frequently asked questions
How much smaller will my scanned PDF be after lossless compression?
Usually very little — typically 0-10%, and sometimes nothing. A scan is essentially one image per page, and lossless compression never re-encodes image pixels, so there's almost no structural slack to reclaim. If you need a scan meaningfully smaller, the honest answer is to use lossy compress instead, which re-encodes the pages and can shrink a scan by 50-90%.
Does this tool use JBIG2 or JPEG 2000 to compress scans?
No. It has no JBIG2, JPEG 2000, or any image-codec option — it does not re-encode images at all. It copies the page images through unchanged and only re-packs the file's structure. Those codecs change pixel data (JBIG2's lossy mode can even substitute digits), which is the opposite of lossless. For image-level scan compression, use the lossy compressor.
Is there a setting to choose black-and-white vs colour scans?
No. The lossless tool runs automatically with no options panel — there's no scan-type selector, quality slider, or mode. It treats every page the same way: copy the image through and re-pack the structure. If you've read elsewhere that this tool has a scan-type choice, that was inaccurate; it doesn't process images by type.
Then why would I ever use lossless on a scan?
When preserving every pixel exactly is the whole point — a signed contract scan, a diagnostic-grade image, a legal exhibit, or any record where image fidelity is mandatory. Lossless guarantees the stored scan is byte-faithful while still trimming structural slack and clearing metadata. If fidelity isn't critical and you just want it smaller, the lossy tool is the better choice.
Will lossless compression keep my scan searchable?
It preserves an existing OCR text layer, but it never adds one — a raw scan has no searchable text and compression won't create it. To make a scan searchable, run OCR first; the invisible text layer it adds is preserved when you then compress. Bear in mind OCR adds searchability but doesn't reduce file size.
What's the best way to actually shrink a scanned PDF?
Use lossy compress. It rasterises each page and re-encodes it as a JPEG, and you can set a target size like 1 MB or 2 MB that it searches quality (and resolution) to land under. The trade-offs are honest: images soften at low quality, and because pages become images, selectable text is lost. For most scans where small size is the goal, that's the right tool.
Is lossless suitable for medical or diagnostic scans?
Yes — when fidelity is the requirement, lossless is correct because it preserves every pixel exactly and never applies any lossy codec. Just understand it won't reduce the file size much, since it can't re-encode the image data. If you're working with DICOM specifically, see the DICOM to PDF tool for converting medical imaging into PDF in the first place.
Should I OCR before or after lossless compression?
Before. Run OCR first to add the searchable text layer, then compress — lossless preserves that layer in the output. If you OCR after compressing you'd have to re-run it on the new file, with no benefit. Order: OCR, then lossless compress.
Is my scan uploaded anywhere?
No. Everything runs in your browser, so the scan — often a contract, ID, or medical record — never leaves your device. Only an anonymous usage counter is recorded server-side if you're signed in, never the document. This matters for scans, which frequently contain sensitive personal information.
Why won't my big scan even load?
It probably exceeds your tier's size or page limit: Free allows 2 MB / 50 pages, Pro 50 MB / 500 pages, Pro+Media 500 MB / 2,000 pages, Developer 2 GB / 10,000 pages. Scans are large, so the free 2 MB cap is hit easily. Split the scan with Split by Range, upgrade, or — since size is usually the real concern with scans — reach for the lossy tool.
Does compressing a scan affect a digital signature on it?
Yes — it invalidates it. Re-saving the file changes the bytes the signature's hash covers, so a signed scan won't validate afterward. Keep the original signed scan as the authoritative copy and compress only unsigned working copies. You can check a signature's state with Verify Signature.
Can I batch-process scanned PDFs automatically?
Yes, via the @jadapps/runner: fetch the schema with GET /api/v1/tools/pdf-compress-lossless, pair the runner once, then POST each scan to 127.0.0.1:9789/v1/tools/pdf-compress-lossless/run. It runs locally so scans never leave your machine. For batches where the goal is smaller files, point the same workflow at the lossy compressor instead — lossless won't shrink scans much.
Privacy first
All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.