How to split a large scanned pdf into numbered volumes
- Step 1Note the total page count from your scanning software — Your scanner app reports the page count. You'll use it to predict the number of volumes (ceil of total ÷ volume size) and to sanity-check the result.
- Step 2Open the tool and drop the scanned PDF — Load the document into PDF Split (Fixed). The page count appears once it's read — confirm it matches what the scanner reported.
- Step 3Set the volume size in Pages per chunk — Enter the pages-per-volume your records policy specifies (e.g. 100) in the single Pages per chunk field. No presets — any whole number ≥ 1 works.
- Step 4Process and verify the volume count — Click Process. The result panel shows Output files = ceil(total ÷ volume size). Confirm it matches before downloading, so a mistyped volume size doesn't reach your records system.
- Step 5Download the numbered volumes — Click Download; volumes save as name.split-fixed.1.pdf, name.split-fixed.2.pdf, … (separate files, no zip, ~200 ms apart).
- Step 6Apply retention labels and OCR if required — Rename each volume to your classification/retention scheme (the tool can't do this for you), run PDF OCR on image-only volumes that need to be searchable, then file them in order.
Volume size vs. volume count for a 1,200-page scan
Records policies set the volume size; this is how it translates to file count. Volume count is ceil(total ÷ pages-per-volume).
| Pages per volume | Volumes produced | Best for | Trade-off |
|---|---|---|---|
50 | 24 | Fine-grained retrieval policies | Many files to label and track |
100 | 12 | Common records default | Balanced |
200 | 6 | Loose page caps | Heavier volumes to open |
500 | 3 (last = 200) | Minimal volume sprawl | Large per-volume file size |
Scan splitting — preserved vs. not carried over
What survives the split, and what records staff should re-apply per volume.
| Item | After split |
|---|---|
| Scanned image fidelity | Preserved — pages copied, no re-encoding or down-sampling |
| Existing OCR text layer | Preserved — carried into each volume |
| Page order | Preserved — strictly sequential |
| Document-level metadata / bookmarks | Not carried over — re-apply per volume in your DMS |
| Retention / classification naming | Not applied — rename downloads manually |
| File-size-based volume cap | Not honoured directly — estimate via pages |
Cookbook
Volume recipes for scanned records. Volume count is always ceil(total ÷ pages-per-volume).
1,200-page scan into 100-page volumes
A records policy mandates 100-page volumes. 1,200 ÷ 100 = 12 even volumes.
Input: intake-scan.pdf (1,200 pages) Pages per chunk: 100 Output files panel: 12 intake-scan.split-fixed.1.pdf pages 1-100 ... intake-scan.split-fixed.12.pdf pages 1101-1200
Odd total — last volume is short
A 1,150-page scan at 100 per volume gives 11 full volumes and a 50-page final one — nothing lost.
Input: case-scan.pdf (1,150 pages) Pages per chunk: 100 Output files panel: 12 ...11.pdf pages 1001-1100 (100 pages) ...12.pdf pages 1101-1150 (50 pages, remainder)
OCR image-only volumes after splitting
If the scan has no text layer, split first, then OCR each volume to make it searchable for retrieval.
Step 1: Split (Fixed) -> scan.split-fixed.1..12.pdf
Step 2: for each volume:
PDF OCR /pdf-tools/pdf-ocr (lang: eng)
-> searchable text layer added per volumeRename to a chain-of-custody scheme
The tool can't apply your naming convention, so batch-rename the downloads to your records ID.
Downloaded: scan.split-fixed.1.pdf ... .12.pdf Rename to: RM-2026-0481_v01.pdf ... RM-2026-0481_v12.pdf (use your OS file manager or a batch-rename utility)
Recombine for a disclosure / FOIA request
Need the whole record assembled again? Merge the volumes in numeric order.
Tool: PDF Merge /pdf-tools/pdf-merge Add: scan.split-fixed.1.pdf ... .12.pdf (in order) Merge -> single 1,200-page record
Edge cases and what actually happens
Last volume shorter than the rest
By designWhen the page count isn't a multiple of the volume size, the final volume holds the remainder. A 1,150-page scan at 100 per volume gives eleven 100-page volumes and one 50-page volume. Complete and correct — no scanned page is dropped or duplicated.
Scan fidelity preserved
PreservedPages are copied with pdf-lib copyPages — there is no re-encoding of the scanned image and no down-sampling. Each volume contains the exact image bytes of the original scan, so archival quality is maintained for legal admissibility and long-term retention.
Existing OCR text layer survives
PreservedBecause pages are copied rather than re-rendered, an invisible OCR text layer already present on the scan is carried into the volumes intact, keeping them searchable. (Lossy compression would discard it.) If the scan is image-only, OCR each volume afterward with PDF OCR.
Records policy caps by file size, not pages
estimate pagesThis tool splits by page count only. To meet a megabyte cap, divide the cap by the average scanned-page weight (total MB ÷ total pages) and round down. Colour scans weigh more per page than bitonal ones, so verify the largest output volume against the cap.
Document metadata not carried into volumes
re-apply per volumeEach volume is a new PDF built from copied pages, so original document-level metadata and bookmarks don't propagate. Apply retention metadata in your DMS per volume, or normalise it with the Metadata Scrubber to a known clean state.
Free tier — scan over 2 MB
blocked (upgrade prompt)Large scans are almost always over the Free 2 MB ceiling and are blocked at upload with an upgrade prompt. Pro raises this to 50 MB, Pro+Media to 500 MB, Developer to 2 GB — the relevant tiers for multi-hundred-page scans. The block precedes any split.
Free tier — more than 50 pages
blocked (upgrade prompt)Free caps PDFs at 50 pages, so a scanner's bulk output trips the page block. Pro lifts it to 500 pages, Pro+Media to 2,000, Developer to 10,000 — the practical range for splitting large scanned records.
Retention naming not applied automatically
rename afterVolumes download as name.split-fixed.1.pdf etc. The tool can't apply your classification or retention naming. Rename the downloads (a batch-rename utility helps for many volumes) to your records scheme before filing, e.g. RM-2026-0481_v01.pdf.
Volumes are separate files, not zipped
ExpectedEach volume downloads individually with a ~200 ms stagger; there's no zip bundle. Your browser may prompt to allow multiple downloads — approve it, then file the saved volumes into your records folder structure.
Password-protected scan
Often supportedLoaded with ignoreEncryption: true, so many protected scans split fine. If pdf-lib can't parse the encryption, remove the password first with PDF Remove Password, then split into volumes.
Frequently asked questions
Will image quality be preserved in each volume?
Yes. Scanned page images are copied with pdf-lib copyPages — there's no re-compression and no down-sampling — so each volume holds the exact image bytes of the original scan. Archival fidelity is maintained, which matters for legal admissibility and long-term retention.
How can I make the volumes searchable?
If the scan already has an OCR text layer, splitting preserves it, so the volumes stay searchable automatically. If it's image-only, run each volume through PDF OCR after splitting to embed a searchable text layer. OCR is per file, so process the volumes one at a time.
My organisation requires specific naming — can I rename volumes?
Yes, after download. Volumes come out as scan.split-fixed.1.pdf, .2.pdf, and so on; the tool can't apply your classification or retention naming. Rename them in your OS file manager or a batch-rename utility to match your scheme, e.g. RM-2026-0481_v01.pdf, before filing.
What if our records policy limits by file size instead of pages?
This tool splits by page count, so estimate a safe pages-per-volume from the average scanned-page weight: total MB ÷ total pages gives MB per page, then your size cap ÷ MB-per-page gives pages, rounded down. Colour scans weigh more per page than bitonal scans, so verify the largest output volume against the cap.
Does the last volume lose any pages if the total isn't even?
No. The final volume simply contains the leftover pages. A 1,150-page scan at 100 pages per volume produces eleven 100-page volumes and one 50-page volume — all 1,150 pages are accounted for, with nothing dropped or duplicated.
Will document metadata carry into each volume?
Page content carries over, but document-level metadata (Title, Author) and bookmarks generally do not, because each volume is a new PDF built from copied pages. Apply retention metadata per volume in your DMS, or normalise it with the Metadata Scrubber.
How many volumes will I get?
The volume count is the total page count divided by your pages-per-volume, rounded up: ceil(total ÷ size). A 1,200-page scan at 100 pages per volume gives 12 volumes; the result panel shows the exact Output files count before you download, so you can sanity-check it against the scanner's reported total.
Are the scanned records uploaded anywhere?
No. The split runs entirely in your browser and the panel reports "0 bytes uploaded". For sensitive records subject to retention, privacy, or chain-of-custody rules, the file never leaves your machine — there's no third-party copy and no disclosure surface.
Can I recombine volumes for a disclosure or FOIA request?
Yes — PDF Merge recombines the numbered volumes into the full record. Add them in numeric order so the page sequence is restored, then merge and download a single PDF for the requestor.
What if records cross my fixed volume boundaries?
Fixed split cuts at equal page intervals regardless of content, so an individual record can straddle two volumes. If your policy requires one logical record per volume, use PDF Split by Range and specify the exact ranges that match each record's boundaries.
Is there a faster way to OCR many volumes?
OCR in this suite is per-file, so each volume is processed individually. For very large records projects, you may prefer to OCR the whole scan once (if your tooling allows) before splitting — since the split preserves the resulting text layer into every volume, that avoids re-OCR per file.
How large a scan can I split per job?
Free is limited to 2 MB and 50 pages — too small for a bulk scan. Pro allows 50 MB and 500 pages, Pro+Media 500 MB and 2,000 pages, Developer 2 GB and 10,000 pages. The limit is checked on upload, so an oversized scan is flagged before you set the volume size.
Privacy first
All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.