How to compare and diff pdf contract revisions
- Step 1Open the PDF Compare tool — Open PDF Compare / Diff. It is browser-local, which matters for privileged or confidential agreements — nothing is sent anywhere.
- Step 2Add your prior/template version first — Drop your last-agreed draft or template as file A. It shows in the queue with its name, size, and page count. A's lines are the
removedside, so put your baseline first. - Step 3Add the counterparty's returned version — Drop the version you received as file B — its new lines become
added. The tool takes exactly two files; there is no reorder handle, so add them in the right order. - Step 4Click Process 2 files — The Process button enables once both files are queued. There is no options panel — the diff settings are fixed — so just run it.
- Step 5Read the changed clauses — In the JSON result,
textDiff.removedlists your baseline lines that disappeared andtextDiff.addedlists the counterparty's new wording. The unified-diffreportinterleaves them with-/+prefixes so a removed clause sits right above its replacement. - Step 6Save the report to the matter file — Click Download to save the comparison as
.json. Keep it as a record of the revision you reviewed; pair it with the two PDF versions for a complete change history.
Reading the diff like a redline
How the report's fields map to the contract-review questions you actually have.
| Report field | Contract-review question it answers | How to read it |
|---|---|---|
textDiff.removed | Which of my terms did they delete or change? | Each entry is a page's worth of your baseline text that no longer matches — the clause(s) on that page were edited or removed |
textDiff.added | What did they put in instead? | Each entry is the counterparty's new text for that page; pair it with the matching removed line in the report |
textDiff.report | Show me the change in context | Unified-diff string: - your wording, + their wording, two-space prefix for untouched pages |
differences (page count) | Did they add or drop pages/clauses? | Page count differs means a page was inserted or removed — check for a smuggled schedule or deleted exhibit |
differences (page size) | Did they re-paginate the document? | A size-differs note means the page geometry changed — re-pagination can shift where a clause lands |
What contract reviewers should NOT rely on it for
Honest limits so you don't over-trust the diff in a negotiation.
| Reviewer expectation | Reality | Do this instead |
|---|---|---|
| Word-level redline of a clause | Diff is per-page line granularity, not per-word | The page is pinpointed; read the removed/added pair to find the exact word |
| Detect a bolded/highlighted change | Formatting is not compared, only text | Compare visually for emphasis changes; the diff catches the words |
| Compare a Word redline PDF | Embedded Track Changes adds markup text that pollutes the diff | Accept/clean both versions to flat PDFs first, then compare |
| Diff a scanned/wet-ink contract | No text layer → text diff skipped | Run PDF OCR on both, then compare |
| Cryptographic proof of who changed it | The diff shows what changed, not who or when | Use PDF Signature Verify for signed-document integrity |
Cookbook
Realistic contract-revision scenarios and the exact report shape. Clause text is illustrative; the per-page line alignment means a change shows as a removed/added pair for that page.
Counterparty raised the liability cap
They changed a single number deep in a limitation-of-liability clause. The page that holds it shows as removed and re-added — read the pair to spot the number.
textDiff.report:
Page 6 text … indemnification …
- Page 7 … liability shall not exceed the fees paid in the prior 12 months …
+ Page 7 … liability shall not exceed two times the total contract value …
Page 8 text … governing law …
addedCount: 1 removedCount: 1An entire clause was deleted
The counterparty removed a termination-for-convenience clause and the document is one page shorter.
Result:
pageCountA: 14
pageCountB: 13
differences: [
"Page count differs: 14 vs 13",
"Text differs: 0 line(s) added, 1 line(s) removed"
]
textDiff.removed: ["Page-9 text … either party may terminate for convenience on 30 days …"]
textDiff.added: []A new schedule was inserted
They appended a pricing schedule. Page count goes up and the new page's text appears as a single added line.
Result: pageCountA: 11 pageCountB: 12 differences: ["Page count differs: 11 vs 12", "Text differs: 1 line(s) added, 0 line(s) removed"] textDiff.added: ["Schedule C — Pricing … per-seat fee …"] textDiff.removed: []
Clean returned draft — no substantive change
The counterparty only re-saved the PDF (e.g. printed to PDF again) without editing the words. Identical text confirms you can sign the version you already reviewed.
Result: differences: [] textDiff.identical: true textDiff.unchanged: 12 textDiff.report: "No text differences — the two documents have identical text content."
Returned contract is a flattened scan
The counterparty printed, signed, and scanned the contract, so it has no text layer. The text diff can't run until you OCR it.
Result:
textDiff.extracted: false
textDiff.report:
"Text layer could not be extracted from one or both PDFs
(e.g. a scanned/image-only document). Run OCR first to
enable the text diff."
Fix: OCR the scan at /pdf-tools/pdf-ocr, then diff against your text-based draft.Edge cases and what actually happens
Embedded Word Track Changes in the PDF
Noisy diffIf a draft was exported from Word with Track Changes shown, the inserted/deleted markup text gets extracted too and pollutes the comparison. Accept or reject all changes in Word and export a clean flat PDF for both versions, then compare. The diff is most reliable on final, clean PDFs.
Counterparty returned a scanned, signed contract
Structural onlyA printed-and-scanned contract is image-only, so textDiff.extracted is false and only the structural diff runs. Run PDF OCR on the scan to add a text layer, then diff it against your text-based draft to see the clause changes.
A clause moved to a different page
ExpectedIf a clause was relocated, its old page shows as removed and its new page shows as added — the LCS doesn't track moves as moves. Read both entries together: matching text appearing as one removal and one addition usually means a relocation, not new wording.
Only a defined term's capitalisation changed
ReportedThe diff is case-sensitive and exact: changing the Services to the services on a page makes that page differ. That is correct — capitalisation can matter in a contract — but expect the whole page to show as a removed/added pair, so scan it for the actual change.
Page re-pagination shifts every page
Heads upIf the counterparty reflowed the document so content shifted across page boundaries, many pages can differ even with few real edits, because each page's joined text changed. Use the page-count and structural notes as your first signal, then read pages in order to separate reflow from substance.
Files added in the wrong order
By designFile A is removed, file B is added. Add your baseline first and the returned version second. If additions and deletions look swapped, remove a file with the X and re-queue in the right order — there is no reorder control.
Password-protected contract
May failAn encrypted contract can block pdfjs text extraction, leaving extracted: false. Remove the password first with PDF Unlock, then compare. Light encryption may still load for the structural diff.
Contract exceeds the size limit
LimitFree tier caps each PDF at 2 MB; Pro raises it to 50 MB. A large exhibit-heavy agreement may exceed the free cap — compress it losslessly first with PDF Compress (lossless) or upgrade for the higher limit.
You wanted a rendered redline document
Not availableOutput is a JSON report and a unified-diff report string, not a marked-up PDF with strike-through and accept/reject. For a formal redline to circulate, paste the report into your word processor's compare feature or use a dedicated legal-redline tool; this tool's job is fast, private change detection.
Comparing across two unrelated contracts
ExpectedThere is no similarity gate — comparing two different agreements will show nearly all of A as removed and all of B as added. That's accurate, just not useful. This tool is for two versions of the same document.
Frequently asked questions
Can I use this to review an NDA a counterparty sent back?
Yes. Drop your template NDA as file A and the returned version as file B, then click Process 2 files. textDiff.removed shows your wording they changed, textDiff.added shows their replacement, and the unified-diff report interleaves them so each removed clause sits above its new version. It all runs in your browser, so the NDA's confidential terms are never uploaded.
Does it produce a redline I can send to the other side?
No. The output is a JSON comparison plus a unified-diff report string (-/+ prefixes) — change detection, not a circulatable strike-through redline. To produce a formal redline, run the two versions through your word processor's document-compare feature, or use a dedicated legal redlining tool. Use this for fast, private review of what changed.
How precise is it — will it show the exact changed words?
It pinpoints the page, not the word. Because pdfjs joins each page's text into one line and the diff is line-level, an edit shows as that page's whole text removed and the new text added. You read the removed/added pair to find the changed words. It reliably tells you which pages were touched and what they now say.
What if the contract was returned as a signed scan?
A scanned, signed contract is image-only, so there's no text layer to diff — textDiff.extracted comes back false. Run the scan through PDF OCR to add a searchable text layer, then compare it against your text-based draft. The structural diff (page count, sizes) still works on the scan even before OCR.
Will it catch a clause that was just moved to a different page?
It will flag both pages — the clause's old location shows as removed and its new location as added — but it doesn't label it as a move. When you see the same wording appear once as a removal and once as an addition, that's the signature of a relocation rather than a substantive edit. Read both entries together to confirm.
Does it detect a change that's only formatting, like a bolded obligation?
No. The diff compares the extracted text, so making an obligation bold, highlighting it, or changing its font won't register if the words are unchanged. Page-size changes are detected structurally, but in-text styling is invisible to the comparison. Do a visual pass for emphasis changes.
Are the contract files secure?
Yes. Both PDFs are parsed and diffed entirely in your browser — pdf-lib for structure, pdfjs for text, an in-process LCS for the diff. The result panel confirms 0 bytes uploaded, and nothing is sent to a server. This makes it suitable for privileged or confidential agreements where uploading to a cloud diff service would be a problem.
What about contracts exported from Word with Track Changes still on?
Avoid them. If Track Changes markup is visible in the PDF, the inserted/deleted text gets extracted and clutters the diff. Accept or reject all changes in Word, export a clean flat PDF for both the baseline and the returned version, and compare those. Clean final PDFs give the cleanest clause-level diff.
Can the JSON report serve as evidence of what I reviewed?
It's a useful record of the differences between two specific files at review time, and it's deterministic — re-running on the same two PDFs reproduces it. For a formal, tamper-evident audit trail, store it alongside both source PDFs in your document/matter management system, which provides the versioning and access logging the diff itself does not.
Which version should I drop first?
Drop your prior/last-agreed version (or your template) as file A and the counterparty's returned version as file B. A's unique lines are reported as removed and B's as added, which reads naturally as 'they took out X and put in Y'. There's no reorder control, so if it reads backwards, remove a file and re-add in that order.
How big can a contract PDF be?
Up to 2 MB per file on the free tier and 50 MB per file on Pro. Most pure-text contracts are well under 2 MB; exhibit-heavy agreements with embedded images can be larger. If a file is bloated rather than genuinely large, shrink it first with PDF Compress (lossless).
Is the page-size check actually useful for contracts?
Sometimes. A change from A4 to Letter (or a custom size) is flagged because page geometry shifts where text falls, which can move a clause across a page boundary and disguise an insertion. Treat a size-differs note as a prompt to read the affected pages carefully, especially if the page count also changed.
Privacy first
All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.