Compare Two Versions of a PDF — Free Browser Diff Tool

How to compare two pdf documents to find differences

Step 1
Open the PDF Compare tool — Go to the PDF Compare / Diff tool. It is a multi-file tool — the dropzone reads Drop PDF files here. Everything runs in your browser; nothing is uploaded.
Step 2
Add the original (version A) PDF — Drop the first/earlier version. It appears in the queued-files list with its name, size, and page count. Order matters: the first file is treated as A, so its lines become the removed side when content is dropped.
Step 3
Add the revised (version B) PDF — Drop the second/later version. The tool accepts exactly two files (fileCountLimit: 2); B's new lines become the added side. There is no drag-to-reorder — if you queued them backwards, remove one with the X and re-add in the right order.
Step 4
Click Process 2 files — The tool does not auto-run for multi-file compares — the Process 2 files button is disabled until both files are present. Click it to run. There is no options panel for this tool; comparison settings are fixed.
Step 5
Read the JSON result — The result panel shows the comparison object: pageCountA, pageCountB, a differences array of human-readable structural notes, and a textDiff block with added, removed, unchanged, addedCount, removedCount, identical, and a unified-diff report string.
Step 6
Download the diff report — Click Download to save the full comparison as a .json file (named after the input). Attach it to a review thread, or parse it in a script. If textDiff.identical is true, the two documents have identical extractable text.

What the comparison report contains

The single JSON object returned by the tool. Structural fields come from pdf-lib; the textDiff block comes from the framework-free LCS core fed pdfjs-extracted text.

Field	Meaning	Example value
`pageCountA` / `pageCountB`	Page count of the first and second file	`12` / `13`
`differences`	Human-readable structural notes: page-count mismatch, per-page size mismatch (>0.5 pt), and a one-line text-changed summary	`["Page count differs: 12 vs 13", "Text differs: 4 line(s) added, 2 line(s) removed"]`
`textDiff.extracted`	Whether a text layer was readable from both files. False means scanned/image-only input	`true`
`textDiff.added` / `removed`	Arrays of the actual line strings present only in B / only in A	`["Page-4 text…"]`
`textDiff.unchanged`	Count of lines (pages) common to both files	`9`
`textDiff.identical`	True when no lines were added or removed	`false`
`textDiff.report`	Unified-diff-style string: `-` removed, `+` added, two spaces unchanged	`Page 1 text\n- old line\n+ new line`

What it compares — and what it does not

Capabilities grounded in the implementation. Where a capability is out of scope, the matching JAD tool is named.

Aspect	Detected?	Notes
Page count change	Yes	`pageCountA` vs `pageCountB`; surfaced in `differences`
Page size change	Yes	Per matched page, flagged when width or height differs by >0.5 pt (e.g. A4↔Letter)
Text additions / deletions	Yes (per page line)	LCS over text where each page is one line; granularity is per-page, not per-word
Word-level / inline redline	No	Output is whole-page lines added/removed, not character-level highlights
Formatting (bold, font, size, colour)	No	Only extracted text strings are compared; pure styling changes are invisible to the diff
Images / graphics / signatures	No	Image content is not diffed. For signature integrity use PDF Signature Verify
Scanned / image-only PDF	Structural only	No text layer → `extracted: false`; run PDF OCR first

Cookbook

Concrete inputs and the exact shape of what the tool returns. The text diff aligns one line per page, so changes are reported page-by-page.

Two versions, identical text, different page size

Someone re-saved an A4 draft as US Letter without touching the words. The text diff is identical, but the structural diff flags the size change — which is exactly the kind of silent drift a visual skim misses.

Input A: report-v1.pdf (A4, 595x842 pt)
Input B: report-v2.pdf (Letter, 612x792 pt)

Result:
  pageCountA: 8
  pageCountB: 8
  differences: [
    "Page 1: size differs (595x842 vs 612x792)",
    ... (one per page)
  ]
  textDiff.identical: true
  textDiff.report: "No text differences — the two documents have identical text content."

A single edited sentence on page 3

Because pdfjs joins each page into one string, an edit anywhere on page 3 shows as page 3's whole text removed and the new page 3 text added. You see exactly which page changed.

Result:
  textDiff.addedCount:   1
  textDiff.removedCount: 1
  textDiff.report:
      Page 1 text …
      Page 2 text …
    - Old page-3 text … fee is $4,000 …
    + New page-3 text … fee is $4,500 …
      Page 4 text …

Version B added a page

B has one extra page. Page count differs, and the new page's text appears as a single added line.

Result:
  pageCountA: 10
  pageCountB: 11
  differences: [
    "Page count differs: 10 vs 11",
    "Text differs: 1 line(s) added, 0 line(s) removed"
  ]
  textDiff.added:   ["Appendix B text …"]
  textDiff.removed: []

Identical files

Comparing a file with itself (or a byte-for-byte copy) confirms a clean baseline.

Result:
  pageCountA: 5
  pageCountB: 5
  differences: []
  textDiff.identical: true
  textDiff.unchanged: 5
  textDiff.report: "No text differences — the two documents have identical text content."

One file is a scan with no text layer

If either PDF has no extractable text (a scanned page image), text extraction degrades gracefully: the structural diff still runs, but the text diff is skipped with a message.

Result:
  pageCountA: 6
  pageCountB: 6
  differences: []
  textDiff.extracted: false
  textDiff.report:
    "Text layer could not be extracted from one or both PDFs
     (e.g. a scanned/image-only document). Run OCR first to
     enable the text diff."

Fix: run both through /pdf-tools/pdf-ocr, then compare again.

Edge cases and what actually happens

Files queued in the wrong order

By design

The first file dropped is A (its lines become removed), the second is B (its lines become added). There is no drag-to-reorder. If your additions and deletions look swapped, remove one file with the X button and re-add the versions in chronological order, then click Process 2 files again.

Scanned / image-only PDF (no text layer)

Structural only

pdfjs returns no text for image-only pages, so textDiff.extracted is false and the report tells you to OCR first. The structural diff (page count, page sizes) still runs. Run both files through PDF OCR to add a text layer, then re-compare.

Only formatting changed (bold, font, colour)

Not detected

The tool compares extracted text strings, not styling. If the words are byte-for-byte the same but one version made a heading bold or changed the font, the text diff reports identical: true. Formatting-only changes are out of scope by design.

A whole page reflowed but the words are the same

Expected

Because each page is compared as a single joined line, two pages with the same words in the same reading order match even if line breaks moved. If reflow changed the pdfjs reading order, the joined strings differ and the page is reported as changed — review the page to confirm it is a layout-only change.

Trailing spaces or CRLF vs LF differences

Normalised

Before diffing, each line is normalised: CRLF/CR collapse to LF, the final trailing newline is dropped, and trailing whitespace on every line is trimmed. Leading whitespace is preserved so genuine indentation changes still show. Stray trailing spaces from extraction never produce false positives.

Only one file dropped

Blocked

The Process button stays disabled until two files are present, and the handler throws Upload two PDF files to compare. if it is run with fewer than two. Add the second file to enable the comparison.

File over the size limit

Limit

Free tier caps each PDF at 2 MB; Pro raises it to 50 MB per file. Large multi-hundred-page documents may also be slow to extract in the browser. Compress first with PDF Compress (lossless) if the file is bloated, or split the section you care about.

Encrypted / password-protected PDF

May fail

Structural loading ignores light encryption, but pdfjs text extraction can fail on a password-protected file, leaving extracted: false. Remove the password first with PDF Unlock or PDF Remove Password, then compare.

Two completely unrelated PDFs

Expected

There is no similarity threshold — the LCS simply finds whatever lines (pages) happen to match. Comparing two unrelated documents typically returns nearly everything as removed (from A) plus everything added (from B). That is correct behaviour, not an error.

You expected a colour-coded side-by-side view

Not available

This tool outputs a JSON report and a unified-diff report string (-/+/space prefixes), not a rendered green/red overlay PDF and no accept/reject controls. Use the JSON or the report text; for a visual page render, open both files in a desktop PDF viewer alongside the report.

Frequently asked questions

Does this tool highlight changes in green and red on the PDF?

No. It returns a JSON report plus a unified-diff report string where removed lines start with - , added lines with + , and unchanged lines with two spaces. There is no rendered, colour-coded overlay PDF and no accept/reject buttons. The output is designed to be read or parsed, not visually marked up on the page.

How fine-grained is the text comparison — word level or line level?

It is a line-level LCS diff, and pdfjs joins all the text on a page into a single string. So in practice the comparison aligns one line per page: a change anywhere on a page surfaces as that page's whole text removed and the new text added. This reliably tells you which pages changed; it is not a word-by-word inline redline.

Can I export the diff as an annotated PDF?

No — the download is a .json file containing the full comparison object (page counts, structural differences, and the textDiff block including the unified-diff report). There is no annotated-PDF export. If you need a marked-up PDF, paste the report text into a desktop tool, or render both pages side by side manually.

Will it work on scanned PDFs?

Only for the structural part. A scanned/image-only PDF has no text layer, so pdfjs extracts nothing and textDiff.extracted comes back false with a message to run OCR first. Run both files through PDF OCR to add a searchable text layer, then compare again to get a real text diff.

Are formatting changes like bold or a font swap detected?

No. The diff compares extracted text strings only. If the words are identical but one version changed a heading to bold, used a different font, or changed text colour, the text diff reports the documents as identical. Page size changes are detected (structural diff), but in-text styling is not.

Does the order I add the files in matter?

Yes. The first file dropped is treated as A and the second as B. A's unique lines are reported as removed, B's unique lines as added. There is no reorder control, so add the older version first and the newer version second. If the result looks inverted, remove a file and re-add them in chronological order.

What exactly is in the differences array?

Human-readable structural notes: a Page count differs: X vs Y entry when counts don't match, a Page N: size differs (WxH vs WxH) entry per page whose width or height differs by more than 0.5 pt, and a one-line Text differs: A line(s) added, B line(s) removed summary when the text isn't identical. The detailed text lives in the textDiff block.

Are my documents uploaded to a server?

No. Both PDFs are read and compared entirely in your browser — pdf-lib for structure, pdfjs for text, and a framework-free LCS routine for the diff. The result panel confirms 0 bytes uploaded. Only an anonymous usage counter is recorded when you are signed in; the document content never leaves your device.

Why do two visually similar pages show as changed?

The diff compares the text pdfjs reads, in the order it reads it. If text reflowed enough to change that reading order, or invisible characters differ, the joined page strings won't match and the page is flagged. Open the page and compare visually; if the words are truly identical, treat it as a layout/extraction-order difference rather than a content edit.

What's the maximum file size I can compare?

Free tier allows up to 2 MB per PDF; Pro raises it to 50 MB per file. Very long documents can also be slow to extract in the browser. If a file is over the limit because of bloat rather than real content, shrink it first with PDF Compress (lossless).

Can I compare a PDF against a Word doc or a plain text file?

Not directly — this tool takes two PDFs. Convert the other document to PDF first, or pull both documents' text and diff that. To get just the text from a PDF for an external diff, use PDF to Text, which extracts the same pdfjs text layer this tool compares.

Can I run this comparison from a script or API?

Yes. pdf-diff is exposed as a tool slug, so you can pair the JAD runner and POST both PDF buffers to the local runner endpoint to get the same JSON object back — page counts, structural differences, and the textDiff block — without anything leaving your machine.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to compare two pdf documents to find differences

Step 1
Open the PDF Compare tool — Go to the PDF Compare / Diff tool. It is a multi-file tool — the dropzone reads Drop PDF files here. Everything runs in your browser; nothing is uploaded.
Step 2
Add the original (version A) PDF — Drop the first/earlier version. It appears in the queued-files list with its name, size, and page count. Order matters: the first file is treated as A, so its lines become the removed side when content is dropped.
Step 3
Add the revised (version B) PDF — Drop the second/later version. The tool accepts exactly two files (fileCountLimit: 2); B's new lines become the added side. There is no drag-to-reorder — if you queued them backwards, remove one with the X and re-add in the right order.
Step 4
Click Process 2 files — The tool does not auto-run for multi-file compares — the Process 2 files button is disabled until both files are present. Click it to run. There is no options panel for this tool; comparison settings are fixed.
Step 5
Read the JSON result — The result panel shows the comparison object: pageCountA, pageCountB, a differences array of human-readable structural notes, and a textDiff block with added, removed, unchanged, addedCount, removedCount, identical, and a unified-diff report string.
Step 6
Download the diff report — Click Download to save the full comparison as a .json file (named after the input). Attach it to a review thread, or parse it in a script. If textDiff.identical is true, the two documents have identical extractable text.

What the comparison report contains

The single JSON object returned by the tool. Structural fields come from pdf-lib; the textDiff block comes from the framework-free LCS core fed pdfjs-extracted text.

Field	Meaning	Example value
`pageCountA` / `pageCountB`	Page count of the first and second file	`12` / `13`
`differences`	Human-readable structural notes: page-count mismatch, per-page size mismatch (>0.5 pt), and a one-line text-changed summary	`["Page count differs: 12 vs 13", "Text differs: 4 line(s) added, 2 line(s) removed"]`
`textDiff.extracted`	Whether a text layer was readable from both files. False means scanned/image-only input	`true`
`textDiff.added` / `removed`	Arrays of the actual line strings present only in B / only in A	`["Page-4 text…"]`
`textDiff.unchanged`	Count of lines (pages) common to both files	`9`
`textDiff.identical`	True when no lines were added or removed	`false`
`textDiff.report`	Unified-diff-style string: `-` removed, `+` added, two spaces unchanged	`Page 1 text\n- old line\n+ new line`

What it compares — and what it does not

Capabilities grounded in the implementation. Where a capability is out of scope, the matching JAD tool is named.

Aspect	Detected?	Notes
Page count change	Yes	`pageCountA` vs `pageCountB`; surfaced in `differences`
Page size change	Yes	Per matched page, flagged when width or height differs by >0.5 pt (e.g. A4↔Letter)
Text additions / deletions	Yes (per page line)	LCS over text where each page is one line; granularity is per-page, not per-word
Word-level / inline redline	No	Output is whole-page lines added/removed, not character-level highlights
Formatting (bold, font, size, colour)	No	Only extracted text strings are compared; pure styling changes are invisible to the diff
Images / graphics / signatures	No	Image content is not diffed. For signature integrity use PDF Signature Verify
Scanned / image-only PDF	Structural only	No text layer → `extracted: false`; run PDF OCR first

Cookbook

Concrete inputs and the exact shape of what the tool returns. The text diff aligns one line per page, so changes are reported page-by-page.

Two versions, identical text, different page size

Input A: report-v1.pdf (A4, 595x842 pt)
Input B: report-v2.pdf (Letter, 612x792 pt)

Result:
  pageCountA: 8
  pageCountB: 8
  differences: [
    "Page 1: size differs (595x842 vs 612x792)",
    ... (one per page)
  ]
  textDiff.identical: true
  textDiff.report: "No text differences — the two documents have identical text content."

A single edited sentence on page 3

Because pdfjs joins each page into one string, an edit anywhere on page 3 shows as page 3's whole text removed and the new page 3 text added. You see exactly which page changed.

Result:
  textDiff.addedCount:   1
  textDiff.removedCount: 1
  textDiff.report:
      Page 1 text …
      Page 2 text …
    - Old page-3 text … fee is $4,000 …
    + New page-3 text … fee is $4,500 …
      Page 4 text …

Version B added a page

B has one extra page. Page count differs, and the new page's text appears as a single added line.

Result:
  pageCountA: 10
  pageCountB: 11
  differences: [
    "Page count differs: 10 vs 11",
    "Text differs: 1 line(s) added, 0 line(s) removed"
  ]
  textDiff.added:   ["Appendix B text …"]
  textDiff.removed: []

Identical files

Comparing a file with itself (or a byte-for-byte copy) confirms a clean baseline.

Result:
  pageCountA: 5
  pageCountB: 5
  differences: []
  textDiff.identical: true
  textDiff.unchanged: 5
  textDiff.report: "No text differences — the two documents have identical text content."

One file is a scan with no text layer

If either PDF has no extractable text (a scanned page image), text extraction degrades gracefully: the structural diff still runs, but the text diff is skipped with a message.

Result:
  pageCountA: 6
  pageCountB: 6
  differences: []
  textDiff.extracted: false
  textDiff.report:
    "Text layer could not be extracted from one or both PDFs
     (e.g. a scanned/image-only document). Run OCR first to
     enable the text diff."

Fix: run both through /pdf-tools/pdf-ocr, then compare again.

Edge cases and what actually happens

Files queued in the wrong order

By design

Scanned / image-only PDF (no text layer)

Structural only

Only formatting changed (bold, font, colour)

Not detected

A whole page reflowed but the words are the same

Expected

Trailing spaces or CRLF vs LF differences

Normalised

Only one file dropped

Blocked

File over the size limit

Limit

Encrypted / password-protected PDF

May fail

Two completely unrelated PDFs

Expected

You expected a colour-coded side-by-side view

Not available

Frequently asked questions

Does this tool highlight changes in green and red on the PDF?

How fine-grained is the text comparison — word level or line level?

Can I export the diff as an annotated PDF?

Will it work on scanned PDFs?

Are formatting changes like bold or a font swap detected?

Does the order I add the files in matter?

What exactly is in the differences array?

Are my documents uploaded to a server?

Why do two visually similar pages show as changed?

What's the maximum file size I can compare?

Can I compare a PDF against a Word doc or a plain text file?

Can I run this comparison from a script or API?

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

Compare Two PDF Documents to Find Differences

How to compare two pdf documents to find differences

What the comparison report contains

What it compares — and what it does not

Cookbook

Two versions, identical text, different page size

A single edited sentence on page 3

Version B added a page

Identical files

One file is a scan with no text layer

Edge cases and what actually happens

Files queued in the wrong order

Scanned / image-only PDF (no text layer)

Only formatting changed (bold, font, colour)

A whole page reflowed but the words are the same

Trailing spaces or CRLF vs LF differences

Only one file dropped

File over the size limit

Encrypted / password-protected PDF

Two completely unrelated PDFs

You expected a colour-coded side-by-side view

Frequently asked questions

Does this tool highlight changes in green and red on the PDF?

How fine-grained is the text comparison — word level or line level?

Can I export the diff as an annotated PDF?

Will it work on scanned PDFs?

Are formatting changes like bold or a font swap detected?

Does the order I add the files in matter?

What exactly is in the differences array?

Are my documents uploaded to a server?

Why do two visually similar pages show as changed?

What's the maximum file size I can compare?

Can I compare a PDF against a Word doc or a plain text file?

Can I run this comparison from a script or API?

Privacy first

Related guides

Compare Two PDF Documents to Find Differences

How to compare two pdf documents to find differences

What the comparison report contains

What it compares — and what it does not

Cookbook

Two versions, identical text, different page size

A single edited sentence on page 3

Version B added a page

Identical files

One file is a scan with no text layer

Edge cases and what actually happens

Files queued in the wrong order

Scanned / image-only PDF (no text layer)

Only formatting changed (bold, font, colour)

A whole page reflowed but the words are the same

Trailing spaces or CRLF vs LF differences

Only one file dropped

File over the size limit

Encrypted / password-protected PDF

Two completely unrelated PDFs

You expected a colour-coded side-by-side view

Frequently asked questions

Does this tool highlight changes in green and red on the PDF?

How fine-grained is the text comparison — word level or line level?

Can I export the diff as an annotated PDF?

Will it work on scanned PDFs?

Are formatting changes like bold or a font swap detected?

Does the order I add the files in matter?

What exactly is in the differences array?

Are my documents uploaded to a server?

Why do two visually similar pages show as changed?

What's the maximum file size I can compare?

Can I compare a PDF against a Word doc or a plain text file?

Can I run this comparison from a script or API?

Privacy first

Related guides