Scrub PDF Metadata Before Public Release — Free Online

How to scrub metadata from a pdf before public release

Step 1
Redact visible content first — Metadata is the last step, not the first. Black out names, emails, and sensitive text on the page with pdf-pii-redactor before you scrub the metadata — redaction changes content and is what regulators actually inspect.
Step 2
Remove comments and markup — Run pdf-annotation-remover to strip sticky notes, highlights, and reviewer names. These carry author identities the metadata scrubber does not reach.
Step 3
Flatten form fields if any — If the document has interactive fields, flatten them with pdf-flatten so field values become static content and can't be edited or leak default values.
Step 4
Drop the prepared PDF onto the scrubber — Now add the file here. It loads locally with pdf-lib and the scrub runs automatically — there is no options panel to set. All eight document-info fields are cleared.
Step 5
Drop XMP by re-saving losslessly — Because this tool does not rewrite the XMP packet, finish by re-saving through pdf-compress-lossless, which rebuilds the document and removes the stale XMP metadata. Then download.
Step 6
Audit before you publish — Open the final file in Acrobat (File → Properties → Description and Custom tabs) or run exiftool final.pdf. Confirm Author/Creator/Producer/Title are blank, dates read 1970-01-01, and no XMP author/date remains.

The pre-publication checklist — and which tool owns each item

A metadata scrub alone does not make a document safe to publish. This is the full chain; the metadata scrubber owns one row.

Risk before release	Owned by this tool?	Tool to use
Author / Creator / Producer in Document Properties	Yes	This tool (pdf-metadata-scrubber)
Title / Subject / Keywords naming internal projects	Yes	This tool
Creation / modification timeline	Yes (reset to epoch)	This tool
XMP metadata packet (dc:creator, xmp:CreateDate)	No	Re-save via pdf-compress-lossless
Names/emails in visible page text	No	pdf-pii-redactor
Comments, sticky notes, reviewer markup	No	pdf-annotation-remover
Interactive form field values	No	pdf-flatten

Document-info fields after a public-release scrub

Every field below is processed in the single pass; text fields are emptied and dates are reset.

Field	What it typically leaks	After scrubbing
`/Author`	The official who drafted the document	Empty
`/Creator`	Authoring app / template owner	Empty
`/Producer`	PDF library and version (toolchain fingerprint)	Empty
`/Title`	Internal working title or codename	Empty
`/Subject`	Classification line or internal brief text	Empty
`/Keywords`	Project tags	Cleared
`/CreationDate` + `/ModDate`	Drafting and last-edit timeline	Reset to 1970-01-01T00:00:00Z

Cookbook

Practical release workflows. The metadata-scrubber step is shown in context with the redaction, annotation, and XMP steps that surround it.

FOI response — full release chain

A document released under freedom-of-information law must hide both the visible exemptions and the hidden metadata. Metadata scrubbing is the second-to-last step, after redaction and annotation removal.

1. pdf-pii-redactor      -> black out exempt names/addresses
2. pdf-annotation-remover-> remove caseworker comments
3. pdf-metadata-scrubber -> clear Author/Creator/dates  (this tool)
4. pdf-compress-lossless -> drop XMP packet, finalise
5. exiftool final.pdf    -> audit: all fields blank?

Before/after Document Properties for a published report

A research report's pre-release metadata named the lead author and the lab's template. After scrubbing, Document Properties is empty.

Before (Acrobat → Description):
  Author:   Dr A. Researcher
  Creator:  LabReport-Template-v4
  Producer: Microsoft: Print To PDF
  Title:    Q2-internal-draft
  Created:  2026-04-09

After scrubbing:
  Author/Creator/Producer/Title:  (all empty)
  Created/Modified:               1970-01-01 UTC

The trap: scrubbed metadata but the dossier author is in XMP

The classic public-release mistake. The info-dictionary Author is blank, but the XMP packet still carries dc:creator with the real name — which is exactly what investigators read first.

After metadata-scrubber only:
  Info dict Author:  (empty)        ✓
  XMP dc:creator:    'A. Civil Servant'   ✗ STILL THERE

Fix: re-save via pdf-compress-lossless to drop XMP,
then confirm with: exiftool -XMP-dc:Creator final.pdf

Consultation paper with reviewer comments

Internal review comments were left on the draft. Metadata scrubbing won't remove them — they carry reviewer names and must be stripped separately before publication.

Metadata after scrub:  clean  ✓
Sticky notes:          'Legal: soften para 12 - JR'  ✗

Fix order: pdf-annotation-remover  ->  pdf-metadata-scrubber
Never publish before removing the markup layer.

Final audit command

One ExifTool command surfaces both the info-dictionary and XMP metadata so you can sign off the release.

$ exiftool -G1 -a -s final.pdf | grep -Ei 'author|creator|date|title|subject'
[PDF]      Author:        (blank)
[PDF]      Creator:       (blank)
[PDF]      CreateDate:    1970:01:01 00:00:00Z
[XMP-dc]   (no Creator line)   <- good, XMP also clean

Edge cases and what actually happens

XMP author/date survives the scrub

XMP not rewritten

This is the single most important caveat for public release: the tool clears the classic document-info dictionary but does not rewrite the XMP metadata packet. A name in XMP dc:creator or a real xmp:CreateDate will survive — and these are exactly what journalists and analysts inspect. Finish the chain by re-saving through pdf-compress-lossless.

Reviewer names left in comments

Out of scope

Annotations carry their own author names and are not part of the metadata scrub. Always run pdf-annotation-remover before publishing a document that went through internal review.

Sensitive text still visible on the page

Not redacted

Scrubbing metadata does nothing to text or images on the page. Names, addresses, and exempt content must be redacted with pdf-pii-redactor — and true redaction must remove the underlying content, not just draw a black box, which is a separate concern this tool does not address.

Form fields hold default values

Not flattened

Interactive form fields can leak default values and field names. Flatten them with pdf-flatten before scrubbing so the values become static, non-editable content.

Dates display as 1970-01-01, not blank

Expected

The two date fields are reset to the Unix epoch rather than deleted, so Document Properties shows 01/01/1970. The real timeline is gone; the epoch value is the intended output.

Incremental-update history embedded in the file

May persist

PDFs saved with incremental updates can retain earlier content layers. A plain metadata scrub does not collapse them. Re-saving through pdf-compress-lossless or pdf-flatten rebuilds the document and drops the historical layers before release.

File over the tier size or page limit

Blocked

Free handles 2 MB / 50 pages; Pro 50 MB / 500 pages; Pro+Media 500 MB / 2,000 pages. A large publication-ready PDF may exceed Free — the tool blocks before processing with an upgrade prompt.

Document is digitally signed

Signature breaks

Scrubbing re-saves the file and invalidates any existing signature. For a public release you usually want to scrub first and (if required) re-sign the final clean version. Verify with pdf-signature-verify.

PDF/A archival copy

Conflicts with PDF/A

PDF/A requires certain metadata to be present and consistent. A scrubbed copy is for distribution, not archiving. Keep an unscrubbed PDF/A master if you also need to archive — see the remove-software-info guide for the PDF/A interaction.

Frequently asked questions

Does this tool remove ALL metadata before I publish?

It removes all of the classic document-information dictionary: Title, Author, Subject, Keywords, Producer, Creator, and the two dates (reset to epoch). It does NOT rewrite the XMP metadata packet. For a true pre-publication scrub, follow up by re-saving through pdf-compress-lossless to drop the XMP, then audit with ExifTool.

What's the correct order of steps before public release?

Redact visible content (pdf-pii-redactor) → remove comments (pdf-annotation-remover) → flatten forms (pdf-flatten) → scrub metadata (this tool) → drop XMP and finalise (pdf-compress-lossless) → audit with ExifTool. Metadata is near the end because earlier steps re-save the file.

How do government and FOI teams verify a clean release?

Open Acrobat's File → Properties → Description and Custom tabs and confirm everything is blank, then run exiftool -G1 -a -s file.pdf to catch any residual XMP author or date. The Custom tab and XMP are where forensic checks find leaks.

Why might the author name still appear after scrubbing?

Because it's in the XMP packet (which this tool doesn't rewrite), in a comment, or printed on the page. The famous public-release leaks were XMP and annotation leaks, not info-dictionary ones — so always finish with the lossless re-save and the annotation removal step.

Does scrubbing change the published document's appearance?

No. Metadata is invisible to readers. The pages look identical; only the hidden document-info fields and the date stamps change.

Is the document uploaded to a server?

No. The scrub runs in your browser with pdf-lib. A sensitive pre-release document never leaves your device. Only an anonymous run counter is recorded for signed-in users.

Does it remove tracked changes or revision history?

Not directly. Tracked-change names usually live in annotations (use pdf-annotation-remover); incremental-update history is collapsed by re-saving through pdf-compress-lossless or pdf-flatten.

Can I publish a scrubbed file as PDF/A?

No — PDF/A requires certain metadata to be present, so a scrubbed copy is for distribution rather than archiving. Keep a separate PDF/A master if you need to archive the document long-term.

What metadata does a published research PDF typically need stripped?

Author, Creator (template/app), Producer, Title (often a working draft name), and the dates — all handled here — plus the XMP equivalents, which need the lossless re-save. Subject and Keywords sometimes carry internal classification text and are cleared too.

What's the largest document I can scrub before release?

Free: 2 MB / 50 pages. Pro: 50 MB / 500 pages. Pro+Media: 500 MB / 2,000 pages. Large publication PDFs may need Pro; the metadata operation itself is fast.

Can I make this part of an automated publishing pipeline?

Yes. Pair the @jadapps/runner and POST files to 127.0.0.1:9789/v1/tools/pdf-metadata-scrubber/run (the tool takes no options). Chain it with the redactor and compressor endpoints for a repeatable release pipeline that runs entirely on your own machine.

Does the tool guarantee anonymity of the document?

No single tool can. Metadata scrubbing removes the document-info fingerprint, but anonymity also depends on redacted content, removed annotations, dropped XMP, and even writing style. Treat this as one verified, deterministic layer in a broader checklist.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to scrub metadata from a pdf before public release

Step 1
Redact visible content first — Metadata is the last step, not the first. Black out names, emails, and sensitive text on the page with pdf-pii-redactor before you scrub the metadata — redaction changes content and is what regulators actually inspect.
Step 2
Remove comments and markup — Run pdf-annotation-remover to strip sticky notes, highlights, and reviewer names. These carry author identities the metadata scrubber does not reach.
Step 3
Flatten form fields if any — If the document has interactive fields, flatten them with pdf-flatten so field values become static content and can't be edited or leak default values.
Step 4
Drop the prepared PDF onto the scrubber — Now add the file here. It loads locally with pdf-lib and the scrub runs automatically — there is no options panel to set. All eight document-info fields are cleared.
Step 5
Drop XMP by re-saving losslessly — Because this tool does not rewrite the XMP packet, finish by re-saving through pdf-compress-lossless, which rebuilds the document and removes the stale XMP metadata. Then download.
Step 6
Audit before you publish — Open the final file in Acrobat (File → Properties → Description and Custom tabs) or run exiftool final.pdf. Confirm Author/Creator/Producer/Title are blank, dates read 1970-01-01, and no XMP author/date remains.

The pre-publication checklist — and which tool owns each item

A metadata scrub alone does not make a document safe to publish. This is the full chain; the metadata scrubber owns one row.

Risk before release	Owned by this tool?	Tool to use
Author / Creator / Producer in Document Properties	Yes	This tool (pdf-metadata-scrubber)
Title / Subject / Keywords naming internal projects	Yes	This tool
Creation / modification timeline	Yes (reset to epoch)	This tool
XMP metadata packet (dc:creator, xmp:CreateDate)	No	Re-save via pdf-compress-lossless
Names/emails in visible page text	No	pdf-pii-redactor
Comments, sticky notes, reviewer markup	No	pdf-annotation-remover
Interactive form field values	No	pdf-flatten

Document-info fields after a public-release scrub

Every field below is processed in the single pass; text fields are emptied and dates are reset.

Field	What it typically leaks	After scrubbing
`/Author`	The official who drafted the document	Empty
`/Creator`	Authoring app / template owner	Empty
`/Producer`	PDF library and version (toolchain fingerprint)	Empty
`/Title`	Internal working title or codename	Empty
`/Subject`	Classification line or internal brief text	Empty
`/Keywords`	Project tags	Cleared
`/CreationDate` + `/ModDate`	Drafting and last-edit timeline	Reset to 1970-01-01T00:00:00Z

Cookbook

Practical release workflows. The metadata-scrubber step is shown in context with the redaction, annotation, and XMP steps that surround it.

FOI response — full release chain

1. pdf-pii-redactor      -> black out exempt names/addresses
2. pdf-annotation-remover-> remove caseworker comments
3. pdf-metadata-scrubber -> clear Author/Creator/dates  (this tool)
4. pdf-compress-lossless -> drop XMP packet, finalise
5. exiftool final.pdf    -> audit: all fields blank?

Before/after Document Properties for a published report

A research report's pre-release metadata named the lead author and the lab's template. After scrubbing, Document Properties is empty.

Before (Acrobat → Description):
  Author:   Dr A. Researcher
  Creator:  LabReport-Template-v4
  Producer: Microsoft: Print To PDF
  Title:    Q2-internal-draft
  Created:  2026-04-09

After scrubbing:
  Author/Creator/Producer/Title:  (all empty)
  Created/Modified:               1970-01-01 UTC

The trap: scrubbed metadata but the dossier author is in XMP

The classic public-release mistake. The info-dictionary Author is blank, but the XMP packet still carries dc:creator with the real name — which is exactly what investigators read first.

After metadata-scrubber only:
  Info dict Author:  (empty)        ✓
  XMP dc:creator:    'A. Civil Servant'   ✗ STILL THERE

Fix: re-save via pdf-compress-lossless to drop XMP,
then confirm with: exiftool -XMP-dc:Creator final.pdf

Consultation paper with reviewer comments

Internal review comments were left on the draft. Metadata scrubbing won't remove them — they carry reviewer names and must be stripped separately before publication.

Metadata after scrub:  clean  ✓
Sticky notes:          'Legal: soften para 12 - JR'  ✗

Fix order: pdf-annotation-remover  ->  pdf-metadata-scrubber
Never publish before removing the markup layer.

Final audit command

One ExifTool command surfaces both the info-dictionary and XMP metadata so you can sign off the release.

$ exiftool -G1 -a -s final.pdf | grep -Ei 'author|creator|date|title|subject'
[PDF]      Author:        (blank)
[PDF]      Creator:       (blank)
[PDF]      CreateDate:    1970:01:01 00:00:00Z
[XMP-dc]   (no Creator line)   <- good, XMP also clean

Edge cases and what actually happens

XMP author/date survives the scrub

XMP not rewritten

Reviewer names left in comments

Out of scope

Annotations carry their own author names and are not part of the metadata scrub. Always run pdf-annotation-remover before publishing a document that went through internal review.

Sensitive text still visible on the page

Not redacted

Form fields hold default values

Not flattened

Interactive form fields can leak default values and field names. Flatten them with pdf-flatten before scrubbing so the values become static, non-editable content.

Dates display as 1970-01-01, not blank

Expected

The two date fields are reset to the Unix epoch rather than deleted, so Document Properties shows 01/01/1970. The real timeline is gone; the epoch value is the intended output.

Incremental-update history embedded in the file

May persist

File over the tier size or page limit

Blocked

Free handles 2 MB / 50 pages; Pro 50 MB / 500 pages; Pro+Media 500 MB / 2,000 pages. A large publication-ready PDF may exceed Free — the tool blocks before processing with an upgrade prompt.

Document is digitally signed

Signature breaks

PDF/A archival copy

Conflicts with PDF/A

Frequently asked questions

Does this tool remove ALL metadata before I publish?

What's the correct order of steps before public release?

How do government and FOI teams verify a clean release?

Why might the author name still appear after scrubbing?

Does scrubbing change the published document's appearance?

No. Metadata is invisible to readers. The pages look identical; only the hidden document-info fields and the date stamps change.

Is the document uploaded to a server?

No. The scrub runs in your browser with pdf-lib. A sensitive pre-release document never leaves your device. Only an anonymous run counter is recorded for signed-in users.

Does it remove tracked changes or revision history?

Not directly. Tracked-change names usually live in annotations (use pdf-annotation-remover); incremental-update history is collapsed by re-saving through pdf-compress-lossless or pdf-flatten.

Can I publish a scrubbed file as PDF/A?

No — PDF/A requires certain metadata to be present, so a scrubbed copy is for distribution rather than archiving. Keep a separate PDF/A master if you need to archive the document long-term.

What metadata does a published research PDF typically need stripped?

What's the largest document I can scrub before release?

Free: 2 MB / 50 pages. Pro: 50 MB / 500 pages. Pro+Media: 500 MB / 2,000 pages. Large publication PDFs may need Pro; the metadata operation itself is fast.

Can I make this part of an automated publishing pipeline?

Does the tool guarantee anonymity of the document?

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

Scrub Metadata from a PDF Before Public Release

How to scrub metadata from a pdf before public release

The pre-publication checklist — and which tool owns each item

Document-info fields after a public-release scrub

Cookbook

FOI response — full release chain

Before/after Document Properties for a published report

The trap: scrubbed metadata but the dossier author is in XMP

Consultation paper with reviewer comments

Final audit command

Edge cases and what actually happens

XMP author/date survives the scrub

Reviewer names left in comments

Sensitive text still visible on the page

Form fields hold default values

Dates display as 1970-01-01, not blank

Incremental-update history embedded in the file

File over the tier size or page limit

Document is digitally signed

PDF/A archival copy

Frequently asked questions

Does this tool remove ALL metadata before I publish?

What's the correct order of steps before public release?

How do government and FOI teams verify a clean release?

Why might the author name still appear after scrubbing?

Does scrubbing change the published document's appearance?

Is the document uploaded to a server?

Does it remove tracked changes or revision history?

Can I publish a scrubbed file as PDF/A?

What metadata does a published research PDF typically need stripped?

What's the largest document I can scrub before release?

Can I make this part of an automated publishing pipeline?

Does the tool guarantee anonymity of the document?

Privacy first

Related guides

Scrub Metadata from a PDF Before Public Release

How to scrub metadata from a pdf before public release

The pre-publication checklist — and which tool owns each item

Document-info fields after a public-release scrub

Cookbook

FOI response — full release chain

Before/after Document Properties for a published report

The trap: scrubbed metadata but the dossier author is in XMP

Consultation paper with reviewer comments

Final audit command

Edge cases and what actually happens

XMP author/date survives the scrub

Reviewer names left in comments

Sensitive text still visible on the page

Form fields hold default values

Dates display as 1970-01-01, not blank

Incremental-update history embedded in the file

File over the tier size or page limit

Document is digitally signed

PDF/A archival copy

Frequently asked questions

Does this tool remove ALL metadata before I publish?

What's the correct order of steps before public release?

How do government and FOI teams verify a clean release?

Why might the author name still appear after scrubbing?

Does scrubbing change the published document's appearance?

Is the document uploaded to a server?

Does it remove tracked changes or revision history?

Can I publish a scrubbed file as PDF/A?

What metadata does a published research PDF typically need stripped?

What's the largest document I can scrub before release?

Can I make this part of an automated publishing pipeline?

Does the tool guarantee anonymity of the document?

Privacy first

Related guides