How to anonymize a manuscript pdf for double-blind submission
- Step 1Inspect the manuscript's properties before submitting — Open the compiled PDF and read File → Properties → Description.
/Authoris the field to watch — LaTeX's\author{}and Word's account name both flow into it even after you blank the title page. Note/Producertoo;pdfTeX-1.40.25plus your exact package set can fingerprint a group. - Step 2Drop the manuscript onto the sanitizer — Add the PDF to the tool above. It routes to the canonical PDF Metadata Scrubber, which processes the file with pdf-lib in your browser. There are no options — the field set is fixed, so you cannot accidentally leave the author in.
- Step 3Run the single pass — pdf-lib loads with
ignoreEncryption: true, empties Title, Author, Subject, Keywords, Producer, and Creator, sets both dates tonew Date(0), and re-saves withupdateFieldAppearances: false. Equations, figures, fonts, and references are untouched. - Step 4Download the anonymized PDF and upload that copy — Save the result and submit *this* file, not the original. The page content is byte-for-byte the same; only Document Properties changed. Keep the original locally for the camera-ready version after acceptance.
- Step 5Verify the blind in Document Properties — Re-open File → Properties. Author, Creator, Producer, Title, Subject, and Keywords now read blank, and both dates show 1 January 1970 — the epoch. That is expected and privacy-neutral; every anonymized file shares it, so it cannot be cross-referenced to your other PDFs.
- Step 6Check the page text for residual identity leaks — Metadata is only one channel. Acknowledgements, funding lines, and self-citations phrased as 'in our prior work [Smith 2024]' are visible page content and are not touched here. Review them by eye, and use pdf-pii-redactor if you need to remove a name from the body before submission.
Manuscript document-info fields after anonymizing
All eight fields are handled in one pdf-lib pass. The six text fields become empty strings; the two dates reset to the Unix epoch.
| Field | How it breaks the blind | After sanitizing |
|---|---|---|
/Author | Your real name — Word's account name or LaTeX \author{} | Empty string |
/Creator | The authoring tool (LaTeX with hyperref, Microsoft Word) | Empty string |
/Producer | pdfTeX-1.40.25 or similar — fingerprints your toolchain | Empty string |
/Title | Working title, often with your name or a draft tag | Empty string |
/Subject | A short abstract line or internal note | Empty string |
/Keywords | Topic tags, sometimes a grant or project codename | Empty (cleared to no keywords) |
/CreationDate | Compile date — cross-referenceable against your preprints | Reset to 1970-01-01T00:00:00Z (epoch) |
/ModDate | Last compile — shows the final pre-submission edit | Reset to 1970-01-01T00:00:00Z (epoch) |
Metadata vs. the other ways a blind breaks
A metadata scrub closes the Document-Properties leak. The remaining rows are on-page or second-channel and need other handling.
| De-anonymizing vector | Owned by this tool? | How to handle it |
|---|---|---|
/Author in Document Properties | Yes | This tool (pdf-metadata-scrubber) |
/Producer toolchain fingerprint | Yes | This tool |
| Compile-date timeline | Yes — reset to epoch | This tool |
| Your name in acknowledgements / funding lines | No | Edit the source; or pdf-pii-redactor |
| Self-citation phrased as 'our prior work' | No — visible text | Reword in the source before compiling |
XMP packet still carrying dc:creator | No — not rewritten | Re-save via pdf-compress-lossless |
A Word source .docx with author properties | No | office-doc-property-wiper |
File-size limits by tier (PDF input)
PDF input is file-based, so Security-family tier limits apply. One file per pass.
| Tier | Max file size | Files per pass |
|---|---|---|
| Free | 10 MB | 1 |
| Pro | 100 MB | 5 (processed one at a time) |
| Pro-media | 500 MB | 50 |
| Developer | 2 GB | Unlimited |
Cookbook
Real Document Properties before and after anonymizing a manuscript, plus the leaks a metadata pass cannot reach. Names and titles are illustrative.
LaTeX writing your name straight into /Author
Many LaTeX configurations route \author{} (or hyperref's pdfauthor) into the PDF's /Author field. You blanked the title page, but Properties still names you. This is the classic double-blind failure.
Before (File -> Properties): Author: Wei Chen Creator: LaTeX with hyperref Producer: pdfTeX-1.40.25 Title: Manuscript_v3_for_NeurIPS Created: 2026-05-30 22:10:51 Modified: 2026-06-02 17:44:08 After sanitizing: Author: (blank) Creator: (blank) Producer: (blank) Title: (blank) Created: 1970-01-01 00:00:00 Modified: 1970-01-01 00:00:00
Producer string narrowing down the lab
pdfTeX-1.40.25 plus a distinctive package set is a soft fingerprint of a specific group's build environment. Emptying /Producer and /Creator removes that signal.
Before: Producer: pdfTeX-1.40.25 Creator: LaTeX with hyperref and microtype After: Producer: (blank) Creator: (blank) The build-environment hint is gone.
Compile dates that cross-reference to a preprint
If the submission's CreationDate matches the timestamp on a preprint you posted publicly, a reviewer can link the two. Resetting both to the epoch breaks that correlation.
Before: CreationDate: D:20260530221051+02'00' ModDate: D:20260602174408+02'00' After: CreationDate: D:19700101000000Z ModDate: D:19700101000000Z No timestamp left to match against arXiv or a lab page.
What the scrub does NOT fix — a self-citation on the page
Metadata cleaning cannot anonymize your prose. A sentence like 'extending our earlier method [Chen 2024]' identifies you and is visible page text. Reword it in the source before compiling.
Document Properties after sanitizing: all clean. But page 4 still reads: "building on our prior work (Chen et al., 2024)" This survives the scrub. Reword to a neutral third-person citation in the source, recompile, then sanitize the new PDF.
Anonymize the Word source as well
If you submit a .docx instead of (or alongside) the PDF, the Word file carries its own author and revision history that the PDF scrub never sees.
Workflow:
1. Sanitize manuscript.pdf with this tool.
2. Run manuscript.docx through
/security-tools/office-doc-property-wiper
to clear author, company, and tracked-change
revision data.
Both submission formats are now anonymous.Edge cases and what actually happens
Dates show 1970, not blank
ExpectedThe sanitizer sets CreationDate and ModDate to new Date(0) — the Unix epoch (1970-01-01T00:00:00Z), written as D:19700101000000Z. It does not delete the keys. This is by design and is actually helpful for blinding: a fixed constant shared by every file cannot be matched against the timestamp on your public preprint.
Your name is still in the acknowledgements
Not coveredAcknowledgements, funding statements, and author footnotes are visible page text, not metadata, so the scrub leaves them in place. Remove or neutralize them in the source before compiling, or use pdf-pii-redactor to strip a name from the body of the submission copy.
A self-citation reveals the authors
Not coveredPhrases like 'our prior work' or a cluster of citations to one group identify you, and they are page content the metadata pass cannot touch. Reword self-citations to neutral third person in the source and recompile before sanitizing.
XMP packet still carries dc:creator
Not coveredhyperref and many exporters write your name into an XMP packet (dc:creator, xmp:CreateDate) in addition to /Author. This tool empties /Author but does not rewrite XMP, and some viewers prefer it. Re-serialize through pdf-compress-lossless to drop the stale packet.
The .docx source still names you
Not coveredIf the venue accepts or requires a Word source, that file keeps its own author, company, and tracked-change history regardless of the PDF scrub. Wipe it with office-doc-property-wiper before submitting.
Encrypted / protected manuscript
Loaded with ignoreEncryptionThe scrubber loads with ignoreEncryption: true, so it can often open a protected file to clear metadata, but it is not a password-removal tool. If the PDF needs a password to view its pages, remove it first with pdf-remove-password on your own file, then sanitize.
Supplementary figures carry their own EXIF
Not coveredIf you submit figure source images separately (not embedded), those files can carry their own camera or software EXIF identifying your equipment or account. The PDF scrub does not touch separate image files — clean those before upload if anonymity matters.
File exceeds the tier size cap
RejectedA figure-heavy manuscript can be large. Limits: Free 10 MB, Pro 100 MB, Pro-media 500 MB, Developer 2 GB. If you are over the cap, downsample images via pdf-compress-lossless (which also re-serializes the XMP packet) or move up a tier.
Corrupt or truncated PDF fails to load
ErrorIf pdf-lib cannot parse the file — a failed compile, a truncated download, or a non-PDF renamed to .pdf — the load throws and no output is produced. Confirm the bytes are a valid PDF first with magic-byte-validator.
Frequently asked questions
Which fields does anonymizing clear?
Eight fields in the document-information dictionary. Six text fields are emptied to blank strings: /Title, /Author, /Subject, /Keywords, /Producer, and /Creator. The two timestamps — /CreationDate and /ModDate — are reset to the Unix epoch (1970-01-01T00:00:00Z). The same eight are processed every run; there is nothing to configure.
Why does my name appear in /Author even though the title page is blank?
Because metadata is a separate channel from page content. Word copies your account name into /Author, and LaTeX's \author{} (via hyperref) writes it into the PDF metadata even when it is not printed. Blanking the title page does nothing to the /Author field — this tool is what empties it.
Why do the dates show 1970 instead of being removed?
The tool sets both date stamps to new Date(0), the Unix epoch. For blinding this is an advantage: because every anonymized file shares the same 1970 value, a reviewer cannot match the compile timestamp against a public preprint or lab page. If you need the date keys fully absent, this tool does not offer that.
Does my unpublished manuscript get uploaded anywhere?
No. Processing runs entirely in your browser via pdf-lib. An embargoed or unpublished manuscript never leaves your machine — important when you cannot risk an online converter retaining the file before publication.
Will anonymizing change the equations, figures, or references?
No. Only the /Info dictionary entries change. Equations, figures, fonts, references, and every word on the page are preserved exactly — the file is re-saved, not re-rendered. The page count matches the input.
Is the XMP metadata cleared too?
No. hyperref and many exporters also write your name into an XMP packet (dc:creator) which this tool does not rewrite, and some viewers prefer it. Re-save through pdf-compress-lossless (/pdf-tools/pdf-compress-lossless) to drop the stale packet.
Does this remove my name from the acknowledgements or a self-citation?
No — those are visible page text, not metadata, so they survive the scrub. Reword self-citations to neutral third person in your source and recompile, and remove acknowledgements before submission (or strip a name from the body with pdf-pii-redactor at /pdf-tools/pdf-pii-redactor).
I'm submitting a Word file too — does this anonymize that?
No. This tool only handles PDFs. A .docx carries its own author, company, and tracked-change history. Wipe it with the Office Doc Property Wiper (/security-tools/office-doc-property-wiper) before submitting.
Can I anonymize a password-protected manuscript?
The scrubber loads with encryption ignored, so it can often open a protected file to clear metadata, but it is not a password-removal tool. If the PDF needs a password just to view its pages, remove it first with PDF Remove Password (/pdf-tools/pdf-remove-password) on your own file, then sanitize.
How big a manuscript can I anonymize?
PDF input is file-based, so tier limits apply: Free up to 10 MB and one file; Pro up to 100 MB; Pro-media up to 500 MB; Developer up to 2 GB. Figure-heavy papers hit the caps fastest. Downsample with pdf-compress-lossless (/pdf-tools/pdf-compress-lossless) if needed.
How is this different from the PDF Metadata Scrubber in the PDF suite?
It is the same engine. The PDF History Sanitizer is the Security-suite entry point and routes to the canonical PDF Metadata Scrubber (/pdf-tools/pdf-metadata-scrubber), which does the pdf-lib work. The field set, behaviour, and browser-local processing are identical — use whichever surface fits your workflow.
What is the full pre-submission anonymity checklist?
Five steps. (1) Blank the title page and reword self-citations in the source, then recompile. (2) Sanitize PDF metadata with this tool. (3) Re-serialize via pdf-compress-lossless (/pdf-tools/pdf-compress-lossless) to clear the XMP packet. (4) If you submit a Word source, wipe it with office-doc-property-wiper (/security-tools/office-doc-property-wiper). (5) Re-open Document Properties to confirm Author is blank before uploading.
Privacy first
Every JAD Security operation runs entirely in your browser. Files, passwords, and PGP private keys never leave your device — verified by zero outbound network requests during processing.