How to strip author identity from a .docx before journal submission
- Step 1Accept or reject track changes in Word first — The single biggest blind-review leak is track changes, and this tool does NOT remove them — revision marks live inside
word/document.xml(the body), not a metadata stream. Open the manuscript in Word and Review → Accept All Changes (or Reject) so no redline carrying an author name survives. Do this before you export. - Step 2Remove your name from the title page and acknowledgements yourself — The wiper scrubs metadata, not visible text. Delete author names, affiliations, the corresponding-author email, funding acknowledgements, and any 'as we showed in [your own prior paper]' self-citations from the body in Word. Those are content, not property streams, and survive the wipe.
- Step 3Drop the .docx onto the wiper — The picker accepts
.docx,.xlsx, and.pptxonly. Drag the manuscript in. On the public browser path JSZip reads it with FileReader — nothing is transmitted, so an embargoed pre-print stays on your machine. A legacy.docis a binary OLE file, not an OOXML ZIP, and is rejected; Save As .docx first. - Step 4Let it unzip, strip, and repack — There are no options — the schema is empty by design. The tool removes the fixed list of
docProps/*, every Word comment/people stream, and the embedded thumbnail, runs a wildcard sweep for any numberedcommentsN.xml/threadedComment*.xml, and repacks withgenerateAsync. - Step 5Check the removedEntries count — The result reports
removedEntries— how many metadata files were deleted from this specific manuscript. A clean draft might show 2 (core + app); a heavily commented one with supervisor feedback can show 6 or more. Zero means the file already had no targeted property or comment streams. - Step 6Verify, then upload the -clean copy — Re-open the cleaned file and check File → Info → Properties — Author, Company, and the comment pane should all be blank. For certainty, copy it to
.zipand confirmdocProps/core.xmlis absent. Upload that copy. Do NOT open-and-save it again afterwards: Word re-writes a freshcore.xmlnaming the current user on the next save.
Author-identifying streams removed for blind review
The OOXML entries this tool deletes and the specific blinding leak each one closes. A wildcard sweep also catches any numbered comment streams not named here.
| Stream removed | What it reveals about the author | Why it breaks blinding |
|---|---|---|
docProps/core.xml | Creator, Last Modified By, Created / Modified dates, Revision count | Names the author and co-author directly in File → Properties |
docProps/app.xml | Company (institution), Template name, Total Editing Time, application | The Company field is usually the author's university; the template name can be a departmental house style |
docProps/custom.xml | Reference-manager document ID, grant number, internal draft code | A grant number or lab draft code is traceable to one group via funder databases |
word/comments.xml + people.xml + threadedComments.xml | Reviewer display names, comment text, the author→GUID people map | A single named comment from a supervisor identifies the lab even if the title page is clean |
docProps/thumbnail.jpeg / thumbnail.png | A rendered image of page one | Can display the un-anonymised title page in file explorers and some upload previews |
Blinding leaks this tool does NOT fix
Content that lives in the document body, not a property stream, and so survives the wipe. Handle each in Word before exporting.
| Leak | Where it lives | What to do |
|---|---|---|
| Track changes / revision marks | Nodes inside word/document.xml | Review → Accept/Reject All in Word before exporting |
| Author names on the title page / corresponding-author email | Visible body text | Delete manually; many journals want a separate anonymised title page |
| Self-identifying citations ('our prior work [12]') | Body text and bibliography | Reword to third person; cite anonymously per the journal's guidance |
| EXIF / GPS in figures pasted into the manuscript | Per-image metadata inside word/media/ | Scrub images before inserting; see gps-geotag-remover |
| A separate PDF version of the manuscript | PDF Info dictionary + XMP, not OOXML | Use pdf-history-sanitizer on the PDF |
Plan, limits, and output
The tool requires the Pro plan; file-size and batch caps follow the Security family tier limits.
| Property | Value | Notes |
|---|---|---|
| Minimum plan | Pro | A Free run is rejected with Office Doc Property Wiper requires the pro plan. |
| File-size limit (Pro) | 100 MB per file | Pro-media 500 MB, Developer 2 GB; manuscripts are almost always well under this |
| Accepted input | .docx, .xlsx, .pptx | Literal accept filter; legacy .doc is rejected |
| Output | Repacked file, original name + -clean suffix | Plus a removedEntries count of deleted streams |
| Where it runs | Browser (public site) or local runner (API) | The manuscript is never uploaded to JAD on either path |
Cookbook
Before/after structure from real manuscript containers. Author and lab names are anonymised; the entry paths are exactly what the tool acts on.
A manuscript with supervisor comments going to a double-blind journal
A paper your supervisor marked up with tracked comments. The title page is already anonymised, but the container names you in core.xml, your university in app.xml, and your supervisor in the comment streams. Accept track changes in Word first.
Before (unzip Manuscript-v7.docx): docProps/core.xml <- Creator: A. Researcher; Last Modified By: Prof. Supervisor docProps/app.xml <- Company: State University; TotalTime: 5210 docProps/custom.xml <- ZoteroDocID=..., Grant=NSF-2231104 word/comments.xml <- 14 comments word/people.xml <- 2 reviewer names word/document.xml <- still has track-changes nodes! After wipe (Manuscript-v7-clean.docx): docProps/core.xml REMOVED docProps/app.xml REMOVED docProps/custom.xml REMOVED <- grant number gone word/comments.xml REMOVED word/people.xml REMOVED word/document.xml UNCHANGED <- accept track changes in Word! removedEntries: 5
An anonymised supplementary spreadsheet
A supplementary .xlsx of raw data submitted alongside the manuscript. It carries cell comments from the data-checking pass and a custom property with the lab code. Macros, if any, are preserved.
Before (unzip Supplementary-Data.xlsx): docProps/core.xml <- Creator + Last Modified By docProps/custom.xml <- LabCode=BIO-LAB-3 xl/comments1.xml <- 6 data-check comments xl/persons/person.xml <- author identity map After wipe (Supplementary-Data-clean.xlsx): docProps/core.xml REMOVED docProps/custom.xml REMOVED xl/comments1.xml REMOVED xl/persons/person.xml REMOVED removedEntries: 4
Confirming the blinding before upload
Editorial offices unzip submissions to check. Do the same: rename the cleaned copy to .zip and confirm the author streams are gone before you hit submit.
$ cp Manuscript-v7-clean.docx verify.zip $ unzip -l verify.zip | grep docProps (no core.xml / app.xml / custom.xml) $ unzip -l verify.zip | grep comments (nothing) # Result panel reported: removedEntries: 5 inputBytes: 204880 outputBytes: 176320
A clean draft you authored from a blank document
If you started from a blank Word document and never invited comments, there is little to remove — but app/core still name you. The wipe still strips them.
Before (unzip Short-Communication.docx): docProps/core.xml <- Creator: A. Researcher docProps/app.xml <- Company: State University (no comment streams) After wipe (Short-Communication-clean.docx): docProps/core.xml REMOVED docProps/app.xml REMOVED removedEntries: 2
Batch-anonymising a folder via the local runner
Because this tool is server-safe, you can run it through an @jadapps/runner on your own machine so embargoed files never reach JAD. The schema has no options.
# 1. Read the schema (empty options array):
GET /api/v1/tools/office-doc-property-wiper
-> { options: [], minTier: "pro", outputType: "blob" }
# 2. POST each manuscript to the local runner:
POST http://127.0.0.1:9789/v1/tools/office-doc-property-wiper/run
(multipart: Manuscript-v7.docx)
# 3. Runner returns the cleaned file as base64:
{
"outputBase64": "UEsDBBQA...",
"inputBytes": 204880,
"outputBytes": 176320,
"removedEntries": 5,
"mime": "application/octet-stream"
}Edge cases and what actually happens
Track changes still name the author after the wipe
By designRevision marks live inside word/document.xml (the body), which the wiper deliberately does not modify. A tracked insertion or comment-as-revision carries the author's name and survives the wipe. Accept or reject all changes in Word's Review tab before exporting — this is the most common reason a 'cleaned' blind submission still gets desk-rejected.
Author name still on the title page
Out of scopeThe wiper removes metadata, not visible text. Your name, affiliation, and corresponding-author email on the title page are body content and are not touched. Delete them in Word, and follow the journal's instruction to supply a separate, non-anonymised title page on a different upload slot.
Legacy .doc dropped
RejectedA legacy .doc is a binary OLE compound file, not an OOXML ZIP, so JSZip cannot open it and the accept filter (.docx,.xlsx,.pptx) excludes it. Open it in Word and Save As .docx first, then wipe the converted copy.
removedEntries comes back as 0
ExpectedZero means the manuscript had none of the targeted property or comment streams — common for a file already cleaned or built from a blank document with metadata already empty. It is not an error; the file is still repacked and saved with the -clean suffix.
Author reappears after you re-open and save the clean copy
ExpectedWord writes a fresh, minimal docProps/core.xml listing the current logged-in user as Creator the next time you open and SAVE the file. The wipe was correct when it ran. Make the wipe the last step before upload and do not open-and-save the cleaned copy.
Running on the Free plan
Rejected: requires ProThe tool's minimum plan is Pro; a Free-tier run is rejected with Office Doc Property Wiper requires the pro plan. before any file is read.
Figures pasted into the manuscript still carry EXIF/GPS
Out of scopeThe wiper strips document-level streams, not per-image metadata inside word/media/. A photo of fieldwork or a scanned figure can keep its GPS and camera EXIF. Scrub images before inserting them; see gps-geotag-remover.
Self-citations reveal the lab
Out of scopePhrases like 'building on our earlier work [12]' and matching bibliography entries are body text, not metadata, and survive the wipe. Reword to the third person and follow the journal's anonymous-citation guidance before exporting.
A separate PDF version is being submitted
Use a different toolPDFs are not OOXML ZIPs and are rejected by this tool. Run the PDF through pdf-history-sanitizer, which scrubs the Info dictionary and XMP where Creator and Producer fields hide.
Repacked file size differs from the original
ExpectedJSZip re-compresses surviving entries on repack, so outputBytes rarely matches the source byte-for-byte even beyond the deleted streams. The manuscript opens identically in Word; the delta reflects ZIP recompression and removed XML, not content loss.
Frequently asked questions
Will this make my .docx pass a journal's double-blind check?
It removes the author identity that lives in metadata — docProps/core.xml (Creator, Last Modified By), docProps/app.xml (Company / institution, Total Editing Time), docProps/custom.xml (reference-manager IDs, grant codes), and the Word comment / people streams. It does NOT remove your name from the title page, your tracked changes, or your self-citations, all of which are body content. Handle those in Word; this tool closes the metadata half of the check.
Does it remove track changes?
No. Revision marks are nodes inside word/document.xml (the body), not a property stream, and the wiper deliberately leaves the body untouched. A surviving tracked insertion carries the author's name. Accept or reject all changes in Word's Review tab before you export — this is the most common blinding leak.
What about my supervisor's comments?
They are removed. The tool deletes word/comments.xml, commentsExtended.xml, commentsExtensible.xml, commentsIds.xml, people.xml, and threadedComments.xml, plus a wildcard sweep for numbered comment files. A single named comment can de-anonymise a blind submission, so this is one of the most important streams it clears.
Will a grant number or Zotero ID be removed?
Yes, if it is stored as a custom document property. docProps/custom.xml is deleted entirely, and reference managers and grant-tracking add-ins commonly stamp IDs there. Note that a grant number written into the body text or acknowledgements is content and survives — remove that manually.
Is my unpublished manuscript uploaded anywhere?
On the public website, no. JSZip unpacks, strips, and repacks the file entirely in your browser, so an embargoed pre-print never leaves your machine. The optional API path is also upload-free: it dispatches to an @jadapps/runner on your own machine. A counter of runs (no content) is recorded for audit.
Does it strip the embedded thumbnail?
Yes — docProps/thumbnail.jpeg and docProps/thumbnail.png are deleted. The thumbnail is a rendered image of page one that file explorers and some upload systems display; removing it stops an un-anonymised title page from flashing in a preview.
Can I anonymise several manuscripts at once?
The picker allows multi-select (Pro allows up to 5 in a batch), but the browser processor cleans the first selected file per run and returns one cleaned copy. Run them one at a time, or script the local runner endpoint for true unattended batch processing of a folder.
What plan and file size do I need?
The tool requires the Pro plan or higher. File-size caps follow the Security family limits: Pro 100 MB, Pro-media 500 MB, Developer 2 GB per file. Manuscripts and supplements are almost always far under these unless they embed large figures or media.
Why does my name come back after I edit the cleaned file?
Word writes a fresh docProps/core.xml naming the current user as Creator the next time you open and save the document. The wipe was correct when it ran. Make the wipe the last step before upload and do not open-and-save the cleaned copy afterwards.
How do I confirm the file is anonymous?
Check File → Info → Properties in Word — Author, Company, and the comment pane should be blank. For certainty, copy the cleaned file to a .zip and list it: docProps/core.xml, app.xml, custom.xml, and the comment streams should be absent. The result panel also reports a removedEntries count you can record.
What about figures with GPS or camera EXIF?
The wiper strips document-level streams, not per-image metadata inside word/media/. A field photo can keep its GPS coordinates. Scrub images before inserting them — see gps-geotag-remover. For a PDF version of the paper, use pdf-history-sanitizer.
How does this compare to the other JAD metadata tools?
This tool handles OOXML Office containers only. For a PDF manuscript use pdf-history-sanitizer; for audio supplements (MP3) use audio-id3-ghoster; to scrub figures use gps-geotag-remover. All share the same browser-side, no-upload model.
Privacy first
Every JAD Security operation runs entirely in your browser. Files, passwords, and PGP private keys never leave your device — verified by zero outbound network requests during processing.