How to black-box candidate contact details on resume pdfs
- Step 1Open the canonical PDF PII redactor — This Security entry routes to the real engine at /pdf-tools/pdf-pii-redactor. It is Pro-tier (
minTier: pro), so a Free account can't run this particular redactor. - Step 2Upload one resume PDF with a text layer — Drop a single CV at a time (
acceptsMultiple: false). Born-digital resumes exported from Word, Google Docs, or a CV builder have a real text layer. A scanned or photographed CV does not — OCR it first or it returns untouched. - Step 3Let the scanner walk every page — pdfjs reads each page's
getTextContent()items and pdf-lib loads the same document. For each text item, the four patterns run in order — email, phone, SSN, card — and the first match flags the item. - Step 4Boxes land over the contact details — When the contact email or phone matches, a black rectangle is drawn at that item's
x/yspanning its full width and height (plus 2 pt). One box per matched item covers the whole run, so the value is fully hidden. - Step 5Download the redacted resume — The result is saved as a new PDF blob. Page count and layout are preserved — only black boxes are added over matched contact text. The candidate's name, skills, and history stay as-is (the tool does not detect names).
- Step 6Flatten before you forward it to the panel — Critical: open the file and run
Ctrl+A → copy. If the email/phone still pastes out, the glyphs are still there. Flatten (/pdf-tools/pdf-flatten) or rasterise so the boxes become pixels, then send the anonymous copy to reviewers.
What gets boxed on a typical resume
These are the exact PII_PATTERNS the redactor runs against each text item, in order. On a CV the email and phone almost always match; the SSN/card patterns rarely fire unless a candidate listed an ID or long number. There are no toggles — the set is fixed in code.
| Resume field | Pattern that catches it | Caught? | Notes |
|---|---|---|---|
| Personal email (jane.doe@gmail.com) | email — local@domain.tld, 2+ letter TLD | Yes | Standard addresses always match; no DNS check |
| Phone (+1 (415) 555-0182) | phone — optional country/area code, loose grouping | Yes | Loose by design; can also catch other phone-shaped digit strings |
| Candidate name | (none — no name detection) | No | Names are out of scope; redact manually if needed |
| Street address / city | (none — no address detection) | No | Not detected; box it yourself with signature-burner |
| LinkedIn / portfolio URL | (none — URL is not a pattern) | Usually no | A bare URL with no @ won't match email; can still identify a candidate |
Redaction behaviour — what it does vs. does NOT do
The most important table for recruiters: "visual" means a filled rectangle is drawn over the text; the characters underneath are not removed.
| Behaviour | Reality in this tool | Why it matters for a CV |
|---|---|---|
| Redaction method | Filled black drawRectangle over the matched item (pdf-lib), at pdfjs coordinates | Real ink, visible everywhere — but an overlay, not a deletion |
| Text removal | Not removed — glyphs stay in the content stream | A curious reviewer can Ctrl+A → copy the boxed email/phone unless you flatten first |
| Granularity | Whole text item, not the exact substring | If the contact line is one run, the whole line is boxed — adjacent text goes too |
| Name / address | Not detected (no name/address patterns) | Blind hiring still needs you to handle name/address by hand |
| Options | None (needsOptions: false) | You can't pick "email only" or change the box colour |
Tier and file limits (PDF family)
This redactor is gated at Pro (minTier: pro) and runs through the PDF tool family, so PDF-family limits apply. One file at a time.
| Tier | Max file size | Max pages | Files per run |
|---|---|---|---|
| Free | Tool gated — Pro required | — | — |
| Pro | 50 MB | 500 pages | 5 (this tool: 1 at a time) |
| Pro-media | 500 MB | 2,000 pages | 50 (this tool: 1 at a time) |
| Developer | 2 GB | 10,000 pages | Unlimited (this tool: 1 at a time) |
Cookbook
Before/after snippets from real resume layouts. Candidate details are fabricated. "Before" is the page text; "After" shows what the panel sees once boxes are drawn — and what copy/paste still recovers underneath until you flatten.
A standard contact header
Born-digital CV exported from a builder — full text layer. The email and phone in the header both match and get boxed. The name is not detected, so it stays visible.
Before (header text): Jane Doe jane.doe@gmail.com | +1 (415) 555-0182 Senior Product Designer After (what the panel sees): Jane Doe ██████████████████ | ███████████████ Senior Product Designer Note: the NAME is still visible (no name pattern). Flatten if you also need the name hidden.
Contact line is a single text item
Redaction is per text item. If the email and phone sit in one run, the whole run is boxed — useful here because it hides both at once, but it also covers the ' | ' separator and any nearby text in that item.
Before (single pdfjs item): 'jane.doe@gmail.com | +1 (415) 555-0182 | San Francisco' After: '█████████████████████████████████████████████████████' The whole item is boxed because it contained an email match, so the city got covered too. Reflow/copy recovers all of it.
Candidate listed a national ID number
Some CVs (outside the US) list a long ID number. A 13–16 digit run matches the card pattern and is boxed — over-redaction in the safe direction, but worth a glance so you don't hide a wanted field.
Before: Passport / ID: 4002 8812 3456 7890 Email: jane.doe@gmail.com After: Passport / ID: █████████████████████ Email: ██████████████████ The 16-digit ID matched the card pattern (no Luhn check), so it was boxed alongside the email.
Scanned CV with no text layer
A CV someone scanned to PDF is just images — no text items. The redactor finds nothing and returns the file unchanged. OCR it first to add a text layer, or box the contact block manually.
Input: scanned_resume.pdf (image-only) Scan result: 0 text items -> 0 matches -> 0 boxes Output: identical pages, no redactions. Fix path: 1. Run OCR via /pdf-tools/pdf-ocr to add a text layer 2. Re-run this redactor, OR 3. Burn manual rectangles with /security-tools/signature-burner
A portfolio URL slips through
A bare website or LinkedIn URL has no '@', so it doesn't match the email pattern, and it's not digits, so no other pattern fires. It stays visible and can still identify a candidate — box it by hand for true anonymity.
Before: jane.doe@gmail.com linkedin.com/in/janedoe <- NOT redacted (no @, not digits) janedoe.design <- NOT redacted After: ██████████████████ linkedin.com/in/janedoe <- still visible janedoe.design <- still visible Mitigation: redact those lines manually with signature-burner.
Edge cases and what actually happens
Boxed email/phone is still copy-pasteable
By design (visual only)The tool draws a filled rectangle over each matched item; it does not delete glyphs. The code comment is explicit that "the glyphs underneath are still in the file's content stream." A reviewer can Ctrl+A → copy the contact details out. For a genuinely anonymous CV, flatten (/pdf-tools/pdf-flatten) or rasterise the output so the boxes become pixels, then verify with copy-paste before sending.
Candidate name is not redacted
Out of scopeOnly email, phone, SSN, and 13–16 digit runs are detected — there is no name pattern in this tool (despite a registry FAQ that mentions name patterns; the code does not implement that). For blind hiring you still need to box the name and any photo manually with signature-burner.
Whole contact line is boxed, not just the email
ExpectedRedaction is one box per matched text item. If pdfjs returns the email inside a longer run (jane.doe@gmail.com | San Francisco), the entire run is covered ("one redaction box per item is enough"). Usually convenient for a contact line, but it can hide a city or title you wanted to keep.
Scanned / image-only CV produces no redactions
No matchesDetection reads the text layer via pdfjs. A scanned CV is just images — zero text items, zero matches, nothing redacted. Add a text layer with PDF OCR first, then re-run, or box the header with signature-burner.
LinkedIn / portfolio URL stays visible
Missed identifierA bare URL has no '@' so it doesn't match email, and it isn't a digit run, so nothing fires. URLs can still identify a candidate. There is no URL pattern in this tool — redact those lines manually.
Email split across two text items
Missed matchRegexes run per item. If a builder split jane.doe@ and gmail.com into separate items (justified text, certain exports), neither fragment matches and nothing is boxed. Spot-check the header; flatten + re-OCR can re-flow text into single items.
Phone-shaped employee ID gets boxed
Over-redactionThe phone pattern is deliberately loose, so a number formatted like a phone (an internal reference, fax, or grouped ID on the CV) can be flagged. It hides rather than leaks, but eyeball the output if you needed that value readable.
Free tier can't run this tool
Pro requiredThe redactor is gated at minTier: pro. On Free the run is blocked before processing. Pro allows up to 50 MB / 500 pages; Developer raises that to 2 GB / 10,000 pages. One CV at a time.
Box sits slightly off rotated text
Visual mismatch possibleThe rectangle is axis-aligned at the item's x/y with its width/height. On a CV with a rotated sidebar or unusual transforms, the box can land slightly off the glyphs. Always inspect the rendered output before treating a resume as anonymised.
Frequently asked questions
Does this actually remove the candidate's email and phone?
Not on its own. The tool draws a black rectangle over each matched item with pdf-lib, but the characters stay in the PDF content stream — the code comment says so. A reviewer can Ctrl+A → copy the boxed contact details out. Treat it as a fast first pass, then flatten (/pdf-tools/pdf-flatten) or rasterise the file before forwarding it to the panel.
Will it hide the candidate's name for blind hiring?
No. There is no name detection in this tool — only email, phone, SSN, and 13–16 digit runs. For a truly blind CV you still need to box the name (and any photo) by hand with signature-burner after running this pass.
Why did it black out the whole contact line instead of just the email?
Redaction is per text item, not per substring. pdfjs returns text in runs, and if the email is inside a longer run the whole run gets one box ("one redaction box per item is enough"). For a contact line that usually helps — it hides the phone and city too — but check the output if you wanted some of that text visible.
Does it work on a scanned resume?
No. Detection reads the PDF text layer via pdfjs, and a scanned CV is just images — zero text items, nothing redacted. Run PDF OCR to add a text layer first, then re-run, or box the header manually with signature-burner.
Is the resume uploaded anywhere?
No. pdfjs reads the pages, pdf-lib draws the boxes, and the result is saved locally — all in your browser. The CV and its personal data never leave your device, which is what makes it usable for GDPR/CCPA-sensitive candidate handling.
Can I redact a whole folder of CVs at once?
Not in one run — this tool processes a single PDF at a time (acceptsMultiple: false). Run each resume separately. There is no batch redaction queue on this path.
Can I choose to redact only the email, or change the box colour?
No. The tool has no options (needsOptions: false). All four patterns always run and the box is always black. If you need configurable masking with [REDACTED_*] labels on text, use email-phone-scrubber instead.
What about a LinkedIn or portfolio URL on the CV?
A bare URL has no '@', so it doesn't match the email pattern, and it isn't digits, so nothing else fires — it stays visible. URLs can still identify a candidate. Box those lines manually with signature-burner.
Will it catch an international phone number?
Usually. The phone pattern allows an optional + country code, an optional area code, and loose digit grouping with spaces, dots, or dashes — so most international formats match. The same looseness means a phone-shaped reference number can also get boxed; eyeball the result.
How do I make the anonymised CV permanent?
Run this tool to place the boxes, then destroy the underlying text: flatten the PDF (/pdf-tools/pdf-flatten) or print-to-PDF as an image so each page becomes pixels. Re-verify with Ctrl+A → copy — if nothing pastes from the boxed areas, the contact details are gone. Only then send it to the panel.
What file size and page limits apply?
The redactor is gated at Pro. Pro allows up to 50 MB and 500 pages per PDF; Pro-media 500 MB / 2,000 pages; Developer 2 GB / 10,000 pages. Free accounts can't run this redactor. One file at a time.
My candidate data is in a spreadsheet, not a PDF — what then?
Use the text-native siblings. email-phone-scrubber replaces PII in pasted text or .txt with [REDACTED_*] labels, and csv-json-data-scrambler handles structured candidate rows. Those genuinely replace the values rather than covering them, since text formats have no glyph-layer problem.
Privacy first
Every JAD Security operation runs entirely in your browser. Files, passwords, and PGP private keys never leave your device — verified by zero outbound network requests during processing.