How to pre-publication secret scrub for markdown
- Step 1Run your CI secret scanner first — gitleaks or trufflehog catches high-entropy strings and provider prefixes (
sk-,ghp_,xoxb-) this tool has no patterns for. Treat their output as the authoritative gate. - Step 2Open the redactor for the readable pass — Go to /markdown-tools/md-secret-redactor and paste or drop the final draft of the doc you are about to publish.
- Step 3Set scope to match where your secrets live — Leave
scanAlloff for code-sample-only docs. Enable it when credentials might appear in prose, inlinebacktickspans, or headings (e.g. a quickstart that saysYour token is ...in a sentence). - Step 4Run and confirm placeholders appear — Matches become
[REDACTED],[REDACTED_AWS_KEY],[REDACTED_JWT], or[REDACTED_PRIVATE_KEY]. Their presence confirms the corresponding pattern fired. - Step 5Grep for the prefixes the tool cannot match — Search the scrubbed doc for
sk-,ghp_,xoxb-,sk_live_, and 40-char base64 blobs. None of these are detected unless keyword-prefixed. - Step 6Rotate, commit, publish — Any real credential you found is compromised the moment it was committed — rotate it. Then commit the scrubbed Markdown and publish.
What the redactor actually detects
The five regex patterns the redactor applies, in the exact order it applies them, taken from lib/markdown/markdown-engine.ts. There are no other patterns — anything not matched here is left untouched.
| Pattern (what it matches) | Example that matches | Replaced with | Order |
|---|---|---|---|
AWS access key id: AKIA + 16 uppercase letters/digits (case-sensitive) | AKIAIOSFODNN7EXAMPLE | [REDACTED_AWS_KEY] | 1 |
Keyword assignment: api_key, api-key, apikey, token, secret, password, passwd, pwd, authorization followed by =/:/space, then an 8+ char value | api_key = abcd12345678 | <keyword>=[REDACTED] (separator normalized to =) | 2 |
Bearer + an 8+ char token | Bearer eyJhbGci... | Bearer [REDACTED] | 3 |
Three-segment JWT: eyJ + 10+ chars, dot, 10+ chars, dot, 10+ chars | eyJhbGci....eyJzdWIi....SflKxw... | [REDACTED_JWT] | 4 |
PEM private-key block: -----BEGIN ... KEY----- ... -----END ... KEY----- | an RSA/EC/OPENSSH key block | [REDACTED_PRIVATE_KEY] | 5 |
Scope: which parts of the document are scanned
The single scanAll option controls scope. Default (off) restricts redaction to fenced `` code blocks only; on scans the entire document. Inline backtick` spans and 4-space indented code are treated as prose, not code blocks.
| Document region | scanAll: false (default) | scanAll: true |
|---|---|---|
| Fenced ``` code block | Scanned | Scanned |
| Prose / paragraph text | Left untouched | Scanned |
Inline backtick code span | Left untouched (counts as prose) | Scanned |
| 4-space indented code block | Left untouched (only ``` fences count) | Scanned |
| Headings, tables, blockquotes | Left untouched | Scanned |
Defense-in-depth: this tool vs. a CI scanner
The redactor and a CI scanner cover different gaps. Use both. This table maps where each one wins.
| Capability | Secret Redactor | gitleaks / trufflehog |
|---|---|---|
| In-place readable redaction in the doc | Yes (placeholders) | No (reports findings only) |
| Provider prefixes (sk-, ghp_, xoxb-) | Only if keyword-prefixed | Yes (dedicated rules) |
| High-entropy / base64 detection | No | Yes |
| Scans Git history | No | Yes |
| Runs offline in browser, no upload | Yes | Local CLI (also offline) |
| AWS AKIA key id | Yes | Yes |
Cookbook
Publish-checklist scenarios taken from real open-source docs, run against the actual engine.
Sample request with a Bearer header
The classic README leak. The Bearer pattern matches the token and replaces it, leaving the header shape intact for readers.
Input: ```http GET /v1/account Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjMifQ.SflKxwRJSMeKKF2QT4 ``` Output: ```http GET /v1/account Authorization: Bearer [REDACTED] ```
A .env snippet in the quickstart
Keyword assignments are the most reliable hits. SECRET, API_KEY, and PASSWORD all trigger the keyword pattern; the value is replaced and the separator normalized to =.
Input: ```dotenv API_KEY=abcd1234efgh5678 DB_PASSWORD = s3cr3tlongvalue PORT=3000 ``` Output: ```dotenv API_KEY=[REDACTED] DB_PASSWORD=[REDACTED] PORT=3000 ```
A GitHub token that slips through
A bare ghp_... token on its own line has no matching pattern. This is exactly why you run gitleaks first — it has a dedicated GitHub rule.
Input: ```bash git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo ``` Output (unchanged — no ghp_ pattern): ```bash git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo ```
Secret in prose — enable scanAll
A quickstart that names a token in a sentence needs scanAll, because default scope is fenced blocks only.
Input: Set `password = m2x9longsecret` in your config before running. scanAll: false → unchanged (it's prose + inline code) scanAll: true → Set `password=[REDACTED]` in your config before running.
PEM private key pasted into a troubleshooting section
The whole block from BEGIN to END is replaced with a single placeholder, regardless of key type (RSA, EC, OPENSSH).
Input: ``` -----BEGIN OPENSSH PRIVATE KEY----- b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAAB -----END OPENSSH PRIVATE KEY----- ``` Output: ``` [REDACTED_PRIVATE_KEY] ```
Edge cases and what actually happens
gitleaks flagged a key this tool missed
ExpectedBy design they overlap only partially. The redactor has no entropy or provider-prefix rules; trust the CI scanner as the gate and use this for readable in-doc redaction.
Bare provider key (sk-, ghp_, xoxb-, sk_live_)
Not detectedNo dedicated patterns exist. Caught only when keyword-prefixed (token=ghp_...). Grep for these prefixes before publishing.
Secret only in prose, scanAll off
PreservedBy design — default scope is fenced code blocks. Enable scanAll for docs that name credentials in sentences.
40-char base64 secret with no keyword
Not detectedIndistinguishable from random data; no pattern matches it. This is gitleaks' entropy job.
Multi-line / wrapped JWT
Not detectedThe JWT pattern needs eyJ... with three dotted segments on one line. Wrapped tokens are missed.
Placeholder value gets redacted
Expectedapi_key = your-key-here-now matches the keyword pattern and is redacted. Over-redaction of placeholders is harmless but worth a glance.
Trailing quote left behind
By designQuoted values like secret: "x" become secret=[REDACTED]". Cosmetic; clean up if it matters.
Doc exceeds Free limits
RejectedFree is 1 MB / 500,000 chars / 1 file. Split a large docs bundle with md-splitter or upgrade.
Secret lives in committed Git history
Out of scopeOnly the current text is rewritten. Purge history with BFG/git-filter-repo and rotate.
Frequently asked questions
Can this replace gitleaks or trufflehog?
No. It has no entropy detection and no provider-prefix rules. Use a CI scanner as the gate and this tool for readable in-document redaction.
What does it actually detect?
AWS AKIA key ids, keyword assignments (api_key/token/secret/password/passwd/pwd/authorization + 8+ char value), Bearer tokens, three-segment eyJ... JWTs, and PEM key blocks.
Why wasn't my GitHub token redacted?
There is no ghp_ pattern. A bare token is only redacted when a keyword precedes it. gitleaks has a dedicated rule for this.
Does it scan my whole doc by default?
No. Default scope is fenced `` code blocks only. Enable scanAll` to scan prose, inline code, and headings.
Is it safe for pre-release docs with live keys?
Yes. It runs in your browser; the document is never uploaded.
Does it scan Git history?
No. It rewrites the current document only. Use BFG or git-filter-repo for history, and rotate leaked keys.
Can I configure custom patterns?
No. The only option is the scanAll boolean.
Will it touch my prose narrative?
Not by default — only fenced code blocks are scanned unless scanAll is on.
What placeholders does it use?
[REDACTED], [REDACTED_AWS_KEY], [REDACTED_JWT], and [REDACTED_PRIVATE_KEY].
Can it process my whole /docs folder at once?
No. acceptsMultiple is false — one document per run. For multi-file flows, scrub each file or merge first with md-merger.
Does redacting the doc fix the leak?
No. A committed key is already exposed. Rotate it; redaction only prevents future re-exposure in the published doc.
What else should I run before publishing?
Lint with md-lint, validate links with md-link-validator, and strip emoji with md-emoji-remover.
Privacy first
All Markdown processing runs locally in your browser using JavaScript. No file is ever uploaded to JAD Apps servers — only metadata counters are saved for signed-in dashboard stats.