Scrub Secrets Before Publishing Docs — Free Pre-Publish Check

How to pre-publication secret scrub for markdown

Step 1
Run your CI secret scanner first — gitleaks or trufflehog catches high-entropy strings and provider prefixes (sk-, ghp_, xoxb-) this tool has no patterns for. Treat their output as the authoritative gate.
Step 2
Open the redactor for the readable pass — Go to /markdown-tools/md-secret-redactor and paste or drop the final draft of the doc you are about to publish.
Step 3
Set scope to match where your secrets live — Leave scanAll off for code-sample-only docs. Enable it when credentials might appear in prose, inline backtick spans, or headings (e.g. a quickstart that says Your token is ... in a sentence).
Step 4
Run and confirm placeholders appear — Matches become [REDACTED], [REDACTED_AWS_KEY], [REDACTED_JWT], or [REDACTED_PRIVATE_KEY]. Their presence confirms the corresponding pattern fired.
Step 5
Grep for the prefixes the tool cannot match — Search the scrubbed doc for sk-, ghp_, xoxb-, sk_live_, and 40-char base64 blobs. None of these are detected unless keyword-prefixed.
Step 6
Rotate, commit, publish — Any real credential you found is compromised the moment it was committed — rotate it. Then commit the scrubbed Markdown and publish.

What the redactor actually detects

The five regex patterns the redactor applies, in the exact order it applies them, taken from lib/markdown/markdown-engine.ts. There are no other patterns — anything not matched here is left untouched.

Pattern (what it matches)	Example that matches	Replaced with	Order
AWS access key id: `AKIA` + 16 uppercase letters/digits (case-sensitive)	`AKIAIOSFODNN7EXAMPLE`	`[REDACTED_AWS_KEY]`	1
Keyword assignment: `api_key`, `api-key`, `apikey`, `token`, `secret`, `password`, `passwd`, `pwd`, `authorization` followed by `=`/`:`/space, then an 8+ char value	`api_key = abcd12345678`	`<keyword>=[REDACTED]` (separator normalized to `=`)	2
`Bearer` + an 8+ char token	`Bearer eyJhbGci...`	`Bearer [REDACTED]`	3
Three-segment JWT: `eyJ` + 10+ chars, dot, 10+ chars, dot, 10+ chars	`eyJhbGci....eyJzdWIi....SflKxw...`	`[REDACTED_JWT]`	4
PEM private-key block: `-----BEGIN ... KEY-----` ... `-----END ... KEY-----`	an RSA/EC/OPENSSH key block	`[REDACTED_PRIVATE_KEY]`	5

Scope: which parts of the document are scanned

The single scanAll option controls scope. Default (off) restricts redaction to fenced `` code blocks only; on scans the entire document. Inline backtick` spans and 4-space indented code are treated as prose, not code blocks.

Document region	scanAll: false (default)	scanAll: true
Fenced ``` code block	Scanned	Scanned
Prose / paragraph text	Left untouched	Scanned
Inline `backtick` code span	Left untouched (counts as prose)	Scanned
4-space indented code block	Left untouched (only ``` fences count)	Scanned
Headings, tables, blockquotes	Left untouched	Scanned

Defense-in-depth: this tool vs. a CI scanner

The redactor and a CI scanner cover different gaps. Use both. This table maps where each one wins.

Capability	Secret Redactor	gitleaks / trufflehog
In-place readable redaction in the doc	Yes (placeholders)	No (reports findings only)
Provider prefixes (sk-, ghp_, xoxb-)	Only if keyword-prefixed	Yes (dedicated rules)
High-entropy / base64 detection	No	Yes
Scans Git history	No	Yes
Runs offline in browser, no upload	Yes	Local CLI (also offline)
AWS AKIA key id	Yes	Yes

Cookbook

Publish-checklist scenarios taken from real open-source docs, run against the actual engine.

Sample request with a Bearer header

The classic README leak. The Bearer pattern matches the token and replaces it, leaving the header shape intact for readers.

Input:
```http
GET /v1/account
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjMifQ.SflKxwRJSMeKKF2QT4
```

Output:
```http
GET /v1/account
Authorization: Bearer [REDACTED]
```

A .env snippet in the quickstart

Keyword assignments are the most reliable hits. SECRET, API_KEY, and PASSWORD all trigger the keyword pattern; the value is replaced and the separator normalized to =.

Input:
```dotenv
API_KEY=abcd1234efgh5678
DB_PASSWORD = s3cr3tlongvalue
PORT=3000
```

Output:
```dotenv
API_KEY=[REDACTED]
DB_PASSWORD=[REDACTED]
PORT=3000
```

A GitHub token that slips through

A bare ghp_... token on its own line has no matching pattern. This is exactly why you run gitleaks first — it has a dedicated GitHub rule.

Input:
```bash
git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo
```

Output (unchanged — no ghp_ pattern):
```bash
git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo
```

Secret in prose — enable scanAll

A quickstart that names a token in a sentence needs scanAll, because default scope is fenced blocks only.

Input:
Set `password = m2x9longsecret` in your config before running.

scanAll: false → unchanged (it's prose + inline code)

scanAll: true →
Set `password=[REDACTED]` in your config before running.

PEM private key pasted into a troubleshooting section

The whole block from BEGIN to END is replaced with a single placeholder, regardless of key type (RSA, EC, OPENSSH).

Input:
```
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAAB
-----END OPENSSH PRIVATE KEY-----
```

Output:
```
[REDACTED_PRIVATE_KEY]
```

Edge cases and what actually happens

gitleaks flagged a key this tool missed

Expected

By design they overlap only partially. The redactor has no entropy or provider-prefix rules; trust the CI scanner as the gate and use this for readable in-doc redaction.

Bare provider key (sk-, ghp_, xoxb-, sk_live_)

Not detected

No dedicated patterns exist. Caught only when keyword-prefixed (token=ghp_...). Grep for these prefixes before publishing.

Secret only in prose, scanAll off

Preserved

By design — default scope is fenced code blocks. Enable scanAll for docs that name credentials in sentences.

40-char base64 secret with no keyword

Not detected

Indistinguishable from random data; no pattern matches it. This is gitleaks' entropy job.

Multi-line / wrapped JWT

Not detected

The JWT pattern needs eyJ... with three dotted segments on one line. Wrapped tokens are missed.

Placeholder value gets redacted

Expected

api_key = your-key-here-now matches the keyword pattern and is redacted. Over-redaction of placeholders is harmless but worth a glance.

Trailing quote left behind

By design

Quoted values like secret: "x" become secret=[REDACTED]". Cosmetic; clean up if it matters.

Doc exceeds Free limits

Rejected

Free is 1 MB / 500,000 chars / 1 file. Split a large docs bundle with md-splitter or upgrade.

Secret lives in committed Git history

Out of scope

Only the current text is rewritten. Purge history with BFG/git-filter-repo and rotate.

Frequently asked questions

Can this replace gitleaks or trufflehog?

No. It has no entropy detection and no provider-prefix rules. Use a CI scanner as the gate and this tool for readable in-document redaction.

What does it actually detect?

AWS AKIA key ids, keyword assignments (api_key/token/secret/password/passwd/pwd/authorization + 8+ char value), Bearer tokens, three-segment eyJ... JWTs, and PEM key blocks.

Why wasn't my GitHub token redacted?

There is no ghp_ pattern. A bare token is only redacted when a keyword precedes it. gitleaks has a dedicated rule for this.

Does it scan my whole doc by default?

No. Default scope is fenced `` code blocks only. Enable scanAll` to scan prose, inline code, and headings.

Is it safe for pre-release docs with live keys?

Yes. It runs in your browser; the document is never uploaded.

Does it scan Git history?

No. It rewrites the current document only. Use BFG or git-filter-repo for history, and rotate leaked keys.

Can I configure custom patterns?

No. The only option is the scanAll boolean.

Will it touch my prose narrative?

Not by default — only fenced code blocks are scanned unless scanAll is on.

What placeholders does it use?

[REDACTED], [REDACTED_AWS_KEY], [REDACTED_JWT], and [REDACTED_PRIVATE_KEY].

Can it process my whole /docs folder at once?

No. acceptsMultiple is false — one document per run. For multi-file flows, scrub each file or merge first with md-merger.

Does redacting the doc fix the leak?

No. A committed key is already exposed. Rotate it; redaction only prevents future re-exposure in the published doc.

What else should I run before publishing?

Lint with md-lint, validate links with md-link-validator, and strip emoji with md-emoji-remover.

Privacy first

All Markdown processing runs locally in your browser using JavaScript. No file is ever uploaded to JAD Apps servers — only metadata counters are saved for signed-in dashboard stats.

How to pre-publication secret scrub for markdown

Step 1
Run your CI secret scanner first — gitleaks or trufflehog catches high-entropy strings and provider prefixes (sk-, ghp_, xoxb-) this tool has no patterns for. Treat their output as the authoritative gate.
Step 2
Open the redactor for the readable pass — Go to /markdown-tools/md-secret-redactor and paste or drop the final draft of the doc you are about to publish.
Step 3
Set scope to match where your secrets live — Leave scanAll off for code-sample-only docs. Enable it when credentials might appear in prose, inline backtick spans, or headings (e.g. a quickstart that says Your token is ... in a sentence).
Step 4
Run and confirm placeholders appear — Matches become [REDACTED], [REDACTED_AWS_KEY], [REDACTED_JWT], or [REDACTED_PRIVATE_KEY]. Their presence confirms the corresponding pattern fired.
Step 5
Grep for the prefixes the tool cannot match — Search the scrubbed doc for sk-, ghp_, xoxb-, sk_live_, and 40-char base64 blobs. None of these are detected unless keyword-prefixed.
Step 6
Rotate, commit, publish — Any real credential you found is compromised the moment it was committed — rotate it. Then commit the scrubbed Markdown and publish.

What the redactor actually detects

Pattern (what it matches)	Example that matches	Replaced with	Order
AWS access key id: `AKIA` + 16 uppercase letters/digits (case-sensitive)	`AKIAIOSFODNN7EXAMPLE`	`[REDACTED_AWS_KEY]`	1
Keyword assignment: `api_key`, `api-key`, `apikey`, `token`, `secret`, `password`, `passwd`, `pwd`, `authorization` followed by `=`/`:`/space, then an 8+ char value	`api_key = abcd12345678`	`<keyword>=[REDACTED]` (separator normalized to `=`)	2
`Bearer` + an 8+ char token	`Bearer eyJhbGci...`	`Bearer [REDACTED]`	3
Three-segment JWT: `eyJ` + 10+ chars, dot, 10+ chars, dot, 10+ chars	`eyJhbGci....eyJzdWIi....SflKxw...`	`[REDACTED_JWT]`	4
PEM private-key block: `-----BEGIN ... KEY-----` ... `-----END ... KEY-----`	an RSA/EC/OPENSSH key block	`[REDACTED_PRIVATE_KEY]`	5

Scope: which parts of the document are scanned

Document region	scanAll: false (default)	scanAll: true
Fenced ``` code block	Scanned	Scanned
Prose / paragraph text	Left untouched	Scanned
Inline `backtick` code span	Left untouched (counts as prose)	Scanned
4-space indented code block	Left untouched (only ``` fences count)	Scanned
Headings, tables, blockquotes	Left untouched	Scanned

Defense-in-depth: this tool vs. a CI scanner

The redactor and a CI scanner cover different gaps. Use both. This table maps where each one wins.

Capability	Secret Redactor	gitleaks / trufflehog
In-place readable redaction in the doc	Yes (placeholders)	No (reports findings only)
Provider prefixes (sk-, ghp_, xoxb-)	Only if keyword-prefixed	Yes (dedicated rules)
High-entropy / base64 detection	No	Yes
Scans Git history	No	Yes
Runs offline in browser, no upload	Yes	Local CLI (also offline)
AWS AKIA key id	Yes	Yes

Cookbook

Publish-checklist scenarios taken from real open-source docs, run against the actual engine.

Sample request with a Bearer header

The classic README leak. The Bearer pattern matches the token and replaces it, leaving the header shape intact for readers.

Input:
```http
GET /v1/account
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjMifQ.SflKxwRJSMeKKF2QT4
```

Output:
```http
GET /v1/account
Authorization: Bearer [REDACTED]
```

A .env snippet in the quickstart

Keyword assignments are the most reliable hits. SECRET, API_KEY, and PASSWORD all trigger the keyword pattern; the value is replaced and the separator normalized to =.

Input:
```dotenv
API_KEY=abcd1234efgh5678
DB_PASSWORD = s3cr3tlongvalue
PORT=3000
```

Output:
```dotenv
API_KEY=[REDACTED]
DB_PASSWORD=[REDACTED]
PORT=3000
```

A GitHub token that slips through

A bare ghp_... token on its own line has no matching pattern. This is exactly why you run gitleaks first — it has a dedicated GitHub rule.

Input:
```bash
git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo
```

Output (unchanged — no ghp_ pattern):
```bash
git clone https://ghp_aBcD1234aBcD1234aBcD1234aBcD1234abcd@github.com/me/repo
```

Secret in prose — enable scanAll

A quickstart that names a token in a sentence needs scanAll, because default scope is fenced blocks only.

Input:
Set `password = m2x9longsecret` in your config before running.

scanAll: false → unchanged (it's prose + inline code)

scanAll: true →
Set `password=[REDACTED]` in your config before running.

PEM private key pasted into a troubleshooting section

The whole block from BEGIN to END is replaced with a single placeholder, regardless of key type (RSA, EC, OPENSSH).

Input:
```
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAAB
-----END OPENSSH PRIVATE KEY-----
```

Output:
```
[REDACTED_PRIVATE_KEY]
```

Edge cases and what actually happens

gitleaks flagged a key this tool missed

Expected

By design they overlap only partially. The redactor has no entropy or provider-prefix rules; trust the CI scanner as the gate and use this for readable in-doc redaction.

Bare provider key (sk-, ghp_, xoxb-, sk_live_)

Not detected

No dedicated patterns exist. Caught only when keyword-prefixed (token=ghp_...). Grep for these prefixes before publishing.

Secret only in prose, scanAll off

Preserved

By design — default scope is fenced code blocks. Enable scanAll for docs that name credentials in sentences.

40-char base64 secret with no keyword

Not detected

Indistinguishable from random data; no pattern matches it. This is gitleaks' entropy job.

Multi-line / wrapped JWT

Not detected

The JWT pattern needs eyJ... with three dotted segments on one line. Wrapped tokens are missed.

Placeholder value gets redacted

Expected

api_key = your-key-here-now matches the keyword pattern and is redacted. Over-redaction of placeholders is harmless but worth a glance.

Trailing quote left behind

By design

Quoted values like secret: "x" become secret=[REDACTED]". Cosmetic; clean up if it matters.

Doc exceeds Free limits

Rejected

Free is 1 MB / 500,000 chars / 1 file. Split a large docs bundle with md-splitter or upgrade.

Secret lives in committed Git history

Out of scope

Only the current text is rewritten. Purge history with BFG/git-filter-repo and rotate.

Frequently asked questions

Can this replace gitleaks or trufflehog?

No. It has no entropy detection and no provider-prefix rules. Use a CI scanner as the gate and this tool for readable in-document redaction.

What does it actually detect?

AWS AKIA key ids, keyword assignments (api_key/token/secret/password/passwd/pwd/authorization + 8+ char value), Bearer tokens, three-segment eyJ... JWTs, and PEM key blocks.

Why wasn't my GitHub token redacted?

There is no ghp_ pattern. A bare token is only redacted when a keyword precedes it. gitleaks has a dedicated rule for this.

Does it scan my whole doc by default?

No. Default scope is fenced `` code blocks only. Enable scanAll` to scan prose, inline code, and headings.

Is it safe for pre-release docs with live keys?

Yes. It runs in your browser; the document is never uploaded.

Does it scan Git history?

No. It rewrites the current document only. Use BFG or git-filter-repo for history, and rotate leaked keys.

Can I configure custom patterns?

No. The only option is the scanAll boolean.

Will it touch my prose narrative?

Not by default — only fenced code blocks are scanned unless scanAll is on.

What placeholders does it use?

[REDACTED], [REDACTED_AWS_KEY], [REDACTED_JWT], and [REDACTED_PRIVATE_KEY].

Can it process my whole /docs folder at once?

No. acceptsMultiple is false — one document per run. For multi-file flows, scrub each file or merge first with md-merger.

Does redacting the doc fix the leak?

No. A committed key is already exposed. Rotate it; redaction only prevents future re-exposure in the published doc.

What else should I run before publishing?

Lint with md-lint, validate links with md-link-validator, and strip emoji with md-emoji-remover.

Privacy first

All Markdown processing runs locally in your browser using JavaScript. No file is ever uploaded to JAD Apps servers — only metadata counters are saved for signed-in dashboard stats.

Pre-Publication Secret Scrub for Markdown

How to pre-publication secret scrub for markdown

What the redactor actually detects

Scope: which parts of the document are scanned

Defense-in-depth: this tool vs. a CI scanner

Cookbook

Sample request with a Bearer header

A .env snippet in the quickstart

A GitHub token that slips through

Secret in prose — enable scanAll

PEM private key pasted into a troubleshooting section

Edge cases and what actually happens

gitleaks flagged a key this tool missed

Bare provider key (sk-, ghp_, xoxb-, sk_live_)

Secret only in prose, scanAll off

40-char base64 secret with no keyword

Multi-line / wrapped JWT

Placeholder value gets redacted

Trailing quote left behind

Doc exceeds Free limits

Secret lives in committed Git history

Frequently asked questions

Can this replace gitleaks or trufflehog?

What does it actually detect?

Why wasn't my GitHub token redacted?

Does it scan my whole doc by default?

Is it safe for pre-release docs with live keys?

Does it scan Git history?

Can I configure custom patterns?

Will it touch my prose narrative?

What placeholders does it use?

Can it process my whole /docs folder at once?

Does redacting the doc fix the leak?

What else should I run before publishing?

Privacy first

Related guides

Pre-Publication Secret Scrub for Markdown

How to pre-publication secret scrub for markdown

What the redactor actually detects

Scope: which parts of the document are scanned

Defense-in-depth: this tool vs. a CI scanner

Cookbook

Sample request with a Bearer header

A .env snippet in the quickstart

A GitHub token that slips through

Secret in prose — enable scanAll

PEM private key pasted into a troubleshooting section

Edge cases and what actually happens

gitleaks flagged a key this tool missed

Bare provider key (sk-, ghp_, xoxb-, sk_live_)

Secret only in prose, scanAll off

40-char base64 secret with no keyword

Multi-line / wrapped JWT

Placeholder value gets redacted

Trailing quote left behind

Doc exceeds Free limits

Secret lives in committed Git history

Frequently asked questions

Can this replace gitleaks or trufflehog?

What does it actually detect?

Why wasn't my GitHub token redacted?

Does it scan my whole doc by default?

Is it safe for pre-release docs with live keys?

Does it scan Git history?

Can I configure custom patterns?

Will it touch my prose narrative?

What placeholders does it use?

Can it process my whole /docs folder at once?

Does redacting the doc fix the leak?

What else should I run before publishing?

Privacy first

Related guides