How to detect and tag code block languages
- Step 1Paste or upload your Markdown — Paste a doc into the text box, or switch to Upload file and pick one
.mdfile. The tool processes a single document per run (acceptsMultipleis false for this tool). - Step 2Run Code Tagger — Click Run. There are no options to set — this tool exposes no controls, so the same heuristic pass runs every time.
- Step 3Detection scans untagged fences — Each ```
``` opening fence with nothing after the backticks is matched; the first line of its body is tested against the language patterns in order. - Step 4A hint is inserted or the block stays bare — On the first matching pattern the tool writes ```
python```-style fences; if no pattern matches, the fence is left untagged rather than guessed. - Step 5Review the diff — Spot-check blocks against the language matrix below — short snippets and unsupported languages (TypeScript, YAML, Bash, CSS) will not be tagged.
- Step 6Copy or download the result — Copy the tagged Markdown or download it as
.md, then commit. Re-running later is safe because tagged blocks are skipped.
Languages the detector recognises
The full detection set. Patterns are tested top to bottom against the start of each untagged block; the first match wins, so order matters when a snippet could read as two languages.
| Hint inserted | Triggered by (start of block) | Order |
|---|---|---|
python | import, from, def , class , or if __name__ | 1st |
javascript | const , let , var , function , =>, or import { | 2nd |
rust | pub fn, fn , use , let mut, impl , or struct | 3rd |
go | package , import "fmt", func , type , var , or const | 4th |
php | <?php, namespace , echo , function , or class | 5th |
sql | SELECT, INSERT, UPDATE, DELETE, or CREATE (case-insensitive) | 6th |
html | <html, <div, <span, or <!DOCTYPE (case-insensitive) | 7th |
json | first non-space character is { or [ | 8th |
cpp | #include, int main, void , or std:: | 9th |
What this tool does and does not touch
Behaviour summary for fence types and existing tags. Anything outside the recognised set is left exactly as written.
| Input | Result |
|---|---|
Untagged ``` ``` fence, first line matches a pattern | Hint inserted (e.g. ``` sql ```) |
| Untagged fence, no pattern matches | Left bare — no hint, no comment marker |
Fence already tagged (``` python ```) | Returned unchanged |
Tilde fence (~~~) | Not detected — only backtick fences are processed |
| Indented (4-space) code block | Not detected — only fenced blocks are processed |
| TypeScript, YAML, Bash, CSS, Java, Ruby snippets | Not in the detection set — left bare |
Cookbook
Real before/after runs. Each Input is untagged Markdown; each Output is exactly what the tagger returns.
Python detected from an import line
The first line begins with import, the first pattern in the chain, so the block is tagged python.
Input: ``` import requests resp = requests.get(url) ``` Output: ```python import requests resp = requests.get(url) ```
JavaScript from a const declaration
const matches the JavaScript pattern. Note: TypeScript is not a separate target, so a .ts snippet that opens with const is also tagged javascript.
Input: ``` const api = '/v1/users'; fetch(api).then(r => r.json()); ``` Output: ```javascript const api = '/v1/users'; fetch(api).then(r => r.json()); ```
JSON detected from a leading brace
When the trimmed body starts with { or [, the block is tagged json. This rule sits near the end of the chain, so config that looks like another language can win first.
Input:
```
{
"name": "jad",
"version": "1.0.0"
}
```
Output:
```json
{
"name": "jad",
"version": "1.0.0"
}
```Already-tagged block is preserved
A fence that already carries a hint is returned byte-for-byte, even if the content would have detected as something else.
Input: ```bash npm run build ``` Output: ```bash npm run build ```
Unrecognised snippet stays bare
A YAML block does not match any pattern, so it is left untagged — the tool never guesses a fallback like text.
Input: ``` name: build on: [push] ``` Output: ``` name: build on: [push] ```
Edge cases and what actually happens
TypeScript tagged as javascript
By designThere is no typescript target. A .ts snippet that opens with const, let, function, or import { matches the JavaScript pattern and is tagged javascript. If you need the typescript identifier for your renderer, change it by hand after the run.
Go const/var/func collides with detection order
By designconst and var appear in both the JavaScript and Go patterns, and JavaScript is tested first. A Go file that opens with const ( is tagged javascript. Move a package or func line to the top, or correct the hint manually.
Block opens with a comment or blank line
Left bareDetection reads the start of the block body. If the first line is a # comment, //, or blank, none of the keyword patterns match and the block stays untagged. Put a signature line (an import, a def, a SELECT) first to get a hit.
YAML, Bash, CSS, Java, Ruby snippets
Left bareThese languages are not in the nine-pattern set. Their blocks are never tagged automatically — add the hint yourself after running.
Tilde-fenced block
Not detectedOnly backtick fences (``` ) are scanned. A ~~~`-delimited block is skipped entirely. Convert tildes to backticks first if you want it tagged.
Indented (4-space) code block
Not detectedThe tool only matches fenced blocks. A four-space-indented code block is treated as ordinary text and never receives a hint.
JSON wins over a real language
By designA block whose first character is { or [ is tagged json. A Rust or Go snippet that happens to open with a brace can be mislabelled json if its earlier patterns did not match. Review brace-first blocks.
Re-running an already-tagged doc
ExpectedEvery tagged fence is returned unchanged, so a second pass produces identical output. The transform is idempotent and safe to wire into a pre-commit hook.
Input over the free character limit
RejectedFree tier caps markdown input at 500,000 characters (and 1 MB / 1 file). A doc beyond that is rejected before processing — split it with md-splitter or upgrade for the 5,000,000-character Pro limit.
Frequently asked questions
Which languages can it actually detect?
Exactly nine: Python, JavaScript, Rust, Go, PHP, SQL, HTML, JSON, and C++. There is no TypeScript, YAML, Bash, CSS, Java, Ruby, or Markdown target — those blocks are left bare.
Will it overwrite an existing language tag?
No. Any fence that already has a hint after the backticks is returned unchanged, even if the content would have detected differently.
What happens when no pattern matches?
The block is left untagged. The tool does not insert a fallback like text and does not add a comment marker — it simply leaves the bare fence.
Are there any options or settings?
No. This tool exposes no controls. The same heuristic pass runs on every Run, so there is nothing to configure or misconfigure.
Does it detect TypeScript?
Not as typescript. A TS snippet that opens with const, let, function, or import { is tagged javascript. Rename it afterward if your highlighter needs the typescript token.
How does detection decide between two possible languages?
Patterns are tested in a fixed order: Python, JavaScript, Rust, Go, PHP, SQL, HTML, JSON, C++. The first one that matches the start of the block wins.
Why didn't my Go file get tagged go?
JavaScript is tested before Go and shares the const and var keywords. A Go block opening with const ( is tagged javascript. Lead with package or func to get go.
Does it handle tilde fences or indented code?
No. Only backtick (``` ) fences are processed. Tilde (~~~`) fences and four-space-indented blocks are skipped.
Is the output GitHub-compatible?
Yes. It emits lowercase identifiers like python and javascript that GitHub, GitLab, and VS Code preview all recognise. Repo language statistics are unaffected — those read file extensions, not fence hints.
Can I process a whole folder at once?
Not in one run — this tool takes a single document (paste or one uploaded file). Run files individually, or use md-merger to combine them into one document first if that suits your workflow.
Should I run other cleanup at the same time?
Tagging pairs well with a formatting pass. Run md-prettifier to normalise spacing, or md-lint to catch structural issues, then tag code blocks last so the fences are stable.
Where does processing happen?
Entirely in your browser. Nothing is uploaded; your source stays on the page. Free tier handles up to 500,000 characters / 1 MB per run.
Privacy first
All Markdown processing runs locally in your browser using JavaScript. No file is ever uploaded to JAD Apps servers — only metadata counters are saved for signed-in dashboard stats.