Whitelist vs Charset Subsetting — Which Strategy Wins?

How to whitelist vs charset: choosing a font subset strategy

Step 1
Decide whether your text is fixed or variable — Fixed = you know every character at build time and it never changes per-visitor (logo, hero headline, app name, ticker symbols). Variable = the text depends on data or user input (names, search results, comments, any CMS field). This single question decides the strategy.
Step 2
For fixed text, whitelist the exact characters — Use the [Character Whitelist Builder](/font-tools/character-whitelist-builder). Paste the literal string — including spaces and punctuation. It deduplicates by codepoint and keeps only those glyphs. Smallest output, but anything you forgot renders `.notdef`.
Step 3
For variable text, pick a Unicode range — Use the [Font Subsetter](/font-tools/font-subsetter) and choose a named subset: Latin, Latin-Ext, Cyrillic, Greek, Vietnamese, or symbols. The [Latin Filter](/font-tools/latin-filter) is a one-click Latin shortcut. The range covers any character in that block, so unexpected accents and names still render.
Step 4
Confirm coverage before either approach — Run the [Character Coverage Map](/font-tools/character-coverage-map) to see which Unicode blocks the source font actually fills (it scores against 346 real Unicode blocks). A 'Latin' charset only helps if the font has Latin glyphs; a whitelist only helps if the font has your specific characters.
Step 5
Run the subset, download the TTF — Both tools output `<stem>.<label>.ttf` (uncompressed). The result panel shows glyphs-in-source vs glyphs-kept so you can compare the two strategies' footprints on the same font.
Step 6
Compress and embed identically — Regardless of strategy, the last steps are the same: [TTF→WOFF2](/font-tools/ttf-to-woff2) then an `@font-face` from the [Font-Face Generator](/font-tools/font-face-generator). For charset subsets, set the CSS `unicode-range` to match so the browser only downloads the subset when it's needed.

Whitelist vs charset: the decision matrix

Both strategies use JAD's in-browser opentype.js engine, output uncompressed TTF, and drop kerning + OT features. The difference is what you specify and how forgiving it is.

Dimension	Whitelist (exact list)	Charset (Unicode range)
JAD tool	Character Whitelist Builder	Font Subsetter / Latin Filter
You specify	The literal characters (`JAD APPS`)	A named block (`latin`, `cyrillic`, …)
Glyphs kept	Exactly your unique codepoints + `.notdef`	Every glyph in the range the font has + `.notdef`
Typical output	Smallest (logo: ~6–20 glyphs)	Larger (Latin: ~95–220 glyphs)
Breaks when…	Text contains a character you didn't paste	Text uses a script outside the chosen range
Best for	Logos, headlines, app names, fixed labels	Body copy, user names, CMS content, i18n

Named charsets available in the Font Subsetter

The Font Subsetter's range dropdown maps to these UNICODE_SUBSETS definitions. The Whitelist Builder has no dropdown — it only takes a literal string. Ranges are inclusive.

Charset key	Covers	Codepoint ranges
`latin`	Basic Latin + Latin-1 Supplement	U+0020–007E, U+00A0–00FF
`latin-ext`	Latin Extended-A + Extended-B + Additional	U+0100–024F, U+1E00–1EFF
`cyrillic`	Cyrillic + Cyrillic Supplement	U+0400–04FF, U+0500–052F
`greek`	Greek + Greek Extended	U+0370–03FF, U+1F00–1FFF
`vietnamese`	Latin Ext ranges + đồng sign	U+0100–024F, U+1E00–1EFF, U+20AB
`symbols`	General punctuation + super/subscripts + currency	U+2000–206F, U+2070–209F, U+20A0–20CF

Cookbook

Concrete scenarios mapped to the right strategy. The pattern: fixed → whitelist, variable → charset, and confirm coverage either way.

Logo wordmark — whitelist wins

Example

Six glyphs, never changes per visitor. A Latin charset would keep ~200 glyphs you'll never render. Whitelist the exact string.

Text: "JAD APPS" (fixed)
Strategy: WHITELIST
Tool: /font-tools/character-whitelist-builder
Paste: JAD APPS
-> 7 glyphs, ~5 KB TTF, ~2 KB WOFF2

(A 'latin' charset here would keep ~200 glyphs, ~30 KB.)

English blog body — charset wins

Example

Body copy varies per post and may include any Latin-1 character (curly quotes, accented author names). Whitelisting build-time content would tofu on the next article. Use a Latin range.

Text: article body (varies per post)
Strategy: CHARSET
Tool: /font-tools/font-subsetter  (charset: latin)
-> keeps U+0020-007E + U+00A0-00FF
-> survives 'naïve', 'café', curly quotes, em dashes

(A whitelist of today's article tofus on tomorrow's.)

Multilingual site — charset, possibly two

Example

An English + Russian site needs Latin and Cyrillic. Subset twice and ship two @font-face blocks with matching unicode-range so the browser fetches only what a page uses.

Strategy: CHARSET x2
Font Subsetter (charset: latin)    -> Brand.latin.ttf
Font Subsetter (charset: cyrillic) -> Brand.cyrillic.ttf

@font-face { src: url(brand.latin.woff2);    unicode-range: U+0020-00FF; }
@font-face { src: url(brand.cyrillic.woff2); unicode-range: U+0400-04FF; }

Pricing-page numerals + currency — whitelist

Example

A tabular figure set for a pricing widget is a fixed, known glyph set. Whitelist the digits, decimal, comma, and currency symbols you display.

Text: "0123456789.,$€£ /mo"  (fixed glyph set)
Strategy: WHITELIST
Paste: 0123456789.,$€£ /mo
-> ~18 glyphs, tiny TTF

Guard: include EVERY currency you show; a missing € = box.

When you need kerning — neither in-browser strategy works

Example

Both JAD in-browser strategies drop kerning. If letter-pair spacing matters (fine typographic headlines), move to a layout-preserving pipeline engine.

Need kerning/ligatures preserved?
-> NOT the in-browser whitelist or charset tools (both drop GPOS/GSUB)
-> Use pyftsubset --layout-features='*'  OR  hb-subset
   in a build pipeline (Python or WASM, layout-safe)
-> See /font-tools/guides/automate-font-subsetting-build-pipeline

Edge cases and what actually happens

Every row below was probed against the live API. Some documented requirements (alphabetical axis order, numerical tuple order) are not actually enforced in practice — useful to know if you've been blaming the wrong thing for a 400.

Whitelist tofus when the text changes

Strategy mismatch

The classic whitelist failure: you whitelist today's hero copy, marketing edits it next week to add an em dash or an accented name, and the new character renders .notdef. Whitelisting is only safe for text that is genuinely fixed and under your control. If an editor can change the string, use a charset range or re-run the whitelist on every content change (a build-pipeline job).

Charset keeps glyphs you'll never use

By design

A 'latin' charset keeps every Latin-1 glyph the font has — including currency symbols, fractions, and accented letters your English site never renders. That's the safety/size trade: you carry ~150–200 glyphs to guarantee you never tofu. For a fixed string this is pure waste; for variable text it's insurance. Pick based on whether the text varies, not on which file is smaller.

Both strategies drop kerning — the choice doesn't fix it

Kerning dropped

Neither the Whitelist Builder nor the Font Subsetter preserves kern/GPOS, because both call the same subsetByCodepoints core that rebuilds the font from new Font({ glyphs }). Switching strategies does not recover kerning. If kerning is non-negotiable, the answer is a different engine (hb-subset / pyftsubset in a pipeline), not a different in-browser strategy.

Charset range exists but the font doesn't fill it

Coverage gap

Choosing 'cyrillic' only helps if the source font actually contains Cyrillic glyphs. A Latin-only font subset to 'cyrillic' would keep nothing and error 'Subset would be empty'. Run the Character Coverage Map (it scores 346 real Unicode blocks) to confirm the font fills the range before you subset to it.

Whitelist with a base + combining-mark cluster

Split into codepoints

If your text uses a decomposed character (e.g. e + U+0301 combining acute) rather than precomposed é, the whitelist must include both codepoints, and the rendered result depends on the font's mark-positioning — which, with GPOS dropped, may sit incorrectly. Prefer precomposed characters in fixed marketing strings, or use a charset range that includes the combining-marks block.

Mixing strategies on one site

Supported

It's common and correct to use both: whitelist the logo and hero headline to tiny files, charset-subset the body font to a Latin range. They're independent @font-face declarations. Just give them distinct font-family names so the logo's kerning-less micro-font isn't accidentally applied to body copy.

Vietnamese needs more than 'latin'

Use the right range

Vietnamese uses Latin Extended ranges plus the đồng sign (U+20AB) — a plain 'latin' charset will tofu on ệ, ữ, ọ. Use the dedicated 'vietnamese' charset, which the Font Subsetter defines as U+0100–024F + U+1E00–1EFF + U+20AB. A whitelist of the exact Vietnamese characters in fixed copy also works and is smaller.

Symbols/punctuation fall outside 'latin'

Add the symbols charset

Em dashes (U+2014), curly quotes (U+2018–201D), and the euro sign (U+20AC) live in General Punctuation / Currency blocks, not Latin-1. An English body subset to 'latin' will tofu on typographic punctuation. Either add the 'symbols' charset (U+2000–20CF) via a second subset, or whitelist the specific punctuation you use.

Frequently asked questions

What's the actual difference between whitelisting and charset subsetting on JAD Apps?

Whitelisting (the Character Whitelist Builder) keeps an exact list of characters you paste. Charset subsetting (the Font Subsetter / Latin Filter) keeps every glyph in a chosen Unicode range. Both run in-browser on the same opentype.js engine, both output uncompressed TTF, and both drop kerning + OpenType features. The only difference is granularity: whitelist is smaller but brittle; charset is larger but safe for varying text.

Which one should I use for a logo?

Whitelist. A logo is a fixed string you know completely at build time, so keeping only those 6–20 glyphs gives the smallest possible file — often an order of magnitude smaller than even a Latin subset. Just be sure to include spaces, punctuation, and any symbol (®, ™, &) that appears in the mark. See the tiny-logo-font guide.

Which one should I use for body text?

Charset. Body copy varies — different articles, user-entered names, CMS edits — so whitelisting build-time content will tofu on the next character that wasn't on the page. A Latin (or Latin-Ext, or Cyrillic) range covers any glyph in that block, so unexpected accents and names still render. Pair it with a CSS unicode-range so the browser only downloads the subset when needed.

Does choosing one strategy over the other affect kerning?

No. Both JAD in-browser strategies drop kerning and all OpenType layout features, because they share the same subset core that rebuilds the font from scratch. If you need kerning preserved, the fix is a different engine — hb-subset or pyftsubset --layout-features='*' in a build pipeline — not a different in-browser strategy. See the build-pipeline guide.

What Unicode ranges can the Font Subsetter target?

Six named charsets: latin (U+0020–007E, U+00A0–00FF), latin-ext (U+0100–024F, U+1E00–1EFF), cyrillic (U+0400–052F), greek (U+0370–03FF, U+1F00–1FFF), vietnamese (Latin-Ext ranges + U+20AB), and symbols (U+2000–20CF). The Whitelist Builder has no range option at all — it only takes a literal character string.

Can I get a custom range that isn't in the dropdown?

Not directly in the charset tool — it only offers the six named subsets. But the whitelist approach effectively gives you arbitrary granularity: paste exactly the characters you want and you get exactly those glyphs, which is a custom 'range' of one or many characters. For programmatic arbitrary ranges (e.g. U+2190–21FF arrows), use pyftsubset's --unicodes flag in a pipeline.

Is the output size really that different?

Yes, when text is fixed. A 6-glyph logo whitelist might be ~5 KB TTF (~2 KB WOFF2), while the same font subset to 'latin' keeps ~200 glyphs at ~30 KB TTF (~12 KB WOFF2). But for body text you can't whitelist safely, so the comparison is moot — you pay for the Latin range because you need the coverage.

Can I combine both on the same page?

Yes, and it's a good pattern: whitelist the logo and hero headline to micro-fonts, charset-subset the body font to a Latin range. Each is an independent @font-face with its own font-family name. Keep the names distinct so the kerning-less logo font isn't applied to running text where its missing glyphs and spacing would show.

How do I avoid tofu boxes with either strategy?

Run the Character Coverage Map first — it scores the font against 346 real Unicode blocks so you can confirm the glyphs you need exist. For whitelist, include every literal character your text uses (especially punctuation and symbols). For charset, pick a range wide enough for your content; add the 'symbols' charset for typographic punctuation, which lives outside Latin-1.

Why do both tools output TTF instead of WOFF2?

Because they both use opentype.js's Font.toArrayBuffer(), which writes sfnt/TTF. There's no WOFF2 writer in either tool. Whichever strategy you choose, the next step is the same: run the TTF through TTF→WOFF2. The strategy choice doesn't change the output format — only the glyph set.

Is whitelisting ever wrong even for fixed-looking text?

Yes — when 'fixed' isn't truly fixed. Navigation labels translated by a CMS, A/B-tested headlines, or any string an editor can change are deceptively variable. If a non-developer can edit the text, treat it as variable and use a charset, or automate the whitelist regeneration on every content change so it never falls out of sync.

Which strategy is better for performance?

Smaller files load faster, so whitelist wins on raw bytes when it's safe. But correctness beats size: a tofu box on a real user's name is worse than a few extra KB. Use whitelist where text is provably fixed (logos, hero copy under your control) and charset everywhere text can vary. Both compress to WOFF2 the same way, and both benefit equally from font-display: swap.

Privacy first

Every JAD Font tool runs entirely in your browser using opentype.js and the wawoff2 WASM Brotli encoder. Your fonts never leave your device — verified by zero outbound network requests during processing.

How to whitelist vs charset: choosing a font subset strategy

Step 1
Decide whether your text is fixed or variable — Fixed = you know every character at build time and it never changes per-visitor (logo, hero headline, app name, ticker symbols). Variable = the text depends on data or user input (names, search results, comments, any CMS field). This single question decides the strategy.
Step 2
For fixed text, whitelist the exact characters — Use the [Character Whitelist Builder](/font-tools/character-whitelist-builder). Paste the literal string — including spaces and punctuation. It deduplicates by codepoint and keeps only those glyphs. Smallest output, but anything you forgot renders `.notdef`.
Step 3
For variable text, pick a Unicode range — Use the [Font Subsetter](/font-tools/font-subsetter) and choose a named subset: Latin, Latin-Ext, Cyrillic, Greek, Vietnamese, or symbols. The [Latin Filter](/font-tools/latin-filter) is a one-click Latin shortcut. The range covers any character in that block, so unexpected accents and names still render.
Step 4
Confirm coverage before either approach — Run the [Character Coverage Map](/font-tools/character-coverage-map) to see which Unicode blocks the source font actually fills (it scores against 346 real Unicode blocks). A 'Latin' charset only helps if the font has Latin glyphs; a whitelist only helps if the font has your specific characters.
Step 5
Run the subset, download the TTF — Both tools output `<stem>.<label>.ttf` (uncompressed). The result panel shows glyphs-in-source vs glyphs-kept so you can compare the two strategies' footprints on the same font.
Step 6
Compress and embed identically — Regardless of strategy, the last steps are the same: [TTF→WOFF2](/font-tools/ttf-to-woff2) then an `@font-face` from the [Font-Face Generator](/font-tools/font-face-generator). For charset subsets, set the CSS `unicode-range` to match so the browser only downloads the subset when it's needed.

Whitelist vs charset: the decision matrix

Both strategies use JAD's in-browser opentype.js engine, output uncompressed TTF, and drop kerning + OT features. The difference is what you specify and how forgiving it is.

Dimension	Whitelist (exact list)	Charset (Unicode range)
JAD tool	Character Whitelist Builder	Font Subsetter / Latin Filter
You specify	The literal characters (`JAD APPS`)	A named block (`latin`, `cyrillic`, …)
Glyphs kept	Exactly your unique codepoints + `.notdef`	Every glyph in the range the font has + `.notdef`
Typical output	Smallest (logo: ~6–20 glyphs)	Larger (Latin: ~95–220 glyphs)
Breaks when…	Text contains a character you didn't paste	Text uses a script outside the chosen range
Best for	Logos, headlines, app names, fixed labels	Body copy, user names, CMS content, i18n

Named charsets available in the Font Subsetter

The Font Subsetter's range dropdown maps to these UNICODE_SUBSETS definitions. The Whitelist Builder has no dropdown — it only takes a literal string. Ranges are inclusive.

Charset key	Covers	Codepoint ranges
`latin`	Basic Latin + Latin-1 Supplement	U+0020–007E, U+00A0–00FF
`latin-ext`	Latin Extended-A + Extended-B + Additional	U+0100–024F, U+1E00–1EFF
`cyrillic`	Cyrillic + Cyrillic Supplement	U+0400–04FF, U+0500–052F
`greek`	Greek + Greek Extended	U+0370–03FF, U+1F00–1FFF
`vietnamese`	Latin Ext ranges + đồng sign	U+0100–024F, U+1E00–1EFF, U+20AB
`symbols`	General punctuation + super/subscripts + currency	U+2000–206F, U+2070–209F, U+20A0–20CF

Cookbook

Concrete scenarios mapped to the right strategy. The pattern: fixed → whitelist, variable → charset, and confirm coverage either way.

Logo wordmark — whitelist wins

Example

Six glyphs, never changes per visitor. A Latin charset would keep ~200 glyphs you'll never render. Whitelist the exact string.

Text: "JAD APPS" (fixed)
Strategy: WHITELIST
Tool: /font-tools/character-whitelist-builder
Paste: JAD APPS
-> 7 glyphs, ~5 KB TTF, ~2 KB WOFF2

(A 'latin' charset here would keep ~200 glyphs, ~30 KB.)

English blog body — charset wins

Example

Body copy varies per post and may include any Latin-1 character (curly quotes, accented author names). Whitelisting build-time content would tofu on the next article. Use a Latin range.

Text: article body (varies per post)
Strategy: CHARSET
Tool: /font-tools/font-subsetter  (charset: latin)
-> keeps U+0020-007E + U+00A0-00FF
-> survives 'naïve', 'café', curly quotes, em dashes

(A whitelist of today's article tofus on tomorrow's.)

Multilingual site — charset, possibly two

Example

An English + Russian site needs Latin and Cyrillic. Subset twice and ship two @font-face blocks with matching unicode-range so the browser fetches only what a page uses.

Strategy: CHARSET x2
Font Subsetter (charset: latin)    -> Brand.latin.ttf
Font Subsetter (charset: cyrillic) -> Brand.cyrillic.ttf

@font-face { src: url(brand.latin.woff2);    unicode-range: U+0020-00FF; }
@font-face { src: url(brand.cyrillic.woff2); unicode-range: U+0400-04FF; }

Pricing-page numerals + currency — whitelist

Example

A tabular figure set for a pricing widget is a fixed, known glyph set. Whitelist the digits, decimal, comma, and currency symbols you display.

Text: "0123456789.,$€£ /mo"  (fixed glyph set)
Strategy: WHITELIST
Paste: 0123456789.,$€£ /mo
-> ~18 glyphs, tiny TTF

Guard: include EVERY currency you show; a missing € = box.

When you need kerning — neither in-browser strategy works

Example

Both JAD in-browser strategies drop kerning. If letter-pair spacing matters (fine typographic headlines), move to a layout-preserving pipeline engine.

Need kerning/ligatures preserved?
-> NOT the in-browser whitelist or charset tools (both drop GPOS/GSUB)
-> Use pyftsubset --layout-features='*'  OR  hb-subset
   in a build pipeline (Python or WASM, layout-safe)
-> See /font-tools/guides/automate-font-subsetting-build-pipeline

Edge cases and what actually happens

Whitelist tofus when the text changes

Strategy mismatch

Charset keeps glyphs you'll never use

By design

Both strategies drop kerning — the choice doesn't fix it

Kerning dropped

Charset range exists but the font doesn't fill it

Coverage gap

Whitelist with a base + combining-mark cluster

Split into codepoints

Mixing strategies on one site

Supported

Vietnamese needs more than 'latin'

Use the right range

Symbols/punctuation fall outside 'latin'

Add the symbols charset

Whitelist vs Charset: Choosing a Font Subset Strategy

How to whitelist vs charset: choosing a font subset strategy

Whitelist vs charset: the decision matrix

Named charsets available in the Font Subsetter

Cookbook

Logo wordmark — whitelist wins

English blog body — charset wins

Multilingual site — charset, possibly two

Pricing-page numerals + currency — whitelist

When you need kerning — neither in-browser strategy works

Edge cases and what actually happens

Whitelist tofus when the text changes

Charset keeps glyphs you'll never use

Both strategies drop kerning — the choice doesn't fix it

Charset range exists but the font doesn't fill it

Whitelist with a base + combining-mark cluster

Mixing strategies on one site

Vietnamese needs more than 'latin'

Symbols/punctuation fall outside 'latin'

Frequently asked questions

What's the actual difference between whitelisting and charset subsetting on JAD Apps?

Which one should I use for a logo?

Which one should I use for body text?

Does choosing one strategy over the other affect kerning?

What Unicode ranges can the Font Subsetter target?

Can I get a custom range that isn't in the dropdown?

Is the output size really that different?

Can I combine both on the same page?

How do I avoid tofu boxes with either strategy?

Why do both tools output TTF instead of WOFF2?

Is whitelisting ever wrong even for fixed-looking text?

Which strategy is better for performance?

Privacy first

Related guides

Whitelist vs Charset: Choosing a Font Subset Strategy

How to whitelist vs charset: choosing a font subset strategy

Whitelist vs charset: the decision matrix

Named charsets available in the Font Subsetter

Cookbook

Logo wordmark — whitelist wins

English blog body — charset wins

Multilingual site — charset, possibly two

Pricing-page numerals + currency — whitelist

When you need kerning — neither in-browser strategy works

Edge cases and what actually happens

Whitelist tofus when the text changes

Charset keeps glyphs you'll never use

Both strategies drop kerning — the choice doesn't fix it

Charset range exists but the font doesn't fill it

Whitelist with a base + combining-mark cluster

Mixing strategies on one site

Vietnamese needs more than 'latin'

Symbols/punctuation fall outside 'latin'

Frequently asked questions

What's the actual difference between whitelisting and charset subsetting on JAD Apps?

Which one should I use for a logo?

Which one should I use for body text?

Does choosing one strategy over the other affect kerning?

What Unicode ranges can the Font Subsetter target?

Can I get a custom range that isn't in the dropdown?

Is the output size really that different?

Can I combine both on the same page?

How do I avoid tofu boxes with either strategy?

Why do both tools output TTF instead of WOFF2?

Is whitelisting ever wrong even for fixed-looking text?

Which strategy is better for performance?

Privacy first

Related guides