How to estimate subsetting savings before you commit
- Step 1Drop a TTF, OTF, WOFF, or WOFF2 onto the analyser — The drop zone accepts `.ttf`, `.otf`, `.woff`, and `.woff2`. WOFF2 is decompressed with the bundled `wawoff2` WASM, WOFF 1.0 is inflated table-by-table, TTF/OTF are read directly. A TrueType Collection (`.ttc`) is detected but rejected with `Unsupported font format: ttc` — extract a single face first.
- Step 2Press Process — there are no options to configure — The schema for this tool is empty (`options: []`). There is no charset picker, no quality slider, no format toggle. The same six subsets are always projected. The only control on the page is the file you upload.
- Step 3Read total_glyphs against the free-tier ceiling — The result panel headlines `Total glyphs`. The free tier advertises a 1,000-glyph ceiling; the gate the page actually enforces before processing is the 5 MB file-size limit. A 1,000-glyph Latin font is ~30–60 KB, so most Latin fonts clear both comfortably.
- Step 4Scan the projections table for the smallest viable subset — Each row is `{ subset, glyph_count, estimated_woff2_bytes, savings_pct }`. `glyph_count` is how many of your font's mapped codepoints land inside that subset's ranges. Pick the smallest subset whose `glyph_count` covers the languages your audience actually reads.
- Step 5Treat the size as a projection, not a measurement — `estimated_woff2_bytes` = `max(1500, current × min(1, proportion × 1.25))`. It is an estimate within roughly ±5% for typical Latin fonts and noisier for complex scripts. For the exact byte count, run the chosen charset through [the font subsetter](/font-tools/font-subsetter).
- Step 6Build the winner and verify — Subset to the charset you picked, then re-run this analyser on the output (or check the subsetter's own size metric). The real WOFF2 should land near the projection; if it's far off, your font has heavy hinting or composite glyphs that compress differently — strip hinting with [the hinting stripper](/font-tools/hinting-stripper) and re-measure.
The six fixed subsets the analyser projects
These are the only charsets projected, and they are not user-selectable. The ranges are exactly what the analyser walks per codepoint to count coverage. There is no "custom" charset on this tool despite older copy — for an arbitrary character set use the subsetter or whitelist builder instead.
| Subset (as labelled in output) | Codepoint ranges walked | Covers |
|---|---|---|
Latin Basic + Latin-1 Supplement | U+0020–007E, U+00A0–00FF | English, plus accented letters for French, German, Spanish, Italian, Portuguese, Nordic |
Latin Extended-A + Extended-B | U+0100–024F, U+1E00–1EFF | Central/Eastern European Latin (Polish, Czech, Turkish, Romanian) plus the Latin Extended Additional block |
Cyrillic | U+0400–04FF, U+0500–052F | Russian, Ukrainian, Bulgarian, Serbian, plus the Cyrillic Supplement |
Greek | U+0370–03FF, U+1F00–1FFF | Modern Greek plus Greek Extended (polytonic / classical) |
Vietnamese | U+0100–024F, U+1E00–1EFF, U+20AB | Vietnamese diacritics (overlaps Latin Extended) plus the đồng sign ₫ |
General punctuation + symbols | U+2000–206F, U+2070–209F, U+20A0–20CF | Smart quotes, dashes, ellipsis, super/subscripts, currency symbols |
How each projection number is computed
The exact maths from the handler. current is the current WOFF2 estimate (real file size for WOFF2 input, else sfnt bytes × 0.55). proportion = subset glyph_count ÷ total_glyphs.
| Output field | Formula | Why |
|---|---|---|
total_glyphs | font.glyphs.length | Every glyph in the font: glyphs with codepoints, composites, .notdef, and unmapped glyphs all count |
current_woff2_estimate_bytes | WOFF2 input → real file.size; else round(sfnt.byteLength × 0.55) | WOFF2 self-reports exactly; TTF/OTF/WOFF use a 0.55× typical-compression assumption |
glyph_count (per subset) | count of mapped codepoints that fall inside the subset's ranges | Counts coverage by codepoint, so it can differ from how many glyph IDs a real subset keeps |
estimated_woff2_bytes | max(1500, round(current × min(1, proportion × 1.25))) | The 1.25× factor models per-glyph overhead; the 1500-byte floor stops tiny subsets projecting to absurdly small sizes |
savings_pct | round((1 − estimated ÷ current) × 100) | Headline savings, rounded to a whole percent |
Cookbook
Real JSON snapshots and how to act on them. All numbers are illustrative projections from the formulas above — your font's actual subset will differ slightly. If you want the language fitness of a font rather than its size, pair this with the character coverage map.
A pan-European font that only needs English
ExampleA 1,400-glyph webfont where the team only ships English copy. The Latin Basic row shows the win; everything past it is dead weight you're paying for on every page load.
Output (abridged):
{
"total_glyphs": 1400,
"current_woff2_estimate_bytes": 84000,
"projections": [
{ "subset": "Latin Basic + Latin-1 Supplement",
"glyph_count": 210, "estimated_woff2_bytes": 15750, "savings_pct": 81 },
{ "subset": "Latin Extended-A + Extended-B",
"glyph_count": 390, "estimated_woff2_bytes": 29250, "savings_pct": 65 },
{ "subset": "Cyrillic",
"glyph_count": 256, "estimated_woff2_bytes": 19200, "savings_pct": 77 }
]
}
Read: Latin Basic projects 84 KB → ~16 KB (81% off).
Action: subset to U+0020-007E,U+00A0-00FF in the font subsetter.WOFF2 input gives the tightest projection
ExampleThe same font, but you upload the already-built WOFF2 instead of the source TTF. Because the analyser anchors to the real file.size, the current estimate is exact and the projections are at their most trustworthy.
TTF input → current_woff2_estimate_bytes = round(sfntBytes * 0.55) (a guess) WOFF2 input → current_woff2_estimate_bytes = file.size (exact) Upload Inter-Regular.woff2 (not the .ttf) when you want the projection baseline to be real bytes rather than a 0.55x estimate.
The 1500-byte floor on a tiny subset
ExampleAn icon font with 40 glyphs, none of which map into the standard subsets except a handful of currency symbols. The symbols projection hits the floor — the analyser refuses to project below 1500 bytes because a real WOFF2 has irreducible header and table overhead.
current_woff2_estimate_bytes: 9000 symbols glyph_count: 6 (proportion 6/40 = 0.15) raw projection: 9000 * (0.15 * 1.25) = 1687 bytes floored projection: max(1500, 1687) = 1687 -> still above floor If proportion were 0.10: 9000 * (0.10 * 1.25) = 1125 -> max(1500, 1125) = 1500 (floored) savings_pct then = round((1 - 1500/9000)*100) = 83%
Pipe the JSON to a build step
ExampleThe analyser is browser-only for interactive use, but the same engine is callable through the JAD runner's local HTTP API. The schema is empty, so you just POST the file.
# Schema confirms no options:
curl -sS http://127.0.0.1:9789/v1/tools/glyph-count-analyzer
# -> { "options": [] }
# Run it on a font, capture the JSON:
curl -sS -X POST http://127.0.0.1:9789/v1/tools/glyph-count-analyzer/run \
-F 'files=@src/fonts/Brand-Regular.ttf' \
-o brand.glyph-count.json
# Then read total_glyphs / best savings_pct in your script.From projection to exact build, in two tools
ExampleThe analyser tells you Latin Basic is the win; it does not produce a subset. Hand the charset to the subsetter, then re-measure. This is the canonical workflow the analyser is designed for.
1. glyph-count-analyzer: Latin Basic projects 81% savings. 2. font-subsetter: keep "latin" -> outputs Brand-Latin.woff2 (real bytes). 3. glyph-count-analyzer on the output: total_glyphs now ~210, confirming the subset took. current_woff2_estimate_bytes is now the real subset size.
Edge cases and what actually happens
Every row below was probed against the live API. Some documented requirements (alphabetical axis order, numerical tuple order) are not actually enforced in practice — useful to know if you've been blaming the wrong thing for a 400.
total_glyphs is higher than the codepoints you can type
By designtotal_glyphs is font.glyphs.length — it counts every glyph slot including .notdef, ligature glyphs, small-cap and oldstyle-figure variants, and composite/component glyphs that have no Unicode codepoint of their own. So a font with 230 typable characters can report 600+ glyphs. The per-subset glyph_count counts only mapped codepoints inside a range, which is why subset counts can look small next to the total. Both numbers are correct; they measure different things.
Projection differs from the real subset size
Expected (estimate)The projection is current × min(1, proportion × 1.25), a linear model. Real subsetting is not linear: the kept glyphs may be the complex composite ones (heavy) or simple Latin letters (light), the glyf/CFF table doesn't shrink proportionally to glyph count, and OpenType layout tables (GSUB/GPOS) carry fixed overhead. Treat the number as ±5% for typical Latin and looser for complex scripts. The subsetter gives the exact answer.
TTF/OTF input projects from a 0.55× guess
ExpectedOnly WOFF2 input anchors current_woff2_estimate_bytes to a real measured size. For TTF, OTF, and WOFF input the baseline is round(sfnt.byteLength × 0.55) — a typical TrueType-to-WOFF2 compression ratio. If your font compresses unusually well or badly (lots of hinting, sparse glyphs), every projection inherits that error. For the tightest projections, build the WOFF2 first and analyse that.
Vietnamese and Latin Extended overlap
By designThe Vietnamese subset's ranges are U+0100–024F and U+1E00–1EFF (the same two ranges as Latin Extended-A/B) plus the single đồng sign U+20AB. So for most fonts the Vietnamese glyph_count and the Latin Extended glyph_count are nearly identical, and the projected sizes track each other. That's correct — Vietnamese diacritics live in those Extended blocks. The extra codepoint is just the currency mark.
A .ttc TrueType Collection is rejected
Rejected (unsupported format)The format detector recognises ttcf magic and returns format ttc, but fileToSfntBuffer only handles ttf, otf, woff, and woff2, so a collection throws Unsupported font format: ttc. Collections bundle multiple faces (e.g. a family in one file); extract the single face you want with a font editor or fonttools ttx, then analyse that face.
A symbol/icon font lands almost entirely in Private Use
Expected (low coverage)Icon fonts map their glyphs into the Private Use Area (U+E000–F8FF) or the symbol blocks. None of the six analysed subsets cover the PUA, so every subset's glyph_count will be near zero and savings_pct near 100% — which is misleading, because subsetting to "latin" would delete every icon. For icon fonts, the analyser's projections are not meaningful; subset by an explicit codepoint list with the character whitelist builder instead.
File over the tier size limit is refused before parsing
413-style rejectThe page checks file size against the active tier (free 5 MB, pro 50 MB, developer 1 GB) and refuses with "exceeds the {tier} tier per-job limit" before the font is ever parsed. The advertised 1,000-glyph free ceiling is a tier characteristic shown in the dashboard; the hard gate the UI enforces for this analyser is the byte limit. A large CJK font will usually hit the size limit long before any glyph cap matters.
savings_pct can read 0% for an already-minimal font
ExpectedIf a subset's glyph_count is most of your font (proportion near 1), min(1, proportion × 1.25) clamps to 1, the projected size equals the current size, and savings_pct rounds to 0. That's the analyser telling you there is nothing to gain from that subset — the font is already close to that charset. Look at a smaller subset row, or accept that this font is already as lean as a fixed-subset cut will make it.
A static instance of a variable font reports only that instance
ExpectedIf you analyse a single static WOFF2 cut from a variable font, you see that one instance's glyph total. The variable font itself carries the same glyph set across all weights in one file. To compare "variable vs N statics" by size, freeze the instances you need with the variable font freezer and analyse each — the glyph total won't change, but the file size will.
Frequently asked questions
Can I project savings for a custom character set?
Not with this analyser. It always projects the same six fixed subsets (Latin, Latin Extended, Cyrillic, Greek, Vietnamese, symbols) and has no charset input — its schema is empty. Older copy mentioning "custom" is wrong for this tool. If you want size for an arbitrary character list, build the actual subset with the font subsetter or restrict to an exact glyph set with the character whitelist builder and read the real output size.
How accurate is the projected size?
Within roughly ±5% for typical Latin fonts, looser for CJK, icon, and heavily-hinted fonts. The projection is a linear model (current × proportion × 1.25, floored at 1500 bytes); real subsetting shrinks tables non-linearly. It's accurate enough to rank subsets and set a budget, not to promise an exact byte count. For the exact number, run the subset and measure the output.
Why does the total glyph count look so high?
Because it's font.glyphs.length — every glyph slot, including .notdef, ligatures, stylistic and figure variants, and composite component glyphs that have no codepoint. A font with ~230 typable Latin characters routinely reports 500–700 glyphs once OpenType features and composites are counted. The per-subset glyph_count counts only codepoints inside a range, which is the smaller, more intuitive number.
Which input formats can I upload?
TTF, OTF, WOFF 1.0, and WOFF2. WOFF2 is Brotli-decompressed in the browser, WOFF 1.0 is zlib-inflated table-by-table, and TTF/OTF are read directly by opentype.js. TrueType Collections (.ttc) are recognised but rejected — extract a single face first. EOT and SVG fonts are not supported.
Why is the WOFF2 projection more accurate for WOFF2 input?
Because for WOFF2 input the analyser uses the real file.size as the baseline current_woff2_estimate_bytes. For TTF/OTF/WOFF it estimates the baseline as sfnt bytes × 0.55 — a typical compression ratio that can be off if your font is unusually dense or sparse. Every per-subset projection scales off that baseline, so a real WOFF2 baseline yields tighter projections. Build the WOFF2 first, then analyse it.
What's the 1500-byte floor about?
A real WOFF2 file has irreducible overhead — header, table directory, a few required tables — so it can't shrink below roughly 1.5 KB no matter how few glyphs you keep. The analyser caps every projection at max(1500, …) so a tiny subset doesn't project to an impossibly small size. If you see exactly 1500 bytes in a row, the model floored it.
Does this tool actually subset my font?
No. It only counts and projects — it never modifies or downloads a new font. The output is JSON describing your font and the six projections. To produce a subset font file, take the charset the analyser recommends and run the font subsetter; to convert a finished subset to WOFF2 use the TTF to WOFF2 converter.
Why are the Vietnamese and Latin Extended rows almost the same size?
Because the Vietnamese subset reuses the Latin Extended ranges (U+0100–024F and U+1E00–1EFF) and only adds one codepoint, the đồng sign U+20AB. Vietnamese diacritics genuinely live in those Extended blocks, so for most fonts both rows count nearly the same glyphs and project nearly the same size. Pick whichever label matches your audience; the bytes are effectively identical.
Will my font be uploaded anywhere?
No. The font is parsed in your browser with WebAssembly — wawoff2 for WOFF2, pako for WOFF, opentype.js for parsing. The bytes never leave the page. Only an anonymous "one file processed" counter is recorded for signed-in dashboard stats (no content), and you can opt out in account settings. This matters for unreleased or licensed foundry fonts you can't legally send to a server.
Can I run this in a build or CI pipeline?
Yes. GET /api/v1/tools/glyph-count-analyzer returns the (empty) option schema; pair the @jadapps/runner once and POST the font to http://127.0.0.1:9789/v1/tools/glyph-count-analyzer/run as a multipart files field. You get the same JSON back. Common use: read total_glyphs or the best savings_pct and fail the build if a font drifted. The design-system budget script guide has full patterns.
Should I subset every font?
Subset when a charset that covers your audience is meaningfully smaller than the full font — the analyser's savings_pct quantifies it. A 30%+ projected saving on a font you serve on every page is worth the build step. A 10 KB icon font, or a font where the smallest viable subset already projects 0% savings, isn't worth subsetting. Let the projection table decide rather than a blanket rule.
How do I serve multiple languages without shipping one giant font?
Use unicode-range to ship one @font-face per subset; the browser fetches only the WOFF2 for the script actually on the page. Run the analyser to pick which subsets to ship, build each with the font subsetter, then wire them into CSS with the @font-face generator and add <link rel=preload> for the critical one via the preload tag builder.
Privacy first
Every JAD Font tool runs entirely in your browser using opentype.js and the wawoff2 WASM Brotli encoder. Your fonts never leave your device — verified by zero outbound network requests during processing.