How to automate whitelist subsetting for marketing hero fonts
- Step 1Mark up the hero text so it's machine-extractable — Give the hero element a stable selector (e.g. `[data-hero]`). After your framework renders the page, you'll scan the built HTML for that node's text content and collect its unique codepoints. Fixed copy in a layout or a content file works just as well.
- Step 2Extract the unique characters from the rendered output — Read the built HTML (or the source string) and build a Set of codepoints with `Array.from(text)` + `codePointAt(0)` — the exact same decoding the browser tool uses. Include the space character if the hero has spaces.
- Step 3Subset with the browser-parity opentype.js script — Load the font, keep glyph 0 (`.notdef`) plus any glyph whose `unicode`/`unicodes` intersects your set, and rebuild with `new opentype.Font({ familyName, styleName, unitsPerEm, ascender, descender, glyphs })`. This is the literal browser-tool algorithm, so output matches what the UI would produce.
- Step 4Choose your engine based on whether kerning matters — The opentype.js path drops `kern`/`GPOS`/`GSUB` — fine for most caps heroes. If your hero needs kerning or ligatures, swap in `pyftsubset --text-file=hero.txt --layout-features='*'` or `hb-subset`, which preserve layout tables.
- Step 5Convert to WOFF2 and gate on size — The opentype.js script writes TTF; convert with [TTF→WOFF2](/font-tools/ttf-to-woff2) in the manual flow, or use `pyftsubset --flavor=woff2` to emit WOFF2 directly. Add a CI check that fails if the output exceeds a budget so a copy edit can't silently bloat the hero font.
- Step 6Make it idempotent and wire into deploy — Cache by (font hash + hero string) and skip when unchanged so the subset only runs when the hero copy or font changes. Run it as a prebuild/postbuild step and emit the WOFF2 into your assets directory with an `@font-face` from the [Font-Face Generator](/font-tools/font-face-generator).
Engines for whitelist subsetting in CI
The opentype.js script matches the JAD browser tool exactly but drops layout tables. pyftsubset/hb-subset preserve kerning and ligatures. Choose by whether your hero needs them.
| Engine | Runtime | Kerning / OT features | Output | Matches browser tool? |
|---|---|---|---|---|
| Node + opentype.js | Node (pure JS) | Dropped | TTF (then WOFF2 step) | Yes — byte-for-byte |
pyftsubset (fontTools) | Python | Preserved (--layout-features='*') | TTF or WOFF2 (--flavor) | No — keeps more tables |
hb-subset (harfbuzz) | C / WASM | Preserved | TTF / WOFF2 | No — keeps more tables |
| @jadapps/runner API | Node (local runner) | Engine-dependent | Per tool schema | Runs the JAD tool locally |
The whitelist-subset algorithm (browser parity)
These are the exact rules the in-browser Whitelist Builder applies. Replicate them in your script for identical output.
| Step | Rule |
|---|---|
| Build codepoint set | Array.from(text) then codePointAt(0) for each char; dedupe into a Set |
| Always keep glyph 0 | .notdef is pushed first, unconditionally (spec requirement) |
| Keep matching glyphs | Keep glyph i if g.unicode is in the set, or g.unicodes intersects it |
| Rebuild the font | new Font({ familyName, styleName, unitsPerEm, ascender, descender, glyphs }) |
| What's lost | kern, GPOS, GSUB, hinting tables not attached to kept glyphs, full name table |
| Output | font.toArrayBuffer() → uncompressed TTF |
Cookbook
Copy-paste starting points. The opentype.js script is browser-parity; the pyftsubset path is for kerning-sensitive heroes; the runner path needs no Python.
Extract hero characters from built HTML
ExampleScan the rendered output for the hero node and collect its unique codepoints — the same decoding the browser tool uses.
// extract-hero.mjs
import fs from "node:fs";
const html = fs.readFileSync("./dist/index.html", "utf8");
const m = html.match(/data-hero[^>]*>([^<]*)</);
const heroText = m ? m[1] : "JAD APPS — Build Faster";
const chars = new Set();
for (const ch of Array.from(heroText)) {
const cp = ch.codePointAt(0);
if (cp != null) chars.add(String.fromCodePoint(cp));
}
fs.writeFileSync("./hero-chars.txt", [...chars].join(""));
console.log(`hero glyphs: ${chars.size}`);Subset with browser-parity opentype.js
ExampleThis reproduces the in-browser Character Whitelist Builder exactly: .notdef kept, glyphs matched by unicode/unicodes, rebuilt with new Font(). Output is uncompressed TTF — convert to WOFF2 next.
// subset-hero.mjs — node subset-hero.mjs Brand.ttf hero-chars.txt
import fs from "node:fs";
import opentype from "opentype.js";
const [, , fontPath, charsFile] = process.argv;
const keep = new Set(
Array.from(fs.readFileSync(charsFile, "utf8").trim())
.map(c => c.codePointAt(0))
);
const font = opentype.loadSync(fontPath);
const glyphs = [font.glyphs.get(0)]; // .notdef, always
for (let i = 1; i < font.glyphs.length; i++) {
const g = font.glyphs.get(i);
const us = g.unicodes?.length ? g.unicodes : [g.unicode];
if (us.some(u => u != null && keep.has(u))) glyphs.push(g);
}
if (glyphs.length === 1) throw new Error("Subset would be empty.");
const out = new opentype.Font({
familyName: font.getEnglishName("fontFamily") || "Brand",
styleName: font.getEnglishName("fontSubfamily") || "Regular",
unitsPerEm: font.unitsPerEm,
ascender: font.ascender,
descender: font.descender,
glyphs,
});
fs.writeFileSync("Brand.hero.ttf", Buffer.from(out.toArrayBuffer()));
console.log(`kept ${glyphs.length} glyphs (incl .notdef)`);
// NOTE: kern/GPOS/GSUB dropped — same as the browser tool.Kerning-preserving alternative (pyftsubset)
ExampleIf the hero needs kern pairs or ligatures, the opentype.js path won't do — pyftsubset keeps layout tables and can emit WOFF2 directly. pip install fonttools brotli first.
pyftsubset Brand.ttf \ --text-file=hero-chars.txt \ --layout-features='*' \ --flavor=woff2 \ --output-file=Brand.hero.woff2 # Layout preserved; web-ready WOFF2 in one command.
GitHub Actions — subset + size budget
ExampleRegenerate the hero font in CI and fail the build if it regresses past a budget, catching the case where a copy edit widens the glyph set.
- name: Build hero font
run: |
node extract-hero.mjs
node subset-hero.mjs src/fonts/Brand.ttf hero-chars.txt
npx ttf2woff2 < Brand.hero.ttf > dist/fonts/Brand.hero.woff2
- name: Enforce hero font budget
run: |
SIZE=$(stat -c%s dist/fonts/Brand.hero.woff2)
echo "Brand.hero.woff2 = $SIZE bytes"
test "$SIZE" -le 4096 # fail if over 4 KBNo-toolchain path via the @jadapps/runner API
ExampleRun the actual JAD Character Whitelist Builder locally via the runner when you don't want Python or native deps. The schema is option-only (whitelistChars); the runner executes on your machine so the font stays local.
# 1. Inspect the schema (one required option: whitelistChars)
curl -H "Authorization: Bearer $JAD_API_KEY" \
https://jadapps.example/api/v1/tools/character-whitelist-builder
# 2. Dispatch to your paired local runner (font stays local):
# POST file + inputs to the runner endpoint per the API docs:
# 127.0.0.1:9789/v1/tools/character-whitelist-builder/run
# inputs={"whitelistChars":"JAD APPS — Build Faster"}
# Output is TTF; convert to WOFF2 as a follow-up step.Edge cases and what actually happens
Every row below was probed against the live API. Some documented requirements (alphabetical axis order, numerical tuple order) are not actually enforced in practice — useful to know if you've been blaming the wrong thing for a 400.
Pipeline output drops kerning vs a hand-tuned hero
Layout droppedThe browser-parity opentype.js script drops kern/GPOS/GSUB, exactly like the in-browser tool. If a designer hand-kerned the hero in mockups, the deployed micro-font will show default spacing. For caps geometric heroes this is usually fine; for refined editorial headlines, switch to pyftsubset --layout-features='*' or hb-subset in the pipeline — those preserve the tables.
A new copy edit introduces a glyph that wasn't subset
RegenerateThe whole point of automation: if marketing changes 'Build Faster' to 'Build Faster™', the ™ (U+2122) won't be in the old subset and renders .notdef. Because the pipeline re-derives the whitelist from the built HTML on every deploy, the regeneration picks it up automatically — provided your extraction step reads the final rendered text, not a stale source string.
Extraction misses dynamically-injected text
Build-time onlyGlyph extraction only sees text present in the built output. If the hero string is injected client-side (e.g. from an API after hydration), the build-time scan won't see it and the subset will tofu. For any runtime-variable hero, fall back to a charset subset (Latin range) instead of a whitelist so unexpected characters still render.
The script forgets the space character
ExpectedIf your extraction strips whitespace, the space glyph (U+0020) won't be in the set and 'JAD APPS' will render with a .notdef box between words. The browser tool keeps inner spaces; your script must too. Don't .replace(/\s/g,'') the hero text before building the codepoint set — keep the space.
CFF/OTF source font won't write via opentype.js
Writer errorJust like the browser tool, the opentype.js script's toArrayBuffer() can fail on CFF (PostScript-outline) OTF fonts. In a pipeline this surfaces as a thrown error mid-build. Either convert the source to a TrueType-outline TTF first, or use pyftsubset/hb-subset which handle CFF and CFF2 properly.
Output ships as TTF and bloats the page
Convert to WOFF2The opentype.js step emits an uncompressed TTF; shipping it directly is 2–3× larger than necessary. Always add a WOFF2 conversion (ttf2woff2, or use pyftsubset --flavor=woff2 to skip the two-step). Gate on the WOFF2 size, not the TTF size, so the budget reflects the real wire cost.
Build got slower because the subset runs every time
Make it idempotentSubsetting on every build, including local dev, is wasteful. Cache by a hash of (source font bytes + hero string); skip when unchanged. Or restrict the subset step to production builds so local iteration stays fast. The output is deterministic for a given (font, string), so caching is safe.
Multiple weights need separate runs
One file eachThe Whitelist Builder (and the script) processes one font file at a time. A hero in Bold plus a sub-headline in Regular means two subset runs producing two micro-fonts. Loop your weights in the build script; give each output a distinct filename and @font-face font-weight so the browser picks the right one.
Frequently asked questions
Can I run the Character Whitelist Builder in CI?
The browser tool itself is interactive, but you have two automation paths. (1) Reproduce its algorithm in Node with opentype.js — the cookbook script keeps .notdef, matches glyphs by unicode/unicodes, and rebuilds with new Font(), producing byte-for-byte the same TTF. (2) Run the actual JAD tool locally via the @jadapps/runner: GET /api/v1/tools/character-whitelist-builder returns the schema (one option, whitelistChars) and the runner executes the job on your machine at 127.0.0.1:9789/v1/tools/character-whitelist-builder/run.
Does the automated subset keep kerning?
Not the opentype.js path — it drops kern/GPOS/GSUB, exactly like the in-browser tool, because opentype.js's writer doesn't carry those tables. If your hero needs kerning or ligatures, use pyftsubset --text-file=hero.txt --layout-features='*' or hb-subset instead; both preserve layout tables. For most caps hero headlines the dropped kerning is invisible, so the JS path is fine.
How do I extract the exact hero characters automatically?
Scan your built HTML for the hero node (give it a stable selector like data-hero) and collect unique codepoints with Array.from(text) + codePointAt(0) — the same decoding the browser tool uses. Keep the space character. The cookbook's extract-hero.mjs does this; for whole-site discovery, glyphhanger crawls pages and collects all used glyphs.
Why is the output a TTF and not a WOFF2?
The browser tool and the opentype.js script both write sfnt/TTF via toArrayBuffer() — there's no WOFF2 writer in opentype.js. Add a WOFF2 step: ttf2woff2 on the file, or skip the two-step entirely with pyftsubset --flavor=woff2, which subsets and emits WOFF2 in one command (and preserves kerning).
Should the script forget about the space character?
No — keep it. The space is U+0020, a real glyph in the whitelist. If you strip whitespace from the hero text before building the codepoint set, 'JAD APPS' renders a .notdef box where the space should be. The browser tool keeps inner spaces (it only trims the outer edges of the input), so your script should too.
How do I gate font size in CI?
After producing the WOFF2, read its byte size and fail the build if it exceeds a budget — the cookbook shows test "$SIZE" -le 4096 for a 4 KB hero. Gate on the WOFF2 size (the real wire cost), not the intermediate TTF. This catches the regression where a copy edit widens the glyph set or someone swaps in an unsubsetted font.
What happens if the hero copy changes between deploys?
Because the pipeline re-derives the whitelist from the rendered output on every deploy, a copy change automatically updates the glyph set — provided your extraction reads the final built HTML, not a stale source string. This is the main reason to automate: a manual whitelist falls out of sync the moment marketing edits the hero.
Can I subset multiple weights in one run?
The tool and the script handle one font file per run, so loop your weights in the build script — one subset call per weight, each writing a distinct file. Give each @font-face the matching font-weight so the browser selects the right one. A hero in Bold and a sub-headline in Regular = two micro-fonts.
How do I keep the build reproducible?
The output is deterministic for a given (source font, character string), so pin your opentype.js / fontTools version, commit or hash the hero string, and cache by (font hash + string). Same inputs → same bytes, which keeps long-term HTTP caching effective and lets you skip the subset step when nothing changed.
What if my source font has CFF (PostScript) outlines?
opentype.js's writer can fail on CFF/OTF fonts — both the browser tool and the parity script throw on those. Convert the OTF to a TrueType-outline TTF before subsetting, or use pyftsubset/hb-subset, which handle CFF and CFF2 cleanly. This is one case where the kerning-preserving engines are also the more robust choice.
How does this differ from the general font-subsetting pipeline guide?
The font-subsetting build-pipeline guide covers charset/range subsetting (keep all of Latin, etc.) for body fonts. This guide is specifically about exact-character subsetting for fixed hero/marketing copy — smaller output, but it tofus if the text changes without a rebuild, which is exactly why automating the regeneration matters.
Is the font uploaded to JAD's servers when I use the runner API?
No. The @jadapps/runner executes the tool locally on your machine — GET /api/v1/tools/character-whitelist-builder returns only the option schema, and the actual font + processing stay on 127.0.0.1:9789. Nothing about the font bytes leaves your environment, which matters for unreleased brand fonts in a marketing build.
Privacy first
Every JAD Font tool runs entirely in your browser using opentype.js and the wawoff2 WASM Brotli encoder. Your fonts never leave your device — verified by zero outbound network requests during processing.