How to join two excel product catalogs without matching skus using fuzzy product names
- Step 1Prepare both catalogs — Export the supplier price list (product_name, price, supplier_sku) and your internal product master (internal_name, internal_sku) as XLSX or CSV.
- Step 2Drop your master as File A — File A is the internal product master; every internal product becomes one output row whether or not a supplier match is found.
- Step 3Drop the supplier list as File B — File B is the supplier price list whose price and SKU columns get appended as
right_*to the best match. - Step 4Set the join columns — Left join col =
internal_name; Right join col =product_name. Type the exact headers — matching is case-insensitive and trimmed. - Step 5Set Threshold % and run — Try 78 for product names. The panel reports 'X matched · Y unmatched (threshold: 78%)'. Reorder-heavy names may need a lower threshold.
- Step 6Verify before updating the master — Download, sort by
__match_scoreascending, and confirm low-confidence matches (a wrong match means wrong supplier pricing) before merging the data back into your master.
Why product names are hard to match
Scores from the live formula after lowercasing and trimming. Word reordering and attribute soup penalize Levenshtein heavily even when products are identical.
| Internal name | Supplier name | Score | Matched at 78%? |
|---|---|---|---|
| USB-C Hub 7-Port | USB Type-C 7 Port Hub | 52% | No |
| Samsung Galaxy S24 | S24 Samsung Phone | 39% | No |
| Logitech MX Master 3S | Logitech MX Master 3S | 100% | Yes |
| AA Alkaline Battery 24pk | AA Alkaline Batteries 24 pack | 76% | No |
| 4K HDMI Cable 2m | 4k hdmi cable 2 m | 88% | Yes |
Output columns you get back
For each internal product, the tool stamps two metadata columns and appends every supplier column (except the join column) with a right_ prefix.
| Column | Source | Meaning |
|---|---|---|
| internal_name, internal_sku | File A | Your original columns, unchanged |
__match_score | Added | 0–100 confidence of the best supplier match |
__matched_value | Added | The supplier name matched, or '(no match)' |
right_price, right_supplier_sku | File B | Supplier columns, prefixed right_ |
Tier and scale reality
Fuzzy Merger requires Developer tier; the join compares every internal product against every supplier line (O(n × m)).
| Catalog size | Comparisons | Expectation |
|---|---|---|
| 500 × 2,000 | 1 million | Near-instant |
| 5,000 × 5,000 | 25 million | A few seconds in-browser |
| 50,000 × 50,000 | 2.5 billion | Slow — split by category first |
Cookbook
Catalog merges shown with the exact right_ columns the tool produces. Prices are illustrative.
Appending supplier pricing to the master
Each internal product picks up the best-matching supplier row's price and SKU, so you can update your master in one pass.
File A (master.xlsx) File B (supplier.xlsx) internal_name sku product_name price supplier_sku Logitech MX Master 3S M-01 Logitech MX Master 3S 79.99 SUP-9981 USB-C Hub 7-Port M-02 USB Type-C 7 Port Hub 24.50 SUP-3320 Left: internal_name Right: product_name Threshold: 78 Output: internal_name sku __match_score __matched_value right_price right_supplier_sku Logitech MX Master 3S M-01 100 Logitech MX Master 3S 79.99 SUP-9981 USB-C Hub 7-Port M-02 52 (no match)
Finding products with no supplier coverage
Internal products that found no supplier match surface as '(no match)' — your gaps to negotiate or source elsewhere.
Filter __matched_value = (no match): internal_name sku __match_score USB-C Hub 7-Port M-02 52 Samsung Galaxy S24 M-07 39 Word-reordered names scored low — try a lower threshold or normalize names.
Lowering the threshold for reordered names
Product names that reorder words ('S24 Samsung Phone') score low. Drop the threshold deliberately and verify the new matches.
Threshold 78: USB-C Hub 7-Port -> (no match) [52%] Threshold 50: USB-C Hub 7-Port -> 'USB Type-C 7 Port Hub' right_price 24.50 Panel: '2 matched · 0 unmatched (threshold: 50%)' — but verify each.
Normalizing names to lift scores
Stripping packaging words and unifying separators before the merge pushes real matches over the threshold.
Before: 'AA Alkaline Battery 24pk' vs 'AA Alkaline Batteries 24 pack' -> 76 (no match at 78)
After normalizing 'Batteries'->'Battery', '24 pack'->'24pk':
'AA Alkaline Battery 24pk' vs 'AA Alkaline Battery 24pk' -> 100 -> matches
Normalize plurals and pack syntax in a helper column first.Avoiding the wrong variant
Two supplier variants (64GB vs 128GB) with near-identical names — only the single best score is returned, so confirm it's the right variant.
Internal: 'Phone X 128GB' Supplier rows: 'Phone X 128GB' (97), 'Phone X 64GB' (94) Output __matched_value: 'Phone X 128GB' (the higher score wins) If scores were reversed by a typo, you'd get the wrong variant — check __match_score.
Edge cases and what actually happens
Account is below Developer tier
Requires DeveloperCatalog merging needs Developer tier; Free, Pro, and Pro + Media get 'Fuzzy Merger requires Developer tier.' before any matching runs.
Product names reorder words
Low scoresLevenshtein is order-sensitive: 'USB-C Hub 7-Port' vs 'USB Type-C 7 Port Hub' scores ~52%. Lower the threshold or normalize word order in a helper column before merging.
Only the master file uploaded
Two files requiredBoth catalogs are required. One file fails with 'Two files are required for fuzzy merging.' Drop the supplier list into File B.
Join column header typo
RequiredBoth Left and Right join col are required and must match the real headers. A typo reads as empty for every row, so the whole catalog returns '(no match)'.
Internal product matches several supplier variants
By designOnly the single highest-scoring supplier row is returned. A 64GB and 128GB variant won't both appear — the closer name wins; ties go to the first supplier row encountered. Confirm the variant via __matched_value.
Internal product with no supplier listing
PreservedThe internal row is kept with __matched_value: "(no match)" and the best score found, so coverage gaps are visible rather than silently dropped.
Matching on SKU codes instead of names
Use exact joinIf both catalogs share a SKU, fuzzy matching on codes is unreliable. Use the exact Sheet Joiner on the SKU and reserve Fuzzy Merger for name-only matching.
Large catalogs on both sides
Slow but supportedThe join is O(n × m): 50,000 × 50,000 is billions of comparisons. Developer tier removes the row cap, but split the merge by product category to keep each pass fast.
Names contain trademark or accented characters
Counts as edits'Café Bustelo®' vs 'Cafe Bustelo' costs edits for é and ®. Strip them with the non-ASCII remover before merging if you want them to score identically.
Expecting CSV output for an upload
XLSX onlyOutput is a binary fuzzy-merged.xlsx. If your marketplace importer needs CSV, save-as CSV from Excel after downloading.
Frequently asked questions
What threshold works for product names?
Try 75–80%. Names with reordered words or extra attributes score lower than you'd expect, so you may need to drop toward 60% and verify the new matches manually.
Why didn't 'USB-C Hub 7-Port' match 'USB Type-C 7 Port Hub'?
Levenshtein is order-sensitive and penalizes the inserted 'Type-' and reordered 'Hub', dropping the score to ~52%. Normalize word order in a helper column or lower the threshold.
Can one internal product match two supplier variants?
No. Only the single highest-scoring supplier row is returned per internal product. Check __matched_value to confirm you got the right variant (e.g. 128GB, not 64GB).
What supplier columns get added?
Every supplier (File B) column except the join column, each prefixed right_ — so price becomes right_price and supplier_sku becomes right_supplier_sku.
Are internal products without a supplier match kept?
Yes. They appear with __matched_value: "(no match)" and the best score found, so you can see exactly which products lack supplier coverage.
Should I match on SKU if both files have one?
If the SKUs are truly shared, use exact join via the Sheet Joiner — fuzzy matching on codes invites false positives. Fuzzy Merger is for name-only matching.
Can Fuzzy Merger handle 10,000+ product rows?
Yes on Developer tier (unlimited rows), but the join is O(n × m) — 10,000 × 10,000 is 100 million comparisons. Split by category to keep each pass responsive.
What output format do I get?
A binary .xlsx file (fuzzy-merged.xlsx, sheet 'FuzzyMerged'). Save-as CSV from Excel if your marketplace importer needs CSV.
Is supplier pricing uploaded anywhere?
No. Matching runs in your browser via SheetJS — both catalogs stay on your machine.
How do I improve match rates without lowering the threshold?
Normalize both name columns first: unify casing, strip packaging words ('pack'/'pk'), and standardize separators. Higher input quality means higher honest scores.
What tier do I need?
Developer tier. Free, Pro, and Pro + Media accounts cannot run Fuzzy Merger.
Can I match on product name plus brand together?
Concatenate name and brand into a single helper column in each file, then join on that — the tool matches on one column per side only.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.