How to join crm and erp data in excel when company names are inconsistent
- Step 1Export both systems to Excel — Pull CRM data (company_name, deal_value, stage) and ERP data (entity_name, invoiced_amount) as separate XLSX or CSV files.
- Step 2Drop CRM as File A, ERP as File B — File A is the CRM export (one output row per CRM record); File B is the ERP export whose columns get appended as
right_*. - Step 3Set Left join col to the CRM name — Type the CRM name header into Left join col — for example
company_name. - Step 4Set Right join col to the ERP name — Type the ERP name header into Right join col — for example
entity_name. - Step 5Set Threshold % to 80 and run — Start at 80. The panel reports 'X matched · Y unmatched (threshold: 80%)'. Raise to 85–90 if you see false pairings.
- Step 6Validate in the score column — Download, sort by
__match_scoreascending, and have finance eyeball the lowest pairings before reporting revenue against pipeline.
Mapping CRM and ERP exports onto the two files
Fuzzy Merger has two drop zones and three text/number inputs. CRM is always File A so every CRM deal survives to the output.
| Field | CRM (File A) | ERP (File B) |
|---|---|---|
| Role | Left file — primary | Right file — reference |
| Join column input | Left join col | Right join col |
| Example header | company_name | entity_name |
| In output | Kept as-is, all rows | Appended as right_*, best match only |
Real CRM-vs-ERP name pairs and their scores
Scores from the live formula (lowercased, trimmed, Levenshtein over the longer string). 'Matched?' assumes the default 80% threshold.
| CRM name | ERP name | Score | Matched at 80%? |
|---|---|---|---|
| Amazon Web Services | AWS LLC | 16% | No |
| Salesforce.com Inc | Salesforce.com, Inc. | 90% | Yes |
| Acme Holdings | Acme Holdings Ltd | 76% | No |
| Globex Corporation | globex corporation | 100% | Yes |
| Initech | Initech Software | 44% | No |
Threshold tuning for company names
Guidance for choosing a threshold on CRM/ERP name joins. The control accepts 50–100; the default is 80.
| Threshold | Effect | Best for |
|---|---|---|
| 90–95 | Very strict — only near-identical names | Clean data; minimizing false revenue pairings |
| 80 (default) | Balanced — punctuation/casing variation joins | Most CRM/ERP joins as a starting point |
| 65–75 | Loose — abbreviations may join | Messy data; accept manual review of every match |
Cookbook
CRM-to-ERP joins shown with the exact right_ columns and score stamps the tool produces. Figures are illustrative.
Pipeline next to booked revenue
Each CRM deal gets the best-matching ERP entity's invoiced amount appended, so a board deck can show pipeline vs actuals in one sheet.
File A (crm.xlsx) File B (erp.xlsx) company_name deal_value entity_name invoiced Salesforce.com Inc 120000 Salesforce.com, Inc. 98000 Amazon Web Services 80000 AWS LLC 150000 Left: company_name Right: entity_name Threshold: 80 Output: company_name deal_value __match_score __matched_value right_invoiced Salesforce.com Inc 120000 90 Salesforce.com, Inc. 98000 Amazon Web Services 80000 16 (no match)
Spotting deals with no ERP record
Closed-won deals that never got billed surface as '(no match)'. Filter the column to hand them to finance.
Filter __matched_value = (no match): company_name deal_value __match_score Amazon Web Services 80000 16 Initech 45000 44 These closed deals have no matching ERP entity at 80% — investigate billing.
Tightening the threshold to drop a bad pairing
If a 78% pairing turns out wrong on review, raise the threshold so it falls back to '(no match)' instead of polluting the report.
Threshold 75: 'Acme Holdings' -> matched 'Acme Group' (76) [wrong entity] Threshold 80: 'Acme Holdings' -> (no match) [correct: flag it] Panel: '41 matched · 9 unmatched (threshold: 80%)'
Normalizing legal suffixes first
Stripping 'Inc', 'LLC', 'Ltd', and trailing punctuation before merging lifts scores so genuine matches clear the threshold.
Before: 'Amazon Web Services' vs 'AWS LLC' -> 16% (different strings)
After aliasing AWS in a helper column:
'AWS' vs 'AWS LLC' -> 43%
After stripping ' LLC':
'AWS' vs 'AWS' -> 100% -> matches
Fuzzy matching can't expand acronyms — pre-map known aliases.Keeping CRM and ERP region columns separate
If both exports have a region column, the ERP one arrives as right_region, so you can compare CRM-recorded region against ERP billing region.
CRM headers: company_name, region, deal_value
ERP headers: entity_name, region, invoiced
Right join col: entity_name
Output: company_name | region | deal_value | __match_score |
__matched_value | right_region | right_invoicedEdge cases and what actually happens
Account is not Developer tier
Requires DeveloperFuzzy Merger only runs on Developer tier. Free, Pro, and Pro + Media get 'Fuzzy Merger requires Developer tier.' Upgrade before attempting the CRM/ERP join.
Acronym vs full name (AWS vs Amazon Web Services)
Won't matchLevenshtein can't expand acronyms — 'AWS LLC' vs 'Amazon Web Services' scores ~16%. Pre-map known acronyms to full names in a helper column, then merge on that.
Only the CRM file is dropped
Two files requiredBoth exports are mandatory. With one file the run fails 'Two files are required for fuzzy merging.' Drop the ERP export into the File B zone.
Wrong join column header typed
Required / mismatchBoth join column inputs are required, and a misspelled header reads as empty everywhere — empty-vs-text scores 0, so the whole join returns '(no match)'. Confirm the exact header text in each export.
One CRM account, two ERP entities (parent + subsidiary)
By designOnly the single highest-scoring ERP entity is returned per CRM row. A parent and a subsidiary with similar names won't both appear — the closer string wins; ties go to the first ERP row encountered.
Closed deal with no ERP match
PreservedThe CRM row is kept with __matched_value: "(no match)" and the best score found. This is the feature, not a failure — it lets finance see deals lacking booked revenue.
Joining on numeric account IDs
Use exact joinIf both systems share a numeric account ID, fuzzy matching is the wrong tool — '1001' and '1010' score high by accident. Use the Sheet Joiner for clean ID joins.
Tens of thousands of accounts on both sides
Slow but supportedMatching is O(n × m). 20,000 CRM rows × 20,000 ERP rows is 400 million comparisons. Developer tier lifts the row cap but the browser will work hard — split by region or first letter to shrink each pass.
Trailing spaces from a copy-paste export
HandledJoin values are trimmed before scoring, so a trailing space in 'Globex ' doesn't break the match. Internal double-spaces still count as edits — clean those with the whitespace trimmer first.
Expecting a CSV to paste into another tool
XLSX onlyOutput is a binary fuzzy-merged.xlsx. If your downstream BI tool wants CSV, save-as CSV from Excel after downloading, or merge the CSV upstream.
Frequently asked questions
Which file should be File A — CRM or ERP?
Whichever you want one row of per output. Usually CRM is File A so every pipeline deal survives; ERP columns are then appended as right_* to the best-matching deal.
What threshold works for company names?
Start at the default 80%. It joins casing and punctuation variants like 'Globex Corporation' / 'globex corporation'. Raise to 85–90 if finance flags false pairings; the control accepts 50–100.
Why didn't 'AWS' match 'Amazon Web Services'?
Fuzzy matching scores character similarity, not meaning. Those two strings share almost no characters, so the score is ~16%. Map acronyms to full names in a helper column before merging.
Can a CRM account match two ERP entities?
No. Each CRM row gets exactly one ERP match — the single highest-scoring entity. There is no fan-out, so pipeline and revenue totals stay clean.
How do I find deals with no booked revenue?
After the merge, filter __matched_value for '(no match)'. Those CRM rows found no ERP entity above the threshold — exactly the deals finance needs to chase.
How are the ERP columns labeled?
Every ERP (File B) column except the right join column is appended with a right_ prefix, so invoiced becomes right_invoiced and never collides with a CRM column of the same name.
Is this safe for confidential revenue data?
Yes. The entire join runs in your browser via SheetJS — neither export is uploaded. Revenue and pipeline figures never leave your machine.
Should I match on numeric account IDs instead?
If both systems carry the same numeric ID, use exact join via the Sheet Joiner — fuzzy matching on numbers produces false positives. Reserve Fuzzy Merger for text names.
What tier do I need?
Developer tier. The tool checks isDeveloper before running; Free, Pro, and Pro + Media accounts cannot use it.
What output format do I get?
A binary .xlsx file (fuzzy-merged.xlsx, sheet 'FuzzyMerged') downloaded in your browser. There is no CSV or JSON export from this tool.
Can I join on multiple columns (name + city)?
Not directly — only one column per side. Concatenate name and city into a single helper column in each export first, then set that as the join column.
How big a join can it handle?
Developer tier allows up to 500 MB files and unlimited rows, but the join is O(n × m). Expect real processing time on six-figure inputs; pre-filtering by region keeps each pass fast.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.