Deduplicate an Excel Contact List Before Mailchimp Import

How to remove near-duplicate contacts from excel before email marketing import

Step 1
Export your contact list — Download the subscriber list from your platform (or your source spreadsheet) as .xlsx or .csv, with both the email and name columns present. Fuzzy Dedup reads the first sheet only.
Step 2
Run exact email dedup first — Before fuzzy matching, dedup identical emails with the exact csv-deduplicator on the email column. This removes the unambiguous duplicates cheaply and shrinks the list before the fuzzy pass.
Step 3
Open Fuzzy Dedup and set the name Key column — Drop the email-deduped file onto this tool and type the name column's exact header into the Key column field (free text), e.g. full_name or Name.
Step 4
Choose a threshold for names — Default 85; for personal names 88–95 is safer to avoid merging different people. Enter a value from 50 to 100. Lower catches more variants but raises false positives on short names.
Step 5
Process and review the merged-contact report — The panel shows {removedCount} removed · {keptCount} kept and previews up to 5 merges (50 in the downloadable report). Scan for false merges — two different subscribers with similar names — before trusting the cut.
Step 6
Download and import — Download deduped-fuzzy.xlsx (sheet Deduped, kept contacts with all columns). Import to Mailchimp/Klaviyo/HubSpot. If real subscribers merged, raise the threshold and re-run on the original.

Two-pass dedup: email then name

Exact email dedup and fuzzy name dedup catch different duplicates. Run both for the cleanest list.

Pass	Tool	Column	Catches
1 — exact	csv-deduplicator	email	Identical emails (`a@x.com` twice)
2 — fuzzy	Fuzzy Dedup (this tool)	name	Same person, different emails (`Rob`/`Robert` Johnson)
Optional — composite	Fuzzy Dedup on a name+email key	concatenated	Similar name AND same email (safer)

Email-list name patterns and scores

Normalized Levenshtein similarity (case-/whitespace-insensitive) for common subscriber-name duplicates.

Pattern	Example	Approx. similarity	Removed at 88%?
Trailing space / case	`Rob Johnson` / `rob johnson`	100%	Yes
Nickname (long form)	`Rob Johnson` / `Robert Johnson`	~73%	No (needs ~70%)
Nickname (short diff)	`Mike Lee` / `Michael Lee`	~67%	No
Single typo	`Jennifer` / `Jenifer`	~89%	Yes
Different people, similar name	`Sara Cohen` / `Sarah Cohen`	~91%	Yes (possible false merge)

Tier and behavior

Fuzzy Dedup is Pro-gated and dedups one column only.

Aspect	Behavior
Tier required	Pro minimum (Free blocked)
Pro capacity	50 MB · 100,000 rows · 5 files
Key columns	One only (concatenate for name+email)
Survivor	First contact of each cluster (file order)
Email awareness	None — scores the chosen column's string only
Output	`deduped-fuzzy.xlsx`, sheet `Deduped`, all columns kept

Cookbook

Real email-list duplicate patterns, the threshold and pass that catch each, and the report. Report row numbers are 1-based including the header row.

Same person, two emails (the fuzzy-name win)

Exact email dedup can't catch this — the emails differ. Fuzzy on the name surfaces Rob Johnson and Robert Johnson as likely the same subscriber. Note the ~73% score means you need a threshold around 70%, lower than the default.

After exact email dedup, remaining rows:
full_name,email
Rob Johnson,rob@work.com
Robert Johnson,robert@home.com

Fuzzy Dedup on full_name, threshold 70
Report
1 near-duplicate row(s) removed · 1 rows kept.
Row 3 "Robert Johnson" ≈ "Rob Johnson" (73%)

Output keeps the FIRST row (Rob Johnson, rob@work.com).
The second email is dropped — review the report first.

Signup-form casing/whitespace (always collapses)

Double signups often produce the same name with different casing or a trailing space. Trim+lowercase make these score 100%, so they collapse at any threshold — a safe, automatic win.

Input (column: full_name)
full_name,email
Rob Johnson,rob@x.com
rob johnson ,rob@x.com

threshold: 95
Report
1 near-duplicate row(s) removed · 1 rows kept.
Row 3 "rob johnson " ≈ "Rob Johnson" (100%)

Output: one Rob Johnson row. (Exact email dedup would also
catch this since the email is identical — fuzzy is a backstop.)

False-merge danger: Sara vs Sarah Cohen

Sara Cohen and Sarah Cohen differ by one letter (~91%) and would merge at 88% — but they could be two different people. For email lists, prefer a name+email composite key so only same-email near-names collapse.

Input (column: full_name)
full_name,email
Sara Cohen,sara@a.com
Sarah Cohen,sarah@b.com

Fuzzy on full_name, threshold 88
Report: Row 3 "Sarah Cohen" ≈ "Sara Cohen" (91%) removed
  -> deletes a possibly-real subscriber (different email!)

Safer: build a name|email key and dedup on that (see next).

Composite name+email key (safer for marketing)

To require a similar name AND the same email before merging, concatenate them into one column first. Then two different Cohens with different emails stay separate.

Step 1 — add a key column:
full_name,email,key
Rob Johnson,rob@x.com,Rob Johnson|rob@x.com
robert johnson,rob@x.com,robert johnson|rob@x.com
Sara Cohen,sara@a.com,Sara Cohen|sara@a.com
Sarah Cohen,sarah@b.com,Sarah Cohen|sarah@b.com

Step 2 — Fuzzy Dedup on key, threshold 85
Rows 1&2 (same email) -> high score, collapse.
Rows 3&4 (different email) -> lower score, BOTH kept.

Keep the subscriber record you want

First-occurrence-wins decides who survives. To keep the contact with the engaged email or richer profile, sort that row to the top before processing.

Before (stale record first):
full_name,email,last_open
Robert Johnson,old@x.com,2023-02-01
Rob Johnson,rob@x.com,2026-05-20

Sort by last_open DESC, then Fuzzy Dedup (threshold 70):
full_name,email,last_open
Rob Johnson,rob@x.com,2026-05-20    <- kept (engaged)
Robert Johnson,old@x.com,2023-02-01 <- removed

No "keep most engaged" option exists — sorting is the lever.

Edge cases and what actually happens

Fuzzy name dedup ignores the email

False merge

Fuzzy Dedup scores only the column you name. Run on the name column, it has no idea the emails differ, so two different people with similar names (Sara/Sarah Cohen) can collapse and you'd lose a real subscriber. Use a name+email composite key, or review the report, before importing.

Long-form nicknames score below the default

Missed duplicates

Rob/Robert (~73%) and Mike/Michael (~67%) score below the 85% default, so they survive. Lower the threshold to ~65–70% to catch them — but that raises the false-merge rate on short names, so review carefully.

Free tier marketer

Pro required

The processor throws Fuzzy Deduplicator requires Pro tier. for Free accounts. Email lists also often exceed Free's 10,000-row Excel cap. Pro gives 100,000 rows / 50 MB / 5 files; for the exact email pass, the csv-deduplicator is available on lower tiers.

Name column header typed wrong

Empty matches

The Key column is free text. If it doesn't match a header, every name reads empty, all blanks score 100%, and the whole list collapses to one contact. Copy the header verbatim and confirm the kept count looks right before importing.

Wanting to dedup on email only

Wrong tool

For identical-email duplicates, fuzzy matching is overkill and risks false positives (a@x.com vs a@x.con). Use the exact csv-deduplicator on the email column. Reserve Fuzzy Dedup for the name column to catch the same person across different emails.

Survivor has the wrong email

Order-dependent

First-occurrence-wins keeps whichever row is first, which may carry a stale or unengaged email. Sort by last-open or signup date before processing so the preferred record survives — there's no "keep most engaged" setting.

Merged contact loses the second email/tags

By design

Only the first contact's row is kept; the duplicate's unique data (second email, extra tags, a phone) is not merged in and exists only in the report. If you need to combine subscriber attributes, reconcile from the report or use your platform's merge feature.

Platform also dedups on import

Expected

Mailchimp and others merge exact-email duplicates silently on import, so your row count may not match their contact count even after this tool. Do the email exact-dedup first so the numbers are predictable; fuzzy name dedup catches what the platform's email-only merge won't.

Multi-sheet export

First sheet only

Fuzzy Dedup reads only the first sheet. If your export has a summary tab, move the subscriber rows to the first sheet or export them alone before deduplicating.

Reconciling two lists before a migration

Wrong tool

To match subscribers across two separate exports (e.g. old platform vs new) by approximate name and merge their columns, use excel-fuzzy-merger (Developer tier), not this single-file deduper.

Frequently asked questions

Should I dedup on email or name?

Both, in order. First exact-dedup the email column with the csv-deduplicator to remove identical-email duplicates. Then run Fuzzy Dedup on the name column to catch the same person under different email addresses, which the email pass can't see.

What if the threshold flags too many legitimate unique contacts?

Raise it to 92–95%. You can also review the report (count plus up to 50 previewed Row N "value" ≈ "matched" (score%) lines) and re-run on the original at a higher threshold. For safety, dedup on a name+email composite key so different-email near-names don't merge.

Why didn't 'Rob Johnson' and 'Robert Johnson' merge at the default threshold?

They score about 73%, below the 85% default — Robert adds three characters to Rob. Lower the threshold to roughly 70% to catch long-form nicknames, then review the report because shorter names get riskier as you lower the bar.

Will it merge two different people with similar names?

Yes — it scores the name string only and ignores the email. Sara Cohen and Sarah Cohen (~91%) would merge at 88% even with different emails. Use a name+email composite key or a higher threshold, and review the report before importing.

Does it combine the two emails or tags when it merges contacts?

No. Only the first contact's row is kept; the duplicate's second email, extra tags, or phone are dropped from the file and appear only in the report. Reconcile from the report or use your email platform's merge feature for attribute-level merging.

How do I make sure the engaged record survives?

Sort the file so that row is first (e.g. by last-open date descending) before processing — first-occurrence-wins keeps it. There is no "keep most engaged" or "keep most complete" option in the tool.

Is the matching case-sensitive?

No — names are lowercased and trimmed before scoring, so Rob Johnson, rob johnson, and Rob Johnson all score 100% and collapse even at a 100% threshold. Signup-form casing and whitespace artifacts disappear automatically.

Which platforms can I import the cleaned list to?

Any that accept .xlsx/.csv — Mailchimp, Klaviyo, HubSpot, ActiveCampaign, and others. The clean output is a standard .xlsx; export to CSV if your platform prefers it.

How big a list can I process?

Pro tier handles 100,000 rows / 50 MB / 5 files; Pro-media 500,000 rows; Developer is unlimited. Free tier cannot run Fuzzy Dedup, though the exact email pass via csv-deduplicator is available on lower tiers.

Can I preview the merges before they happen?

Deduplication runs when you process; the panel then shows what merged (count plus up to 5 previews, up to 50 in the report). There's no per-contact confirm. To change the result, adjust the threshold or build a composite key and re-run on the original.

Is my subscriber data uploaded anywhere?

No. Everything runs in your browser via SheetJS — names and emails stay on your machine, and the clean .xlsx is generated and downloaded locally.

What if I deduped too aggressively?

Your input is untouched; the output is a separate deduped-fuzzy.xlsx. Re-process the original at a higher threshold (or with a composite key) to keep more contacts, and compare the reports before importing either version.

Privacy first

Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.

How to remove near-duplicate contacts from excel before email marketing import

Step 1
Export your contact list — Download the subscriber list from your platform (or your source spreadsheet) as .xlsx or .csv, with both the email and name columns present. Fuzzy Dedup reads the first sheet only.
Step 2
Run exact email dedup first — Before fuzzy matching, dedup identical emails with the exact csv-deduplicator on the email column. This removes the unambiguous duplicates cheaply and shrinks the list before the fuzzy pass.
Step 3
Open Fuzzy Dedup and set the name Key column — Drop the email-deduped file onto this tool and type the name column's exact header into the Key column field (free text), e.g. full_name or Name.
Step 4
Choose a threshold for names — Default 85; for personal names 88–95 is safer to avoid merging different people. Enter a value from 50 to 100. Lower catches more variants but raises false positives on short names.
Step 5
Process and review the merged-contact report — The panel shows {removedCount} removed · {keptCount} kept and previews up to 5 merges (50 in the downloadable report). Scan for false merges — two different subscribers with similar names — before trusting the cut.
Step 6
Download and import — Download deduped-fuzzy.xlsx (sheet Deduped, kept contacts with all columns). Import to Mailchimp/Klaviyo/HubSpot. If real subscribers merged, raise the threshold and re-run on the original.

Two-pass dedup: email then name

Exact email dedup and fuzzy name dedup catch different duplicates. Run both for the cleanest list.

Pass	Tool	Column	Catches
1 — exact	csv-deduplicator	email	Identical emails (`a@x.com` twice)
2 — fuzzy	Fuzzy Dedup (this tool)	name	Same person, different emails (`Rob`/`Robert` Johnson)
Optional — composite	Fuzzy Dedup on a name+email key	concatenated	Similar name AND same email (safer)

Email-list name patterns and scores

Normalized Levenshtein similarity (case-/whitespace-insensitive) for common subscriber-name duplicates.

Pattern	Example	Approx. similarity	Removed at 88%?
Trailing space / case	`Rob Johnson` / `rob johnson`	100%	Yes
Nickname (long form)	`Rob Johnson` / `Robert Johnson`	~73%	No (needs ~70%)
Nickname (short diff)	`Mike Lee` / `Michael Lee`	~67%	No
Single typo	`Jennifer` / `Jenifer`	~89%	Yes
Different people, similar name	`Sara Cohen` / `Sarah Cohen`	~91%	Yes (possible false merge)

Tier and behavior

Fuzzy Dedup is Pro-gated and dedups one column only.

Aspect	Behavior
Tier required	Pro minimum (Free blocked)
Pro capacity	50 MB · 100,000 rows · 5 files
Key columns	One only (concatenate for name+email)
Survivor	First contact of each cluster (file order)
Email awareness	None — scores the chosen column's string only
Output	`deduped-fuzzy.xlsx`, sheet `Deduped`, all columns kept

Cookbook

Real email-list duplicate patterns, the threshold and pass that catch each, and the report. Report row numbers are 1-based including the header row.

Same person, two emails (the fuzzy-name win)

After exact email dedup, remaining rows:
full_name,email
Rob Johnson,rob@work.com
Robert Johnson,robert@home.com

Fuzzy Dedup on full_name, threshold 70
Report
1 near-duplicate row(s) removed · 1 rows kept.
Row 3 "Robert Johnson" ≈ "Rob Johnson" (73%)

Output keeps the FIRST row (Rob Johnson, rob@work.com).
The second email is dropped — review the report first.

Signup-form casing/whitespace (always collapses)

Double signups often produce the same name with different casing or a trailing space. Trim+lowercase make these score 100%, so they collapse at any threshold — a safe, automatic win.

Input (column: full_name)
full_name,email
Rob Johnson,rob@x.com
rob johnson ,rob@x.com

threshold: 95
Report
1 near-duplicate row(s) removed · 1 rows kept.
Row 3 "rob johnson " ≈ "Rob Johnson" (100%)

Output: one Rob Johnson row. (Exact email dedup would also
catch this since the email is identical — fuzzy is a backstop.)

False-merge danger: Sara vs Sarah Cohen

Input (column: full_name)
full_name,email
Sara Cohen,sara@a.com
Sarah Cohen,sarah@b.com

Fuzzy on full_name, threshold 88
Report: Row 3 "Sarah Cohen" ≈ "Sara Cohen" (91%) removed
  -> deletes a possibly-real subscriber (different email!)

Safer: build a name|email key and dedup on that (see next).

Composite name+email key (safer for marketing)

To require a similar name AND the same email before merging, concatenate them into one column first. Then two different Cohens with different emails stay separate.

Step 1 — add a key column:
full_name,email,key
Rob Johnson,rob@x.com,Rob Johnson|rob@x.com
robert johnson,rob@x.com,robert johnson|rob@x.com
Sara Cohen,sara@a.com,Sara Cohen|sara@a.com
Sarah Cohen,sarah@b.com,Sarah Cohen|sarah@b.com

Step 2 — Fuzzy Dedup on key, threshold 85
Rows 1&2 (same email) -> high score, collapse.
Rows 3&4 (different email) -> lower score, BOTH kept.

Keep the subscriber record you want

First-occurrence-wins decides who survives. To keep the contact with the engaged email or richer profile, sort that row to the top before processing.

Before (stale record first):
full_name,email,last_open
Robert Johnson,old@x.com,2023-02-01
Rob Johnson,rob@x.com,2026-05-20

Sort by last_open DESC, then Fuzzy Dedup (threshold 70):
full_name,email,last_open
Rob Johnson,rob@x.com,2026-05-20    <- kept (engaged)
Robert Johnson,old@x.com,2023-02-01 <- removed

No "keep most engaged" option exists — sorting is the lever.

Edge cases and what actually happens

Fuzzy name dedup ignores the email

False merge

Long-form nicknames score below the default

Missed duplicates

Free tier marketer

Pro required

Name column header typed wrong

Empty matches

Wanting to dedup on email only

Wrong tool

Survivor has the wrong email

Order-dependent

Merged contact loses the second email/tags

By design

Platform also dedups on import

Expected

Multi-sheet export

First sheet only

Fuzzy Dedup reads only the first sheet. If your export has a summary tab, move the subscriber rows to the first sheet or export them alone before deduplicating.

Reconciling two lists before a migration

Wrong tool

To match subscribers across two separate exports (e.g. old platform vs new) by approximate name and merge their columns, use excel-fuzzy-merger (Developer tier), not this single-file deduper.

Frequently asked questions

Should I dedup on email or name?

What if the threshold flags too many legitimate unique contacts?

Why didn't 'Rob Johnson' and 'Robert Johnson' merge at the default threshold?

Will it merge two different people with similar names?

Does it combine the two emails or tags when it merges contacts?

How do I make sure the engaged record survives?

Is the matching case-sensitive?

Which platforms can I import the cleaned list to?

Any that accept .xlsx/.csv — Mailchimp, Klaviyo, HubSpot, ActiveCampaign, and others. The clean output is a standard .xlsx; export to CSV if your platform prefers it.

How big a list can I process?

Can I preview the merges before they happen?

Is my subscriber data uploaded anywhere?

No. Everything runs in your browser via SheetJS — names and emails stay on your machine, and the clean .xlsx is generated and downloaded locally.

Remove Near-Duplicate Contacts from Excel Before Email Marketing Import

How to remove near-duplicate contacts from excel before email marketing import

Two-pass dedup: email then name

Email-list name patterns and scores

Tier and behavior

Cookbook

Same person, two emails (the fuzzy-name win)

Signup-form casing/whitespace (always collapses)

False-merge danger: Sara vs Sarah Cohen

Composite name+email key (safer for marketing)

Keep the subscriber record you want

Edge cases and what actually happens

Fuzzy name dedup ignores the email

Long-form nicknames score below the default

Free tier marketer

Name column header typed wrong

Wanting to dedup on email only

Survivor has the wrong email

Merged contact loses the second email/tags

Platform also dedups on import

Multi-sheet export

Reconciling two lists before a migration

Frequently asked questions

Should I dedup on email or name?

What if the threshold flags too many legitimate unique contacts?

Why didn't 'Rob Johnson' and 'Robert Johnson' merge at the default threshold?

Will it merge two different people with similar names?

Does it combine the two emails or tags when it merges contacts?

How do I make sure the engaged record survives?

Is the matching case-sensitive?

Which platforms can I import the cleaned list to?

How big a list can I process?

Can I preview the merges before they happen?

Is my subscriber data uploaded anywhere?

What if I deduped too aggressively?

Privacy first

Related guides

Remove Near-Duplicate Contacts from Excel Before Email Marketing Import

How to remove near-duplicate contacts from excel before email marketing import

Two-pass dedup: email then name

Email-list name patterns and scores

Tier and behavior

Cookbook

Same person, two emails (the fuzzy-name win)

Signup-form casing/whitespace (always collapses)

False-merge danger: Sara vs Sarah Cohen

Composite name+email key (safer for marketing)

Keep the subscriber record you want

Edge cases and what actually happens

Fuzzy name dedup ignores the email

Long-form nicknames score below the default

Free tier marketer

Name column header typed wrong

Wanting to dedup on email only

Survivor has the wrong email

Merged contact loses the second email/tags

Platform also dedups on import

Multi-sheet export

Reconciling two lists before a migration

Frequently asked questions

Should I dedup on email or name?

What if the threshold flags too many legitimate unique contacts?

Why didn't 'Rob Johnson' and 'Robert Johnson' merge at the default threshold?

Will it merge two different people with similar names?

Does it combine the two emails or tags when it merges contacts?

How do I make sure the engaged record survives?

Is the matching case-sensitive?

Which platforms can I import the cleaned list to?

How big a list can I process?

Can I preview the merges before they happen?

Is my subscriber data uploaded anywhere?

What if I deduped too aggressively?

Privacy first

Related guides