Detect Duplicate Leads in a CSV Before Outreach

How to find duplicate leads in a csv before sales outreach

Step 1
Export or assemble the lead CSV — Download from your lead-gen tool, list provider, webinar platform, or form. If you have several lists, append them first with csv-merger so duplicates across sources are caught in one pass.
Step 2
Drop the file onto the tool — Parsing runs in your browser via PapaParse — contact data never reaches a server. Free runs handle up to 2 MB / 500 rows; Pro handles 100 MB / 100,000 rows.
Step 3
Select the Email column — In Find duplicates in column, choose Email (or Email Address). One key column per run, so do email first.
Step 4
Keep case-insensitive on (the default) — Leave Case-sensitive matching off so email casing is ignored — the correct setting for addresses. Click Find duplicates.
Step 5
Download, then run the phone pass — Click Download Marked CSV. Then re-run the tool on the same source selecting the Phone column to catch same-number, different-email duplicates. Reconcile the two marked files.
Step 6
Clean and import — Filter _is_duplicate = YES, decide which record to keep (usually the most enriched), remove the rest, and import the de-duplicated list into your CRM or sequencer.

What the lead duplicate finder does

The full control set. One key column per pass, one checkbox, flag-only output. No fuzzy matching and no cross-column key.

Control	Behaviour	Default
Find duplicates in column	Single key column (e.g. `Email`, then `Phone` on a second run); values are grouped to find repeats	First column
Case-sensitive matching	Off lowercases before comparing — correct for email. On requires identical casing	Off
`_is_duplicate` column	`YES` if the key value appears 2+ times, `NO` if once; first occurrence is `YES` as well	Always added
Removal / merge	None — leads are flagged, not merged or deleted. Use csv-deduplicator to drop surplus rows	Zero removed

Two-pass plan for lead lists

Because the key is one column, run separate passes to cover both identifiers. Reconcile the two marked outputs afterwards.

Pass	Key column	Catches	Note
1	`Email`	Same inbox imported twice (case ignored by default)	Plus-addressing (`me+a@x.com` vs `me@x.com`) is NOT merged — different text
2	`Phone`	Same person under a different email but identical phone	Normalise phone format first (strip spaces/`+`) so `+44 7…` and `07…` match
Optional	Combined `Email\|Phone`	Exact same email AND phone	Build the column with csv-column-merger first

Cookbook

Before/after rows from real prospect exports. Emails and phones anonymised; the _is_duplicate column is exactly what the tool appends.

Case-different email captured by the default

Example

The same prospect signed up via two forms — once with autocapitalised email, once lowercase. Case-insensitive matching (default) treats them as one inbox and flags both rows.

Input (leads.csv):
Email,First Name,Source
Sue@Acme.com,Sue,Webinar
jon@borex.io,Jon,Cold list
sue@acme.com,Sue,Newsletter

Key column: Email  ·  Case-sensitive: off (default)

Output (leads.duplicates-marked.csv):
Email,First Name,Source,_is_duplicate
Sue@Acme.com,Sue,Webinar,YES
jon@borex.io,Jon,Cold list,NO
sue@acme.com,Sue,Newsletter,YES

Plus-addressing is NOT treated as the same lead

Example

Matching is whole-cell exact, so me+webinar@x.com and me@x.com are different strings and are not flagged — even though they reach the same inbox. If you want them merged, strip the plus-tag first.

Input:
Email,Campaign
me+webinar@x.com,Spring
me@x.com,Spring

Key column: Email  ·  Case-sensitive: off

Output (not flagged — text differs):
Email,Campaign,_is_duplicate
me+webinar@x.com,Spring,NO
me@x.com,Spring,NO

Fix: use csv-find-replace with pattern \+[^@]+ -> empty,
then re-run on Email.

Trailing space hides a duplicate email

Example

Mobile autocomplete added a trailing space to one email. Because matching does not trim, the pair is not detected. Trim the column first so genuine duplicates surface.

Input (space after the first .com):
Email,Name
lead@x.com ,A
lead@x.com,B

Key column: Email  ·  Case-sensitive: off

Output (missed):
Email,Name,_is_duplicate
lead@x.com ,A,NO
lead@x.com,B,NO

Fix: run csv-whitespace-trimmer first, then re-check.

Phone pass after the email pass

Example

Two leads have different emails but the same phone — only the second (Phone) pass catches them. Normalise phone formatting first so spacing and country prefixes don't block the match.

Input (after normalising phone to bare digits):
Email,Phone
work@x.com,447700900111
personal@y.com,447700900111

Key column: Phone  ·  Case-sensitive: off

Output:
Email,Phone,_is_duplicate
work@x.com,447700900111,YES
personal@y.com,447700900111,YES

Reading the list-health summary

Example

For a 900-row merged lead list, the summary tells you how much overlap the sources had before you decide what to import.

Summary after Find duplicates (Email pass):
  Duplicate groups : 64   (email addresses that repeat)
  Extra copies     : 71   (surplus rows to review/remove)
  Unique values    : 765  (emails appearing exactly once)

Meaning: 829 distinct emails across 900 rows; 64 addresses
repeat, some more than twice. 71 rows are surplus copies.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

You expected duplicate leads merged automatically

By design

This tool flags only — it appends _is_duplicate so you choose which record to keep (the more enriched one usually wins). To physically remove duplicate leads and keep one per group, use csv-deduplicator.

Plus-addressed or dotted Gmail variants

Not matched

me+a@x.com vs me@x.com, or j.smith@gmail.com vs jsmith@gmail.com, are different text and won't group, even though they hit the same inbox. Normalise with csv-find-replace (strip +tag; for Gmail, remove dots in the local part) before running.

Phone numbers in mixed formats

Not matched

+44 7700 900111 and 07700900111 are different strings. The phone pass only matches identical text, so normalise to a single format (strip spaces, +, and leading-zero/country-code differences) with csv-find-replace first.

Trailing spaces on emails from form fills

Not matched

Whole-cell matching does not trim, so lead@x.com and lead@x.com look distinct. Run csv-whitespace-trimmer before the email pass to avoid missing real duplicates.

First occurrence marked YES too

Expected

All members of a duplicate group are flagged YES, including the first. This is so you can compare every record and keep the best one. For surplus-only removal, use csv-deduplicator.

Empty email cells

Grouped together

Rows with a blank email all share one empty key and get flagged YES together; the duplicate list shows (empty). Filter out blanks (or use a different identifier) before importing.

Need to match on email AND phone in one pass

Single key only

The key is one column. Build a combined Email|Phone column with csv-column-merger and key on it for an exact composite match, or do two separate passes and reconcile.

Lead list over the free 500-row / 2 MB cap

Upgrade required

Free runs cap at 2 MB and 500 rows; bigger lists are blocked with a Pro prompt. Pro raises it to 100 MB / 100,000 rows. Splitting with csv-row-splitter works for a one-off but won't catch duplicates across chunks.

Frequently asked questions

Should I check email and phone as separate passes?

Yes. The key is a single column, so run Email first, then run Phone on the same source to catch leads that have different emails but the same number. Reconcile the two marked files. For an exact composite match, combine the columns with csv-column-merger and key on that.

Does this work across multiple lists merged into one CSV?

Yes, and it's the recommended workflow. Append your lists with csv-merger first (they should share a header schema), then run the duplicate finder on the combined file so cross-source duplicates are caught in one pass.

Does it merge or delete the duplicate leads?

Neither. It appends an _is_duplicate column (YES/NO) and keeps every row so you decide which record to retain. To actually remove duplicates and keep one per group, use csv-deduplicator.

Will it catch plus-addressed emails like me+tag@x.com?

No. Matching is exact text, so me+tag@x.com and me@x.com are treated as different even though they share an inbox. Strip the plus-tag with csv-find-replace (pattern \+[^@]+) before the email pass if you want them merged.

Why didn't it flag two leads with the same email?

Usually an invisible difference: a trailing space, a non-breaking space, or different casing with case-sensitive matching accidentally on. Keep case-sensitive off for email and trim with csv-whitespace-trimmer first.

Is contact data uploaded?

No. All parsing and detection happen in your browser. Prospect names, emails, and phone numbers never reach a server. Only an anonymous usage counter is recorded when signed in, and it can be turned off in settings.

How do I make phone numbers match across formats?

Normalise to a single format before the phone pass — strip spaces, dashes, and the +, and reconcile country-code vs leading-zero forms — using csv-find-replace. Then +44 7700 900111 and 07700900111 reduce to the same digits and will match.

What does each summary number mean for my list?

Duplicate groups = how many email (or phone) values repeat. Extra copies = surplus rows you'd remove on a clean import. Unique values = leads appearing exactly once. Together they tell you how much overlap your sources had.

How large a lead list can I check?

Free runs handle up to 2 MB and 500 rows; larger files are blocked with a Pro prompt. Pro handles 100 MB and 100,000 rows. For lists beyond that, split with csv-row-splitter, accepting that cross-chunk duplicates won't be detected.

What's the output file?

Your original CSV plus a trailing _is_duplicate column, saved as <yourfile>.duplicates-marked.csv. Filter that column to YES in your spreadsheet to see only the duplicate leads.

Can I keep the most recently captured duplicate?

Sort the file by a capture-date column descending with csv-sorter before running, so the first occurrence within each group is the newest. The flag still marks all copies; you then keep whichever row your policy prefers.

Should I dedupe before or after CRM import?

Before. Most CRMs either reject or silently merge duplicate emails on import, which makes your row counts unpredictable. Flagging and cleaning the CSV first gives you a list whose count matches what lands in the CRM.

Privacy first

Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to find duplicate leads in a csv before sales outreach

Step 1
Export or assemble the lead CSV — Download from your lead-gen tool, list provider, webinar platform, or form. If you have several lists, append them first with csv-merger so duplicates across sources are caught in one pass.
Step 2
Drop the file onto the tool — Parsing runs in your browser via PapaParse — contact data never reaches a server. Free runs handle up to 2 MB / 500 rows; Pro handles 100 MB / 100,000 rows.
Step 3
Select the Email column — In Find duplicates in column, choose Email (or Email Address). One key column per run, so do email first.
Step 4
Keep case-insensitive on (the default) — Leave Case-sensitive matching off so email casing is ignored — the correct setting for addresses. Click Find duplicates.
Step 5
Download, then run the phone pass — Click Download Marked CSV. Then re-run the tool on the same source selecting the Phone column to catch same-number, different-email duplicates. Reconcile the two marked files.
Step 6
Clean and import — Filter _is_duplicate = YES, decide which record to keep (usually the most enriched), remove the rest, and import the de-duplicated list into your CRM or sequencer.

What the lead duplicate finder does

The full control set. One key column per pass, one checkbox, flag-only output. No fuzzy matching and no cross-column key.

Control	Behaviour	Default
Find duplicates in column	Single key column (e.g. `Email`, then `Phone` on a second run); values are grouped to find repeats	First column
Case-sensitive matching	Off lowercases before comparing — correct for email. On requires identical casing	Off
`_is_duplicate` column	`YES` if the key value appears 2+ times, `NO` if once; first occurrence is `YES` as well	Always added
Removal / merge	None — leads are flagged, not merged or deleted. Use csv-deduplicator to drop surplus rows	Zero removed

Two-pass plan for lead lists

Because the key is one column, run separate passes to cover both identifiers. Reconcile the two marked outputs afterwards.

Pass	Key column	Catches	Note
1	`Email`	Same inbox imported twice (case ignored by default)	Plus-addressing (`me+a@x.com` vs `me@x.com`) is NOT merged — different text
2	`Phone`	Same person under a different email but identical phone	Normalise phone format first (strip spaces/`+`) so `+44 7…` and `07…` match
Optional	Combined `Email\|Phone`	Exact same email AND phone	Build the column with csv-column-merger first

Cookbook

Before/after rows from real prospect exports. Emails and phones anonymised; the _is_duplicate column is exactly what the tool appends.

Case-different email captured by the default

Example

The same prospect signed up via two forms — once with autocapitalised email, once lowercase. Case-insensitive matching (default) treats them as one inbox and flags both rows.

Input (leads.csv):
Email,First Name,Source
Sue@Acme.com,Sue,Webinar
jon@borex.io,Jon,Cold list
sue@acme.com,Sue,Newsletter

Key column: Email  ·  Case-sensitive: off (default)

Output (leads.duplicates-marked.csv):
Email,First Name,Source,_is_duplicate
Sue@Acme.com,Sue,Webinar,YES
jon@borex.io,Jon,Cold list,NO
sue@acme.com,Sue,Newsletter,YES

Plus-addressing is NOT treated as the same lead

Example

Input:
Email,Campaign
me+webinar@x.com,Spring
me@x.com,Spring

Key column: Email  ·  Case-sensitive: off

Output (not flagged — text differs):
Email,Campaign,_is_duplicate
me+webinar@x.com,Spring,NO
me@x.com,Spring,NO

Fix: use csv-find-replace with pattern \+[^@]+ -> empty,
then re-run on Email.

Trailing space hides a duplicate email

Example

Mobile autocomplete added a trailing space to one email. Because matching does not trim, the pair is not detected. Trim the column first so genuine duplicates surface.

Input (space after the first .com):
Email,Name
lead@x.com ,A
lead@x.com,B

Key column: Email  ·  Case-sensitive: off

Output (missed):
Email,Name,_is_duplicate
lead@x.com ,A,NO
lead@x.com,B,NO

Fix: run csv-whitespace-trimmer first, then re-check.

Phone pass after the email pass

Example

Two leads have different emails but the same phone — only the second (Phone) pass catches them. Normalise phone formatting first so spacing and country prefixes don't block the match.

Input (after normalising phone to bare digits):
Email,Phone
work@x.com,447700900111
personal@y.com,447700900111

Key column: Phone  ·  Case-sensitive: off

Output:
Email,Phone,_is_duplicate
work@x.com,447700900111,YES
personal@y.com,447700900111,YES

Reading the list-health summary

Example

For a 900-row merged lead list, the summary tells you how much overlap the sources had before you decide what to import.

Summary after Find duplicates (Email pass):
  Duplicate groups : 64   (email addresses that repeat)
  Extra copies     : 71   (surplus rows to review/remove)
  Unique values    : 765  (emails appearing exactly once)

Meaning: 829 distinct emails across 900 rows; 64 addresses
repeat, some more than twice. 71 rows are surplus copies.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

You expected duplicate leads merged automatically

By design

Plus-addressed or dotted Gmail variants

Not matched

Phone numbers in mixed formats

Not matched

Trailing spaces on emails from form fills

Not matched

Whole-cell matching does not trim, so lead@x.com and lead@x.com look distinct. Run csv-whitespace-trimmer before the email pass to avoid missing real duplicates.

First occurrence marked YES too

Expected

All members of a duplicate group are flagged YES, including the first. This is so you can compare every record and keep the best one. For surplus-only removal, use csv-deduplicator.

Empty email cells

Grouped together

Rows with a blank email all share one empty key and get flagged YES together; the duplicate list shows (empty). Filter out blanks (or use a different identifier) before importing.

Need to match on email AND phone in one pass

Single key only

The key is one column. Build a combined Email|Phone column with csv-column-merger and key on it for an exact composite match, or do two separate passes and reconcile.

Lead list over the free 500-row / 2 MB cap

Upgrade required

Frequently asked questions

Should I check email and phone as separate passes?

Does this work across multiple lists merged into one CSV?

Does it merge or delete the duplicate leads?

Neither. It appends an _is_duplicate column (YES/NO) and keeps every row so you decide which record to retain. To actually remove duplicates and keep one per group, use csv-deduplicator.

Will it catch plus-addressed emails like me+tag@x.com?

Why didn't it flag two leads with the same email?

Is contact data uploaded?

How do I make phone numbers match across formats?

What does each summary number mean for my list?

How large a lead list can I check?

What's the output file?

Your original CSV plus a trailing _is_duplicate column, saved as <yourfile>.duplicates-marked.csv. Filter that column to YES in your spreadsheet to see only the duplicate leads.

Can I keep the most recently captured duplicate?

Should I dedupe before or after CRM import?

Privacy first

Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Find Duplicate Leads in a CSV Before Sales Outreach

How to find duplicate leads in a csv before sales outreach

What the lead duplicate finder does

Two-pass plan for lead lists

Cookbook

Case-different email captured by the default

Plus-addressing is NOT treated as the same lead

Trailing space hides a duplicate email

Phone pass after the email pass

Reading the list-health summary

Errors and edge cases

You expected duplicate leads merged automatically

Plus-addressed or dotted Gmail variants

Phone numbers in mixed formats

Trailing spaces on emails from form fills

First occurrence marked YES too

Empty email cells

Need to match on email AND phone in one pass

Lead list over the free 500-row / 2 MB cap

Frequently asked questions

Should I check email and phone as separate passes?

Does this work across multiple lists merged into one CSV?

Does it merge or delete the duplicate leads?

Will it catch plus-addressed emails like me+tag@x.com?

Why didn't it flag two leads with the same email?

Is contact data uploaded?

How do I make phone numbers match across formats?

What does each summary number mean for my list?

How large a lead list can I check?

What's the output file?

Can I keep the most recently captured duplicate?

Should I dedupe before or after CRM import?

Privacy first

Related guides

Find Duplicate Leads in a CSV Before Sales Outreach

How to find duplicate leads in a csv before sales outreach

What the lead duplicate finder does

Two-pass plan for lead lists

Cookbook

Case-different email captured by the default

Plus-addressing is NOT treated as the same lead

Trailing space hides a duplicate email

Phone pass after the email pass

Reading the list-health summary

Errors and edge cases

You expected duplicate leads merged automatically

Plus-addressed or dotted Gmail variants

Phone numbers in mixed formats

Trailing spaces on emails from form fills

First occurrence marked YES too

Empty email cells

Need to match on email AND phone in one pass

Lead list over the free 500-row / 2 MB cap

Frequently asked questions

Should I check email and phone as separate passes?

Does this work across multiple lists merged into one CSV?

Does it merge or delete the duplicate leads?

Will it catch plus-addressed emails like me+tag@x.com?

Why didn't it flag two leads with the same email?

Is contact data uploaded?

How do I make phone numbers match across formats?

What does each summary number mean for my list?

How large a lead list can I check?

What's the output file?

Can I keep the most recently captured duplicate?

Should I dedupe before or after CRM import?

Privacy first

Related guides