Deduplicate Excel Data Before SQL Import to Avoid Unique Constraint Errors

How to remove duplicates from excel before sql import to prevent unique constraint violations

Step 1
Find the constrained column — Check the target DDL for the UNIQUE or PRIMARY KEY column — e.g. email in CREATE TABLE users (email text UNIQUE ...).
Step 2
Open the deduplicator — Open the Excel deduplicator (it loads the CSV/Excel deduplicator workspace) and drop your .xlsx or .csv source.
Step 3
Key on the constraint column — Select that same column as the unique key. Two rows count as duplicates only when this column matches.
Step 4
Match your index's case rule — Leave Case-sensitive keys off only if your DB index is case-insensitive (e.g. citext / LOWER() index). For a default case-sensitive text index, tick it.
Step 5
Remove duplicates — Run it. Compare duplicates removed against what you expected; investigate if the number is surprisingly high.
Step 6
Load the clean file — Run your COPY / LOAD DATA / INSERT. With in-file duplicates gone, the constraint won't trip — assuming the table doesn't already hold those keys (see FAQ).

Excel dedup vs SQL UNIQUE semantics

Where the tool's matching and a SQL UNIQUE index agree — and where you must set the option to match. Getting case wrong is the usual gotcha.

Aspect	This tool's default	SQL UNIQUE (typical)	What to do
Case	Case-insensitive	Case-SENSITIVE for `text`/`varchar`	Tick `Case-sensitive keys` to match
Whitespace	Trims key before compare	`'x '` ≠ `'x'` (distinct)	Trim cells too before load (see edge cases)
Blank / NULL key	Kept (one row each)	`NOT NULL` rejects; multiple NULLs may be allowed	Strip blanks before load if column is NOT NULL
Composite key	Single column only	Multi-column UNIQUE supported	Concatenate into a helper column
Existing table rows	Not considered	Conflicts with existing keys still fail	Truncate or use ON CONFLICT / IGNORE

Loader behaviour on duplicate keys

How each loader reacts when in-file duplicates hit a UNIQUE constraint. Deduping first avoids all of these.

Loader	On duplicate key	Pre-dedup benefit
PostgreSQL `COPY`	Aborts entire load with constraint error	Whole load succeeds
PostgreSQL `INSERT ... ON CONFLICT`	Skips/updates conflicts (verbose)	Cleaner, smaller load
MySQL `LOAD DATA`	Errors, or last-wins with `REPLACE`	Deterministic first-wins instead
MySQL `INSERT IGNORE`	Silently drops conflicting rows	You control which row survives
`pg_restore` / bulk	Aborts on first violation	Restore completes

Tier limits for Excel deduplication

Real per-file limits applied when you drop an .xlsx onto the Excel hub. Row count is the parsed first-sheet body, not bytes.

Tier	Max file size	Max rows	Files at once
Free	5 MB	10,000	1
Pro	50 MB	100,000	5
Pro-media	200 MB	500,000	20
Developer	500 MB	unlimited	unlimited

Cookbook

Pre-load fixes for real constraint errors. SQL shown is illustrative; dedup behaviour is exactly as implemented.

Stop a PostgreSQL COPY from aborting

A source sheet has the same email twice. COPY users FROM ... aborts with a unique-constraint error. Dedup on email first so the whole load succeeds.

Target: CREATE TABLE users (email text UNIQUE, name text);

Input (key column = email, case-SENSITIVE to match the index):
email,name
sue@x.com,Sue
sue@x.com,Sue Dup
jon@x.com,Jon

Output -> COPY users FROM 'clean.csv' CSV HEADER;
-> no 'duplicate key value violates unique constraint'

Case mismatch: match your index's rule

A default Postgres text UNIQUE index is case-sensitive, so Sue@x.com and sue@x.com are TWO valid rows to the DB. The tool's default (case-insensitive) would wrongly drop one. Tick case-sensitive to mirror the index.

Index is case-sensitive (plain text column):

Key column = email, Case-sensitive keys = ON:
email
Sue@x.com
sue@x.com
Sue@x.com

Output:
email
Sue@x.com
sue@x.com
-> only the exact-case repeat removed (matches DB)

citext / LOWER() index: keep case-insensitive

If the column is citext or has a LOWER(email) functional unique index, the DB treats Sue@x.com = sue@x.com. Here the default case-insensitive dedup is correct — leave the box unchecked.

Index: CREATE UNIQUE INDEX ON users (lower(email));

Key column = email, Case-sensitive keys = OFF (default):
email
Sue@x.com
sue@x.com

Output:
email
Sue@x.com
-> collapsed, matching the case-insensitive index

Composite UNIQUE via a helper column

The table has UNIQUE (tenant_id, email). The tool keys on one column, so concatenate the two into a helper, dedup on it, then delete the helper before load.

Target: UNIQUE (tenant_id, email)

Add helper 'k' = tenant_id & '|' & email:
k,tenant_id,email
1|sue@x.com,1,sue@x.com
1|sue@x.com,1,sue@x.com
2|sue@x.com,2,sue@x.com

Key column = k -> 1 duplicate removed (tenant 2 kept).
Delete helper, then load.

Trailing space the DB would treat as distinct

'sue@x.com ' and 'sue@x.com' are DIFFERENT to PostgreSQL, so both load fine into a UNIQUE column — but they're the same person. The tool's trim collapses them; trim the cells too so the loaded value is clean.

Input (key column = email):
email
sue@x.com 
sue@x.com

This tool: keeps 1 ('sue@x.com ' verbatim) — trim matched.
BEFORE load, run the whitespace trimmer so the stored
value is 'sue@x.com' not 'sue@x.com ' (DB keeps the space).

Edge cases and what actually happens

DB index is case-sensitive but dedup default isn't

Mismatch risk

A plain text/varchar UNIQUE index in PostgreSQL/MySQL (non-ci collation) is case-sensitive — A@x and a@x are two valid rows. The tool's default case-insensitive dedup would drop one you needed. Tick Case-sensitive keys to match a case-sensitive index, or leave it off for citext / LOWER() indexes.

Trim collapses keys the DB keeps distinct

Watch

The tool always trims the comparison key, so 'x ' and 'x' collapse. But the database treats them as distinct values and stores the space on the surviving row. Run the whitespace trimmer on the column so the loaded value is actually clean, not just deduped.

Table already contains the key

Still fails

Deduping the file only removes in-file duplicates. If the target table already holds one of those keys, the load still violates the constraint. Truncate first, or use INSERT ... ON CONFLICT / INSERT IGNORE.

Composite UNIQUE constraint

Single column only

The tool keys on one column; a multi-column UNIQUE (a, b) needs a helper column concatenating a and b. Dedup on the helper, then delete it before generating the load.

NOT NULL column with blank cells

Will reject

Blank-key rows are kept by this tool, but a NOT NULL constrained column rejects them at load. Strip blank-key rows with the empty-row remover before loading.

Numeric key parsed inconsistently from XLSX

Type drift

After XLSX parsing, a key like 00123 may arrive as text in one row and the number 123 in another, so they won't match here but the DB (text column) treats them differently again. Standardise the column's type before deduping.

Only the first worksheet is loaded

First sheet only

If your import data is split across sheets, only sheet 1 is deduplicated and exported. Combine sheets with the sheet joiner before deduping so all rows are considered.

File exceeds the tier limit

Rejected

Free caps at 5 MB / 10,000 rows / 1 file. For large loads upgrade to Pro (100,000 rows) / Developer (unlimited), or split the file and dedup each chunk — but note cross-chunk duplicates won't be caught.

XLSX output is values-only

Values only

The deduplicated .xlsx is rebuilt as a single values-only sheet; formulas become their computed values. That's usually ideal for a load file, but export to CSV if your loader prefers plain text.

Near-duplicate keys (typos)

Not collapsed

Exact match only — acme and acme ltd stay separate and both load. For approximate dedup before a fuzzy upsert, use the fuzzy deduplicator (Pro).

Frequently asked questions

What if the table has a composite (multi-column) unique constraint?

Single-column key only. Concatenate the constrained columns into a helper column (e.g. tenant_id|email), dedup on the helper, then delete it before loading. That replicates a composite UNIQUE.

Does deduplicating guarantee no constraint violations?

It removes in-file duplicate keys. If the target table already holds some of those keys, the load still fails. Truncate the table first, or load with INSERT ... ON CONFLICT DO NOTHING/UPDATE (Postgres) or INSERT IGNORE (MySQL).

Should I use case-sensitive or case-insensitive dedup for SQL?

Match your index. A plain text/varchar UNIQUE index is case-sensitive — tick Case-sensitive keys. A citext column or a LOWER() functional index is case-insensitive — leave the box off (the default).

Why did the database keep two rows that look identical?

Almost always a trailing space: 'x ' and 'x' are distinct to SQL. The tool trims the comparison key so it dedups them, but it doesn't rewrite the cell — run the whitespace trimmer so the stored value is clean before loading.

Which row survives the dedup?

The first occurrence in row order. Sort the file so the row you want to keep (newest, most complete) is first before deduping, since that's the one that loads.

Can I generate INSERT statements from the cleaned data?

Yes — after deduping, feed the file to the Excel-to-SQL generator to produce INSERT statements (or use COPY/LOAD DATA on the file directly).

What about NULL / blank values in the key column?

The tool keeps blank-key rows (it doesn't merge them). A NOT NULL column will reject them at load, so strip blank-key rows with the empty-row remover first.

Does it read multiple sheets?

No — only the first worksheet. If your load data spans sheets, merge them with the sheet joiner before deduping so all rows are deduplicated together.

How large a dataset can I dedupe before loading?

Free handles 5 MB / 10,000 rows / 1 file. Pro raises it to 50 MB / 100,000 rows, Pro-media to 200 MB / 500,000, and Developer to 500 MB / unlimited — match this to your load size.

Does the tool talk to my database?

No. It runs entirely in your browser and only produces a clean file. You run the COPY / LOAD DATA / INSERT yourself — no credentials are ever involved.

My numeric keys aren't deduping — why?

After XLSX parsing, a key like 00123 may be text in one row and a number in another, so they compare as different. Standardise the column to text (preserve leading zeros) before deduping.

What about near-duplicate keys from typos?

This tool needs an exact match (after trim/case-fold), so acme and acme ltd both survive. For similarity-based dedup before a fuzzy upsert, use the fuzzy deduplicator on Pro.

Privacy first

Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.

How to remove duplicates from excel before sql import to prevent unique constraint violations

Step 1
Find the constrained column — Check the target DDL for the UNIQUE or PRIMARY KEY column — e.g. email in CREATE TABLE users (email text UNIQUE ...).
Step 2
Open the deduplicator — Open the Excel deduplicator (it loads the CSV/Excel deduplicator workspace) and drop your .xlsx or .csv source.
Step 3
Key on the constraint column — Select that same column as the unique key. Two rows count as duplicates only when this column matches.
Step 4
Match your index's case rule — Leave Case-sensitive keys off only if your DB index is case-insensitive (e.g. citext / LOWER() index). For a default case-sensitive text index, tick it.
Step 5
Remove duplicates — Run it. Compare duplicates removed against what you expected; investigate if the number is surprisingly high.
Step 6
Load the clean file — Run your COPY / LOAD DATA / INSERT. With in-file duplicates gone, the constraint won't trip — assuming the table doesn't already hold those keys (see FAQ).

Excel dedup vs SQL UNIQUE semantics

Where the tool's matching and a SQL UNIQUE index agree — and where you must set the option to match. Getting case wrong is the usual gotcha.

Aspect	This tool's default	SQL UNIQUE (typical)	What to do
Case	Case-insensitive	Case-SENSITIVE for `text`/`varchar`	Tick `Case-sensitive keys` to match
Whitespace	Trims key before compare	`'x '` ≠ `'x'` (distinct)	Trim cells too before load (see edge cases)
Blank / NULL key	Kept (one row each)	`NOT NULL` rejects; multiple NULLs may be allowed	Strip blanks before load if column is NOT NULL
Composite key	Single column only	Multi-column UNIQUE supported	Concatenate into a helper column
Existing table rows	Not considered	Conflicts with existing keys still fail	Truncate or use ON CONFLICT / IGNORE

Loader behaviour on duplicate keys

How each loader reacts when in-file duplicates hit a UNIQUE constraint. Deduping first avoids all of these.

Loader	On duplicate key	Pre-dedup benefit
PostgreSQL `COPY`	Aborts entire load with constraint error	Whole load succeeds
PostgreSQL `INSERT ... ON CONFLICT`	Skips/updates conflicts (verbose)	Cleaner, smaller load
MySQL `LOAD DATA`	Errors, or last-wins with `REPLACE`	Deterministic first-wins instead
MySQL `INSERT IGNORE`	Silently drops conflicting rows	You control which row survives
`pg_restore` / bulk	Aborts on first violation	Restore completes

Tier limits for Excel deduplication

Real per-file limits applied when you drop an .xlsx onto the Excel hub. Row count is the parsed first-sheet body, not bytes.

Tier	Max file size	Max rows	Files at once
Free	5 MB	10,000	1
Pro	50 MB	100,000	5
Pro-media	200 MB	500,000	20
Developer	500 MB	unlimited	unlimited

Cookbook

Pre-load fixes for real constraint errors. SQL shown is illustrative; dedup behaviour is exactly as implemented.

Stop a PostgreSQL COPY from aborting

A source sheet has the same email twice. COPY users FROM ... aborts with a unique-constraint error. Dedup on email first so the whole load succeeds.

Target: CREATE TABLE users (email text UNIQUE, name text);

Input (key column = email, case-SENSITIVE to match the index):
email,name
sue@x.com,Sue
sue@x.com,Sue Dup
jon@x.com,Jon

Output -> COPY users FROM 'clean.csv' CSV HEADER;
-> no 'duplicate key value violates unique constraint'

Case mismatch: match your index's rule

Index is case-sensitive (plain text column):

Key column = email, Case-sensitive keys = ON:
email
Sue@x.com
sue@x.com
Sue@x.com

Output:
email
Sue@x.com
sue@x.com
-> only the exact-case repeat removed (matches DB)

citext / LOWER() index: keep case-insensitive

If the column is citext or has a LOWER(email) functional unique index, the DB treats Sue@x.com = sue@x.com. Here the default case-insensitive dedup is correct — leave the box unchecked.

Index: CREATE UNIQUE INDEX ON users (lower(email));

Key column = email, Case-sensitive keys = OFF (default):
email
Sue@x.com
sue@x.com

Output:
email
Sue@x.com
-> collapsed, matching the case-insensitive index

Composite UNIQUE via a helper column

The table has UNIQUE (tenant_id, email). The tool keys on one column, so concatenate the two into a helper, dedup on it, then delete the helper before load.

Target: UNIQUE (tenant_id, email)

Add helper 'k' = tenant_id & '|' & email:
k,tenant_id,email
1|sue@x.com,1,sue@x.com
1|sue@x.com,1,sue@x.com
2|sue@x.com,2,sue@x.com

Key column = k -> 1 duplicate removed (tenant 2 kept).
Delete helper, then load.

Trailing space the DB would treat as distinct

Input (key column = email):
email
sue@x.com 
sue@x.com

This tool: keeps 1 ('sue@x.com ' verbatim) — trim matched.
BEFORE load, run the whitespace trimmer so the stored
value is 'sue@x.com' not 'sue@x.com ' (DB keeps the space).

Edge cases and what actually happens

DB index is case-sensitive but dedup default isn't

Mismatch risk

Trim collapses keys the DB keeps distinct

Watch

Table already contains the key

Still fails

Composite UNIQUE constraint

Single column only

The tool keys on one column; a multi-column UNIQUE (a, b) needs a helper column concatenating a and b. Dedup on the helper, then delete it before generating the load.

NOT NULL column with blank cells

Will reject

Blank-key rows are kept by this tool, but a NOT NULL constrained column rejects them at load. Strip blank-key rows with the empty-row remover before loading.

Numeric key parsed inconsistently from XLSX

Type drift

Only the first worksheet is loaded

First sheet only

If your import data is split across sheets, only sheet 1 is deduplicated and exported. Combine sheets with the sheet joiner before deduping so all rows are considered.

File exceeds the tier limit

Rejected

XLSX output is values-only

Values only

The deduplicated .xlsx is rebuilt as a single values-only sheet; formulas become their computed values. That's usually ideal for a load file, but export to CSV if your loader prefers plain text.

Near-duplicate keys (typos)

Not collapsed

Exact match only — acme and acme ltd stay separate and both load. For approximate dedup before a fuzzy upsert, use the fuzzy deduplicator (Pro).

Frequently asked questions

What if the table has a composite (multi-column) unique constraint?

Single-column key only. Concatenate the constrained columns into a helper column (e.g. tenant_id|email), dedup on the helper, then delete it before loading. That replicates a composite UNIQUE.

Does deduplicating guarantee no constraint violations?

Should I use case-sensitive or case-insensitive dedup for SQL?

Why did the database keep two rows that look identical?

Which row survives the dedup?

The first occurrence in row order. Sort the file so the row you want to keep (newest, most complete) is first before deduping, since that's the one that loads.

Can I generate INSERT statements from the cleaned data?

Yes — after deduping, feed the file to the Excel-to-SQL generator to produce INSERT statements (or use COPY/LOAD DATA on the file directly).

What about NULL / blank values in the key column?

The tool keeps blank-key rows (it doesn't merge them). A NOT NULL column will reject them at load, so strip blank-key rows with the empty-row remover first.

Does it read multiple sheets?

No — only the first worksheet. If your load data spans sheets, merge them with the sheet joiner before deduping so all rows are deduplicated together.

How large a dataset can I dedupe before loading?

Free handles 5 MB / 10,000 rows / 1 file. Pro raises it to 50 MB / 100,000 rows, Pro-media to 200 MB / 500,000, and Developer to 500 MB / unlimited — match this to your load size.

Does the tool talk to my database?

No. It runs entirely in your browser and only produces a clean file. You run the COPY / LOAD DATA / INSERT yourself — no credentials are ever involved.

My numeric keys aren't deduping — why?

After XLSX parsing, a key like 00123 may be text in one row and a number in another, so they compare as different. Standardise the column to text (preserve leading zeros) before deduping.

What about near-duplicate keys from typos?

This tool needs an exact match (after trim/case-fold), so acme and acme ltd both survive. For similarity-based dedup before a fuzzy upsert, use the fuzzy deduplicator on Pro.

Remove Duplicates from Excel Before SQL Import to Prevent Unique Constraint Violations

How to remove duplicates from excel before sql import to prevent unique constraint violations

Excel dedup vs SQL UNIQUE semantics

Loader behaviour on duplicate keys

Tier limits for Excel deduplication

Cookbook

Stop a PostgreSQL COPY from aborting

Case mismatch: match your index's rule

citext / LOWER() index: keep case-insensitive

Composite UNIQUE via a helper column

Trailing space the DB would treat as distinct

Edge cases and what actually happens

DB index is case-sensitive but dedup default isn't

Trim collapses keys the DB keeps distinct

Table already contains the key

Composite UNIQUE constraint

NOT NULL column with blank cells

Numeric key parsed inconsistently from XLSX

Only the first worksheet is loaded

File exceeds the tier limit

XLSX output is values-only

Near-duplicate keys (typos)

Frequently asked questions

What if the table has a composite (multi-column) unique constraint?

Does deduplicating guarantee no constraint violations?

Should I use case-sensitive or case-insensitive dedup for SQL?

Why did the database keep two rows that look identical?

Which row survives the dedup?

Can I generate INSERT statements from the cleaned data?

What about NULL / blank values in the key column?

Does it read multiple sheets?

How large a dataset can I dedupe before loading?

Does the tool talk to my database?

My numeric keys aren't deduping — why?

What about near-duplicate keys from typos?

Privacy first

Related guides

Remove Duplicates from Excel Before SQL Import to Prevent Unique Constraint Violations

How to remove duplicates from excel before sql import to prevent unique constraint violations

Excel dedup vs SQL UNIQUE semantics

Loader behaviour on duplicate keys

Tier limits for Excel deduplication

Cookbook

Stop a PostgreSQL COPY from aborting

Case mismatch: match your index's rule

citext / LOWER() index: keep case-insensitive

Composite UNIQUE via a helper column

Trailing space the DB would treat as distinct

Edge cases and what actually happens

DB index is case-sensitive but dedup default isn't

Trim collapses keys the DB keeps distinct

Table already contains the key

Composite UNIQUE constraint

NOT NULL column with blank cells

Numeric key parsed inconsistently from XLSX

Only the first worksheet is loaded

File exceeds the tier limit

XLSX output is values-only

Near-duplicate keys (typos)

Frequently asked questions

What if the table has a composite (multi-column) unique constraint?

Does deduplicating guarantee no constraint violations?

Should I use case-sensitive or case-insensitive dedup for SQL?

Why did the database keep two rows that look identical?

Which row survives the dedup?

Can I generate INSERT statements from the cleaned data?

What about NULL / blank values in the key column?

Does it read multiple sheets?

How large a dataset can I dedupe before loading?

Does the tool talk to my database?

My numeric keys aren't deduping — why?

What about near-duplicate keys from typos?

Privacy first

Related guides