Remove Duplicate SKU Rows - CSV Deduplicator

How to remove duplicate sku rows from a csv

Step 1
Combine product feeds into one file — The tool dedupes within a single CSV. If you have a master catalogue plus supplier feeds, concatenate them with csv-merger, placing the feed whose data you trust (your master price list) first so its row wins.
Step 2
Drop the catalogue CSV onto the deduplicator — Accepts .csv and Excel/ODS (.xlsx/.xls/.ods — first sheet only, auto-converted). PapaParse auto-detects whether the file is comma- or semicolon-delimited.
Step 3
Select your SKU / identifier column — From the Unique key column dropdown, pick the column that uniquely identifies a product variant — usually SKU. Use ASIN for Amazon feeds, Barcode/UPC for retail, or an internal Product ID if SKUs are inconsistent across suppliers.
Step 4
Set case sensitivity to match your SKU scheme — Leave Case-sensitive keys off for typed SKUs where case is incidental. Turn it on if your identifier scheme treats AB12 and ab12 as different products (some internal and Amazon codes are case-distinct).
Step 5
Run and verify against your variant count — Click Remove duplicates and read the tiles: Rows in, Rows out, Duplicates, Unique keys, Empty keys. The Unique keys count should match the number of distinct products you expect. A non-zero Empty keys means rows with no SKU — fix those before import.
Step 6
Download and import the deduplicated feed — Click Download CSV (or .xlsx if you uploaded a spreadsheet). One row per SKU survives, in original order. Import into your store or ERP — the duplicate-SKU rejections disappear.

The deduplicator's two controls

There are exactly two options. No multi-column key, no merge-on-collision, no quantity-summing — it removes whole duplicate rows only.

Control	Effect	Default	SKU-feed guidance
Unique key column	Rows sharing this value (trimmed, optionally lowercased) are duplicates; first is kept	First column	Set to `SKU`, or `ASIN`/`Barcode`/`Product ID` per channel
Case-sensitive keys	Off: `AB12` matches `ab12`. On: exact case required	Off	Off for typed SKUs; on for case-distinct codes
Whitespace in key	Trimmed before comparison; the stored SKU value is unchanged	Always trimmed	A trailing-space SKU still matches its clean form
Blank SKU rows	Never deduped; preserved and counted as Empty keys	Always kept	Filter out rows with no SKU first if your importer rejects them

Which identifier column to use per channel

Match the dedup key to how each marketplace or system uniquely identifies a product.

Channel / system	Identifier column	Notes
Shopify product import	`Variant SKU`	Shopify keys variants on SKU; duplicate SKUs across products cause overwrite/rejection
Amazon Seller flat file	`seller-sku` (or `asin`)	Dedupe on `seller-sku` for your listings; `asin` to find catalogue overlaps
Retail / POS export	`Barcode` / `UPC` / `EAN`	Barcodes are globally unique; reliable when SKUs are inconsistent
Supplier price-list merge	`Product ID` or `MPN`	Use the field both suppliers share; concatenate master feed first to keep your pricing
WooCommerce product CSV	`SKU`	Woo requires unique SKUs; a duplicate blocks the whole import row

Cookbook

Real before/after rows from product and inventory feeds. The tool keeps the first row per identifier and removes whole duplicate rows — it does not sum quantities or merge fields.

Marketplace sync re-added an existing SKU

Example

A nightly sync appended rows that already existed in the master feed. Concatenating master-first and deduping on SKU keeps the master row and drops the re-added copy.

Input (master rows above synced rows):
sku,title,price
SKU-1001,Blue Mug,12.00
SKU-1002,Red Mug,12.00
SKU-1001,Blue Mug,11.50

Key column: sku   ·   Case-sensitive keys: OFF

Output (master price kept):
sku,title,price
SKU-1001,Blue Mug,12.00
SKU-1002,Red Mug,12.00

Stats: Rows in 3 · Rows out 2 · Duplicates 1 · Unique keys 2

Hand-typed SKU casing differs

Example

Two team members entered the same product with different SKU casing. With default case-insensitive matching they collapse to one product.

Input:
sku,supplier
abc-12,Acme
ABC-12,Acme
xyz-99,Globex

Key column: sku   ·   Case-sensitive keys: OFF

Output:
sku,supplier
abc-12,Acme
xyz-99,Globex

If abc-12 and ABC-12 are genuinely different variants in your
scheme, turn Case-sensitive keys ON to keep both.

Trailing space on a barcode from a CSV export

Example

A POS export left a trailing space on some barcodes. Trim-before-compare recognises the padded barcode as the same product and removes the duplicate.

Input (trailing space on row 1):
barcode,name
5012345678900 ,Widget
5012345678900,Widget

Key column: barcode

Output:
barcode,name
5012345678900 ,Widget

The surviving barcode still has its space — trim affects only
the key. Clean the value afterward with csv-whitespace-trimmer.

Blank-SKU rows kept for cataloguing

Example

New products awaiting a SKU have a blank identifier. They aren't duplicates of each other — every blank-key row is preserved and counted as an Empty key.

Input:
sku,title
,New Hat (pending)
,New Scarf (pending)
SKU-7,Old Belt

Key column: sku

Output (both blank-SKU rows kept):
sku,title
,New Hat (pending)
,New Scarf (pending)
SKU-7,Old Belt

Stats: Rows in 3 · Rows out 3 · Duplicates 0 · Empty keys 2

Two suppliers, same UPC — keep the cheaper source first

Example

Both suppliers list the same UPC. You want the row from the supplier you placed first (your preferred vendor) to survive. First-occurrence-wins does exactly that.

Input (preferred supplier rows first):
upc,supplier,cost
0049000000443,VendorA,3.10
0049000000443,VendorB,3.45

Key column: upc

Output (VendorA kept):
upc,supplier,cost
0049000000443,VendorA,3.10

To keep the CHEAPEST instead, sort ascending by cost with
csv-sorter first, then dedupe.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Product feed over the free 500-row limit

Pro required

This is a Pro tool; free is capped at 500 rows / 2 MB. Real catalogues exceed that. Pro raises it to 100,000 rows / 100 MB. For feeds beyond 100k, split with csv-row-splitter, dedupe each chunk, concatenate, then dedupe once more.

Same SKU, different sizes/colours that you want to keep

Over-collapse risk

If your file repeats one parent SKU across size/colour variants in a single column, deduping on that column would wrongly collapse legitimate variants. Dedupe on the variant-level identifier (Variant SKU, Barcode) instead, which is unique per variant.

Need to sum quantities across duplicate SKUs

Not supported

The deduplicator removes whole rows; it does NOT aggregate. If SKU-1001 appears twice with quantities 5 and 3, you get one row with whichever quantity was first — not 8. For summing, use a spreadsheet pivot or your ERP's import aggregation; the deduplicator is for collapsing, not totalling.

Want the cheapest/most-recent row kept

First-row only

Only the first occurrence is kept. To keep the lowest price or newest entry, sort first with csv-sorter (price ascending, or date descending) so the desired row sits first, then dedupe on the SKU.

SKUs differ only by case and that matters

Toggle case-sensitive

By default AB12 and ab12 collapse. If your scheme treats them as distinct products, enable Case-sensitive keys so only byte-exact SKUs match. This is the one place case sensitivity commonly matters for product data.

Leading-zero barcode lost as a number

Preserved

The tool is text-only and never reinterprets a barcode as a number, so a leading-zero UPC like 0049000000443 is preserved exactly. (If Excel mangled it to 4.9E+10 before export, fix that in the source — the deduplicator can't recover digits the spreadsheet already dropped.)

Composite key (SKU + warehouse) needed

Single key only

One key column only. To dedupe per SKU per warehouse, merge the two columns first with csv-column-merger into a combined key, dedupe on it, then split back with csv-column-value-splitter.

Blank-SKU rows all kept

Preserved

Rows with no SKU value pass through untouched and count as Empty keys — they are never treated as duplicates of each other. If your importer rejects blank SKUs, filter them out first with csv-column-filter (sku is_not_empty).

Semicolon-delimited supplier feed

Supported

Delimiter auto-detection handles ;-separated supplier exports without configuration. Output is comma-delimited. No values are altered.

You only want to find duplicate SKUs, not remove them

Use the finder

To audit which SKUs are duplicated and how many times before deleting, use csv-duplicate-finder — it marks rows YES/NO and groups matches. The deduplicator is the cleanup step once you've reviewed.

Frequently asked questions

Which column should I use to dedupe a product feed?

The variant-level unique identifier: SKU (or Variant SKU) for most stores, ASIN/seller-sku for Amazon, Barcode/UPC/EAN for retail, or an internal Product ID/MPN when SKUs are inconsistent across suppliers. Pick one column — the tool keys on a single field.

Does it match SKUs case-insensitively?

Yes by default. abc-12 and ABC-12 collapse to one product unless you turn on Case-sensitive keys. Enable that checkbox only if your SKU scheme genuinely treats case as significant.

Will it sum the quantities of duplicate SKUs?

No. It removes whole duplicate rows and keeps the first — it does not aggregate or total quantities. If SKU-1001 appears with qty 5 then qty 3, the output keeps one row with qty 5. For summing, use a spreadsheet pivot or your ERP's import aggregation.

Which duplicate row survives?

The first occurrence in file order. To keep the cheapest, sort ascending by price with csv-sorter first; to keep the newest, sort descending by date — then dedupe so the desired row is first.

What happens to products with no SKU yet?

They're kept. Blank-key rows are never deduped and are counted as Empty keys. If your import tool rejects blank SKUs, pre-filter them out with csv-column-filter (sku is_not_empty).

Will leading zeros on my UPC/barcode survive?

Yes — the tool is text-only and never converts identifiers to numbers, so 0049000000443 is preserved. If a spreadsheet already stripped the zeros before you exported, that loss happened upstream and can't be recovered here.

Can I dedupe per SKU per warehouse in one pass?

Not directly — the key is one column. Merge SKU and warehouse into a single key with csv-column-merger, dedupe on that, then split it back with csv-column-value-splitter if needed.

How large a catalogue can it process?

Free tier: 500 rows / 2 MB (this is a Pro tool). Pro: 100,000 rows / 100 MB. For larger feeds, split with csv-row-splitter, dedupe each part, concatenate, and run a final pass.

Can I upload an Excel product sheet?

Yes — .xlsx, .xls, and .ods are accepted; the first sheet is converted to CSV, deduped, and downloadable back as .xlsx. Plain .csv works too, and the delimiter is auto-detected.

Is my wholesale cost / supplier data uploaded?

No. Parsing and deduplication run in your browser. Costs, supplier names, and SKUs never reach a server — only an anonymous run counter is recorded for signed-in dashboards.

How do I just see which SKUs are duplicated first?

Use csv-duplicate-finder — it adds an _is_duplicate YES/NO column and groups duplicate SKUs so you can review before removing anything. Then use this deduplicator to collapse them.

It collapsed legitimate size variants — what went wrong?

You likely keyed on a parent SKU shared across variants. Dedupe on the variant-level identifier instead (Variant SKU, Barcode), which is unique per size/colour, so genuine variants are preserved.

Privacy first

Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to remove duplicate sku rows from a csv

Step 1
Combine product feeds into one file — The tool dedupes within a single CSV. If you have a master catalogue plus supplier feeds, concatenate them with csv-merger, placing the feed whose data you trust (your master price list) first so its row wins.
Step 2
Drop the catalogue CSV onto the deduplicator — Accepts .csv and Excel/ODS (.xlsx/.xls/.ods — first sheet only, auto-converted). PapaParse auto-detects whether the file is comma- or semicolon-delimited.
Step 3
Select your SKU / identifier column — From the Unique key column dropdown, pick the column that uniquely identifies a product variant — usually SKU. Use ASIN for Amazon feeds, Barcode/UPC for retail, or an internal Product ID if SKUs are inconsistent across suppliers.
Step 4
Set case sensitivity to match your SKU scheme — Leave Case-sensitive keys off for typed SKUs where case is incidental. Turn it on if your identifier scheme treats AB12 and ab12 as different products (some internal and Amazon codes are case-distinct).
Step 5
Run and verify against your variant count — Click Remove duplicates and read the tiles: Rows in, Rows out, Duplicates, Unique keys, Empty keys. The Unique keys count should match the number of distinct products you expect. A non-zero Empty keys means rows with no SKU — fix those before import.
Step 6
Download and import the deduplicated feed — Click Download CSV (or .xlsx if you uploaded a spreadsheet). One row per SKU survives, in original order. Import into your store or ERP — the duplicate-SKU rejections disappear.

The deduplicator's two controls

There are exactly two options. No multi-column key, no merge-on-collision, no quantity-summing — it removes whole duplicate rows only.

Control	Effect	Default	SKU-feed guidance
Unique key column	Rows sharing this value (trimmed, optionally lowercased) are duplicates; first is kept	First column	Set to `SKU`, or `ASIN`/`Barcode`/`Product ID` per channel
Case-sensitive keys	Off: `AB12` matches `ab12`. On: exact case required	Off	Off for typed SKUs; on for case-distinct codes
Whitespace in key	Trimmed before comparison; the stored SKU value is unchanged	Always trimmed	A trailing-space SKU still matches its clean form
Blank SKU rows	Never deduped; preserved and counted as Empty keys	Always kept	Filter out rows with no SKU first if your importer rejects them

Which identifier column to use per channel

Match the dedup key to how each marketplace or system uniquely identifies a product.

Channel / system	Identifier column	Notes
Shopify product import	`Variant SKU`	Shopify keys variants on SKU; duplicate SKUs across products cause overwrite/rejection
Amazon Seller flat file	`seller-sku` (or `asin`)	Dedupe on `seller-sku` for your listings; `asin` to find catalogue overlaps
Retail / POS export	`Barcode` / `UPC` / `EAN`	Barcodes are globally unique; reliable when SKUs are inconsistent
Supplier price-list merge	`Product ID` or `MPN`	Use the field both suppliers share; concatenate master feed first to keep your pricing
WooCommerce product CSV	`SKU`	Woo requires unique SKUs; a duplicate blocks the whole import row

Cookbook

Real before/after rows from product and inventory feeds. The tool keeps the first row per identifier and removes whole duplicate rows — it does not sum quantities or merge fields.

Marketplace sync re-added an existing SKU

Example

A nightly sync appended rows that already existed in the master feed. Concatenating master-first and deduping on SKU keeps the master row and drops the re-added copy.

Input (master rows above synced rows):
sku,title,price
SKU-1001,Blue Mug,12.00
SKU-1002,Red Mug,12.00
SKU-1001,Blue Mug,11.50

Key column: sku   ·   Case-sensitive keys: OFF

Output (master price kept):
sku,title,price
SKU-1001,Blue Mug,12.00
SKU-1002,Red Mug,12.00

Stats: Rows in 3 · Rows out 2 · Duplicates 1 · Unique keys 2

Hand-typed SKU casing differs

Example

Two team members entered the same product with different SKU casing. With default case-insensitive matching they collapse to one product.

Input:
sku,supplier
abc-12,Acme
ABC-12,Acme
xyz-99,Globex

Key column: sku   ·   Case-sensitive keys: OFF

Output:
sku,supplier
abc-12,Acme
xyz-99,Globex

If abc-12 and ABC-12 are genuinely different variants in your
scheme, turn Case-sensitive keys ON to keep both.

Trailing space on a barcode from a CSV export

Example

A POS export left a trailing space on some barcodes. Trim-before-compare recognises the padded barcode as the same product and removes the duplicate.

Input (trailing space on row 1):
barcode,name
5012345678900 ,Widget
5012345678900,Widget

Key column: barcode

Output:
barcode,name
5012345678900 ,Widget

The surviving barcode still has its space — trim affects only
the key. Clean the value afterward with csv-whitespace-trimmer.

Blank-SKU rows kept for cataloguing

Example

New products awaiting a SKU have a blank identifier. They aren't duplicates of each other — every blank-key row is preserved and counted as an Empty key.

Input:
sku,title
,New Hat (pending)
,New Scarf (pending)
SKU-7,Old Belt

Key column: sku

Output (both blank-SKU rows kept):
sku,title
,New Hat (pending)
,New Scarf (pending)
SKU-7,Old Belt

Stats: Rows in 3 · Rows out 3 · Duplicates 0 · Empty keys 2

Two suppliers, same UPC — keep the cheaper source first

Example

Both suppliers list the same UPC. You want the row from the supplier you placed first (your preferred vendor) to survive. First-occurrence-wins does exactly that.

Input (preferred supplier rows first):
upc,supplier,cost
0049000000443,VendorA,3.10
0049000000443,VendorB,3.45

Key column: upc

Output (VendorA kept):
upc,supplier,cost
0049000000443,VendorA,3.10

To keep the CHEAPEST instead, sort ascending by cost with
csv-sorter first, then dedupe.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Product feed over the free 500-row limit

Pro required

Same SKU, different sizes/colours that you want to keep

Over-collapse risk

Need to sum quantities across duplicate SKUs

Not supported

Want the cheapest/most-recent row kept

First-row only

Only the first occurrence is kept. To keep the lowest price or newest entry, sort first with csv-sorter (price ascending, or date descending) so the desired row sits first, then dedupe on the SKU.

SKUs differ only by case and that matters

Toggle case-sensitive

Leading-zero barcode lost as a number

Preserved

Composite key (SKU + warehouse) needed

Single key only

One key column only. To dedupe per SKU per warehouse, merge the two columns first with csv-column-merger into a combined key, dedupe on it, then split back with csv-column-value-splitter.

Blank-SKU rows all kept

Preserved

Semicolon-delimited supplier feed

Supported

Delimiter auto-detection handles ;-separated supplier exports without configuration. Output is comma-delimited. No values are altered.

You only want to find duplicate SKUs, not remove them

Use the finder

To audit which SKUs are duplicated and how many times before deleting, use csv-duplicate-finder — it marks rows YES/NO and groups matches. The deduplicator is the cleanup step once you've reviewed.

Frequently asked questions

Which column should I use to dedupe a product feed?

Does it match SKUs case-insensitively?

Yes by default. abc-12 and ABC-12 collapse to one product unless you turn on Case-sensitive keys. Enable that checkbox only if your SKU scheme genuinely treats case as significant.

Will it sum the quantities of duplicate SKUs?

Which duplicate row survives?

The first occurrence in file order. To keep the cheapest, sort ascending by price with csv-sorter first; to keep the newest, sort descending by date — then dedupe so the desired row is first.

What happens to products with no SKU yet?

They're kept. Blank-key rows are never deduped and are counted as Empty keys. If your import tool rejects blank SKUs, pre-filter them out with csv-column-filter (sku is_not_empty).

Will leading zeros on my UPC/barcode survive?

Can I dedupe per SKU per warehouse in one pass?

Not directly — the key is one column. Merge SKU and warehouse into a single key with csv-column-merger, dedupe on that, then split it back with csv-column-value-splitter if needed.

How large a catalogue can it process?

Free tier: 500 rows / 2 MB (this is a Pro tool). Pro: 100,000 rows / 100 MB. For larger feeds, split with csv-row-splitter, dedupe each part, concatenate, and run a final pass.

Can I upload an Excel product sheet?

Yes — .xlsx, .xls, and .ods are accepted; the first sheet is converted to CSV, deduped, and downloadable back as .xlsx. Plain .csv works too, and the delimiter is auto-detected.

Is my wholesale cost / supplier data uploaded?

No. Parsing and deduplication run in your browser. Costs, supplier names, and SKUs never reach a server — only an anonymous run counter is recorded for signed-in dashboards.

How do I just see which SKUs are duplicated first?

Use csv-duplicate-finder — it adds an _is_duplicate YES/NO column and groups duplicate SKUs so you can review before removing anything. Then use this deduplicator to collapse them.

It collapsed legitimate size variants — what went wrong?

You likely keyed on a parent SKU shared across variants. Dedupe on the variant-level identifier instead (Variant SKU, Barcode), which is unique per size/colour, so genuine variants are preserved.

Privacy first

Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Remove Duplicate SKU Rows from a CSV

How to remove duplicate sku rows from a csv

The deduplicator's two controls

Which identifier column to use per channel

Cookbook

Marketplace sync re-added an existing SKU

Hand-typed SKU casing differs

Trailing space on a barcode from a CSV export

Blank-SKU rows kept for cataloguing

Two suppliers, same UPC — keep the cheaper source first

Errors and edge cases

Product feed over the free 500-row limit

Same SKU, different sizes/colours that you want to keep

Need to sum quantities across duplicate SKUs

Want the cheapest/most-recent row kept

SKUs differ only by case and that matters

Leading-zero barcode lost as a number

Composite key (SKU + warehouse) needed

Blank-SKU rows all kept

Semicolon-delimited supplier feed

You only want to find duplicate SKUs, not remove them

Frequently asked questions

Which column should I use to dedupe a product feed?

Does it match SKUs case-insensitively?

Will it sum the quantities of duplicate SKUs?

Which duplicate row survives?

What happens to products with no SKU yet?

Will leading zeros on my UPC/barcode survive?

Can I dedupe per SKU per warehouse in one pass?

How large a catalogue can it process?

Can I upload an Excel product sheet?

Is my wholesale cost / supplier data uploaded?

How do I just see which SKUs are duplicated first?

It collapsed legitimate size variants — what went wrong?

Privacy first

Related guides

Remove Duplicate SKU Rows from a CSV

How to remove duplicate sku rows from a csv

The deduplicator's two controls

Which identifier column to use per channel

Cookbook

Marketplace sync re-added an existing SKU

Hand-typed SKU casing differs

Trailing space on a barcode from a CSV export

Blank-SKU rows kept for cataloguing

Two suppliers, same UPC — keep the cheaper source first

Errors and edge cases

Product feed over the free 500-row limit

Same SKU, different sizes/colours that you want to keep

Need to sum quantities across duplicate SKUs

Want the cheapest/most-recent row kept

SKUs differ only by case and that matters

Leading-zero barcode lost as a number

Composite key (SKU + warehouse) needed

Blank-SKU rows all kept

Semicolon-delimited supplier feed

You only want to find duplicate SKUs, not remove them

Frequently asked questions

Which column should I use to dedupe a product feed?

Does it match SKUs case-insensitively?

Will it sum the quantities of duplicate SKUs?

Which duplicate row survives?

What happens to products with no SKU yet?

Will leading zeros on my UPC/barcode survive?

Can I dedupe per SKU per warehouse in one pass?

How large a catalogue can it process?

Can I upload an Excel product sheet?

Is my wholesale cost / supplier data uploaded?

How do I just see which SKUs are duplicated first?

It collapsed legitimate size variants — what went wrong?

Privacy first

Related guides