How to dedupe an ecommerce orders csv
- Step 1Decide: order-level count, or line-item detail? — If you need one row per order (counts, finance reconciliation), deduping is right. If you need every line item (per-SKU fulfilment), do NOT dedupe on Order ID — it would discard the extra line-item rows. This tool keeps exactly one row per key value.
- Step 2Export and combine your orders — Pull the orders CSV from Shopify (Orders → Export), WooCommerce, or Magento. If you exported multiple overlapping date ranges, concatenate them with csv-merger first, oldest export on top so its row wins.
- Step 3Drop the orders CSV onto the deduplicator — Accepts
.csvand Excel/ODS (.xlsx/.xls/.ods, first sheet). The delimiter is auto-detected. Shopify and Woo both export comma-delimited UTF-8. - Step 4Choose the order-identifier column — Pick
Name(Shopify's order number, e.g.#1001),Order ID, orOrder Numberfrom the Unique key column dropdown. This is the column that's identical across all line-item rows of one order. - Step 5Keep case sensitivity off and run — Leave Case-sensitive keys unchecked — order numbers aren't case-meaningful. Click Remove duplicates. The tiles show Rows in, Rows out (= order count), Duplicates (= extra line-item rows removed), Unique keys (= distinct orders), Empty keys.
- Step 6Verify against your store's order count — Cross-check Unique keys against your store admin's order total for the period. They should match. Then Download CSV (or
.xlsx) — one row per order, in original order, ready for reconciliation.
What deduping an orders export actually does
The two real controls plus the consequences specific to line-item-style exports. There is no row-level merge or quantity sum.
| Control / behaviour | Effect on an orders CSV | Default |
|---|---|---|
| Unique key column | Set to Name/Order ID; all line-item rows of one order collapse to the first | First column |
| Case-sensitive keys | Off matches #1001 regardless of case; leave off for order numbers | Off |
| Whitespace in key | #1001 matches #1001 — trimmed for comparison, original kept in the cell | Always trimmed |
| Line-item rows after the first | Discarded — per-SKU detail on rows 2..n of an order is lost | By design |
| Blank order-number rows | Kept and counted as Empty keys, not collapsed | Always preserved |
How major platforms shape their orders export
Whether an order spans multiple rows determines whether deduping on Order ID is the right move.
| Platform | Rows per order | Dedup key | Caveat |
|---|---|---|---|
| Shopify (Orders export) | One row per line item; first row has the full header | Name | Deduping keeps the header row but drops per-item lines |
| WooCommerce (order CSV plugins) | Varies; bundles/line items often expand to multiple rows | Order ID / order_number | Check whether your plugin already exports one row per order |
| Magento / Adobe Commerce | Often one row per order at order grid; per-item at item grid | increment_id | Use the order-grid export if you want order-level dedup |
| BigCommerce | One row per order in the standard export | Order ID | Duplicates here usually come from overlapping date ranges, not line items |
| Etsy | Order CSV is one row per order; 'Sold items' CSV is per item | Order ID | Pick the order-level file before deduping |
Cookbook
Real before/after rows from orders exports. Customer data anonymised. Remember: deduping a line-item export keeps one row per order and drops the rest of the line items.
Shopify line-item export collapsed to one row per order
ExampleOrder #1001 has two line items, so Shopify wrote two rows — the second repeats the order name but leaves customer/total blank. Deduping on Name keeps the first (full) row.
Input (Shopify-style: row 2 of #1001 has blank header fields): Name,Email,Lineitem name,Total #1001,a@x.com,Blue Mug,24.00 #1001,,Red Mug, #1002,b@y.com,Hat,15.00 Key column: Name · Case-sensitive keys: OFF Output (one row per order, header row kept): Name,Email,Lineitem name,Total #1001,a@x.com,Blue Mug,24.00 #1002,b@y.com,Hat,15.00 Note: the 'Red Mug' line item is gone — fine for an order count, NOT fine if you need per-item fulfilment detail.
Overlapping date-range exports duplicate whole orders
ExampleTwo exports (Jan 1–15 and Jan 10–31) both include order #1042. Concatenated, it appears twice as a full duplicate. Deduping on the order number removes the repeat.
Input (two exports concatenated): Order,Date,Total 1042,2026-01-12,80.00 1099,2026-01-20,45.00 1042,2026-01-12,80.00 Key column: Order Output: Order,Date,Total 1042,2026-01-12,80.00 1099,2026-01-20,45.00 Stats: Rows in 3 · Rows out 2 · Duplicates 1 · Unique keys 2
Order number with stray whitespace from a re-export
ExampleA re-export padded the order number with a trailing space. Trim-before-compare treats it as the same order and removes the duplicate.
Input (trailing space on row 1): order_id,total 1001 ,50.00 1001,50.00 Key column: order_id Output: order_id,total 1001 ,50.00 The kept cell still has its space — only the comparison key is trimmed. Run csv-whitespace-trimmer to clean the value too.
Blank order-number rows preserved (malformed export)
ExampleA truncated export produced rows with no order number. They aren't duplicates of each other — each blank-key row is kept and counted as an Empty key for inspection.
Input: order_id,total 1001,50.00 ,?? ,?? Key column: order_id Output (blank rows kept): order_id,total 1001,50.00 ,?? ,?? Stats: Rows in 3 · Rows out 3 · Duplicates 0 · Empty keys 2 → investigate the source export; these rows are likely corrupt.
Keep line items, just remove FULL duplicate rows
ExampleIf you need every line item but a clumsy export duplicated entire rows verbatim, do NOT dedupe on Order ID. Instead use csv-duplicate-finder or remove only exact full-row repeats — keying on Order ID here would wrongly drop distinct line items.
Input (each line item distinct; no true duplicates): Order,SKU,Qty #1001,MUG,1 #1001,HAT,1 #1002,MUG,2 Deduping on 'Order' WOULD wrongly collapse #1001 to one line. Right approach: keep all rows; if whole rows are duplicated, audit with csv-duplicate-finder on a composite Order+SKU key (merge those columns first with csv-column-merger).
Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
Deduping a line-item export drops line items
By designKeying on Order ID keeps one row per order and discards rows 2..n — which for a Shopify-style export means losing every line item after the first. This is correct for order-level counts and finance summaries, but wrong for per-SKU fulfilment. If you need all line items, don't dedupe on Order ID; audit full-row duplicates instead with csv-duplicate-finder.
Orders export over the free 500-row limit
Pro requiredThis is a Pro tool; free is capped at 500 rows / 2 MB. A month of line-item rows easily exceeds that. Pro raises it to 100,000 rows / 100 MB. For bigger periods, split with csv-row-splitter, dedupe each chunk, concatenate, dedupe once more.
Need to sum order totals across duplicate rows
Not supportedThe deduplicator removes whole rows; it does not add up totals or quantities. For revenue you need per order, dedupe to get one row per order then sum the (already order-level) total — or use a spreadsheet pivot if the totals are spread across line-item rows.
First row of the order has blank totals
Check your export shapeMost exports put the full header (customer, total) on the FIRST line-item row, so keeping the first row is correct. If your export instead repeats totals on every line, or puts them on the last row, verify before trusting the dedup — sort with csv-sorter so the complete row is first if needed.
Want the most recent export's row to win
First-row onlyOnly the first occurrence is kept. To make a newer export's version of an order win over an older one, concatenate the newer export first, or sort by export/updated date descending with csv-sorter before deduping.
Order numbers reused across years
Collision riskIf your store reset order numbering and #1001 exists in both 2024 and 2025, deduping on the number alone would merge two different orders. Dedupe on a globally-unique Order ID (the internal one), or build a composite key (year + number) with csv-column-merger first.
Refund / partial-refund rows duplicated
Review before dedupSome exports add separate rows for refunds or adjustments sharing the order number. Deduping on Order ID would drop the refund row. If refund detail matters, keep all rows and filter instead with csv-column-filter; dedupe only when you truly want one row per order.
Blank order-number rows kept
PreservedRows with no order number pass through untouched as Empty keys — usually a sign of a truncated or corrupt export. Investigate the source; filter them out with csv-column-filter (order_id is_not_empty) before reconciliation if they're junk.
Semicolon-delimited or BOM-prefixed export
SupportedDelimiter auto-detection handles ;-separated EU-locale exports, and a leading UTF-8 BOM doesn't corrupt the first header. Output is comma-delimited. For a guaranteed Excel-clean round-trip, upload and download as .xlsx.
You only want to count duplicates, not remove them
Use the finderTo see which orders are duplicated and how many times before deciding, use csv-duplicate-finder — it marks each row YES/NO and groups matches. The deduplicator is the step that actually collapses them.
Frequently asked questions
Why does my Shopify export have multiple rows per order?
Shopify writes one row per line item. A 3-product order is 3 rows that share the same Name (order number); only the first row carries the customer, shipping, and totals. Deduping on Name collapses them to one row per order — keep this in mind: it discards the extra line items.
Which column should I dedupe orders on?
The order identifier that's identical across an order's rows: Shopify Name (#1001), WooCommerce Order ID/order_number, Magento increment_id, BigCommerce/Etsy Order ID. Pick one column — the tool keys on a single field.
Will deduping lose my line-item detail?
Yes, if you key on Order ID on a line-item export. The first row per order is kept and the rest are dropped, so per-SKU lines after the first are lost. That's right for order counts and finance, wrong for fulfilment. If you need every line item, don't dedupe on Order ID.
Does it sum order totals or quantities?
No. It removes whole duplicate rows and keeps the first — there's no aggregation. For revenue per order, dedupe to one row per order (which already holds the order total in Shopify exports) or use a spreadsheet pivot to sum line-item totals.
How do I remove orders duplicated by overlapping date-range exports?
Concatenate the exports with csv-merger, then dedupe on the order number — full-duplicate orders collapse to one. Cross-check the resulting Unique keys count against your store admin's order total for the period.
Which copy of a duplicated order is kept?
The first occurrence in file order. To make a newer export win, concatenate it first or sort by date descending with csv-sorter before deduping.
My store reuses order numbers each year — is that a problem?
Yes. If #1001 exists in two years, deduping on the number alone merges two different orders. Dedupe on the internal globally-unique Order ID, or build a composite key (year + number) with csv-column-merger first.
Is order and customer data uploaded anywhere?
No. Parsing and deduplication run in your browser. Order totals, customer addresses, and payment references stay on your device — only an anonymous run counter is recorded for signed-in dashboards.
How many order rows can it handle?
Free tier: 500 rows / 2 MB (Pro tool). Pro: 100,000 rows / 100 MB. For larger periods, split with csv-row-splitter, dedupe each chunk, concatenate, then dedupe once more.
Can I upload the export as an Excel file?
Yes — .xlsx, .xls, and .ods are accepted (first sheet), and you can download the deduped result back as .xlsx. Plain .csv works directly, with auto-detected delimiter.
How do I just audit duplicate orders before removing them?
Use csv-duplicate-finder — it adds an _is_duplicate YES/NO column and groups matching orders so you can confirm before deleting. Then use this deduplicator to collapse them.
What about refund rows that share the order number?
Deduping on Order ID would drop a separate refund/adjustment row. If refund detail matters, keep all rows and filter with csv-column-filter instead; only dedupe when you genuinely want a single row per order.
Privacy first
Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.