How to find duplicate orders in a csv before fulfilment
- Step 1Export the orders CSV — Download from Shopify (Orders → Export), WooCommerce (Orders → Export), or your OMS. A standard comma-delimited export works; the parser auto-detects the delimiter from the file.
- Step 2Drop the file onto the tool — Parsing runs in your browser via PapaParse. Order numbers, customer names, and addresses never reach a server. The free tier handles files up to 2 MB / 500 rows; Pro unlocks 100 MB / 100,000 rows.
- Step 3Pick the key column — Use the Find duplicates in column dropdown to select
Order ID(orName/Order Numberdepending on platform). Only one column can be the key per run — there is no multi-column key in this tool. - Step 4Choose case sensitivity — Leave Case-sensitive matching off (default) so
#1001and#1001match regardless of any stray casing. Tick it only if your IDs are case-meaningful. - Step 5Review the on-screen summary — After clicking Find duplicates, read the three stat cards (duplicate groups, extra copies, unique values) and the duplicate-values list to confirm the scale before exporting.
- Step 6Download and resolve before despatch — Click Download Marked CSV for a
*.duplicates-marked.csvfile. Filter_is_duplicate = YESin your spreadsheet, cancel or merge the surplus orders in your OMS, then re-export the clean list to the warehouse.
What the duplicate finder actually does
The complete control set — this tool has exactly one key column and one checkbox. There is no multi-column key, no fuzzy match, no auto-removal.
| Control | What it does | Default |
|---|---|---|
| Find duplicates in column | Dropdown of your header names; picks the single column whose values are grouped to detect duplicates. One column only per run | First column |
| Case-sensitive matching | When off, values are lowercased before comparison (SO-1 = so-1). When on, byte-for-byte casing must match | Off (case-insensitive) |
Output: _is_duplicate column | Appended to every data row as YES (value appears 2+ times in the key column) or NO (appears once). First occurrence of a group is also YES | Always added |
| Rows removed | None. The tool flags only — use csv-deduplicator to actually remove duplicate rows | Zero removed |
Why the same Order ID appears twice
Common upstream causes of duplicate order rows in e-commerce exports, and where the duplicate finder fits in the fix.
| Cause | How it looks in the CSV | What to do after flagging |
|---|---|---|
| Double-fired payment webhook | Two rows, identical Order ID, same totals, timestamps seconds apart | Cancel the surplus order in the OMS before the pick list is generated |
| Customer double-clicked checkout | Two adjacent Order IDs for the same cart, or one ID repeated if the platform deduped the number but not the row | Refund/void one; confirm only one was charged |
| Manual order re-keyed after a sync error | Same Order ID with slightly different customer casing or address | Compare the marked rows and keep the correct version |
| Two exports concatenated | Whole order rows repeated because the date ranges overlapped | De-dupe with csv-deduplicator, or merge cleanly with csv-merger next time |
Cookbook
Real before/after order exports. Order numbers and customer fields anonymised; the _is_duplicate column is exactly what the tool appends.
Webhook fired twice — same Order ID, marked YES
ExampleA payment gateway retried its webhook, so the OMS wrote the order row twice with an identical Order ID. Selecting Order ID as the key and downloading the marked CSV makes both copies obvious. Note both rows are YES, including the first.
Input (orders.csv): Order ID,Customer,Total,Status 1001,Sam Lee,49.00,paid 1002,Dana Roe,18.50,paid 1001,Sam Lee,49.00,paid Key column: Order ID · Case-sensitive: off Output (orders.duplicates-marked.csv): Order ID,Customer,Total,Status,_is_duplicate 1001,Sam Lee,49.00,paid,YES 1002,Dana Roe,18.50,paid,NO 1001,Sam Lee,49.00,paid,YES
Re-keyed order with different casing
ExampleA manual order was re-entered as so-2050 after the original SO-2050 failed to sync. Case-insensitive matching (the default) collapses the casing so both are caught as one group.
Input: Order ID,Channel SO-2050,Phone SO-2051,Web so-2050,Phone Key column: Order ID · Case-sensitive: off (default) Output: Order ID,Channel,_is_duplicate SO-2050,Phone,YES SO-2051,Web,NO so-2050,Phone,YES
Case-sensitive run keeps distinct IDs apart
ExampleHere WEB-77 (web channel) and web-77 (a legacy POS order) are genuinely different in the source system. Ticking Case-sensitive matching keeps them as two unique values rather than one duplicate group.
Input: Order ID,Source WEB-77,Online web-77,POS legacy Key column: Order ID · Case-sensitive: ON Output (no duplicates found): Order ID,Source,_is_duplicate WEB-77,Online,NO web-77,POS legacy,NO
Trailing space hides a duplicate
ExampleMatching is whole-cell exact — it does not trim. A trailing space on one Order ID makes it a distinct value, so the pair is NOT flagged. Trim first with the whitespace trimmer if your export has stray spaces.
Input (note the space after 3003 on row 1): Order ID,Total 3003 ,12.00 3003,12.00 Key column: Order ID · Case-sensitive: off Output (not detected — values differ by a space): Order ID,Total,_is_duplicate 3003 ,12.00,NO 3003,12.00,NO Fix: run csv-whitespace-trimmer first, then re-check.
Reading the summary before you download
ExampleThe stat cards tell you the scale before exporting. For a 1,200-row export with 30 repeated Order IDs each appearing twice, the numbers read like this.
Summary after Find duplicates: Duplicate groups : 30 (distinct Order IDs that repeat) Extra copies : 30 (sum of count-1 across groups) Unique values : 1,140 (Order IDs appearing exactly once) Meaning: 1,170 distinct Order IDs, 30 of them duplicated once each, so 1,200 data rows total. Download to see the YES/NO flags.
Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
You expected duplicates removed, not flagged
By designThis tool never deletes rows. It appends an _is_duplicate column so you can audit before acting. To actually remove duplicate order rows, use csv-deduplicator, which keeps the first of each group and drops the rest.
You need to match on customer + product, not just Order ID
Not supportedThe key is a single column. There is no multi-column or composite key in this tool. To catch 'same customer ordered the same SKU twice', build a combined key column first with csv-column-merger (e.g. Customer|SKU), then run the duplicate finder on that new column.
Trailing or leading spaces around the Order ID
Not matchedComparison is exact (whole-cell), with optional lowercasing only. 3003 and 3003 are different values and will not group. Clean the column with csv-whitespace-trimmer before running so genuine duplicates are not missed.
First occurrence is also marked YES
ExpectedEvery member of a repeated group gets YES, including the earliest row. This is intentional: you want to compare all copies. If you only want the surplus copies, filter to YES and then exclude the first row per group manually, or remove duplicates outright with csv-deduplicator.
File over 2 MB or 500 rows on the free tier
Upgrade requiredFree runs cap at 2 MB and 500 rows; a larger orders export is blocked with a Pro prompt. Pro lifts the ceiling to 100 MB / 100,000 rows. For a one-off over the free cap, split the export with csv-row-splitter, though duplicates spanning two chunks will not be detected across files.
Empty Order ID cells
Grouped togetherBlank values are treated as a single key — every row with an empty Order ID groups together and is flagged YES if there are two or more. The duplicate-values list shows it as (empty). Fill or filter blanks before relying on the result.
Two exports concatenated into one file
DetectedIf you pasted two overlapping date-range exports together, the repeated whole rows will share an Order ID and be flagged. For cleaner combining next time, use csv-merger which appends files sharing a header schema.
Order ID stored with a currency-style thousands format
Compared as textEverything is compared as text. If one row stores 1,001 and another 1001 they are different values and will not match. Normalise the format (find/replace the comma) with csv-find-replace first.
Frequently asked questions
Does this remove duplicate orders or just flag them?
It flags them. All rows are kept and a new _is_duplicate column is appended with YES for rows whose Order ID appears two or more times, NO otherwise. To physically remove duplicates and keep one per group, use csv-deduplicator.
Can I check duplicates on Order ID and customer email at the same time?
No. The tool uses a single key column per run. To detect 'same customer, same order content', first build a combined column (e.g. Email|SKU) with csv-column-merger, then run the duplicate finder on that column.
Is the first copy of a duplicate order marked too?
Yes. Every row in a repeated group is marked YES, including the first occurrence. This lets you compare all copies side by side. If you only want the surplus rows, filter to YES and drop the earliest per group, or use the deduplicator instead.
Why didn't it catch two orders I know are the same?
Matching is whole-cell exact (optionally lowercased). The most common reason is a difference you can't see — a trailing space, a non-breaking space, or a thousands separator. Trim the column with csv-whitespace-trimmer or normalise it with csv-find-replace and re-run.
What does case-sensitive matching change?
By default values are lowercased before comparison, so SO-1 and so-1 count as the same. Tick Case-sensitive matching to require identical casing, which keeps WEB-77 and web-77 as separate unique values.
Is order data uploaded anywhere?
No. The file is parsed and processed entirely in your browser. Order numbers, customer names, totals, and addresses never reach a server. Only a usage counter (no file content) is recorded when you are signed in, and that can be turned off in account settings.
What do the three summary numbers mean?
Duplicate groups = how many distinct Order IDs repeat. Extra copies = the surplus rows (sum of group size minus one for each group), i.e. how many rows you'd remove if you deduplicated. Unique values = Order IDs that appear exactly once.
How large an orders export can I check?
Free runs handle up to 2 MB and 500 rows; larger files are blocked with a Pro prompt. Pro handles up to 100 MB and 100,000 rows. For very large exports beyond Pro, split with csv-row-splitter, noting that duplicates straddling two chunks won't be detected across files.
What does the downloaded file look like?
It is your original CSV with one extra column, _is_duplicate, appended at the end of every row. The file is named <yourfile>.duplicates-marked.csv. Open it in Excel or Google Sheets and filter that column to YES to see only the suspect rows.
Will it work with Shopify order exports that have line-item rows?
Shopify exports repeat the order Name across multiple line-item rows for a single order, which the tool will correctly flag as duplicates of the Name. If you want one row per order, deduplicate or roll up line items first; otherwise pick a truly unique column (Shopify's Id) as the key.
Can it sort the duplicates together?
No — the marked CSV keeps your original row order. After downloading, sort by the _is_duplicate column (or by Order ID) with csv-sorter to group all flagged rows together for review.
How do I combine multiple daily order files before checking?
Use csv-merger to append the daily exports into one file (they share a header schema), then run the duplicate finder on the combined file so duplicates spanning days are caught in a single pass.
Privacy first
Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.