How to detect duplicate invoice numbers in an accounting csv
- Step 1Export the invoices or AR CSV — Download from Xero (Business → Invoices → Export), QuickBooks (Reports → export to CSV), Sage, or your billing platform. A standard comma-delimited file is fine; the delimiter is auto-detected.
- Step 2Drop the file onto the tool — Parsing happens in your browser via PapaParse. Invoice numbers, client names, and amounts never leave the machine. Free runs handle up to 2 MB / 500 rows; Pro handles 100 MB / 100,000 rows.
- Step 3Select the invoice-number column — In Find duplicates in column, choose
Invoice Number(orInvoiceNo,Reference,Document No.). Only one key column is used per run. - Step 4Set case sensitivity — Leave Case-sensitive matching off so casing differences are ignored. Tick it only if
A-100anda-100are genuinely distinct invoices in your ledger. - Step 5Read the summary — Click Find duplicates and review the three counters plus the duplicate-values list to see exactly which invoice numbers repeat and how many times.
- Step 6Download and void in your accounting system — Click Download Marked CSV for
*.duplicates-marked.csv, filter_is_duplicate = YES, and void or merge the duplicate invoices in Xero/QuickBooks before the next billing cycle. Re-export to confirm they are gone.
The audit controls in full
The complete option set. There is no fuzzy matching, no multi-column key, and no automatic removal — this is a flag-only tool.
| Control | Behaviour | Default |
|---|---|---|
| Find duplicates in column | Single key column chosen from your header names; values in it are grouped to find repeats | First column |
| Case-sensitive matching | Off lowercases values before comparing (INV-1 = inv-1); on requires identical casing | Off |
_is_duplicate column | Appended to every data row: YES if the invoice number occurs 2+ times, NO if once. First occurrence is YES too | Always added |
| Removal | None — nothing is deleted. Use csv-deduplicator to drop surplus rows | Zero removed |
Why invoice numbers duplicate in exports
Frequent causes of repeated invoice numbers in AR/billing CSVs and the recommended follow-up after flagging.
| Cause | How it appears | Resolution after flagging |
|---|---|---|
| Batch posted twice | Whole rows repeated, identical invoice number and amount | Void the duplicate batch; reconcile the ledger |
| Credit note reused the original number | Same number, opposite-sign amount | Renumber the credit note per your numbering policy |
| Manual re-entry after a sync failure | Same number, minor differences in date or client casing | Keep the correct line; remove the stray |
| Two period exports merged | Overlapping months produce repeated invoices | Merge cleanly with csv-merger and de-dupe with csv-deduplicator |
Cookbook
Before/after rows from accounting exports. Invoice numbers, clients, and amounts anonymised; the _is_duplicate column is exactly what the tool writes.
Double-posted batch — invoice number repeated
ExampleA nightly sync ran twice, so the same invoice posted to the AR ledger twice. Selecting Invoice Number as the key marks both copies YES, including the first, so the full pair is preserved for the audit file.
Input (ar_export.csv): Invoice Number,Client,Amount,Date INV-1001,Acme Ltd,1200.00,2026-05-02 INV-1002,Borex,640.00,2026-05-02 INV-1001,Acme Ltd,1200.00,2026-05-02 Key column: Invoice Number · Case-sensitive: off Output (ar_export.duplicates-marked.csv): Invoice Number,Client,Amount,Date,_is_duplicate INV-1001,Acme Ltd,1200.00,2026-05-02,YES INV-1002,Borex,640.00,2026-05-02,NO INV-1001,Acme Ltd,1200.00,2026-05-02,YES
Credit note reused the invoice number
ExampleA credit note was issued under the original invoice number INV-2050 instead of a new one. The amounts differ, but the number is the same — the tool flags both because it keys on the number column only.
Input: Invoice Number,Type,Amount INV-2050,Invoice,500.00 INV-2050,Credit Note,-500.00 Key column: Invoice Number · Case-sensitive: off Output: Invoice Number,Type,Amount,_is_duplicate INV-2050,Invoice,500.00,YES INV-2050,Credit Note,-500.00,YES
Leading zeros hide a duplicate
ExampleExcel stripped a leading zero from one export, so 00123 became 123. Because matching is exact text, the two are different values and are NOT flagged. Normalise the format first if a spreadsheet has dropped zeros.
Input: Invoice Number,Amount 00123,90.00 123,90.00 Key column: Invoice Number · Case-sensitive: off Output (not detected — text differs): Invoice Number,Amount,_is_duplicate 00123,90.00,NO 123,90.00,NO Fix: re-pad with csv-find-replace, or re-export with the number column stored as text, then re-check.
Case-sensitive run for case-meaningful prefixes
ExampleA firm uses A- for one entity and a- for another, so A-100 and a-100 are different invoices. Ticking Case-sensitive matching keeps them apart instead of merging them into one duplicate group.
Input: Invoice Number,Entity A-100,EntityOne a-100,EntityTwo Key column: Invoice Number · Case-sensitive: ON Output (no duplicates): Invoice Number,Entity,_is_duplicate A-100,EntityOne,NO a-100,EntityTwo,NO
Reading the audit summary
ExampleFor a 480-row AR export with 12 invoice numbers each appearing twice, the summary quantifies exposure before you export the marked file.
Summary after Find duplicates: Duplicate groups : 12 (invoice numbers that repeat) Extra copies : 12 (surplus rows = sum of count-1) Unique values : 456 (invoice numbers appearing once) Interpretation: 468 distinct invoice numbers across 480 rows; 12 are duplicated once each. Download to flag every row.
Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
You wanted duplicates removed for the next import
By designThis is a flag-only audit tool — it appends _is_duplicate and keeps every row, which is what you want for an audit trail. To produce a deduplicated file for re-import, use csv-deduplicator, which keeps the first row of each group.
Leading zeros dropped by Excel
Not matchedComparison is exact text. If Excel coerced 00123 to 123 in one export, the two won't group. Re-export with the invoice column formatted as text, or re-pad with csv-find-replace, before checking.
Invoice number and reference are separate columns
Single key onlyOnly one key column is used per run. If you must detect duplicates across Invoice Number plus Reference, build a combined column with csv-column-merger and key on that, or run the finder twice (once per column) and reconcile the two marked files.
Whitespace or non-breaking spaces around the number
Not matchedA stray space (or a CHAR(160) non-breaking space pasted from a PDF) makes a value distinct, so a genuine duplicate is missed. Run csv-whitespace-trimmer first; note it targets ordinary whitespace, so review pasted-from-PDF cells.
First instance also flagged YES
ExpectedEvery member of a repeated group is marked YES, including the earliest. This preserves the full set for comparison. To isolate only surplus rows, filter to YES then exclude the first per group, or use csv-deduplicator.
Blank invoice-number cells
Grouped togetherEvery row with an empty key value is grouped under one empty key and flagged YES if two or more exist; the duplicate list shows it as (empty). Filter or fill blank invoice numbers before trusting the audit.
Export larger than the free 500-row / 2 MB cap
Upgrade requiredFree runs are capped at 2 MB and 500 rows; a larger AR/AP export is blocked with a Pro prompt. Pro raises the limit to 100 MB / 100,000 rows. To split a one-off, use csv-row-splitter, but duplicates across chunks won't be detected.
Amounts differ but the number is the same
DetectedThe tool keys on the invoice-number column only, so it flags repeated numbers regardless of differing amounts or dates — exactly what you want to catch a credit note or a corrected invoice that reused the number. Compare the marked rows to decide the fix.
Frequently asked questions
Can I see which rows are the original and which are duplicates?
The _is_duplicate column marks every member of a duplicate group as YES, including the first occurrence, so all versions of an invoice number stay visible for comparison. If you specifically need only the surplus rows, filter to YES and exclude the earliest per group, or remove duplicates with csv-deduplicator.
Does this work for purchase (AP) invoice numbers too?
Yes. The tool is column-agnostic — point it at any invoice-number or document-reference column in any CSV, whether it's AR, AP, or expenses. The behaviour is identical.
Is there a row limit for the audit?
Free runs handle up to 2 MB and 500 rows; larger files are blocked with a Pro prompt. Pro lifts that to 100 MB and 100,000 rows. (Earlier descriptions citing 2,000 rows were inaccurate — the free row limit is 500.)
Why didn't it flag two invoices with the same number?
Matching is exact text (optionally lowercased). The usual culprits are leading zeros dropped by Excel (00123 vs 123), a trailing space, or a comma in the number. Normalise with csv-find-replace or csv-whitespace-trimmer and re-run.
Will it detect a credit note that reused the invoice number?
Yes. It keys on the number column only, so a credit note carrying the original invoice number is flagged as a duplicate even though the amount differs. That's exactly the kind of numbering error an audit should surface.
Is financial data uploaded?
No. Parsing and detection run entirely in your browser. Invoice numbers, client names, and amounts never reach a server. Only an anonymous usage counter is recorded when signed in, and it can be disabled in account settings.
Can I check across two columns, e.g. number and reference?
Not in one pass — the key is a single column. Either combine the columns first with csv-column-merger and key on the combined value, or run the finder twice and reconcile the two marked files.
What's the difference versus the CSV Deduplicator?
This finder flags duplicates and keeps every row (audit-first). The csv-deduplicator actually removes them, keeping one row per group. Use the finder to investigate, the deduplicator to produce a clean import file.
How do I combine multiple period exports before checking?
Use csv-merger to append the monthly files (they share a header schema) into one CSV, then run the duplicate finder so invoices repeated across periods are caught in a single audit.
What is the output filename and format?
Your original CSV with one extra trailing column, _is_duplicate, saved as <yourfile>.duplicates-marked.csv. Open it in your accounting spreadsheet and filter that column to YES.
Does case-sensitive matching matter for invoice numbers?
Usually not — most numbering schemes are case-insensitive, so the default catches INV-1 vs inv-1. Turn it on only if your prefixes encode different entities by case (e.g. A- vs a-).
Can I group all flagged rows together for the auditor?
The marked CSV preserves original order. After download, sort by the _is_duplicate column or by invoice number with csv-sorter so all YES rows sit together in the working paper.
Privacy first
Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.