How to deduplicate excel rows by a unique key column
- Step 1Open the deduplicator — Land on the Excel deduplicator and you are taken to the CSV/Excel deduplicator workspace, which accepts .xlsx, .xls, .ods, and .csv from the same drop zone.
- Step 2Drop your spreadsheet — Drag the file in. SheetJS parses the first worksheet's header row locally and builds the column picker — your data stays in the tab.
- Step 3Choose the unique key column — Pick the column that should be unique —
Email,SKU,Order ID. Two rows count as duplicates only when this one column matches. - Step 4Decide on case sensitivity — Leave
Case-sensitive keysunchecked for the usual case-insensitive match, or tick it whenABC123andabc123are genuinely different IDs. - Step 5Run Remove duplicates — Click Remove duplicates. The summary reports rows in, rows out, duplicates removed, unique keys, and how many blank-key rows were kept.
- Step 6Download the clean file — Download the result. XLSX in gives a
.deduped.xlsx(single sheet, values only); CSV in gives a.deduped.csv. A 10-row preview shows the survivors.
What you can and can't control
The deduplicator exposes exactly two controls; the rest is fixed behaviour. Nothing here is invented — these are the only knobs that exist.
| Control / behaviour | Type | Default | Effect |
|---|---|---|---|
| Unique key column | Dropdown of the header row; you pick exactly one column | First column | Two rows are duplicates when this one column matches |
| Case-sensitive keys | Checkbox | Off (case-insensitive) | Off folds ABC = abc; on keeps them distinct |
| Whitespace handling | Always on (not a toggle) | Always trims the key before compare | x matches x`; output keeps the original spacing |
| Which row survives | Fixed behaviour | First occurrence kept | Later duplicates are dropped; order is preserved |
| Empty / blank keys | Fixed behaviour | Always kept | Blank-key rows are never collapsed into one |
How common key columns behave
The same case-insensitive, always-trimmed comparison applies to every column type. Pick the one that defines a unique record for your data.
| Key column | Typical duplicate cause | Default match result | Tip |
|---|---|---|---|
Email | Same person signs up twice with different casing | Sue@x.com = sue@x.com collapse | Keep default (case-insensitive) |
SKU | Re-imported catalog appends rows | AB-01 = ab-01 = ` AB-01 ` collapse | Keep default; trim is automatic |
Order ID | Webhook delivered the order twice | Exact ID match, first kept | Either case mode is fine for numeric IDs |
Customer ID (base64) | Merge of two systems | Case matters — aB ≠ Ab | Tick Case-sensitive keys |
Phone | Formatting differs (+1 555 vs 1555) | Only collapses if the cell text is identical after trim | Normalise format first, then dedup |
Tier limits for Excel deduplication
Real per-file limits applied when you drop an .xlsx onto the Excel hub. Row count is the parsed first-sheet body, not bytes.
| Tier | Max file size | Max rows | Files at once |
|---|---|---|---|
| Free | 5 MB | 10,000 | 1 |
| Pro | 50 MB | 100,000 | 5 |
| Pro-media | 200 MB | 500,000 | 20 |
| Developer | 500 MB | unlimited | unlimited |
Cookbook
Real before/after rows from CRM, catalog, and order-list cleanups. Values anonymised; behaviour is exactly what the tool does.
Collapse repeated SKUs in a catalog export
An inventory re-import appended rows instead of updating them, so each SKU appears twice with a different Updated timestamp. Whole-row dedup misses these because the timestamp differs. Keying on SKU keeps the first (oldest) row per SKU.
Input (key column = SKU, default case-insensitive): SKU,Name,Stock,Updated AB-01,Widget,40,2026-01-03 ab-01,Widget,40,2026-02-11 CD-02,Gadget,12,2026-01-09 Output (.deduped.xlsx): SKU,Name,Stock,Updated AB-01,Widget,40,2026-01-03 CD-02,Gadget,12,2026-01-09 -> 1 duplicate removed (ab-01 matched AB-01)
Keep the first record per customer
A CRM export has one row per support ticket, so frequent customers repeat. Deduping on CustomerID produces one row per customer — the first ticket they ever filed.
Input (key column = CustomerID): CustomerID,Name,Ticket,Opened C-9,Lee,T-100,2026-03-01 C-9,Lee,T-140,2026-04-02 C-12,Roy,T-101,2026-03-02 Output: CustomerID,Name,Ticket,Opened C-9,Lee,T-100,2026-03-01 C-12,Roy,T-101,2026-03-02
A leading space was hiding a duplicate
A copy-paste left a leading space on one ID. It looks unique in the grid but is the same record. The comparison key is always trimmed, so it collapses anyway — and the survivor keeps its untouched original text.
Input (key column = OrderID): OrderID,Total 5012 ,49.00 5012,49.00 5099,12.50 Output (first row kept verbatim, spaces preserved): OrderID,Total 5012 ,49.00 5099,12.50 -> trim-before-compare caught the duplicate
Case-sensitive keys for base64 IDs
Two systems were merged and the join key is a base64 token where casing is significant. The default would wrongly collapse aB9x and Ab9X. Tick Case-sensitive keys to keep them apart.
Input (key column = Token, Case-sensitive keys = ON): Token,Source aB9x,System A Ab9X,System B aB9x,System A Output: Token,Source aB9x,System A Ab9X,System B -> only the exact-case repeat of aB9x removed
Blank keys are preserved, not merged
Some rows have no value in the key column. The tool keeps every blank-key row — it never collapses them into one — so use the empty-row tools separately if those rows are noise.
Input (key column = Email): Email,Name sue@x.com,Sue ,Anon jon@x.com,Jon ,Anon2 sue@x.com,Sue Dup Output: Email,Name sue@x.com,Sue ,Anon jon@x.com,Jon ,Anon2 -> sue@x.com duplicate removed; BOTH blank rows kept
Edge cases and what actually happens
Only the first worksheet is read
First sheet onlyWhen you upload a multi-sheet .xlsx, only the first worksheet is parsed and deduplicated. Sheets 2+ are not included in the output. If your duplicates span sheets, combine them first with the sheet joiner, then dedup the result.
Whole-row duplicates that differ in another column
By designThe tool keys on one column only. If two rows share the Email but differ in Notes, they are still duplicates and the later one is dropped — the differing note is lost with it. Pick the column that truly defines uniqueness, or merge the notes before deduping.
Multi-column composite key needed
Not supportedThere is no multi-column / composite-key mode. To dedup on FirstName + LastName + DOB, concatenate them into one helper column first (use the conditional splitter or a formula), key on the helper, then delete it afterwards.
Near-duplicates (typos, spacing in the value)
Not supported hereThis is an exact-after-trim/case match, not fuzzy. Acme Inc and Acme, Inc. will NOT collapse. For approximate matching by similarity score, use the fuzzy deduplicator (Pro).
Blank-key rows are never collapsed
PreservedRows whose key cell is empty or whitespace-only are kept verbatim — one output row each. They are counted under empty keys in the summary. Strip them with the empty-row remover if they are unwanted.
Formulas and formatting on XLSX output
Values onlyXLSX output is rebuilt as a fresh single-sheet workbook from the deduplicated values. Formulas become their computed text, and cell formatting / merged cells / colours are not carried over. Run the format inspector first if you need to record the original formatting.
File exceeds the tier limit
RejectedFree is capped at 5 MB / 10,000 rows / 1 file. A larger sheet is blocked before processing. Upgrade to Pro (50 MB / 100,000 rows) or split the file first.
No header row in the sheet
ErrorThe parser needs a header row to build the column picker; an empty sheet throws "No header row was found". Add a header row (or open the file and confirm row 1 has labels) and re-drop it.
Numbers vs text in the key column
Watch coercionAfter XLSX parsing, the key is compared as text. A column where 01024 is stored as the number 1024 in one row and the text 01024 in another may not match. Standardise the column's type before deduping if leading zeros matter.
Two columns share the same header
AmbiguousIf your header row has two columns both named Email, the dropdown lists both by position. Pick by position — they are distinct columns to the tool even though the labels match.
Frequently asked questions
Which row is kept when there are duplicates?
The first occurrence in row order. Every later row with the same key is removed. Original order among survivors is preserved exactly.
Can I deduplicate on more than one column at once?
No — the tool keys on a single column you pick from the dropdown. For a composite key, concatenate the columns into one helper column first, key on it, then delete the helper. There is no multi-column mode.
Is matching case-sensitive?
By default no — ABC and abc are treated as the same key. Tick Case-sensitive keys to make casing matter, which you'd want for base64 tokens or case-significant IDs.
Does it trim spaces before comparing?
Yes, always. The key value is trimmed of leading/trailing whitespace before comparison, so ` 1024 matches 1024`. Your visible cell text is never modified — only the comparison key is trimmed.
What happens to rows where the key cell is blank?
They are kept — every one of them, one row each. Blank keys are never collapsed together. They show up under empty keys in the run summary.
Does it work across multiple sheets?
No. Only the first worksheet of an .xlsx is read and deduplicated. Combine sheets first with the sheet joiner if duplicates span tabs.
Will my formulas and formatting survive?
On XLSX output, no — the result is a fresh single-sheet workbook of values. Formulas become their computed values and formatting is dropped. Use the deduplicator on a values-only copy if that matters.
What file types can I upload?
.xlsx, .xls, .ods, and .csv all work from the same drop zone. XLSX-family inputs return an XLSX; a CSV input returns a CSV.
How large a file can I deduplicate?
Free allows 5 MB / 10,000 rows / 1 file. Pro raises that to 50 MB / 100,000 rows / 5 files, Pro-media to 200 MB / 500,000 rows, and Developer to 500 MB with unlimited rows.
Is my data uploaded anywhere?
No. Parsing and deduplication happen entirely in your browser via SheetJS. The file never leaves your device; only a usage counter is recorded if you're signed in.
How is this different from the fuzzy deduplicator?
This tool requires an exact key match (after trim / optional case-fold). The fuzzy deduplicator collapses near-matches like Acme Inc ≈ Acme, Inc. using a similarity threshold — use it for messy names, this one for clean keys.
Can I see how many duplicates were removed before downloading?
Yes. The results panel reports input rows, output rows, duplicates removed, unique keys, and empty keys kept, plus a 10-row preview of the deduplicated sheet.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.