How to normalise status column values across a csv
- Step 1List the variants and the one canonical value — Scan the status column for every form (
Active,Y,Yes,1,TRUE,t) and decide the target your schema expects —active, or1, ortrue. - Step 2Drop the CSV onto the tool — Accepts
.csv,.tsv,.txtwith auto delimiter detection. The whole file is processed; the header row is skipped automatically. - Step 3Turn on Regex mode and anchor the value — Tick Regex mode. Find
^Y$, Replace withactive. The anchors stopYfrom matching inside other words and stop1from hitting ID columns. - Step 4Run and sanity-check the counter — Click Replace. Replacements made is the count of cells changed — it should roughly match how many rows had that status value. A wildly high count means the match leaked outside the status column.
- Step 5Chain the remaining variants — Download, re-feed the
.replaced.csv, and map the next value (^Yes$→active,^TRUE$→active). Or do them in one pass with an alternation^(Y|Yes|TRUE|t)$. - Step 6Download and import to your database or BI tool — Click Download CSV (saved
<name>.replaced.csv) and run your enum import. The constraint should now accept every row.
Status-variant folding plan
Common status/boolean spellings and the canonical value. Always anchor in Regex mode so a status token doesn't match inside IDs, quantities, or other words.
| Find (Regex, anchored) | Replace with | Why anchor it | Same-target group |
|---|---|---|---|
| ^(Active|Y|Yes|TRUE|t|1)$ | active | active would match inside Inactive; 1 would match IDs | All true/active forms |
| ^(Inactive|N|No|FALSE|f|0)$ | inactive | 0 would match inside quantities; No inside Norway | All false/inactive forms |
| ^Pending$ | pending | Pending is safe but anchoring keeps it consistent | Standalone state |
| ^Y$ | active | Bare Y matches inside Yes, Yellow, City | If mapping single letters |
| ^1$ | true | Bare 1 matches every ID, qty, and price digit | Boolean 1→0/true |
| ^0$ | false | Bare 0 matches inside 100, 2024, prices | Boolean 0→false |
Why whole-cell anchors are non-negotiable here
The effect of anchoring on a status pass. Without a column selector, anchors are the only thing protecting non-status columns.
| Find | Regex | On a `1` in an ID column (`1001`) | On a `1` in the Status column |
|---|---|---|---|
| 1 | Off (literal) | 1001 → truetrue0true (every 1 replaced) | 1 → true (also changed) |
| ^1$ | On | 1001 → 1001 (not a whole-cell match) | 1 → true (only this changes) |
| \b1\b | On | 1001 → 1001 (no word boundary inside) | 1 → true |
| 1 | On (no anchor) | 1001 → true00true (digits matched) | 1 → true |
Cookbook
Before/after status cells from real exports. Anchored regex keeps IDs and quantities safe; the header row is never touched.
Fold every 'active' spelling in one anchored pass
ExampleCase-insensitive matching plus an anchored alternation turns all the true-state forms into the canonical active while leaving everything else alone.
Input: ID,Status 1001,Active 1002,Y 1003,Yes 1004,TRUE 1005,Inactive Find (Regex ON): ^(Active|Y|Yes|TRUE|t)$ Replace with: active Output: ID,Status 1001,active 1002,active 1003,active 1004,active 1005,Inactive (not matched) Replacements made: 4 (ID column untouched — not whole-cell matches)
The unanchored '1' disaster
ExampleShows why a bare Find 1 is dangerous. Without anchors and a column selector, every 1 digit in IDs and quantities is rewritten too.
Input: ID,Qty,Status 101,15,1 Find: 1 Replace with: true (Regex OFF, no anchor) Output (every '1' clobbered): ID,Qty,Status true0true,true5,true Use Regex mode + ^1$ instead: Find: ^1$ Replace with: true → ID 101 and Qty 15 stay intact.
'active' inside 'inactive'
ExampleCase-insensitive matching means a literal Find active also matches the active inside Inactive, corrupting the opposite state. Anchor the whole cell to avoid it.
Input: User,Status Alice,Active Bob,Inactive Find: active Replace with: 1 (Regex OFF — case-insensitive) Wrong output: User,Status Alice,1 Bob,In1 (the 'active' inside 'Inactive' was matched!) Fix — Find (Regex ON): ^Active$ Replace with: 1 User,Status Alice,1 Bob,Inactive
Boolean 1/0 to true/false safely
ExampleTwo anchored passes convert a numeric boolean column. Anchors are essential so the digits inside IDs and prices survive.
Pass 1 — Find (Regex ON): ^1$ Replace with: true Pass 2 — Find (Regex ON): ^0$ Replace with: false (re-feed output) Input: After pass 1: After pass 2: ID,Flag ID,Flag ID,Flag 10,1 10,true 10,true 20,0 20,0 20,false ID column (10, 20) untouched throughout.
Header 'is_active' is preserved
ExampleSnake-case boolean headers often contain words you might map. Because row 0 is never processed, the heading survives even if a Find would otherwise match it.
Input: is_active,name active,Alice active,Bob Find (Regex ON): ^active$ Replace with: 1 Output (header row untouched): is_active,name 1,Alice 1,Bob Replacements made: 2
Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
Unanchored `1` rewrites IDs, quantities, and prices
By designThere is no column selector, so a Find 1 (literal or unanchored regex) matches every 1 digit in the file — IDs, SKUs, prices, dates. Fix: always use Regex mode with whole-cell anchors ^1$ (or word boundaries \b1\b) so only cells equal to the status value change. Test on a 5-row sample first.
`active` matches inside `inactive`
By designCase-insensitive matching (the default) means a literal Find active also matches the substring inside Inactive, turning the opposite state into garbage like In1. Fix: anchor the whole cell with ^Active$ in Regex mode so the match fires only when the cell is exactly Active.
Header row never normalises
PreservedRow 0 is returned unchanged, so headings like Status, is_active, flag, or enabled are safe even if a Find would match them. The trade-off: a headerless file has its first data row skipped — prepend a placeholder header before processing.
One value per pass — no mapping grid
By designYou can't paste Y→active, N→inactive, Pending→pending as a table. Map each value in its own pass (re-feed the output), or fold same-target values into one Regex alternation ^(Y|Yes|TRUE)$. Plan the normalisation as an ordered sequence of passes.
Order of passes matters for overlapping values
CautionIf you map Y→active and then active→1 in a later pass, the second pass also catches the actives you just created. Map to a final canonical value directly, or sequence passes so an intermediate value isn't re-matched. Anchoring reduces but doesn't eliminate this.
Invalid regex returns the file unchanged
No changesA malformed alternation (e.g. ^(Y|Yes with an unclosed group) is caught and the original rows are returned with 0 replacements. Fix: if the counter is 0 unexpectedly, check for a missing ) or $ in the pattern.
Empty Replace deletes the status value
CautionLeaving Replace with empty turns a matched status into a blank cell — occasionally useful (clearing a placeholder), but usually not what you want for a status column an enum import requires to be non-null. Double-check the Replace field has your canonical value in it.
Free tier limit on a large export
Tier limitFree caps at 2 MB / 500 rows. A full transactions export exceeds it. Fix: upgrade to Pro, or split with csv-row-splitter, normalise each chunk with the identical anchored passes, then recombine with csv-merger.
Frequently asked questions
Can I normalise the status column without touching ID numbers?
Yes — use Regex mode with whole-cell anchors. Find ^1$ (Regex on) only matches a cell whose entire value is 1, so a 1 inside an ID like 1001 or a quantity like 15 is left alone. There's no column selector, so anchoring is how you protect numeric columns from a status pass.
Why did mapping 'active' also break my 'Inactive' rows?
Matching is case-insensitive and, without anchors, substring-based — so active matches the tail of Inactive and turns it into In<replacement>. Anchor the whole cell with ^Active$ in Regex mode so only cells equal to Active change.
How do I fold Y, Yes, and TRUE onto one value at once?
Tick Regex mode and use an anchored alternation: Find ^(Y|Yes|TRUE|t)$, Replace with active. That collapses all four spellings in one pass. There's no multi-pair grid, so the alternative is one pass per value with the output re-fed each time.
Does this work for binary 0/1 to false/true conversion?
Yes, with anchors. Run two passes: Find ^1$ Replace with true, then re-feed and Find ^0$ Replace with false. The ^...$ anchors are essential — an unanchored 1 would rewrite every 1 digit in IDs, dates, and prices across the whole file.
Is matching case-sensitive?
Case-insensitive by default, which is ideal for Active/active/ACTIVE folding into one canonical value. Tick Case sensitive if casing is meaningful — for example if lowercase t/f is your boolean and you don't want to catch the letter T elsewhere.
Will my status header get rewritten?
No. The header row (row 0) is never modified, so Status, is_active, enabled, or flag headings are preserved even if your Find would match them. Note that a file without a header row will have its first data row skipped — add a placeholder header.
What does the replacements number count?
It's the number of cells that changed, not occurrences. For a status pass it should roughly equal the rows that held that value. A count much higher than your row total is a red flag that the match leaked into another column — re-run with whole-cell anchors.
Can I map multiple status columns in one run?
The tool already processes every cell in the file, so two status columns are both handled in one pass — there's no per-column scoping. The risk is the same value meaning different things in different columns; if that's a concern, split the file by column with csv-column-filter or normalise the columns in separate, carefully-anchored passes.
What if I accidentally blank out a status value?
That happens when the Replace field is empty — the matched value is deleted. For status columns that an enum import requires, always confirm your canonical value is in the Replace field before running. If you do blank a column, re-run from the original file.
What file types and sizes are supported?
.csv, .tsv, and .txt, with auto delimiter detection. Free tier handles up to 2 MB / 500 rows; Pro removes both. For large exports, split with csv-row-splitter and recombine with csv-merger.
Is my data uploaded to a server?
No. PapaParse runs entirely in your browser; status values and any PII in the file never reach a JAD server. Only an anonymous run counter is stored server-side for signed-in dashboard stats, which you can opt out of in settings.
Can I automate status normalisation in a pipeline?
Yes. GET /api/v1/tools/csv-find-replace returns the option schema (find, replace, caseSensitive, useRegex); pair the @jadapps/runner once and POST to 127.0.0.1:9789/v1/tools/csv-find-replace/run. Since each call is one pair, a status pipeline strings together one anchored runner call per canonical value on each scheduled export.
Privacy first
Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.