Clean Web Form Excel Data — Remove Invisible Unicode Characters

How to strip zero-width spaces and invisible characters from web form excel exports

Step 1
Export responses to a spreadsheet — Typeform: Results → Responses → Download (XLSX/CSV). Google Forms: Responses → ⋮ → Download responses (.csv) or open in Sheets → File → Download → .xlsx. Jotform/Tally: Submissions → Export. Drop the file onto the tool.
Step 2
Keep the default toggles for invisible-character cleanup — Leave all four — Letters, Digits, Spaces, Punctuation — ticked. This keeps normal answer text (including accents) and deletes only invisible characters and stray symbols.
Step 3
Decide if you want NBSP treated as deletion — With Spaces on, only the ASCII space is kept — NBSP and other Unicode spaces are deleted. If you would rather convert NBSP to a real space, do that step in the whitespace trimmer instead, then run this for the remaining noise.
Step 4
Run the strip — Click Strip special chars. Each data cell is filtered character by character; the header row (your question text) is skipped and left exactly as exported.
Step 5
Verify with the cells-modified count and preview — The result panel shows Cells modified and Data rows, plus a first-10-rows preview. A surprisingly high cells-modified count usually means the form was injecting hidden characters on many responses.
Step 6
Download the cleaned export — Click Download: .stripped.xlsx for spreadsheet input, .stripped.csv for CSV. Re-import to your CRM, database, or mailing platform with the hidden characters gone.

Where form builders inject invisible characters

Typical sources of hidden Unicode in web-form exports and whether this tool removes them with default toggles (all four on).

Source	Injected character	Removed by default?	Symptom it causes
Copy-paste into a text field	Zero-width space U+200B	Yes	Unique-constraint failure; search never matches
Mobile keyboard / autocorrect	NBSP U+00A0	Yes	`'New York'` ≠ `'New York'` in `WHERE` clauses
First field / encoding	BOM U+FEFF (mid-cell)	Yes	Leading `\ufeff` breaks exact-match joins
Paste from Word/PDF	Curly quotes “ ”, em dash —	Yes	API JSON escaping / display glitches
Autofill control bytes	C0/C1 control chars	Yes	Boxes, corrupted CRM payloads
Emoji in free-text answers	🙂 and other symbols	Yes	Index bloat, downstream encoding errors
Respondent's accented name	é, ñ, ü, ø, CJK	No — kept	Legitimate data, preserved by `\p{L}`

Toggle settings for common form-cleanup goals

Pick toggles to match the destination system. Defaults (all on) are right for most form data; untick to be stricter.

Goal	Letters	Digits	Spaces	Punctuation
General invisible-character cleanup (keep readable text)	On	On	On	On
Strict alphanumeric IDs from a 'reference' field	On	On	Off	Off
Digits-only phone field	Off	On	Off	Off
Letters-only first/last name field	On	Off	On	Off

Cookbook

Real (anonymised) rows from form exports. The hidden characters are shown as escape sequences for clarity — in the actual file they are invisible.

Zero-width space breaking an email unique key

A Typeform email field with a trailing zero-width space from the respondent pasting their address. Invisible in the sheet, but the row fails your database's unique-email constraint because it is not byte-identical to the same email typed cleanly.

Input (CSV):
email,name
user@x.com,Ana
user2@x.com,Bram

Output (.stripped.csv, defaults):
email,name
user@x.com,Ana
user2@x.com,Bram

NBSP in a city answer that never matched a filter

Google Forms response where the mobile keyboard inserted a non-breaking space inside 'New York'. Your dashboard's WHERE city = 'New York' returned nothing for this respondent. Spaces-on still deletes NBSP (only ASCII space is kept).

Input:
city
New York
Los Angeles

Output (defaults):
city
NewYork
Los Angeles

(If you wanted 'New York' with a normal space, run the whitespace trimmer first.)

BOM stuck to the first answer

Jotform export with a BOM (U+FEFF) prepended to the first data cell. The leading invisible byte broke an exact-match join against a clean reference list.

Input:
ref,answer
ABC123,Yes
ABC124,No

Output (defaults):
ref,answer
ABC123,Yes
ABC124,No

Keep the accented name, drop the emoji

A free-text 'About you' field where the respondent's name is accented and they added an emoji. Defaults keep the name and remove the emoji and any control noise.

Input:
name,about
Søren Müller,Loves coffee ☕ 🙂
José,Dev 🚀

Output (defaults):
name,about
Søren Müller,Loves coffee  
José,Dev

XLSX export cleaned and re-saved as XLSX

Typeform XLSX with several text columns. The tool reads the first sheet, strips hidden noise across all data cells, and gives back a .stripped.xlsx. Header (question text) untouched.

Input: typeform-results.xlsx (Sheet1)
header: Email | Full name | Comments
row:    a@x.com | An ya | great👍

Download: typeform-results.stripped.xlsx
row becomes: a@x.com | Anya | great

Edge cases and what actually happens

Respondent's accented name is preserved

Preserved

José, Søren, Ş, and non-Latin scripts are \p{L} letters and stay. Cleaning hidden characters never strips legitimate multilingual answers as long as Letters is on.

NBSP becomes nothing, not a space

Expected

With Spaces on, only the ASCII space is kept; NBSP (U+00A0) is deleted, so New\u00a0York becomes NewYork. If you need it to become New York, normalise NBSP→space in the whitespace trimmer first.

Header (form question text) is never cleaned

Preserved

Row 1 is returned verbatim, so your question wording survives as column names. If a question itself contains a junk byte, sanitise headers with the header rename tool.

Words fuse where a zero-width joiner sat between them

Expected

Removal deletes with no replacement. A zero-width joiner sitting between two visible characters is dropped and they close up. Usually harmless, but verify in the preview for languages that rely on ZWJ.

Emoji and pictographs are removed

Expected

Emoji are symbols, not letters, so they are deleted with default toggles. There is no option to keep them — if you must retain emoji, this is not the tool.

Curly quotes in a comment field are stripped

Expected

Smart quotes “ ” ‘ ’ are not in the ASCII punctuation set, so they are removed. Straight quotes ' and " are kept. The tool deletes the curly glyph rather than converting it to a straight quote.

Multi-sheet export

First sheet only

Only the first sheet of an XLSX/ODS is read and written back. Additional response tabs are not included in the output.

File exceeds the free limit

Rejected

Free tier: 5 MB / 10,000 rows / 1 file. Large response sets need Pro (50 MB / 100,000 rows / 5 files) or higher. Oversized files are rejected at the dropzone.

All toggles unticked

Strips everything

If no class is kept, every data cell becomes empty. Keep at least Letters on for form text.

Decomposed accents from some keyboards

Edge

A precomposed é is kept; a decomposed e + combining acute (U+0301) keeps the e but may drop the combining mark (it is \p{M}, not \p{L}). Normalise responses to NFC upstream if exact glyphs matter.

Frequently asked questions

Will this delete my respondents' accented names?

No. The Letters toggle keeps all Unicode letters (\p{L}), including accented Latin and non-Latin scripts. José and Søren survive. Only invisible characters and symbols are removed.

Does it remove zero-width spaces?

Yes. U+200B (zero-width space), zero-width joiner, BOM (U+FEFF) appearing mid-cell, and soft hyphens are all deleted with default toggles, because none of them is a kept letter/digit/ASCII-space/punctuation.

Why does 'New York' from a form not match my database?

Almost always an NBSP (U+00A0) from a mobile keyboard sitting where a space should be. This tool deletes the NBSP. If you specifically want it turned into a normal space instead, run the whitespace trimmer first, then this tool for the remaining noise.

Does it strip the BOM at the start of the file?

It removes BOM characters that appear inside data cells. A BOM at the very start of a CSV file is normally handled by the parser; if a BOM ends up wedged in the first cell value, it is removed as noise.

Are emoji removed?

Yes. Emoji are symbols, not letters, so they are deleted with default toggles. There is no keep-emoji option.

Does it change my form's question text (the headers)?

No. The header row is preserved exactly. To clean a dirty question/header, use the header rename tool.

What formats can I upload and download?

Upload XLSX, XLS, ODS, or CSV. Download mirrors the input: .stripped.xlsx for spreadsheets, .stripped.csv for CSV.

Is respondent data uploaded to a server?

No. All parsing and stripping run in your browser. PII in form responses never leaves your machine.

Can I clean just the email column?

Not directly — the filter runs on all data columns. To isolate one field, export a single-column file, or pull the field first with the regex extractor.

Will it fix duplicate responses?

No. It removes characters; it does not deduplicate rows. After cleaning hidden characters (which is what makes duplicates look distinct), deduplicate with the deduplicator.

How many responses can I process at once?

Free: 10,000 rows / 5 MB / 1 file. Pro: 100,000 rows / 50 MB / 5 files. Pro-media: 500,000 rows / 200 MB / 20 files. Developer: unlimited rows / 500 MB.

Does removing a character leave a gap?

No. Deleted characters close up — An\u00a0ya becomes Anya. Check the preview if you are worried a removed separator caused words to merge.

Privacy first

Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.

How to strip zero-width spaces and invisible characters from web form excel exports

Step 1
Export responses to a spreadsheet — Typeform: Results → Responses → Download (XLSX/CSV). Google Forms: Responses → ⋮ → Download responses (.csv) or open in Sheets → File → Download → .xlsx. Jotform/Tally: Submissions → Export. Drop the file onto the tool.
Step 2
Keep the default toggles for invisible-character cleanup — Leave all four — Letters, Digits, Spaces, Punctuation — ticked. This keeps normal answer text (including accents) and deletes only invisible characters and stray symbols.
Step 3
Decide if you want NBSP treated as deletion — With Spaces on, only the ASCII space is kept — NBSP and other Unicode spaces are deleted. If you would rather convert NBSP to a real space, do that step in the whitespace trimmer instead, then run this for the remaining noise.
Step 4
Run the strip — Click Strip special chars. Each data cell is filtered character by character; the header row (your question text) is skipped and left exactly as exported.
Step 5
Verify with the cells-modified count and preview — The result panel shows Cells modified and Data rows, plus a first-10-rows preview. A surprisingly high cells-modified count usually means the form was injecting hidden characters on many responses.
Step 6
Download the cleaned export — Click Download: .stripped.xlsx for spreadsheet input, .stripped.csv for CSV. Re-import to your CRM, database, or mailing platform with the hidden characters gone.

Where form builders inject invisible characters

Typical sources of hidden Unicode in web-form exports and whether this tool removes them with default toggles (all four on).

Source	Injected character	Removed by default?	Symptom it causes
Copy-paste into a text field	Zero-width space U+200B	Yes	Unique-constraint failure; search never matches
Mobile keyboard / autocorrect	NBSP U+00A0	Yes	`'New York'` ≠ `'New York'` in `WHERE` clauses
First field / encoding	BOM U+FEFF (mid-cell)	Yes	Leading `\ufeff` breaks exact-match joins
Paste from Word/PDF	Curly quotes “ ”, em dash —	Yes	API JSON escaping / display glitches
Autofill control bytes	C0/C1 control chars	Yes	Boxes, corrupted CRM payloads
Emoji in free-text answers	🙂 and other symbols	Yes	Index bloat, downstream encoding errors
Respondent's accented name	é, ñ, ü, ø, CJK	No — kept	Legitimate data, preserved by `\p{L}`

Toggle settings for common form-cleanup goals

Pick toggles to match the destination system. Defaults (all on) are right for most form data; untick to be stricter.

Goal	Letters	Digits	Spaces	Punctuation
General invisible-character cleanup (keep readable text)	On	On	On	On
Strict alphanumeric IDs from a 'reference' field	On	On	Off	Off
Digits-only phone field	Off	On	Off	Off
Letters-only first/last name field	On	Off	On	Off

Cookbook

Real (anonymised) rows from form exports. The hidden characters are shown as escape sequences for clarity — in the actual file they are invisible.

Zero-width space breaking an email unique key

Input (CSV):
email,name
user@x.com,Ana
user2@x.com,Bram

Output (.stripped.csv, defaults):
email,name
user@x.com,Ana
user2@x.com,Bram

NBSP in a city answer that never matched a filter

Input:
city
New York
Los Angeles

Output (defaults):
city
NewYork
Los Angeles

(If you wanted 'New York' with a normal space, run the whitespace trimmer first.)

BOM stuck to the first answer

Jotform export with a BOM (U+FEFF) prepended to the first data cell. The leading invisible byte broke an exact-match join against a clean reference list.

Input:
ref,answer
ABC123,Yes
ABC124,No

Output (defaults):
ref,answer
ABC123,Yes
ABC124,No

Keep the accented name, drop the emoji

A free-text 'About you' field where the respondent's name is accented and they added an emoji. Defaults keep the name and remove the emoji and any control noise.

Input:
name,about
Søren Müller,Loves coffee ☕ 🙂
José,Dev 🚀

Output (defaults):
name,about
Søren Müller,Loves coffee  
José,Dev

XLSX export cleaned and re-saved as XLSX

Typeform XLSX with several text columns. The tool reads the first sheet, strips hidden noise across all data cells, and gives back a .stripped.xlsx. Header (question text) untouched.

Input: typeform-results.xlsx (Sheet1)
header: Email | Full name | Comments
row:    a@x.com | An ya | great👍

Download: typeform-results.stripped.xlsx
row becomes: a@x.com | Anya | great

Edge cases and what actually happens

Respondent's accented name is preserved

Preserved

José, Søren, Ş, and non-Latin scripts are \p{L} letters and stay. Cleaning hidden characters never strips legitimate multilingual answers as long as Letters is on.

NBSP becomes nothing, not a space

Expected

Header (form question text) is never cleaned

Preserved

Row 1 is returned verbatim, so your question wording survives as column names. If a question itself contains a junk byte, sanitise headers with the header rename tool.

Words fuse where a zero-width joiner sat between them

Expected

Emoji and pictographs are removed

Expected

Emoji are symbols, not letters, so they are deleted with default toggles. There is no option to keep them — if you must retain emoji, this is not the tool.

Curly quotes in a comment field are stripped

Expected

Multi-sheet export

First sheet only

Only the first sheet of an XLSX/ODS is read and written back. Additional response tabs are not included in the output.

File exceeds the free limit

Rejected

Free tier: 5 MB / 10,000 rows / 1 file. Large response sets need Pro (50 MB / 100,000 rows / 5 files) or higher. Oversized files are rejected at the dropzone.

All toggles unticked

Strips everything

If no class is kept, every data cell becomes empty. Keep at least Letters on for form text.

Decomposed accents from some keyboards

Edge

Frequently asked questions

Will this delete my respondents' accented names?

No. The Letters toggle keeps all Unicode letters (\p{L}), including accented Latin and non-Latin scripts. José and Søren survive. Only invisible characters and symbols are removed.

Does it remove zero-width spaces?

Why does 'New York' from a form not match my database?

Does it strip the BOM at the start of the file?

Are emoji removed?

Yes. Emoji are symbols, not letters, so they are deleted with default toggles. There is no keep-emoji option.

Does it change my form's question text (the headers)?

No. The header row is preserved exactly. To clean a dirty question/header, use the header rename tool.

What formats can I upload and download?

Upload XLSX, XLS, ODS, or CSV. Download mirrors the input: .stripped.xlsx for spreadsheets, .stripped.csv for CSV.

Is respondent data uploaded to a server?

No. All parsing and stripping run in your browser. PII in form responses never leaves your machine.

Can I clean just the email column?

Not directly — the filter runs on all data columns. To isolate one field, export a single-column file, or pull the field first with the regex extractor.

Will it fix duplicate responses?

No. It removes characters; it does not deduplicate rows. After cleaning hidden characters (which is what makes duplicates look distinct), deduplicate with the deduplicator.

How many responses can I process at once?

Free: 10,000 rows / 5 MB / 1 file. Pro: 100,000 rows / 50 MB / 5 files. Pro-media: 500,000 rows / 200 MB / 20 files. Developer: unlimited rows / 500 MB.

Does removing a character leave a gap?

No. Deleted characters close up — An\u00a0ya becomes Anya. Check the preview if you are worried a removed separator caused words to merge.

Strip Zero-Width Spaces and Invisible Characters from Web Form Excel Exports

How to strip zero-width spaces and invisible characters from web form excel exports

Where form builders inject invisible characters

Toggle settings for common form-cleanup goals

Cookbook

Zero-width space breaking an email unique key

NBSP in a city answer that never matched a filter

BOM stuck to the first answer

Keep the accented name, drop the emoji

XLSX export cleaned and re-saved as XLSX

Edge cases and what actually happens

Respondent's accented name is preserved

NBSP becomes nothing, not a space

Header (form question text) is never cleaned

Words fuse where a zero-width joiner sat between them

Emoji and pictographs are removed

Curly quotes in a comment field are stripped

Multi-sheet export

File exceeds the free limit

All toggles unticked

Decomposed accents from some keyboards

Frequently asked questions

Will this delete my respondents' accented names?

Does it remove zero-width spaces?

Why does 'New York' from a form not match my database?

Does it strip the BOM at the start of the file?

Are emoji removed?

Does it change my form's question text (the headers)?

What formats can I upload and download?

Is respondent data uploaded to a server?

Can I clean just the email column?

Will it fix duplicate responses?

How many responses can I process at once?

Does removing a character leave a gap?

Privacy first

Related guides

Strip Zero-Width Spaces and Invisible Characters from Web Form Excel Exports

How to strip zero-width spaces and invisible characters from web form excel exports

Where form builders inject invisible characters

Toggle settings for common form-cleanup goals

Cookbook

Zero-width space breaking an email unique key

NBSP in a city answer that never matched a filter

BOM stuck to the first answer

Keep the accented name, drop the emoji

XLSX export cleaned and re-saved as XLSX

Edge cases and what actually happens

Respondent's accented name is preserved

NBSP becomes nothing, not a space

Header (form question text) is never cleaned

Words fuse where a zero-width joiner sat between them

Emoji and pictographs are removed

Curly quotes in a comment field are stripped

Multi-sheet export

File exceeds the free limit

All toggles unticked

Decomposed accents from some keyboards

Frequently asked questions

Will this delete my respondents' accented names?

Does it remove zero-width spaces?

Why does 'New York' from a form not match my database?

Does it strip the BOM at the start of the file?

Are emoji removed?

Does it change my form's question text (the headers)?

What formats can I upload and download?

Is respondent data uploaded to a server?

Can I clean just the email column?

Will it fix duplicate responses?

How many responses can I process at once?

Does removing a character leave a gap?

Privacy first

Related guides