Fix CRM Export Whitespace in Excel Before Database Import

How to trim crm excel exports before you load them into sql

Step 1
Export from the CRM (don't pre-clean) — Run your Salesforce report export / Data Loader, HubSpot list export, Zoho or Pipedrive export to .xlsx or .csv. Leave it as-is — let the trimmer do the cleaning.
Step 2
Drop the export on the trimmer — Drag the file in. SheetJS parses it locally; the data never leaves the browser. Multi-sheet exports are processed first-sheet-only, so export one object (Accounts, Contacts, Deals) per file.
Step 3
Let the single pass run — There is nothing to configure. Every cell gets a leading/trailing trim — including edge CHAR 160 — in one in-memory pass.
Step 4
Download the cleaned staging file — An .xlsx export comes back as <name>-processed.xlsx; a CSV stays CSV. The result is values-only, ready to stage.
Step 5
Load into your database — Run your \copy / COPY FROM / LOAD DATA INFILE / BULK INSERT against the cleaned file. Foreign-key joins on name fields and UNIQUE constraints on email/account now behave because the keys match byte-for-byte.
Step 6
Spot-check the keys — After load, run SELECT company, COUNT(*) FROM staging GROUP BY company HAVING COUNT(*)<>1 (or compare against the CRM's record count). Counts that finally line up confirm the edge-space variants collapsed.

Where CRM exports hide whitespace

Common edge-whitespace sources by platform and the database symptom they cause. The trimmer addresses the edge-space causes; case and duplicate issues need the linked siblings.

CRM source	Typical edge-whitespace	Database symptom
Salesforce report / Data Loader	Trailing space in Account Name, Billing fields	`JOIN` on account name returns fewer rows than expected
HubSpot list export	Edge CHAR 160 in rich-text company / notes fields	`WHERE email = ?` misses; constraint admits a near-duplicate
Zoho CRM export	Leading space from copy-pasted phone / address	Phone lookup fails; dedup keeps both variants
Pipedrive export	Trailing newline pasted into deal title	`GROUP BY` splits one deal stage into two buckets
Any web-sourced field	Tabs / NBSP from pasted web content	Exact-match key comparison silently fails

What this tool fixes vs. what needs a sibling tool

Scope it precisely so the pre-load checklist routes each problem to the right tool. All siblings below accept XLSX as well as CSV.

Problem	Right tool	Link
Leading/trailing spaces, tabs, newlines, edge CHAR 160	This trimmer	(current tool)
Internal `Acme Corp` double spaces	Find/replace `{2,}` →	csv-find-replace
Case mismatch (`acme` vs `Acme`) breaking keys	Case converter	csv-case-converter
Duplicate account/contact rows	Deduplicator	csv-deduplicator
Garbled non-ASCII / smart quotes in keys	Special-char stripper	csv-special-char-stripper
Blank rows from empty CRM groups	Empty-row remover	csv-empty-row-remover

Tier limits for CRM-sized exports

Per-file Excel-family limits. A 10,000-contact Salesforce export fits the free tier; full account/contact dumps usually need Pro.

Tier	Max file size	Max rows	Files at once
Free	5 MB	10,000	1
Pro	50 MB	100,000	5
Pro-media	200 MB	500,000	20
Developer	500 MB	Unlimited	Unlimited

Cookbook

Pre-load recipes for the most common CRM-to-SQL whitespace failures. PII shown is anonymised.

Salesforce account name trailing space breaks a JOIN

Accounts export carries Acme Corp (trailing space). The Opportunities table has Acme Corp. The join on account name silently drops every Acme opportunity. Trim the Accounts export before staging.

accounts.xlsx (before):     accounts-processed.xlsx (after):
account_name                account_name
Acme Corp·  (trailing)      Acme Corp
Globex                      Globex

SQL after load:
SELECT * FROM opps o JOIN accounts a USING(account_name);
-- before: 0 Acme rows    after: all Acme opps join

HubSpot CHAR 160 defeats Excel TRIM but not this tool

A rich-text company field pasted from a web page ends in CHAR 160. A teammate already ran Excel =TRIM() and it 'looked clean' — but CHAR 160 survives Excel's TRIM and the unique constraint still admits a duplicate. This tool removes the edge CHAR 160.

Before (· = CHAR 160):
company
Initech·
Initech

Excel =TRIM("Initech·")  →  "Initech·"   (still distinct → 2 rows pass UNIQUE)
Whitespace Trimmer       →  "Initech"    (both collapse → constraint catches dup)

Pipedrive deal title trailing newline splits a GROUP BY

A deal title pasted from another tab ends in a newline. After load, GROUP BY stage shows 'Negotiation' and 'Negotiation\n' as two buckets. The trimmer removes the trailing newline.

Before (\n = newline):       After:
deal_title                   deal_title
Negotiation\n                Negotiation
Negotiation                  Negotiation

GROUP BY before: 2 buckets    after: 1 bucket

Full pre-load pipeline for a Salesforce contact dump

Order matters: trim edges first so case-folding and dedup operate on clean keys. Each step is a separate tool; all accept XLSX.

1. Whitespace Trimmer   strip edge spaces/CHAR 160      → contacts-processed.xlsx
2. csv-case-converter   lower-case the email column       (/tool/csv-case-converter)
3. csv-deduplicator     drop duplicate emails             (/tool/csv-deduplicator)
4. \copy staging FROM 'contacts-clean.csv' CSV HEADER;

Why not just TRIM() in SQL on every column?

Wrapping every text column in TRIM() inside your load mapping is brittle (you forget a column; TRIM in SQL also ignores CHAR 160 in many engines) and re-runs on every reload. Cleaning the file once before load is simpler and catches CHAR 160 that SQL TRIM misses.

Brittle:  INSERT ... SELECT TRIM(c1), TRIM(c2), ... (24 columns, easy to miss one)
          -- and Postgres TRIM(' ' ...) does not strip CHAR 160 by default
Clean:    run file through Whitespace Trimmer once → \copy directly

Edge cases and what actually happens

SQL TRIM() ignored CHAR 160 but this tool removes it

Expected

Many database TRIM() implementations strip only the ASCII space by default, leaving the non-breaking space (CHAR 160) that rich-text CRM fields carry. This tool's trim removes edge CHAR 160, so a value that stayed 'distinct' after SELECT TRIM(col) will collapse correctly here before it ever reaches the database.

Internal double spaces in addresses remain

By design

12 Main St keeps its internal double spaces — the tool trims edges only. If your address-matching depends on single-spaced strings, follow with csv-find-replace using {2,} → before staging.

Case mismatch is not fixed

Out of scope

Acme and acme will still be two distinct keys after trimming — this tool does not change case. Fold case with csv-case-converter if your join or unique key is case-sensitive in the database.

Duplicate rows are not removed

Out of scope

Trimming makes near-duplicates identical, but it does not delete them. Run csv-deduplicator after trimming so the now-identical rows collapse to one before they hit a UNIQUE constraint.

Multi-object export in one workbook — sheet 1 only

First sheet only

If a single .xlsx has Accounts on sheet 1 and Contacts on sheet 2, only Accounts is processed and returned. Export each CRM object to its own file (Salesforce, HubSpot, Zoho, Pipedrive all let you do this) and trim them separately.

Number/date formatting not preserved

Formatting lost

Because the file passes through a plain-text stage, currency/date cell formats are not carried to the output .xlsx. The displayed text is preserved. For loads this is usually fine (you cast types on import), but cast date and numeric columns explicitly in your COPY/LOAD mapping.

Phone numbers with leading zeros

Text preserved

If a phone or account number with a leading zero is stored as text in the source, its text is preserved (only edge whitespace is removed). If it was stored as a number in Excel, the leading zero was already gone before export — that is a CRM/Excel typing issue, not a trimming one.

Export exceeds the free tier

Blocked

A full contact dump above 5 MB / 10,000 rows is blocked on the free tier. Either upgrade to Pro (50 MB / 100,000 rows) or split the export into chunks and trim each before loading.

Embedded line breaks inside a quoted CSV field

Preserved

A correctly quoted multi-line notes field is parsed as one cell; the trim removes only leading/trailing whitespace of that whole value, leaving internal line breaks intact. If the source CSV had an *unquoted* newline, the row was already malformed at export — re-export with proper quoting.

Encrypted CRM export

Cannot open

Some CRMs offer password-protected exports. An encrypted workbook cannot be read by the browser parser — export without encryption (or decrypt in Excel) before trimming.

Frequently asked questions

Does Salesforce really add trailing spaces?

Frequently — usually because a rep typed a stray space into a free-text field (Account Name, Billing Street) and Salesforce stores values verbatim. Report exports and Data Loader output carry those spaces straight into the file, where they silently break joins after load. Trimming the export before staging removes them.

Will this also remove non-breaking spaces from HubSpot rich-text fields?

At the edges, yes. Rich-text CRM fields pasted from web content commonly carry CHAR 160 (the non-breaking space). This tool's trim removes edge CHAR 160 — which both Excel's TRIM() and many database TRIM() functions ignore — so your keys actually match after load.

Should I trim before or after loading into the database?

Before. Cleaning the file once is simpler and more reliable than wrapping every text column in TRIM() in your load mapping (easy to miss a column, and SQL TRIM() often skips CHAR 160). Trim → optionally case-fold → dedup → load.

Does it fix case differences like 'acme' vs 'Acme'?

No — it only removes whitespace. For case-sensitive keys, fold case with csv-case-converter after trimming. Many DBAs lowercase email and code columns as part of staging for exactly this reason.

Will it remove duplicate account/contact rows?

No, but it is the right first step. Trimming makes Acme and Acme identical; then csv-deduplicator collapses the now-identical rows. Running dedup before trimming would miss the whitespace-hidden duplicates.

Is my CRM data sent to a server?

No. Parsing and trimming happen entirely in your browser via SheetJS. Account names, contact emails, deal values, and notes never leave your machine — important for PII and any data-processing-agreement constraints. Only an anonymous run counter is recorded when signed in.

My export has Accounts and Contacts on separate tabs — will both be cleaned?

No. The XLSX path processes the first worksheet only and returns a single-sheet file. Export each CRM object to its own file and trim them individually.

Can I use this for Shopify or WooCommerce order exports too?

Yes. Any export that produces .xlsx or .csv works. E-commerce exports are a common source of whitespace-padded SKUs and product titles that break inventory joins exactly the way CRM keys do.

Will trimming change my numeric fields like amount or quantity?

No. Numeric cells have no surrounding text to trim and are passed through unchanged in value. Note that custom number/currency *formatting* is not preserved across the round-trip, so cast numeric columns explicitly in your load mapping.

What about internal double spaces in addresses?

Those are left in place — the tool trims edges only. If your address matching needs single-spaced strings, run csv-find-replace with the regex {2,} replaced by one space after trimming.

How large a CRM export can I clean?

Free tier handles 5 MB / 10,000 rows — fine for a mid-size contact list. Pro raises it to 50 MB / 100,000 rows, Pro-media to 200 MB / 500,000, and Developer to 500 MB / unlimited. Browser memory is the practical ceiling on very large dumps.

Can I automate this in an ETL pipeline?

The trimming logic is identical to the canonical CSV whitespace trimmer at /tool/csv-whitespace-trimmer, which exposes the runner API for scripted local runs. The Excel page itself is a browser tool — for a headless pipeline, point your job at the canonical CSV tool, which accepts XLSX input.

Privacy first

Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.

How to trim crm excel exports before you load them into sql

Step 1
Export from the CRM (don't pre-clean) — Run your Salesforce report export / Data Loader, HubSpot list export, Zoho or Pipedrive export to .xlsx or .csv. Leave it as-is — let the trimmer do the cleaning.
Step 2
Drop the export on the trimmer — Drag the file in. SheetJS parses it locally; the data never leaves the browser. Multi-sheet exports are processed first-sheet-only, so export one object (Accounts, Contacts, Deals) per file.
Step 3
Let the single pass run — There is nothing to configure. Every cell gets a leading/trailing trim — including edge CHAR 160 — in one in-memory pass.
Step 4
Download the cleaned staging file — An .xlsx export comes back as <name>-processed.xlsx; a CSV stays CSV. The result is values-only, ready to stage.
Step 5
Load into your database — Run your \copy / COPY FROM / LOAD DATA INFILE / BULK INSERT against the cleaned file. Foreign-key joins on name fields and UNIQUE constraints on email/account now behave because the keys match byte-for-byte.
Step 6
Spot-check the keys — After load, run SELECT company, COUNT(*) FROM staging GROUP BY company HAVING COUNT(*)<>1 (or compare against the CRM's record count). Counts that finally line up confirm the edge-space variants collapsed.

Where CRM exports hide whitespace

Common edge-whitespace sources by platform and the database symptom they cause. The trimmer addresses the edge-space causes; case and duplicate issues need the linked siblings.

CRM source	Typical edge-whitespace	Database symptom
Salesforce report / Data Loader	Trailing space in Account Name, Billing fields	`JOIN` on account name returns fewer rows than expected
HubSpot list export	Edge CHAR 160 in rich-text company / notes fields	`WHERE email = ?` misses; constraint admits a near-duplicate
Zoho CRM export	Leading space from copy-pasted phone / address	Phone lookup fails; dedup keeps both variants
Pipedrive export	Trailing newline pasted into deal title	`GROUP BY` splits one deal stage into two buckets
Any web-sourced field	Tabs / NBSP from pasted web content	Exact-match key comparison silently fails

What this tool fixes vs. what needs a sibling tool

Scope it precisely so the pre-load checklist routes each problem to the right tool. All siblings below accept XLSX as well as CSV.

Problem	Right tool	Link
Leading/trailing spaces, tabs, newlines, edge CHAR 160	This trimmer	(current tool)
Internal `Acme Corp` double spaces	Find/replace `{2,}` →	csv-find-replace
Case mismatch (`acme` vs `Acme`) breaking keys	Case converter	csv-case-converter
Duplicate account/contact rows	Deduplicator	csv-deduplicator
Garbled non-ASCII / smart quotes in keys	Special-char stripper	csv-special-char-stripper
Blank rows from empty CRM groups	Empty-row remover	csv-empty-row-remover

Tier limits for CRM-sized exports

Per-file Excel-family limits. A 10,000-contact Salesforce export fits the free tier; full account/contact dumps usually need Pro.

Tier	Max file size	Max rows	Files at once
Free	5 MB	10,000	1
Pro	50 MB	100,000	5
Pro-media	200 MB	500,000	20
Developer	500 MB	Unlimited	Unlimited

Cookbook

Pre-load recipes for the most common CRM-to-SQL whitespace failures. PII shown is anonymised.

Salesforce account name trailing space breaks a JOIN

Accounts export carries Acme Corp (trailing space). The Opportunities table has Acme Corp. The join on account name silently drops every Acme opportunity. Trim the Accounts export before staging.

accounts.xlsx (before):     accounts-processed.xlsx (after):
account_name                account_name
Acme Corp·  (trailing)      Acme Corp
Globex                      Globex

SQL after load:
SELECT * FROM opps o JOIN accounts a USING(account_name);
-- before: 0 Acme rows    after: all Acme opps join

HubSpot CHAR 160 defeats Excel TRIM but not this tool

Before (· = CHAR 160):
company
Initech·
Initech

Excel =TRIM("Initech·")  →  "Initech·"   (still distinct → 2 rows pass UNIQUE)
Whitespace Trimmer       →  "Initech"    (both collapse → constraint catches dup)

Pipedrive deal title trailing newline splits a GROUP BY

A deal title pasted from another tab ends in a newline. After load, GROUP BY stage shows 'Negotiation' and 'Negotiation\n' as two buckets. The trimmer removes the trailing newline.

Before (\n = newline):       After:
deal_title                   deal_title
Negotiation\n                Negotiation
Negotiation                  Negotiation

GROUP BY before: 2 buckets    after: 1 bucket

Full pre-load pipeline for a Salesforce contact dump

Order matters: trim edges first so case-folding and dedup operate on clean keys. Each step is a separate tool; all accept XLSX.

1. Whitespace Trimmer   strip edge spaces/CHAR 160      → contacts-processed.xlsx
2. csv-case-converter   lower-case the email column       (/tool/csv-case-converter)
3. csv-deduplicator     drop duplicate emails             (/tool/csv-deduplicator)
4. \copy staging FROM 'contacts-clean.csv' CSV HEADER;

Why not just TRIM() in SQL on every column?

Brittle:  INSERT ... SELECT TRIM(c1), TRIM(c2), ... (24 columns, easy to miss one)
          -- and Postgres TRIM(' ' ...) does not strip CHAR 160 by default
Clean:    run file through Whitespace Trimmer once → \copy directly

Edge cases and what actually happens

SQL TRIM() ignored CHAR 160 but this tool removes it

Expected

Internal double spaces in addresses remain

By design

12 Main St keeps its internal double spaces — the tool trims edges only. If your address-matching depends on single-spaced strings, follow with csv-find-replace using {2,} → before staging.

Case mismatch is not fixed

Out of scope

Acme and acme will still be two distinct keys after trimming — this tool does not change case. Fold case with csv-case-converter if your join or unique key is case-sensitive in the database.

Duplicate rows are not removed

Out of scope

Trimming makes near-duplicates identical, but it does not delete them. Run csv-deduplicator after trimming so the now-identical rows collapse to one before they hit a UNIQUE constraint.

Multi-object export in one workbook — sheet 1 only

First sheet only

Number/date formatting not preserved

Formatting lost

Phone numbers with leading zeros

Text preserved

Export exceeds the free tier

Blocked

A full contact dump above 5 MB / 10,000 rows is blocked on the free tier. Either upgrade to Pro (50 MB / 100,000 rows) or split the export into chunks and trim each before loading.

Embedded line breaks inside a quoted CSV field

Preserved

Encrypted CRM export

Cannot open

Some CRMs offer password-protected exports. An encrypted workbook cannot be read by the browser parser — export without encryption (or decrypt in Excel) before trimming.

Frequently asked questions

Does Salesforce really add trailing spaces?

Will this also remove non-breaking spaces from HubSpot rich-text fields?

Should I trim before or after loading into the database?

Does it fix case differences like 'acme' vs 'Acme'?

No — it only removes whitespace. For case-sensitive keys, fold case with csv-case-converter after trimming. Many DBAs lowercase email and code columns as part of staging for exactly this reason.

Will it remove duplicate account/contact rows?

Is my CRM data sent to a server?

My export has Accounts and Contacts on separate tabs — will both be cleaned?

No. The XLSX path processes the first worksheet only and returns a single-sheet file. Export each CRM object to its own file and trim them individually.

Can I use this for Shopify or WooCommerce order exports too?

Yes. Any export that produces .xlsx or .csv works. E-commerce exports are a common source of whitespace-padded SKUs and product titles that break inventory joins exactly the way CRM keys do.

Will trimming change my numeric fields like amount or quantity?

What about internal double spaces in addresses?

Those are left in place — the tool trims edges only. If your address matching needs single-spaced strings, run csv-find-replace with the regex {2,} replaced by one space after trimming.

Trim CRM Excel Exports Before You Load Them into SQL

How to trim crm excel exports before you load them into sql

Where CRM exports hide whitespace

What this tool fixes vs. what needs a sibling tool

Tier limits for CRM-sized exports

Cookbook

Salesforce account name trailing space breaks a JOIN

HubSpot CHAR 160 defeats Excel TRIM but not this tool

Pipedrive deal title trailing newline splits a GROUP BY

Full pre-load pipeline for a Salesforce contact dump

Why not just TRIM() in SQL on every column?

Edge cases and what actually happens

SQL TRIM() ignored CHAR 160 but this tool removes it

Internal double spaces in addresses remain

Case mismatch is not fixed

Duplicate rows are not removed

Multi-object export in one workbook — sheet 1 only

Number/date formatting not preserved

Phone numbers with leading zeros

Export exceeds the free tier

Embedded line breaks inside a quoted CSV field

Encrypted CRM export

Frequently asked questions

Does Salesforce really add trailing spaces?

Will this also remove non-breaking spaces from HubSpot rich-text fields?

Should I trim before or after loading into the database?

Does it fix case differences like 'acme' vs 'Acme'?

Will it remove duplicate account/contact rows?

Is my CRM data sent to a server?

My export has Accounts and Contacts on separate tabs — will both be cleaned?

Can I use this for Shopify or WooCommerce order exports too?

Will trimming change my numeric fields like amount or quantity?

What about internal double spaces in addresses?

How large a CRM export can I clean?

Can I automate this in an ETL pipeline?

Privacy first

Related guides

Trim CRM Excel Exports Before You Load Them into SQL

How to trim crm excel exports before you load them into sql

Where CRM exports hide whitespace

What this tool fixes vs. what needs a sibling tool

Tier limits for CRM-sized exports

Cookbook

Salesforce account name trailing space breaks a JOIN

HubSpot CHAR 160 defeats Excel TRIM but not this tool

Pipedrive deal title trailing newline splits a GROUP BY

Full pre-load pipeline for a Salesforce contact dump

Why not just TRIM() in SQL on every column?

Edge cases and what actually happens

SQL TRIM() ignored CHAR 160 but this tool removes it

Internal double spaces in addresses remain

Case mismatch is not fixed

Duplicate rows are not removed

Multi-object export in one workbook — sheet 1 only

Number/date formatting not preserved

Phone numbers with leading zeros

Export exceeds the free tier

Embedded line breaks inside a quoted CSV field

Encrypted CRM export

Frequently asked questions

Does Salesforce really add trailing spaces?

Will this also remove non-breaking spaces from HubSpot rich-text fields?

Should I trim before or after loading into the database?

Does it fix case differences like 'acme' vs 'Acme'?

Will it remove duplicate account/contact rows?

Is my CRM data sent to a server?

My export has Accounts and Contacts on separate tabs — will both be cleaned?

Can I use this for Shopify or WooCommerce order exports too?

Will trimming change my numeric fields like amount or quantity?

What about internal double spaces in addresses?

How large a CRM export can I clean?

Can I automate this in an ETL pipeline?

Privacy first

Related guides