Parse Salesforce XML Data Export to JSON — Free Online Tool

How to convert salesforce xml data export to json

Step 1
Run a Salesforce Data Export — In Salesforce Setup → Data → Data Export, request a full or incremental export (XML format). When the email arrives, download the ZIP and extract the per-object XML files — Account.xml, Contact.xml, Opportunity.xml, etc.
Step 2
Open the XML to JSON tool — This is a Pro tool. Drop one object's .xml file onto the dropzone. The free tier converts files up to 2 MB for evaluation; signed-in Pro raises the per-file limit to 100 MB, which covers most single-object exports.
Step 3
Enable Strip namespaces — Tick Strip namespaces to drop the sf: prefix so keys match Salesforce API names. This also removes the xmlns:sf declaration attribute, leaving clean prefix-free JSON ready for your warehouse schema.
Step 4
Choose your type-coercion stance — Leave Coerce types on for numeric Amount/Quantity fields and boolean flags. Turn it OFF if any object has zero-padded external IDs or reference codes you must preserve exactly — coercion would strip the leading zeros.
Step 5
Convert and inspect the record array — Click Convert to JSON. Salesforce wraps records in a root element with repeated record children, so multi-record exports produce an array of record objects. Verify the array shape, then download the JSON.
Step 6
Load into your pipeline, then flatten if needed — Read the JSON with pd.read_json(), stage it for COPY INTO in Snowflake, or feed your migration script. Nested lookup objects are NOT flattened by this tool — use json-flattener to collapse Account.Name into Account_Name columns for a tabular load.

Options mapped to Salesforce ETL outcomes

The real controls in the tool and what they do to a Salesforce export. attributePrefix (@) and textNodeName (#text) are fixed and not shown in the UI.

Option	Salesforce effect	Default
Strip namespaces	`<sf:Name>`→`Name`; matches API field names and warehouse columns. Removes the `xmlns:sf` declaration too	Off (enable for SF exports)
Coerce types	`<Amount>5000</Amount>`→`5000` (number), `<IsWon>true</IsWon>`→`true` (boolean). Off keeps all values as strings	On
Parse attributes	Salesforce data XML rarely uses attributes; leaving on or off has little effect on field values. On maps any attribute to `@key`	On
Indent	Output formatting only — choose 2/4 spaces for readable diffs, Minified for compact loads	2 spaces

Salesforce field types after conversion

How common Salesforce field shapes land in the JSON, and the follow-up step they often need.

Salesforce field	JSON result	Follow-up
18-char Record ID `001D000000abcXY...`	String (kept as-is — has letters)	None — safe; never coerced to a number
Currency `<Amount>129.50</Amount>`	`129.5` (number) with Coerce types on	Fine for analytics; cast to DECIMAL on load if exactness matters
Custom field `<Region__c>EMEA</Region__c>`	`"Region__c": "EMEA"` (suffix kept)	json-key-renamer to drop `__c` if your target dislikes it
Lookup `<Account><Name>Acme</Name><Id>001..</Id></Account>`	Nested object `{Account:{Name,Id}}` — NOT flattened	json-flattener → `Account.Name` / `Account_Name`
Multi-select picklist `<Tags>A;B;C</Tags>`	Single string `"A;B;C"` (Salesforce uses `;` inside one field)	Split in dbt/pandas — the tool does not split delimited strings
Zero-padded external ID `<Ref__c>00042</Ref__c>`	`42` with Coerce types ON (zeros lost) / `"00042"` with it OFF	Turn Coerce types OFF to preserve

Cookbook

Real Salesforce export fragments and the exact JSON the converter returns. Org IDs and PII anonymised.

Stripping the sf: namespace to match API names

Example

Salesforce wraps record fields in the sf: namespace. Strip namespaces gives you keys that match the API names your warehouse schema already uses.

Input (Account.xml fragment):
<records xmlns:sf="urn:sobject.enterprise.soap.sforce.com">
  <sf:Account>
    <sf:Id>001D000000abcXY</sf:Id>
    <sf:Name>Acme Corp</sf:Name>
    <sf:AnnualRevenue>5000000</sf:AnnualRevenue>
  </sf:Account>
</records>

Options: Strip namespaces ON, Coerce types ON

Output:
{
  "records": {
    "Account": {
      "Id": "001D000000abcXY",
      "Name": "Acme Corp",
      "AnnualRevenue": 5000000
    }
  }
}

Single record is an object, many records are an array

Example

A test export with one record produces an object on the record key; production exports with many records produce an array. Plan your load loop to handle both, or always test with a multi-record file.

One <Account> in the file:
{ "records": { "Account": { "Id": "001..", "Name": "Acme" } } }

Many <Account> in the file:
{ "records": { "Account": [ {..}, {..}, {..} ] } }

Safe pandas load:
import pandas as pd, json
d = json.load(open('Account.json'))
recs = d['records']['Account']
df = pd.json_normalize(recs if isinstance(recs, list) else [recs])

Lookup relationships stay nested — flatten for a table

Example

The tool does NOT flatten lookups to Account_Name. A nested <Account> becomes a nested object. Run json-flattener afterward to get the flat columns dbt and Snowflake expect.

Converted JSON (lookup kept nested):
{ "Contact": { "LastName": "Lee",
               "Account": { "Name": "Acme", "Id": "001.." } } }

Then /tool/json-flattener (dot mode):
{ "Contact.LastName": "Lee",
  "Contact.Account.Name": "Acme",
  "Contact.Account.Id": "001.." }
=> now one row, one column per leaf, ready for COPY INTO.

Protecting zero-padded reference fields

Example

Coerce types turns numeric-looking strings into numbers, stripping leading zeros from padded external IDs. Turn coercion off to keep them exact, then cast only the truly numeric columns in dbt.

Input:  <Account><ExternalRef__c>000417</ExternalRef__c>
                  <NumEmployees>250</NumEmployees></Account>

Coerce types ON:
{ "ExternalRef__c": 417, "NumEmployees": 250 }   <-- ref broken

Coerce types OFF:
{ "ExternalRef__c": "000417", "NumEmployees": "250" } <-- safe
=> cast NumEmployees to INT in your model, keep the ref as TEXT.

Working through a multi-object export ZIP

Example

The export ZIP holds one XML per object. Convert each separately, name the JSON after the object, then chain warehouse-prep tools so the mapping stays auditable.

ZIP contents:  Account.xml  Contact.xml  Opportunity.xml ...

Per object:
  Account.xml   -> xml-to-json (Strip NS) -> Account.json
  Contact.xml   -> xml-to-json (Strip NS) -> Contact.json
  ...
Then:
  /tool/json-flattener  -> tabular columns
  /tool/json-to-csv     -> CSV for a Snowflake stage load
Import order: Account before Contact before Opportunity
(so foreign-key lookups resolve).

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Lookup relationships are NOT flattened

By design

A nested <Account> inside a <Contact> stays a nested JSON object — the tool does not produce Account_Name or Account.Name flat keys on its own. Run json-flattener after conversion to collapse the hierarchy into dot- or underscore-keyed columns for a tabular load.

Single-record export yields an object, not an array

Singleton trap

If an object's XML contains exactly one record, the record key holds an object; with two-or-more it holds an array. ETL code that always does for rec in data['records']['Account'] will iterate the object's keys when there is one record. Normalise to a list (recs if isinstance(recs, list) else [recs]) or test on a multi-record export.

Coerce types strips leading zeros from reference fields

Data loss

Zero-padded external IDs or reference codes (000417) become plain integers (417) when Coerce types is on. Salesforce auto-number and external-ID fields frequently use padding. Turn Coerce types OFF to keep all values as strings, then cast only the genuinely numeric columns downstream.

Multi-select picklists arrive as one delimited string

Not split

Salesforce stores multi-select picklist values inside a single field separated by ; (e.g. Hot;Warm). The XML has one element, so you get one string — the tool does not split it. Split it in pandas/dbt, or after exporting to CSV use a CSV column splitter, then re-import.

18-character IDs are safe; large numeric fields stay strings

Preserved

Salesforce 15/18-character IDs contain letters, so they are always strings — no precision risk. Any genuinely large integer field (rare in Salesforce, but possible in custom number fields) that would exceed JavaScript's safe-integer range is also kept as a string by fast-xml-parser, so no silent rounding occurs.

No filtering of records or fields during conversion

Not supported

There is no option to keep only certain objects, statuses, or fields. The tool converts the whole file. To extract a subset (e.g. only Won Opportunities or only specific columns), convert first, then use json-key-filter or json-path-extractor on the JSON.

Empty Salesforce fields become empty strings

Expected

A null/blank Salesforce field exported as <Description/> or <Description></Description> becomes "" (empty string), not null. If your warehouse needs true nulls, post-process with json-null-stripper (to drop empties) or convert ""→null in your load logic.

Malformed export XML parses without an error

Silent parse

fast-xml-parser tolerates minor malformation without throwing. If a Data Export was truncated or corrupted in transit, you may get a structurally wrong object rather than a clear failure. Verify the file size against the export report and validate in a strict XML validator if a load looks short.

Free tier caps evaluation files at 2 MB

Plan limit

XML to JSON is a Pro tool; the free tier converts files up to 2 MB. Many single-object Salesforce exports exceed that. Use Pro (100 MB per file) for full objects, or split a large object by date range with Salesforce's incremental export before converting.

Frequently asked questions

How do Salesforce lookup (relationship) fields appear in the JSON?

As nested objects, exactly as the XML nests them — {Contact:{Account:{Name,Id}}}. The tool does NOT flatten them to Account_Name or Account.Name. Use json-flattener after conversion to collapse the hierarchy into flat columns for a warehouse load.

Will my 18-character Salesforce IDs stay intact?

Yes. They contain letters, so they are always parsed as strings — there is no risk of numeric coercion or precision loss. The same is true for 15-character case-sensitive IDs.

How do I keep zero-padded external IDs from being mangled?

Turn Coerce types OFF before converting. With it on, 000417 becomes 417. With it off, every value stays a string and your padded references are preserved exactly; you then cast only the genuinely numeric columns in dbt or pandas.

Does the tool split multi-select picklist values?

No. Salesforce stores multi-select picklists as a single ;-delimited string inside one field, and the tool returns that as one string. Split it downstream (pandas .str.split(';'), dbt, or a CSV column splitter after exporting to CSV).

Can I drop the __c suffix from custom field names?

Not during conversion — __c is kept verbatim so the mapping stays explicit. Rename after conversion with json-key-renamer, or alias the columns in your warehouse model. Many teams keep __c precisely because it makes custom-vs-standard fields obvious in the schema.

My export ZIP has dozens of object files — how should I work through them?

Convert each object's XML separately and name the JSON after the object (Account.json, Contact.json). This keeps the mapping auditable and lets you load in dependency order (Account → Contact → Opportunity) so lookup references resolve. Chain json-flattener and json-to-csv for warehouse staging.

Why is a one-record export not an array?

fast-xml-parser only creates an array when an element repeats. A single record stays an object on the record key. Normalise in your loader (recs if isinstance(recs, list) else [recs]) or test against a multi-record export so your code handles both shapes.

Can I filter to only certain objects or statuses while converting?

No — the tool converts the whole file with no filtering controls. Convert first, then use json-key-filter to keep specific fields or json-path-extractor to pull, say, only Won opportunities with a filter expression.

Are empty Salesforce fields converted to null?

No — <Field/> becomes an empty string "", not null. If your warehouse needs real nulls, run json-null-stripper to drop empties, or convert ""→null in your load step.

Is my CRM data uploaded during conversion?

No. Conversion is entirely client-side via fast-xml-parser. Account names, pipeline revenue, and contact PII never reach JAD Apps servers — only an anonymous run counter (no content) is recorded for signed-in dashboard stats.

How large a Salesforce object can I convert at once?

This is a Pro tool: free-tier evaluation caps files at 2 MB; signed-in Pro raises the per-file limit to 100 MB. For very large objects, use Salesforce's incremental (date-range) export to produce smaller files, then convert and load in batches.

Can I export Salesforce data as JSON directly instead of XML?

Salesforce's weekly Data Export only offers CSV or XML. If you choose CSV instead, use csv-to-json — it is simpler for flat objects. Pick XML (and this tool) when you need the nested relationship structure that CSV flattens away, then use json-flattener only where you want tabular columns.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert salesforce xml data export to json

Step 1
Run a Salesforce Data Export — In Salesforce Setup → Data → Data Export, request a full or incremental export (XML format). When the email arrives, download the ZIP and extract the per-object XML files — Account.xml, Contact.xml, Opportunity.xml, etc.
Step 2
Open the XML to JSON tool — This is a Pro tool. Drop one object's .xml file onto the dropzone. The free tier converts files up to 2 MB for evaluation; signed-in Pro raises the per-file limit to 100 MB, which covers most single-object exports.
Step 3
Enable Strip namespaces — Tick Strip namespaces to drop the sf: prefix so keys match Salesforce API names. This also removes the xmlns:sf declaration attribute, leaving clean prefix-free JSON ready for your warehouse schema.
Step 4
Choose your type-coercion stance — Leave Coerce types on for numeric Amount/Quantity fields and boolean flags. Turn it OFF if any object has zero-padded external IDs or reference codes you must preserve exactly — coercion would strip the leading zeros.
Step 5
Convert and inspect the record array — Click Convert to JSON. Salesforce wraps records in a root element with repeated record children, so multi-record exports produce an array of record objects. Verify the array shape, then download the JSON.
Step 6
Load into your pipeline, then flatten if needed — Read the JSON with pd.read_json(), stage it for COPY INTO in Snowflake, or feed your migration script. Nested lookup objects are NOT flattened by this tool — use json-flattener to collapse Account.Name into Account_Name columns for a tabular load.

Options mapped to Salesforce ETL outcomes

The real controls in the tool and what they do to a Salesforce export. attributePrefix (@) and textNodeName (#text) are fixed and not shown in the UI.

Option	Salesforce effect	Default
Strip namespaces	`<sf:Name>`→`Name`; matches API field names and warehouse columns. Removes the `xmlns:sf` declaration too	Off (enable for SF exports)
Coerce types	`<Amount>5000</Amount>`→`5000` (number), `<IsWon>true</IsWon>`→`true` (boolean). Off keeps all values as strings	On
Parse attributes	Salesforce data XML rarely uses attributes; leaving on or off has little effect on field values. On maps any attribute to `@key`	On
Indent	Output formatting only — choose 2/4 spaces for readable diffs, Minified for compact loads	2 spaces

Salesforce field types after conversion

How common Salesforce field shapes land in the JSON, and the follow-up step they often need.

Salesforce field	JSON result	Follow-up
18-char Record ID `001D000000abcXY...`	String (kept as-is — has letters)	None — safe; never coerced to a number
Currency `<Amount>129.50</Amount>`	`129.5` (number) with Coerce types on	Fine for analytics; cast to DECIMAL on load if exactness matters
Custom field `<Region__c>EMEA</Region__c>`	`"Region__c": "EMEA"` (suffix kept)	json-key-renamer to drop `__c` if your target dislikes it
Lookup `<Account><Name>Acme</Name><Id>001..</Id></Account>`	Nested object `{Account:{Name,Id}}` — NOT flattened	json-flattener → `Account.Name` / `Account_Name`
Multi-select picklist `<Tags>A;B;C</Tags>`	Single string `"A;B;C"` (Salesforce uses `;` inside one field)	Split in dbt/pandas — the tool does not split delimited strings
Zero-padded external ID `<Ref__c>00042</Ref__c>`	`42` with Coerce types ON (zeros lost) / `"00042"` with it OFF	Turn Coerce types OFF to preserve

Cookbook

Real Salesforce export fragments and the exact JSON the converter returns. Org IDs and PII anonymised.

Stripping the sf: namespace to match API names

Example

Salesforce wraps record fields in the sf: namespace. Strip namespaces gives you keys that match the API names your warehouse schema already uses.

Input (Account.xml fragment):
<records xmlns:sf="urn:sobject.enterprise.soap.sforce.com">
  <sf:Account>
    <sf:Id>001D000000abcXY</sf:Id>
    <sf:Name>Acme Corp</sf:Name>
    <sf:AnnualRevenue>5000000</sf:AnnualRevenue>
  </sf:Account>
</records>

Options: Strip namespaces ON, Coerce types ON

Output:
{
  "records": {
    "Account": {
      "Id": "001D000000abcXY",
      "Name": "Acme Corp",
      "AnnualRevenue": 5000000
    }
  }
}

Single record is an object, many records are an array

Example

A test export with one record produces an object on the record key; production exports with many records produce an array. Plan your load loop to handle both, or always test with a multi-record file.

One <Account> in the file:
{ "records": { "Account": { "Id": "001..", "Name": "Acme" } } }

Many <Account> in the file:
{ "records": { "Account": [ {..}, {..}, {..} ] } }

Safe pandas load:
import pandas as pd, json
d = json.load(open('Account.json'))
recs = d['records']['Account']
df = pd.json_normalize(recs if isinstance(recs, list) else [recs])

Lookup relationships stay nested — flatten for a table

Example

The tool does NOT flatten lookups to Account_Name. A nested <Account> becomes a nested object. Run json-flattener afterward to get the flat columns dbt and Snowflake expect.

Converted JSON (lookup kept nested):
{ "Contact": { "LastName": "Lee",
               "Account": { "Name": "Acme", "Id": "001.." } } }

Then /tool/json-flattener (dot mode):
{ "Contact.LastName": "Lee",
  "Contact.Account.Name": "Acme",
  "Contact.Account.Id": "001.." }
=> now one row, one column per leaf, ready for COPY INTO.

Protecting zero-padded reference fields

Example

Coerce types turns numeric-looking strings into numbers, stripping leading zeros from padded external IDs. Turn coercion off to keep them exact, then cast only the truly numeric columns in dbt.

Input:  <Account><ExternalRef__c>000417</ExternalRef__c>
                  <NumEmployees>250</NumEmployees></Account>

Coerce types ON:
{ "ExternalRef__c": 417, "NumEmployees": 250 }   <-- ref broken

Coerce types OFF:
{ "ExternalRef__c": "000417", "NumEmployees": "250" } <-- safe
=> cast NumEmployees to INT in your model, keep the ref as TEXT.

Working through a multi-object export ZIP

Example

The export ZIP holds one XML per object. Convert each separately, name the JSON after the object, then chain warehouse-prep tools so the mapping stays auditable.

ZIP contents:  Account.xml  Contact.xml  Opportunity.xml ...

Per object:
  Account.xml   -> xml-to-json (Strip NS) -> Account.json
  Contact.xml   -> xml-to-json (Strip NS) -> Contact.json
  ...
Then:
  /tool/json-flattener  -> tabular columns
  /tool/json-to-csv     -> CSV for a Snowflake stage load
Import order: Account before Contact before Opportunity
(so foreign-key lookups resolve).

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Lookup relationships are NOT flattened

By design

Single-record export yields an object, not an array

Singleton trap

Coerce types strips leading zeros from reference fields

Data loss

Multi-select picklists arrive as one delimited string

Not split

18-character IDs are safe; large numeric fields stay strings

Preserved

No filtering of records or fields during conversion

Not supported

Empty Salesforce fields become empty strings

Expected

Malformed export XML parses without an error

Silent parse

Free tier caps evaluation files at 2 MB

Plan limit

Frequently asked questions

How do Salesforce lookup (relationship) fields appear in the JSON?

Will my 18-character Salesforce IDs stay intact?

Yes. They contain letters, so they are always parsed as strings — there is no risk of numeric coercion or precision loss. The same is true for 15-character case-sensitive IDs.

How do I keep zero-padded external IDs from being mangled?

Does the tool split multi-select picklist values?

Can I drop the __c suffix from custom field names?

My export ZIP has dozens of object files — how should I work through them?

Why is a one-record export not an array?

Can I filter to only certain objects or statuses while converting?

Are empty Salesforce fields converted to null?

No — <Field/> becomes an empty string "", not null. If your warehouse needs real nulls, run json-null-stripper to drop empties, or convert ""→null in your load step.

Is my CRM data uploaded during conversion?

How large a Salesforce object can I convert at once?

Can I export Salesforce data as JSON directly instead of XML?

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Convert Salesforce XML Data Export to JSON

How to convert salesforce xml data export to json

Options mapped to Salesforce ETL outcomes

Salesforce field types after conversion

Cookbook

Stripping the sf: namespace to match API names

Single record is an object, many records are an array

Lookup relationships stay nested — flatten for a table

Protecting zero-padded reference fields

Working through a multi-object export ZIP

Errors and edge cases

Lookup relationships are NOT flattened

Single-record export yields an object, not an array

Coerce types strips leading zeros from reference fields

Multi-select picklists arrive as one delimited string

18-character IDs are safe; large numeric fields stay strings

No filtering of records or fields during conversion

Empty Salesforce fields become empty strings

Malformed export XML parses without an error

Free tier caps evaluation files at 2 MB

Frequently asked questions

How do Salesforce lookup (relationship) fields appear in the JSON?

Will my 18-character Salesforce IDs stay intact?

How do I keep zero-padded external IDs from being mangled?

Does the tool split multi-select picklist values?

Can I drop the __c suffix from custom field names?

My export ZIP has dozens of object files — how should I work through them?

Why is a one-record export not an array?

Can I filter to only certain objects or statuses while converting?

Are empty Salesforce fields converted to null?

Is my CRM data uploaded during conversion?

How large a Salesforce object can I convert at once?

Can I export Salesforce data as JSON directly instead of XML?

Privacy first

Related guides

Convert Salesforce XML Data Export to JSON

How to convert salesforce xml data export to json

Options mapped to Salesforce ETL outcomes

Salesforce field types after conversion

Cookbook

Stripping the sf: namespace to match API names

Single record is an object, many records are an array

Lookup relationships stay nested — flatten for a table

Protecting zero-padded reference fields

Working through a multi-object export ZIP

Errors and edge cases

Lookup relationships are NOT flattened

Single-record export yields an object, not an array

Coerce types strips leading zeros from reference fields

Multi-select picklists arrive as one delimited string

18-character IDs are safe; large numeric fields stay strings

No filtering of records or fields during conversion

Empty Salesforce fields become empty strings

Malformed export XML parses without an error

Free tier caps evaluation files at 2 MB

Frequently asked questions

How do Salesforce lookup (relationship) fields appear in the JSON?

Will my 18-character Salesforce IDs stay intact?

How do I keep zero-padded external IDs from being mangled?

Does the tool split multi-select picklist values?

Can I drop the __c suffix from custom field names?

My export ZIP has dozens of object files — how should I work through them?

Why is a one-record export not an array?

Can I filter to only certain objects or statuses while converting?

Are empty Salesforce fields converted to null?

Is my CRM data uploaded during conversion?

How large a Salesforce object can I convert at once?

Can I export Salesforce data as JSON directly instead of XML?

Privacy first

Related guides