Flatten JSON for a BigQuery Flat Schema — Free Online Tool

How to flatten nested json for bigquery schema design

Step 1
Save a representative sample as a .json file — The flattener reads a dropped file, not pasted text. Save 1 document (for schema design) or a small representative set as .json. Cover the full variety of nested fields — rare fields absent from the sample will not appear in the flat key set, so they will be missing from the schema you derive.
Step 2
Drop the file onto the dropzone — Flattening runs in your browser; nothing is uploaded. The header shows the file name and size. Invalid JSON throws a SyntaxError — note that a Firestore export is often an array or NDJSON, so flatten one document at a time (see edge cases).
Step 3
Set the delimiter to underscore — Type _ into the Delimiter box. BigQuery column names must match [A-Za-z_][A-Za-z0-9_]* and cannot contain dots — so event_properties_utm_campaign is legal where event.properties.utm.campaign is not. The default . is wrong for BigQuery flat columns.
Step 4
Choose array handling deliberately — dot.0.key makes positional columns (tags_0, tags_1) — sparse and brittle for BigQuery. keep arrays keeps the array so you can declare it REPEATED/ARRAY. Arrays of objects should stay intact (keep arrays) and be modelled as ARRAY<STRUCT> — do not flatten them to positional keys.
Step 5
Review flat keys for length and legality — Long keys like event_properties_page_properties_utm_campaign are legal but unwieldy. Trim or rename overly verbose keys with json-key-renamer. Also confirm no key starts with a digit or contains a character outside [A-Za-z0-9_] — rename any that do.
Step 6
Load the flat JSON as NDJSON — Convert the flattened object(s) to newline-delimited JSON (one object per line), upload to GCS, and run bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect myproject:mydataset.mytable gs://bucket/flat.ndjson. With a consistent flat shape, autodetect produces a clean flat schema.

Delimiter and BigQuery column-name legality

BigQuery column names must match [A-Za-z_][A-Za-z0-9_]* — dots are illegal. The delimiter you choose determines whether the flat key is a legal column.

Delimiter	Example key	Legal BigQuery column?	Notes
`_` (underscore)	`event_props_utm_campaign`	Yes	Recommended — clean bare identifier
`.` (dot, default)	`event.props.utm.campaign`	No	Dots are illegal in a flat column name; must rename before load
`[i]` (bracket array mode)	`items[0]`	No	Brackets are illegal; rename or use keep arrays + REPEATED
`__` (double underscore)	`event__props__utm`	Yes	Use when source keys already contain single underscores

How nested structures map to BigQuery types

Recommended BigQuery modelling per structure. Flattening is right for objects; arrays of objects belong in ARRAY<STRUCT>, not positional columns.

Source structure	Flattener mode	BigQuery model	Why
Nested object	Flatten (underscore)	Flat scalar columns	Simplest to query; no STRUCT access needed
Array of primitives	keep arrays	`ARRAY<STRING>` (REPEATED)	Native repeated field; dot mode would make sparse `_0`/`_1` columns
Array of objects	keep arrays	`ARRAY<STRUCT<...>>` (REPEATED RECORD)	Preserves per-element relationships flattening would destroy
Highly variable payload	keep arrays / leave nested	`JSON` column type	Avoids schema churn for unpredictable event properties

Cookbook

Before/after documents headed for a BigQuery table. The pattern: flatten objects to legal underscore columns, keep arrays for REPEATED fields.

Nested object → legal flat columns

Example

Underscore delimiter turns a nested object into BigQuery-legal column names you can SELECT without backticks.

Input (doc.json):
{ "event": { "name": "click", "props": { "page": "/home" } } }

Delimiter: _   Array handling: keep arrays

Output:
{ "event_name": "click", "event_props_page": "/home" }

Query: SELECT event_name, event_props_page FROM t
(no UNNEST, no backticks)

Dot delimiter produces illegal column names

Example

The default dot delimiter yields keys BigQuery rejects as flat columns. Always switch to underscore for BigQuery loads.

Delimiter: .  →  { "event.name": "click" }

bq load with this NDJSON:
  Error: Invalid field name "event.name".
  Fields must contain only letters, numbers, and
  underscores, start with a letter or underscore...

Delimiter: _  →  { "event_name": "click" }   ✓ loads

Array of objects: keep it for ARRAY<STRUCT>

Example

Flattening an order's line_items to positional keys loses the per-line relationship. Keep the array and model it as a REPEATED RECORD.

Input:
{ "order_id": 9,
  "line_items": [ {"sku":"A","qty":2}, {"sku":"B","qty":1} ] }

dot.0.key  →  line_items_0_sku, line_items_0_qty, ...
  (positional, relationship lost)

keep arrays  →  line_items stays an array → model as
  line_items ARRAY<STRUCT<sku STRING, qty INT64>>

Verbose flat key, then rename

Example

Deeply nested event properties produce long but legal column names. Shorten them with the key renamer before defining the schema.

Input:
{ "event": { "properties": { "page": { "utm": { "campaign": "spring" } } } } }

Flatten (underscore)  →
{ "event_properties_page_utm_campaign": "spring" }

Legal, but long. Use json-key-renamer to map it to
  utm_campaign before creating the table.

Firestore export must be flattened per document

Example

Firestore exports are typically arrays or NDJSON. A whole array flattens to numeric-prefixed keys, not per-document rows — flatten one document to design the schema.

Firestore export (array):
[ {"user":{"name":"Sue"}}, {"user":{"name":"Jon"}} ]

Flattener output (one object, numeric prefixes):
{ "0.user.name": "Sue", "1.user.name": "Jon" }  ← not rows

Instead: flatten ONE document to design columns, then
flatten each doc per-record and write NDJSON for bq load.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Dot delimiter yields columns BigQuery rejects

Load error

The default . delimiter makes keys like event.name. bq load rejects them with Invalid field name because flat column names cannot contain dots. Always set the Delimiter box to _ for BigQuery; if you already flattened with dots, rename with json-key-renamer before loading.

Array of objects loses relationships when flattened

Modelling error

Flattening line_items to line_items_0_sku, line_items_1_sku discards which qty belongs to which sku. Use keep arrays and model the field as ARRAY<STRUCT<...>> (REPEATED RECORD) so BigQuery keeps per-element structure.

Primitive arrays become sparse positional columns

By design

In dot mode tags: ["a","b"] becomes tags_0, tags_1. With variable-length arrays this produces sparse columns (tags_7 populated only occasionally). Prefer keep arrays and declare the field REPEATED.

Top-level array flattens to numeric keys, not rows

By design

A Firestore/array export flattens to one object with 0.field, 1.field keys — not one row per document. Flatten a single document to design the schema, then flatten each document individually and emit NDJSON (one object per line) for bq load.

Column name starts with a digit or has illegal chars

Load error

If a source key is numeric or contains characters outside [A-Za-z0-9_] (a dot, dash, space), the flat key is an illegal BigQuery column even with underscore delimiter. Rename offenders with json-key-renamer; a numeric leading char needs a prefix like f_.

Empty object/array drops the column

Expected

{} and [] produce no key. A field empty in your sample document is absent from the flat schema, so the column will not be created. Pick a sample where every column is populated, or merge representative documents to cover the full field set.

Very deep document stops at level 10

Expected

Max depth is fixed at 10. Below that the subtree stays nested as a value — which BigQuery autodetect would then type as a RECORD or JSON column, not flat scalars. Re-flatten that fragment, or model the deep portion as a JSON column on purpose.

Underscore collision in source keys

Data loss

A literal user_id key and a nested user → id both flatten to user_id; the later one silently overwrites the earlier. With underscore-heavy source keys, use __ in the Delimiter box so the paths stay distinct columns.

Invalid JSON / NDJSON throws

error

The flattener parses the file as a single JSON value. A multi-line NDJSON export or any syntax error throws a SyntaxError. Flatten one well-formed document at a time. Repair with json-format-fixer or locate the error with json-validator.

Free-tier 2 MB / one-file limit

Tier limit

Free tier caps JSON input at 2 MB, one file. A large Firestore export will be rejected. Flatten one representative document for schema design (all you need), or upgrade to Pro (100 MB, 10-file batches) for bigger files.

Frequently asked questions

Which delimiter should I use for BigQuery columns?

Underscore (_). BigQuery column names must match [A-Za-z_][A-Za-z0-9_]* and cannot contain dots, so the default . delimiter produces illegal column names that bq load rejects with Invalid field name. Set the Delimiter box to _ to get legal bare identifiers like event_props_utm_campaign.

Should I always prefer a flat schema over nested RECORD types?

For analytical workloads with aggregations and GROUP BY, flat is simpler — no UNNEST everywhere. For highly variable payloads (event properties), BigQuery's JSON column type or STRUCT preserves structure more efficiently. Use flat for stable, known field sets; keep JSON/STRUCT for unpredictable ones. The flattener supports the flat path and keep arrays supports the repeated path.

How do I handle arrays of objects for BigQuery?

Don't flatten them. Use keep arrays so the array stays intact, then model it as ARRAY<STRUCT<...>> (a REPEATED RECORD). Flattening line_items to line_items_0_sku destroys the per-element relationship and produces sparse positional columns. Keep arrays + REPEATED RECORD is the correct BigQuery pattern.

Why does bq load reject my flattened column names?

Almost certainly you flattened with the dot delimiter — event.name is illegal in BigQuery. Re-flatten with the _ delimiter, or rename existing dotted keys with json-key-renamer. Also check for keys that start with a digit or contain spaces/dashes; those need renaming too.

My Firestore export is an array — can I flatten the whole thing?

Flattening a whole array gives one object with numeric-prefixed keys (0.user.name), not per-document rows. For a load, flatten one document to design the schema, then flatten each document individually and write newline-delimited JSON (one object per line) for bq load --source_format=NEWLINE_DELIMITED_JSON.

Can I change the maximum nesting depth?

No — it is fixed at 10 levels and not exposed in the UI. Subtrees deeper than 10 are left nested as a value, which BigQuery autodetect types as a RECORD/JSON column rather than flat scalars. Re-flatten the deep fragment, or deliberately model that portion as a JSON column.

Why is a column missing from my derived schema?

The field was probably an empty object {} or empty array [] in the sample document — those emit no key. Use a sample where every field is populated, or flatten several representative documents and union their flat key sets to cover the full schema.

Can I paste JSON instead of dropping a file?

No — the tool reads a dropped .json file and has no paste box. Save your document to a file first. A Firestore/NDJSON export should be split so you flatten one well-formed JSON document per pass.

Does flattening change my data types?

No. Leaf values keep their JSON type, so BigQuery autodetect infers STRING/INT64/FLOAT64/BOOL correctly. The flattener only restructures keys; it does not coerce values. Dates remain strings — define them as DATE/TIMESTAMP explicitly or parse after load.

Is my Firestore or API data uploaded?

No. Flattening runs entirely in your browser. Firestore document data, event payloads, and any PII never reach JAD Apps servers. Only an anonymous file-processed counter (no content) is kept for signed-in dashboard stats, which you can opt out of.

How large a file can I flatten for free?

Free tier caps JSON input at 2 MB, one file. That is enough for a representative schema-design document. Pro raises it to 100 MB with 10-file batches; higher tiers go further. For full exports, design the schema from a sample and load the bulk file via NDJSON + bq load.

Can I flatten in an automated load pipeline?

Yes. GET /api/v1/tools/json-flattener returns the option schema; pair the @jadapps/runner and POST documents plus options (delimiter _, arrayHandling keep arrays) to /api/v1/tools/json-flattener/run locally. Data stays on your machine. A common flow: extract Firestore docs → flatten per-record → write NDJSON to GCS → bq load.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to flatten nested json for bigquery schema design

Step 1
Save a representative sample as a .json file — The flattener reads a dropped file, not pasted text. Save 1 document (for schema design) or a small representative set as .json. Cover the full variety of nested fields — rare fields absent from the sample will not appear in the flat key set, so they will be missing from the schema you derive.
Step 2
Drop the file onto the dropzone — Flattening runs in your browser; nothing is uploaded. The header shows the file name and size. Invalid JSON throws a SyntaxError — note that a Firestore export is often an array or NDJSON, so flatten one document at a time (see edge cases).
Step 3
Set the delimiter to underscore — Type _ into the Delimiter box. BigQuery column names must match [A-Za-z_][A-Za-z0-9_]* and cannot contain dots — so event_properties_utm_campaign is legal where event.properties.utm.campaign is not. The default . is wrong for BigQuery flat columns.
Step 4
Choose array handling deliberately — dot.0.key makes positional columns (tags_0, tags_1) — sparse and brittle for BigQuery. keep arrays keeps the array so you can declare it REPEATED/ARRAY. Arrays of objects should stay intact (keep arrays) and be modelled as ARRAY<STRUCT> — do not flatten them to positional keys.
Step 5
Review flat keys for length and legality — Long keys like event_properties_page_properties_utm_campaign are legal but unwieldy. Trim or rename overly verbose keys with json-key-renamer. Also confirm no key starts with a digit or contains a character outside [A-Za-z0-9_] — rename any that do.
Step 6
Load the flat JSON as NDJSON — Convert the flattened object(s) to newline-delimited JSON (one object per line), upload to GCS, and run bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect myproject:mydataset.mytable gs://bucket/flat.ndjson. With a consistent flat shape, autodetect produces a clean flat schema.

Delimiter and BigQuery column-name legality

BigQuery column names must match [A-Za-z_][A-Za-z0-9_]* — dots are illegal. The delimiter you choose determines whether the flat key is a legal column.

Delimiter	Example key	Legal BigQuery column?	Notes
`_` (underscore)	`event_props_utm_campaign`	Yes	Recommended — clean bare identifier
`.` (dot, default)	`event.props.utm.campaign`	No	Dots are illegal in a flat column name; must rename before load
`[i]` (bracket array mode)	`items[0]`	No	Brackets are illegal; rename or use keep arrays + REPEATED
`__` (double underscore)	`event__props__utm`	Yes	Use when source keys already contain single underscores

How nested structures map to BigQuery types

Recommended BigQuery modelling per structure. Flattening is right for objects; arrays of objects belong in ARRAY<STRUCT>, not positional columns.

Source structure	Flattener mode	BigQuery model	Why
Nested object	Flatten (underscore)	Flat scalar columns	Simplest to query; no STRUCT access needed
Array of primitives	keep arrays	`ARRAY<STRING>` (REPEATED)	Native repeated field; dot mode would make sparse `_0`/`_1` columns
Array of objects	keep arrays	`ARRAY<STRUCT<...>>` (REPEATED RECORD)	Preserves per-element relationships flattening would destroy
Highly variable payload	keep arrays / leave nested	`JSON` column type	Avoids schema churn for unpredictable event properties

Cookbook

Before/after documents headed for a BigQuery table. The pattern: flatten objects to legal underscore columns, keep arrays for REPEATED fields.

Nested object → legal flat columns

Example

Underscore delimiter turns a nested object into BigQuery-legal column names you can SELECT without backticks.

Input (doc.json):
{ "event": { "name": "click", "props": { "page": "/home" } } }

Delimiter: _   Array handling: keep arrays

Output:
{ "event_name": "click", "event_props_page": "/home" }

Query: SELECT event_name, event_props_page FROM t
(no UNNEST, no backticks)

Dot delimiter produces illegal column names

Example

The default dot delimiter yields keys BigQuery rejects as flat columns. Always switch to underscore for BigQuery loads.

Delimiter: .  →  { "event.name": "click" }

bq load with this NDJSON:
  Error: Invalid field name "event.name".
  Fields must contain only letters, numbers, and
  underscores, start with a letter or underscore...

Delimiter: _  →  { "event_name": "click" }   ✓ loads

Array of objects: keep it for ARRAY<STRUCT>

Example

Flattening an order's line_items to positional keys loses the per-line relationship. Keep the array and model it as a REPEATED RECORD.

Input:
{ "order_id": 9,
  "line_items": [ {"sku":"A","qty":2}, {"sku":"B","qty":1} ] }

dot.0.key  →  line_items_0_sku, line_items_0_qty, ...
  (positional, relationship lost)

keep arrays  →  line_items stays an array → model as
  line_items ARRAY<STRUCT<sku STRING, qty INT64>>

Verbose flat key, then rename

Example

Deeply nested event properties produce long but legal column names. Shorten them with the key renamer before defining the schema.

Input:
{ "event": { "properties": { "page": { "utm": { "campaign": "spring" } } } } }

Flatten (underscore)  →
{ "event_properties_page_utm_campaign": "spring" }

Legal, but long. Use json-key-renamer to map it to
  utm_campaign before creating the table.

Firestore export must be flattened per document

Example

Firestore exports are typically arrays or NDJSON. A whole array flattens to numeric-prefixed keys, not per-document rows — flatten one document to design the schema.

Firestore export (array):
[ {"user":{"name":"Sue"}}, {"user":{"name":"Jon"}} ]

Flattener output (one object, numeric prefixes):
{ "0.user.name": "Sue", "1.user.name": "Jon" }  ← not rows

Instead: flatten ONE document to design columns, then
flatten each doc per-record and write NDJSON for bq load.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Dot delimiter yields columns BigQuery rejects

Load error

Array of objects loses relationships when flattened

Modelling error

Primitive arrays become sparse positional columns

By design

Top-level array flattens to numeric keys, not rows

By design

Column name starts with a digit or has illegal chars

Load error

Empty object/array drops the column

Expected

Very deep document stops at level 10

Expected

Underscore collision in source keys

Data loss

Invalid JSON / NDJSON throws

error

Free-tier 2 MB / one-file limit

Tier limit

Frequently asked questions

Which delimiter should I use for BigQuery columns?

Should I always prefer a flat schema over nested RECORD types?

How do I handle arrays of objects for BigQuery?

Why does bq load reject my flattened column names?

My Firestore export is an array — can I flatten the whole thing?

Can I change the maximum nesting depth?

Why is a column missing from my derived schema?

Can I paste JSON instead of dropping a file?

No — the tool reads a dropped .json file and has no paste box. Save your document to a file first. A Firestore/NDJSON export should be split so you flatten one well-formed JSON document per pass.

Does flattening change my data types?

Is my Firestore or API data uploaded?

How large a file can I flatten for free?

Can I flatten in an automated load pipeline?

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Flatten Nested JSON for BigQuery Schema Design

How to flatten nested json for bigquery schema design

Delimiter and BigQuery column-name legality

How nested structures map to BigQuery types

Cookbook

Nested object → legal flat columns

Dot delimiter produces illegal column names

Array of objects: keep it for ARRAY<STRUCT>

Verbose flat key, then rename

Firestore export must be flattened per document

Errors and edge cases

Dot delimiter yields columns BigQuery rejects

Array of objects loses relationships when flattened

Primitive arrays become sparse positional columns

Top-level array flattens to numeric keys, not rows

Column name starts with a digit or has illegal chars

Empty object/array drops the column

Very deep document stops at level 10

Underscore collision in source keys

Invalid JSON / NDJSON throws

Free-tier 2 MB / one-file limit

Frequently asked questions

Which delimiter should I use for BigQuery columns?

Should I always prefer a flat schema over nested RECORD types?

How do I handle arrays of objects for BigQuery?

Why does bq load reject my flattened column names?

My Firestore export is an array — can I flatten the whole thing?

Can I change the maximum nesting depth?

Why is a column missing from my derived schema?

Can I paste JSON instead of dropping a file?

Does flattening change my data types?

Is my Firestore or API data uploaded?

How large a file can I flatten for free?

Can I flatten in an automated load pipeline?

Privacy first

Related guides

Flatten Nested JSON for BigQuery Schema Design

How to flatten nested json for bigquery schema design

Delimiter and BigQuery column-name legality

How nested structures map to BigQuery types

Cookbook

Nested object → legal flat columns

Dot delimiter produces illegal column names

Array of objects: keep it for ARRAY<STRUCT>

Verbose flat key, then rename

Firestore export must be flattened per document

Errors and edge cases

Dot delimiter yields columns BigQuery rejects

Array of objects loses relationships when flattened

Primitive arrays become sparse positional columns

Top-level array flattens to numeric keys, not rows

Column name starts with a digit or has illegal chars

Empty object/array drops the column

Very deep document stops at level 10

Underscore collision in source keys

Invalid JSON / NDJSON throws

Free-tier 2 MB / one-file limit

Frequently asked questions

Which delimiter should I use for BigQuery columns?

Should I always prefer a flat schema over nested RECORD types?

How do I handle arrays of objects for BigQuery?

Why does bq load reject my flattened column names?

My Firestore export is an array — can I flatten the whole thing?

Can I change the maximum nesting depth?

Why is a column missing from my derived schema?

Can I paste JSON instead of dropping a file?

Does flattening change my data types?

Is my Firestore or API data uploaded?

How large a file can I flatten for free?

Can I flatten in an automated load pipeline?

Privacy first

Related guides