CSV to NDJSON for BigQuery and Snowflake Loads

How to convert csv to ndjson for bigquery / snowflake

Step 1
Export the source table as CSV — From a database, spreadsheet, or app, export a CSV with a header row. Headers become NDJSON field names and, with BigQuery autodetect, the column names — so use warehouse-legal names (letters, digits, underscores; no leading digit).
Step 2
Drop the CSV onto the converter above — Accepts .csv, .tsv, .txt. PapaParse auto-detects the delimiter and is quote-aware, so embedded commas and newlines inside quoted cells do not break records — a common cause of CSV load failures.
Step 3
Select NDJSON / JSON Lines as the output mode — This produces one compact JSON object per line with no enclosing array brackets — the line-delimited format every major warehouse load command reads. Indentation does not apply to NDJSON.
Step 4
Keep type inference on for clean autodetect — Leave Infer numbers, booleans, null on so numeric and boolean columns load as their real types. Turn it off only if you want every column loaded as STRING (e.g. you will CAST in SQL afterwards).
Step 5
Consider Skip empty cells for NULL handling — By default a blank cell is "" (an empty string in the column). Turn on Skip empty cells to omit the key, which BigQuery loads as NULL — useful when blank should mean missing, not empty.
Step 6
Download the .ndjson and run the load — Click Download NDJSON, then bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect dataset.table data.ndjson, or stage the file and COPY INTO it in Snowflake. Confirm the Records stat matches your expected row count.

NDJSON load command by warehouse

The NDJSON this tool produces feeds each of these. The tool generates the file; it does not run the load.

Warehouse	Source format / file format	Load command (abbreviated)
BigQuery	`NEWLINE_DELIMITED_JSON`	`bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect ds.t data.ndjson`
Snowflake	`FILE_FORMAT = (TYPE = JSON)`	stage file → `COPY INTO t FROM @stage FILE_FORMAT=(TYPE=JSON)`
Redshift	`FORMAT AS JSON 'auto'`	`COPY t FROM 's3://…/data.ndjson' FORMAT AS JSON 'auto'`
Athena	JSON SerDe	`CREATE EXTERNAL TABLE … ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'`

Inference and how the warehouse types it

With BigQuery --autodetect on. Inference is a single global toggle; off means every value is STRING.

CSV cell	NDJSON value (inference on)	Autodetected type	Note
`42`	`42`	INTEGER	Safe integers only
`3.14`	`3.14`	FLOAT	Decimal / scientific
`true`	`true`	BOOLEAN	Case-insensitive
`null`	`null`	(NULL)	Loads as SQL NULL
`` (empty), Skip empty off	`""`	STRING (empty)	Empty string, not NULL
`` (empty), Skip empty on	key omitted	NULL	Field absent → NULL
`9007199254740993`	`"9007…"`	STRING	Beyond safe int — load to NUMERIC/STRING and CAST
`2026-06-12`	`"2026-06-12"`	STRING	Loads as string; CAST to DATE in SQL or schema

Cookbook

NDJSON conversion recipes for warehouse loads. Values illustrative; the focus is the shape and the load command.

Typed NDJSON for bq load --autodetect

Example

NDJSON mode, inference on. BigQuery autodetect reads events as INTEGER and active as BOOLEAN.

Input (users.csv):
user_id,events,active
1,12,true
2,0,false

Output (NDJSON):
{"user_id":1,"events":12,"active":true}
{"user_id":2,"events":0,"active":false}

bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect \
  analytics.users users.ndjson

NULL vs empty string for a nullable column

Example

Skip empty cells on: a blank country omits the key, so BigQuery loads NULL. With it off you would get an empty string instead.

Input:
user_id,country
1,US
2,

Output (Skip empty cells ON):
{"user_id":1,"country":"US"}
{"user_id":2}

→ row 2's country loads as NULL, not ''

Snowflake COPY INTO from a JSON stage

Example

Stage the NDJSON, then COPY INTO with a JSON file format. Each line becomes one VARIANT or typed-column row.

Output (NDJSON, staged as @my_stage/users.ndjson):
{"user_id":1,"plan":"pro"}
{"user_id":2,"plan":"free"}

COPY INTO users
  FROM @my_stage/users.ndjson
  FILE_FORMAT = (TYPE = JSON);

Keep a big surrogate key as a string

Example

A 19-digit key exceeds safe-integer range; inference keeps it a string so it loads to a STRING/NUMERIC column without rounding.

Input:
event_key,name
9223372036854775807,launch

Output (NDJSON):
{"event_key":"9223372036854775807","name":"launch"}
→ load into a STRING or NUMERIC column, then CAST as needed

Load everything as STRING then CAST in SQL

Example

Turn inference off when you would rather do all typing in the warehouse. Every value is a quoted string; CAST in the load SELECT or a downstream view.

Input:
id,amount,paid
1,19.99,true

Output (NDJSON, inference OFF):
{"id":"1","amount":"19.99","paid":"true"}
→ SELECT CAST(amount AS NUMERIC), CAST(paid AS BOOL) …

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Empty cell loads as empty string instead of NULL

By design

By default a blank cell becomes "", which warehouses load as an empty string, not NULL — these are distinct and can break WHERE col IS NULL logic. Turn on Skip empty cells to omit the key so the column loads as NULL. The literal text null (any case) is always emitted as JSON null regardless of this setting.

Leading-zero key autodetected as INTEGER

By design

With inference on, 00042 becomes 42 and autodetect picks INTEGER. For a key that must stay exact, turn off Infer numbers, booleans, null so the column is a string, then load it to a STRING column. Inference is global — if you need other columns typed, load everything as STRING and CAST in SQL.

Big integer can't fit BigQuery INTEGER (INT64)

Preserved

An integer beyond JavaScript's safe-integer range is kept as a string by inference, avoiding silent corruption. BigQuery's INT64 actually goes higher than JS safe-int, so if you need the numeric value, load this column into NUMERIC/BIGNUMERIC (or STRING) and CAST — the string is exact, so no precision is lost in the file.

Date / timestamp loads as STRING

Expected

2026-06-12 and ISO timestamps don't match the numeric/boolean/null patterns, so they stay strings. Autodetect types them as STRING. Either define an explicit schema with a DATE/TIMESTAMP column (BigQuery parses ISO strings into it), or CAST in a downstream view.

Autodetect picks the wrong type from a sparse column

Schema risk

BigQuery autodetect samples rows; a column that is mostly numeric but has one stray non-numeric cell may type as STRING or fail. This is a warehouse behaviour, not a converter behaviour — the NDJSON faithfully reflects the cells. Supply an explicit schema to bq load (--schema) for predictable typing on messy columns.

Dotted header becomes a flat field, not a RECORD

Not supported here

address.city is a literal field name, not a nested BigQuery RECORD/STRUCT. To load nested structures, convert here, then expand dotted keys with json-unflattener before loading, or define a STRUCT schema and provide nested NDJSON.

Field name illegal for the warehouse

Load reject

A header with spaces or a leading digit (1st_visit, User Name) becomes a JSON key the same way, but BigQuery/Snowflake reject it as a column name on autodetect. Rename the columns in the source CSV (or with csv-header-rename) to legal identifiers before converting.

Free tier file/row cap on a warehouse-scale CSV

Limit

Free tier caps at 2 MB / 500 rows — fine for a sample, not a load. Pro is 100 MB / 100,000 rows, Developer 5 GB with no row cap. For very large loads, split the CSV into chunks under your cap, convert each to NDJSON, and load them as a wildcard set (bq load … data_*.ndjson).

Frequently asked questions

Why NDJSON instead of just loading the CSV into BigQuery?

NDJSON carries the field name on every record, so a schema change (new column) doesn't break older files the way a CSV column-order change does, and JSON null is distinct from empty string for nullable columns. With inference on, autodetect also gets real INTEGER/BOOLEAN types instead of guessing from CSV strings.

Which output mode do I pick for a warehouse load?

NDJSON / JSON Lines. That is the exact format BigQuery's NEWLINE_DELIMITED_JSON, Snowflake's TYPE = JSON, Redshift's FORMAT AS JSON, and Athena's JSON SerDe read. Do not use Array mode for these loads — they expect one record per line, not an enclosing array.

How do I make blank cells load as NULL?

Turn on Skip empty cells. That omits the key for a blank cell, and a missing key loads as SQL NULL. With it off, a blank becomes an empty string "", which is a different value. The literal word null always becomes JSON null regardless of this setting.

Will my date columns load as DATE/TIMESTAMP?

Not automatically — dates stay JSON strings, so autodetect types them STRING. Define an explicit schema with a DATE or TIMESTAMP column (BigQuery parses ISO-8601 strings into those types on load), or CAST in a downstream view/query.

What happens to very large integer IDs?

Inference keeps them as strings when they exceed JavaScript's safe-integer range, so the value in the file is exact (no rounding). Load that column into NUMERIC/BIGNUMERIC or STRING and CAST as needed. BigQuery INT64 can hold larger values than JS safe-int, so the string-then-CAST path preserves full precision.

Is the data uploaded anywhere during conversion?

No. PapaParse parses and the NDJSON is generated entirely in your browser. Warehouse-bound data never reaches a JAD Apps server. Only an anonymous run counter is recorded when signed in, and you can opt out.

Can I load nested STRUCT / RECORD data this way?

Not from a flat CSV directly — each row becomes a flat object, and dotted headers stay literal field names. To get nested NDJSON, convert here, then run the records through json-unflattener to expand a.b.c keys into nested objects, and load against a STRUCT schema.

How large a CSV can I convert?

Free: 2 MB / 500 rows. Pro: 100 MB / 100,000 rows. Pro+Media: 500 MB / 500,000 rows. Developer: 5 GB, no row cap. For warehouse-scale loads, split the source CSV into chunks under your cap and convert each — warehouses happily load a wildcard set of NDJSON files.

Can I generate a BigQuery schema from the data too?

Not in this tool, but the sibling json-schema-generator produces a JSON Schema from the converted records, which you can adapt into a bq load --schema definition. Or rely on --autodetect, which works well once inference has typed the values.

My CSV is semicolon-delimited (EU locale) — will it still work?

Yes. PapaParse auto-detects the delimiter from the file content, so comma, semicolon, and tab all work without configuration. The NDJSON output is identical regardless of the source delimiter.

Can I automate CSV→NDJSON in an ELT pipeline?

Yes. Pair the @jadapps/runner once and POST the CSV to 127.0.0.1:9789/v1/tools/csv-to-json/run with outputMode: ndjson. A common pattern: source export → runner converts to NDJSON → land in GCS/S3 → bq load / COPY INTO. The data stays on your machine and never reaches JAD's servers.

How do I go from warehouse JSON back to CSV for a report?

Use the sibling json-to-csv tool. Export query results as a JSON array, then convert to a flat CSV for a spreadsheet or BI handoff.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert csv to ndjson for bigquery / snowflake

Step 1
Export the source table as CSV — From a database, spreadsheet, or app, export a CSV with a header row. Headers become NDJSON field names and, with BigQuery autodetect, the column names — so use warehouse-legal names (letters, digits, underscores; no leading digit).
Step 2
Drop the CSV onto the converter above — Accepts .csv, .tsv, .txt. PapaParse auto-detects the delimiter and is quote-aware, so embedded commas and newlines inside quoted cells do not break records — a common cause of CSV load failures.
Step 3
Select NDJSON / JSON Lines as the output mode — This produces one compact JSON object per line with no enclosing array brackets — the line-delimited format every major warehouse load command reads. Indentation does not apply to NDJSON.
Step 4
Keep type inference on for clean autodetect — Leave Infer numbers, booleans, null on so numeric and boolean columns load as their real types. Turn it off only if you want every column loaded as STRING (e.g. you will CAST in SQL afterwards).
Step 5
Consider Skip empty cells for NULL handling — By default a blank cell is "" (an empty string in the column). Turn on Skip empty cells to omit the key, which BigQuery loads as NULL — useful when blank should mean missing, not empty.
Step 6
Download the .ndjson and run the load — Click Download NDJSON, then bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect dataset.table data.ndjson, or stage the file and COPY INTO it in Snowflake. Confirm the Records stat matches your expected row count.

NDJSON load command by warehouse

The NDJSON this tool produces feeds each of these. The tool generates the file; it does not run the load.

Warehouse	Source format / file format	Load command (abbreviated)
BigQuery	`NEWLINE_DELIMITED_JSON`	`bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect ds.t data.ndjson`
Snowflake	`FILE_FORMAT = (TYPE = JSON)`	stage file → `COPY INTO t FROM @stage FILE_FORMAT=(TYPE=JSON)`
Redshift	`FORMAT AS JSON 'auto'`	`COPY t FROM 's3://…/data.ndjson' FORMAT AS JSON 'auto'`
Athena	JSON SerDe	`CREATE EXTERNAL TABLE … ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'`

Inference and how the warehouse types it

With BigQuery --autodetect on. Inference is a single global toggle; off means every value is STRING.

CSV cell	NDJSON value (inference on)	Autodetected type	Note
`42`	`42`	INTEGER	Safe integers only
`3.14`	`3.14`	FLOAT	Decimal / scientific
`true`	`true`	BOOLEAN	Case-insensitive
`null`	`null`	(NULL)	Loads as SQL NULL
`` (empty), Skip empty off	`""`	STRING (empty)	Empty string, not NULL
`` (empty), Skip empty on	key omitted	NULL	Field absent → NULL
`9007199254740993`	`"9007…"`	STRING	Beyond safe int — load to NUMERIC/STRING and CAST
`2026-06-12`	`"2026-06-12"`	STRING	Loads as string; CAST to DATE in SQL or schema

Cookbook

NDJSON conversion recipes for warehouse loads. Values illustrative; the focus is the shape and the load command.

Typed NDJSON for bq load --autodetect

Example

NDJSON mode, inference on. BigQuery autodetect reads events as INTEGER and active as BOOLEAN.

Input (users.csv):
user_id,events,active
1,12,true
2,0,false

Output (NDJSON):
{"user_id":1,"events":12,"active":true}
{"user_id":2,"events":0,"active":false}

bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect \
  analytics.users users.ndjson

NULL vs empty string for a nullable column

Example

Skip empty cells on: a blank country omits the key, so BigQuery loads NULL. With it off you would get an empty string instead.

Input:
user_id,country
1,US
2,

Output (Skip empty cells ON):
{"user_id":1,"country":"US"}
{"user_id":2}

→ row 2's country loads as NULL, not ''

Snowflake COPY INTO from a JSON stage

Example

Stage the NDJSON, then COPY INTO with a JSON file format. Each line becomes one VARIANT or typed-column row.

Output (NDJSON, staged as @my_stage/users.ndjson):
{"user_id":1,"plan":"pro"}
{"user_id":2,"plan":"free"}

COPY INTO users
  FROM @my_stage/users.ndjson
  FILE_FORMAT = (TYPE = JSON);

Keep a big surrogate key as a string

Example

A 19-digit key exceeds safe-integer range; inference keeps it a string so it loads to a STRING/NUMERIC column without rounding.

Input:
event_key,name
9223372036854775807,launch

Output (NDJSON):
{"event_key":"9223372036854775807","name":"launch"}
→ load into a STRING or NUMERIC column, then CAST as needed

Load everything as STRING then CAST in SQL

Example

Turn inference off when you would rather do all typing in the warehouse. Every value is a quoted string; CAST in the load SELECT or a downstream view.

Input:
id,amount,paid
1,19.99,true

Output (NDJSON, inference OFF):
{"id":"1","amount":"19.99","paid":"true"}
→ SELECT CAST(amount AS NUMERIC), CAST(paid AS BOOL) …

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

Empty cell loads as empty string instead of NULL

By design

Leading-zero key autodetected as INTEGER

By design

Big integer can't fit BigQuery INTEGER (INT64)

Preserved

Date / timestamp loads as STRING

Expected

Autodetect picks the wrong type from a sparse column

Schema risk

Dotted header becomes a flat field, not a RECORD

Not supported here

Field name illegal for the warehouse

Load reject

Free tier file/row cap on a warehouse-scale CSV

Limit

Frequently asked questions

Why NDJSON instead of just loading the CSV into BigQuery?

Which output mode do I pick for a warehouse load?

How do I make blank cells load as NULL?

Will my date columns load as DATE/TIMESTAMP?

What happens to very large integer IDs?

Is the data uploaded anywhere during conversion?

Can I load nested STRUCT / RECORD data this way?

How large a CSV can I convert?

Can I generate a BigQuery schema from the data too?

My CSV is semicolon-delimited (EU locale) — will it still work?

Yes. PapaParse auto-detects the delimiter from the file content, so comma, semicolon, and tab all work without configuration. The NDJSON output is identical regardless of the source delimiter.

Can I automate CSV→NDJSON in an ELT pipeline?

How do I go from warehouse JSON back to CSV for a report?

Use the sibling json-to-csv tool. Export query results as a JSON array, then convert to a flat CSV for a spreadsheet or BI handoff.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Convert CSV to NDJSON for BigQuery / Snowflake

How to convert csv to ndjson for bigquery / snowflake

NDJSON load command by warehouse

Inference and how the warehouse types it

Cookbook

Typed NDJSON for bq load --autodetect

NULL vs empty string for a nullable column

Snowflake COPY INTO from a JSON stage

Keep a big surrogate key as a string

Load everything as STRING then CAST in SQL

Errors and edge cases

Empty cell loads as empty string instead of NULL

Leading-zero key autodetected as INTEGER

Big integer can't fit BigQuery INTEGER (INT64)

Date / timestamp loads as STRING

Autodetect picks the wrong type from a sparse column

Dotted header becomes a flat field, not a RECORD

Field name illegal for the warehouse

Free tier file/row cap on a warehouse-scale CSV

Frequently asked questions

Why NDJSON instead of just loading the CSV into BigQuery?

Which output mode do I pick for a warehouse load?

How do I make blank cells load as NULL?

Will my date columns load as DATE/TIMESTAMP?

What happens to very large integer IDs?

Is the data uploaded anywhere during conversion?

Can I load nested STRUCT / RECORD data this way?

How large a CSV can I convert?

Can I generate a BigQuery schema from the data too?

My CSV is semicolon-delimited (EU locale) — will it still work?

Can I automate CSV→NDJSON in an ELT pipeline?

How do I go from warehouse JSON back to CSV for a report?

Privacy first

Related guides

Convert CSV to NDJSON for BigQuery / Snowflake

How to convert csv to ndjson for bigquery / snowflake

NDJSON load command by warehouse

Inference and how the warehouse types it

Cookbook

Typed NDJSON for bq load --autodetect

NULL vs empty string for a nullable column

Snowflake COPY INTO from a JSON stage

Keep a big surrogate key as a string

Load everything as STRING then CAST in SQL

Errors and edge cases

Empty cell loads as empty string instead of NULL

Leading-zero key autodetected as INTEGER

Big integer can't fit BigQuery INTEGER (INT64)

Date / timestamp loads as STRING

Autodetect picks the wrong type from a sparse column

Dotted header becomes a flat field, not a RECORD

Field name illegal for the warehouse

Free tier file/row cap on a warehouse-scale CSV

Frequently asked questions

Why NDJSON instead of just loading the CSV into BigQuery?

Which output mode do I pick for a warehouse load?

How do I make blank cells load as NULL?

Will my date columns load as DATE/TIMESTAMP?

What happens to very large integer IDs?

Is the data uploaded anywhere during conversion?

Can I load nested STRUCT / RECORD data this way?

How large a CSV can I convert?

Can I generate a BigQuery schema from the data too?

My CSV is semicolon-delimited (EU locale) — will it still work?

Can I automate CSV→NDJSON in an ELT pipeline?

How do I go from warehouse JSON back to CSV for a report?

Privacy first

Related guides