Generate Analytics Schema SQL Seeds from JSON Events

How to convert json event data to analytics schema sql

Step 1
Export events as a JSON array — You need a top-level array of event objects: [{"event":"page_view",...},{...}]. Segment/Mixpanel exports and GA4 BigQuery extracts often nest events under a key — pull the array out with json-path-extractor ($.events[*]) first.
Step 2
Flatten nested properties if you want flat columns — Event properties/event_params are stored as one TEXT column unless you flatten them. For a true fact-table shape with one column per property, run the array through json-flattener so properties.plan becomes a top-level properties_plan key first.
Step 3
Drop the file on the converter — Drag the .json file onto the dropzone. .xlsx/.xls/.ods work too (first sheet → JSON array). A raw .csv export needs csv-to-json first.
Step 4
Name the fact table and pick a dialect — Set Table name (e.g. fct_events) and choose PostgreSQL for a Postgres/Redshift/Snowflake stand-in, MySQL, or SQLite for a quick local file. Postgres double-quoting is the safest default for warehouse-style identifiers.
Step 5
Exclude PII and high-cardinality fields — List fields to drop in Exclude columns, e.g. ip, anonymous_id, user_agent. They're removed from the schema and every row — useful for keeping a shareable test seed free of identifiers.
Step 6
Generate, review types, and load — Click Generate SQL. Review inferred types — a revenue field shows as DOUBLE PRECISION/REAL; tighten to NUMERIC for your real warehouse. Copy or Download SQL and run against your local warehouse or dev Postgres. Stats report event (row) count and statement count.

What this tool does (and doesn't) for analytics data

Grounded in the actual converter behaviour, not warehouse-specific features it doesn't have.

Capability	Behaviour here	If you need more
Dialects	PostgreSQL, MySQL, SQLite only	No BigQuery/Redshift/Snowflake dialect; use Postgres output as the closest base
Nested `properties` / `event_params`	Stored as one JSON-stringified `TEXT` column	json-flattener to get one column per property
Sparse fields across events	Union of all keys; missing → `NULL`	Built in — no action needed
Partitioning / clustering DDL	Not generated (columns + types only)	Add `PARTITION BY`/`CLUSTER BY` to the DDL manually
PII / high-cardinality fields	Removed via Exclude columns	List them comma-separated before generating
Upsert / MERGE	Only ON CONFLICT DO NOTHING / INSERT IGNORE (skip)	Hand-edit for MERGE/upsert

Event-field type inference by dialect

Inferred per field from all non-null values across events. One off-type value loosens the column type.

Event field example	PostgreSQL	MySQL	SQLite
`count` (integers)	`INTEGER`	`INTEGER`	`INTEGER`
`revenue` (decimals)	`DOUBLE PRECISION`	`REAL`	`REAL`
`is_new_user` (bool)	`BOOLEAN` (`TRUE`/`FALSE`)	`TINYINT(1)` (`1`/`0`)	`BOOLEAN` (`TRUE`/`FALSE`)
`event` / `timestamp` (string)	`TEXT`	`TEXT`	`TEXT`
`properties` (object)	`TEXT` (stringified)	`TEXT` (stringified)	`TEXT` (stringified)

Free vs Pro limits

JSON to SQL is a Pro tool; limits come from the json family in the tier table.

Limit	Free	Pro
Max input file size	2 MB	100 MB
Files per run	1	10
Rows per INSERT batch	100 (fixed)	100 (fixed)

Cookbook

Real event payloads in, warehouse-seed SQL out — copied from the converter so you can see how event fields, nested properties, and dialect differences resolve.

Flat event seed for a fact table

Example

A page_view event array with flat fields. Counts infer as INTEGER, the boolean as BOOLEAN, strings as TEXT — a clean fact-table seed.

Input:
[
  {"event": "page_view", "user_id": "u1", "is_new": true, "duration": 12},
  {"event": "page_view", "user_id": "u2", "is_new": false, "duration": 47}
]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "user_id" TEXT,
  "is_new" BOOLEAN,
  "duration" INTEGER
);

INSERT INTO "fct_events" ("event", "user_id", "is_new", "duration")
VALUES
  ('page_view', 'u1', TRUE, 12),
  ('page_view', 'u2', FALSE, 47);

Nested properties become one TEXT column

Example

A Segment-style event with a properties object. The object is NOT exploded into columns — it lands as a single stringified TEXT value.

Input:
[{"event": "order_completed", "properties": {"total": 89.9, "items": 3}}]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "properties" TEXT
);

INSERT INTO "fct_events" ("event", "properties")
VALUES
  ('order_completed', '{"total":89.9,"items":3}');

-- For separate total/items columns, run json-flattener first.

Sparse events: key union with NULLs

Example

Different event types carry different fields. The column set is the union of all keys; events missing a field get NULL — so no field is silently dropped.

Input:
[
  {"event": "signup", "plan": "pro"},
  {"event": "page_view", "path": "/home"}
]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "plan" TEXT,
  "path" TEXT
);

INSERT INTO "fct_events" ("event", "plan", "path")
VALUES
  ('signup', 'pro', NULL),
  ('page_view', NULL, '/home');

Excluding PII before seeding a shared warehouse

Example

Drop identifiers and raw user-agent strings so a test seed is safe to commit or share.

Input:
[{"event": "click", "ip": "203.0.113.4", "anonymous_id": "a-99", "button": "buy"}]

Config: Exclude columns = ip, anonymous_id

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "button" TEXT
);

INSERT INTO "fct_events" ("event", "button")
VALUES
  ('click', 'buy');

Redshift/Snowflake target via Postgres output

Example

There's no dedicated cloud-warehouse dialect. Postgres double-quoted output runs almost unchanged on Redshift and Snowflake; just plan the JSON-stringified columns as SUPER (Redshift) or VARIANT (Snowflake).

Config: dialect = PostgreSQL

Output:
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "revenue" DOUBLE PRECISION,
  "properties" TEXT
);

INSERT INTO "fct_events" ("event", "revenue", "properties")
VALUES
  ('purchase', 19.99, '{"sku":"A1"}');

-- Redshift: change properties TEXT -> SUPER, DOUBLE PRECISION
-- is native. Snowflake: TEXT -> VARIANT, FLOAT for revenue.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

No BigQuery / Redshift / Snowflake dialect

Not supported

Only Postgres, MySQL, and SQLite are offered. There is no cloud-warehouse dialect, so you won't get STRUCT, SUPER, VARIANT, or PARTITION BY syntax. Use the Postgres output as a base and adapt the JSON-stringified columns to your warehouse's semi-structured type by hand.

Nested event properties not flattened

By design

properties, event_params, context, and similar nested blocks become a single JSON-stringified TEXT column, not separate columns. For a flat fact-table shape, run the events through json-flattener first so each nested key is promoted to a top-level column.

GA4 event_params array stored as a string

By design

GA4's event_params is an array of key/value objects. As an array, it is JSON.stringify-ed into one TEXT column — you won't get a column per param. Flatten or pre-process the GA4 export into a flat shape before converting if you need queryable param columns.

Revenue/amount columns inferred as float

Expected

Decimal revenue fields type as DOUBLE PRECISION (Postgres) or REAL (MySQL/SQLite) — binary floats that round. For accurate financial aggregation in a warehouse, change the generated type to NUMERIC/DECIMAL, or store amounts as integer cents in the source events.

High-cardinality timestamp strings as TEXT

Expected

ISO timestamps are strings, so they become TEXT. That's fine for a seed, but for a real fact table you'll want TIMESTAMP/TIMESTAMPTZ. Edit the column type in the generated DDL, or convert timestamps to epoch integers upstream.

Mixed numeric event field widens to float

Expected

If a field is 1 in one event and 1.5 in another, the column types as float, not INTEGER. With heterogeneous event data this happens often — review the inferred types before relying on them for aggregation.

Events wrapped under a key

Invalid input

The top level must be the events array. A Segment/GA4 export shaped as {"events":[...]} won't convert directly — extract the inner array with json-path-extractor ($.events[*]) first.

Large event export over the size limit

Upgrade required

Free tier caps input at 2 MB, Pro at 100 MB. Event exports are big — sample a representative subset for a seed, or split the array into files and load them in passes. A seed rarely needs the full event volume.

No deduplication of duplicate events

Expected

Every event object becomes a row; duplicate events (common in at-least-once delivery from Segment/Kafka) are all inserted. Dedupe upstream, or enable ON CONFLICT DO NOTHING with a unique constraint on a message-id column in your target table.

Frequently asked questions

Can this output BigQuery / Redshift / Snowflake SQL?

Not as a dedicated dialect — the only options are PostgreSQL, MySQL, and SQLite. The Postgres output is the closest fit: it runs nearly unchanged on Redshift and Snowflake. You'll typically swap the JSON-stringified TEXT columns to SUPER (Redshift) or VARIANT (Snowflake), and tighten floats to NUMERIC for revenue.

Why isn't my event's properties object split into columns?

Nested objects and arrays are JSON-stringified into a single TEXT column — the tool doesn't flatten them. For one column per property, run your event array through json-flattener first; it promotes properties.plan to a top-level properties_plan key, which then becomes its own column here.

How does it handle events with different fields?

It takes the union of all keys across every event, so a field present in only some events still becomes a column — events lacking it get NULL. Nothing is silently dropped, which is important when event types in one stream carry different property sets.

Will it dedupe duplicate events?

No. Every object in the array becomes a row, including duplicates from at-least-once delivery. Dedupe upstream, or add a unique constraint on a message-id column in your target table and enable ON CONFLICT DO NOTHING (Postgres/SQLite) / INSERT IGNORE (MySQL) so re-inserts are skipped.

Can I strip PII before generating the seed?

Yes. List fields like ip, anonymous_id, email, or user_agent in Exclude columns and they're removed from both the schema and every row — handy for producing a test seed that's safe to commit or share. For deeper anonymisation of values you keep, see json-anonymizer.

Why is my revenue column a float?

Decimal values infer as DOUBLE PRECISION (Postgres) or REAL (MySQL/SQLite). Floats round, which is bad for financial aggregation. Change the type to NUMERIC/DECIMAL in the generated DDL, or store amounts as integer cents in your events so the column infers as INTEGER.

Does it generate partitioning or clustering DDL?

No. The CREATE TABLE has columns and inferred types only — no PARTITION BY, CLUSTER BY, DISTKEY, or SORTKEY. Add those for your warehouse manually, or use the converter for INSERT-only output (turn off Include CREATE TABLE) against a table you defined with the right physical layout.

How are timestamps handled?

ISO timestamp strings become TEXT columns, single-quoted. For a real fact table you'll usually want TIMESTAMP/TIMESTAMPTZ — edit the generated column type, or convert timestamps to epoch integers in the source events so they infer as INTEGER.

How many events go into each INSERT?

100 rows per multi-row INSERT ... VALUES, fixed. A 1,000-event array becomes ten INSERT statements plus the CREATE TABLE. The stats line shows the exact statement count after generating.

Is my event data uploaded?

No. Conversion runs entirely in your browser, so event exports — including user identifiers — never reach a server. Only an anonymous run counter (no content) is logged for dashboard stats. This makes it safe to test with real exports.

How big an event export can I convert?

Free tier caps input at 2 MB, Pro at 100 MB. Full event exports often exceed that, but a seed rarely needs the whole volume — sample a representative subset, or split the array and load it in passes. The downstream warehouse, not this tool, is the real ceiling for production loads.

Can I seed multiple fact/dimension tables at once?

No — one run, one table. Split your JSON by entity (events vs users vs sessions), run the converter once per table changing the Table name each time, and concatenate the resulting .sql files in dependency order before loading.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert json event data to analytics schema sql

Step 1
Export events as a JSON array — You need a top-level array of event objects: [{"event":"page_view",...},{...}]. Segment/Mixpanel exports and GA4 BigQuery extracts often nest events under a key — pull the array out with json-path-extractor ($.events[*]) first.
Step 2
Flatten nested properties if you want flat columns — Event properties/event_params are stored as one TEXT column unless you flatten them. For a true fact-table shape with one column per property, run the array through json-flattener so properties.plan becomes a top-level properties_plan key first.
Step 3
Drop the file on the converter — Drag the .json file onto the dropzone. .xlsx/.xls/.ods work too (first sheet → JSON array). A raw .csv export needs csv-to-json first.
Step 4
Name the fact table and pick a dialect — Set Table name (e.g. fct_events) and choose PostgreSQL for a Postgres/Redshift/Snowflake stand-in, MySQL, or SQLite for a quick local file. Postgres double-quoting is the safest default for warehouse-style identifiers.
Step 5
Exclude PII and high-cardinality fields — List fields to drop in Exclude columns, e.g. ip, anonymous_id, user_agent. They're removed from the schema and every row — useful for keeping a shareable test seed free of identifiers.
Step 6
Generate, review types, and load — Click Generate SQL. Review inferred types — a revenue field shows as DOUBLE PRECISION/REAL; tighten to NUMERIC for your real warehouse. Copy or Download SQL and run against your local warehouse or dev Postgres. Stats report event (row) count and statement count.

What this tool does (and doesn't) for analytics data

Grounded in the actual converter behaviour, not warehouse-specific features it doesn't have.

Capability	Behaviour here	If you need more
Dialects	PostgreSQL, MySQL, SQLite only	No BigQuery/Redshift/Snowflake dialect; use Postgres output as the closest base
Nested `properties` / `event_params`	Stored as one JSON-stringified `TEXT` column	json-flattener to get one column per property
Sparse fields across events	Union of all keys; missing → `NULL`	Built in — no action needed
Partitioning / clustering DDL	Not generated (columns + types only)	Add `PARTITION BY`/`CLUSTER BY` to the DDL manually
PII / high-cardinality fields	Removed via Exclude columns	List them comma-separated before generating
Upsert / MERGE	Only ON CONFLICT DO NOTHING / INSERT IGNORE (skip)	Hand-edit for MERGE/upsert

Event-field type inference by dialect

Inferred per field from all non-null values across events. One off-type value loosens the column type.

Event field example	PostgreSQL	MySQL	SQLite
`count` (integers)	`INTEGER`	`INTEGER`	`INTEGER`
`revenue` (decimals)	`DOUBLE PRECISION`	`REAL`	`REAL`
`is_new_user` (bool)	`BOOLEAN` (`TRUE`/`FALSE`)	`TINYINT(1)` (`1`/`0`)	`BOOLEAN` (`TRUE`/`FALSE`)
`event` / `timestamp` (string)	`TEXT`	`TEXT`	`TEXT`
`properties` (object)	`TEXT` (stringified)	`TEXT` (stringified)	`TEXT` (stringified)

Free vs Pro limits

JSON to SQL is a Pro tool; limits come from the json family in the tier table.

Limit	Free	Pro
Max input file size	2 MB	100 MB
Files per run	1	10
Rows per INSERT batch	100 (fixed)	100 (fixed)

Cookbook

Real event payloads in, warehouse-seed SQL out — copied from the converter so you can see how event fields, nested properties, and dialect differences resolve.

Flat event seed for a fact table

Example

A page_view event array with flat fields. Counts infer as INTEGER, the boolean as BOOLEAN, strings as TEXT — a clean fact-table seed.

Input:
[
  {"event": "page_view", "user_id": "u1", "is_new": true, "duration": 12},
  {"event": "page_view", "user_id": "u2", "is_new": false, "duration": 47}
]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "user_id" TEXT,
  "is_new" BOOLEAN,
  "duration" INTEGER
);

INSERT INTO "fct_events" ("event", "user_id", "is_new", "duration")
VALUES
  ('page_view', 'u1', TRUE, 12),
  ('page_view', 'u2', FALSE, 47);

Nested properties become one TEXT column

Example

A Segment-style event with a properties object. The object is NOT exploded into columns — it lands as a single stringified TEXT value.

Input:
[{"event": "order_completed", "properties": {"total": 89.9, "items": 3}}]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "properties" TEXT
);

INSERT INTO "fct_events" ("event", "properties")
VALUES
  ('order_completed', '{"total":89.9,"items":3}');

-- For separate total/items columns, run json-flattener first.

Sparse events: key union with NULLs

Example

Different event types carry different fields. The column set is the union of all keys; events missing a field get NULL — so no field is silently dropped.

Input:
[
  {"event": "signup", "plan": "pro"},
  {"event": "page_view", "path": "/home"}
]

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "plan" TEXT,
  "path" TEXT
);

INSERT INTO "fct_events" ("event", "plan", "path")
VALUES
  ('signup', 'pro', NULL),
  ('page_view', NULL, '/home');

Excluding PII before seeding a shared warehouse

Example

Drop identifiers and raw user-agent strings so a test seed is safe to commit or share.

Input:
[{"event": "click", "ip": "203.0.113.4", "anonymous_id": "a-99", "button": "buy"}]

Config: Exclude columns = ip, anonymous_id

Output (PostgreSQL):
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "button" TEXT
);

INSERT INTO "fct_events" ("event", "button")
VALUES
  ('click', 'buy');

Redshift/Snowflake target via Postgres output

Example

Config: dialect = PostgreSQL

Output:
CREATE TABLE IF NOT EXISTS "fct_events" (
  "event" TEXT,
  "revenue" DOUBLE PRECISION,
  "properties" TEXT
);

INSERT INTO "fct_events" ("event", "revenue", "properties")
VALUES
  ('purchase', 19.99, '{"sku":"A1"}');

-- Redshift: change properties TEXT -> SUPER, DOUBLE PRECISION
-- is native. Snowflake: TEXT -> VARIANT, FLOAT for revenue.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

No BigQuery / Redshift / Snowflake dialect

Not supported

Nested event properties not flattened

By design

GA4 event_params array stored as a string

By design

Revenue/amount columns inferred as float

Expected

High-cardinality timestamp strings as TEXT

Expected

Mixed numeric event field widens to float

Expected

Events wrapped under a key

Invalid input

The top level must be the events array. A Segment/GA4 export shaped as {"events":[...]} won't convert directly — extract the inner array with json-path-extractor ($.events[*]) first.

Large event export over the size limit

Upgrade required

No deduplication of duplicate events

Expected

Frequently asked questions

Can this output BigQuery / Redshift / Snowflake SQL?

Why isn't my event's properties object split into columns?

How does it handle events with different fields?

Will it dedupe duplicate events?

Can I strip PII before generating the seed?

Why is my revenue column a float?

Does it generate partitioning or clustering DDL?

How are timestamps handled?

How many events go into each INSERT?

100 rows per multi-row INSERT ... VALUES, fixed. A 1,000-event array becomes ten INSERT statements plus the CREATE TABLE. The stats line shows the exact statement count after generating.

Is my event data uploaded?

How big an event export can I convert?

Can I seed multiple fact/dimension tables at once?

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Convert JSON Event Data to Analytics Schema SQL

How to convert json event data to analytics schema sql

What this tool does (and doesn't) for analytics data

Event-field type inference by dialect

Free vs Pro limits

Cookbook

Flat event seed for a fact table

Nested properties become one TEXT column

Sparse events: key union with NULLs

Excluding PII before seeding a shared warehouse

Redshift/Snowflake target via Postgres output

Errors and edge cases

No BigQuery / Redshift / Snowflake dialect

Nested event properties not flattened

GA4 event_params array stored as a string

Revenue/amount columns inferred as float

High-cardinality timestamp strings as TEXT

Mixed numeric event field widens to float

Events wrapped under a key

Large event export over the size limit

No deduplication of duplicate events

Frequently asked questions

Can this output BigQuery / Redshift / Snowflake SQL?

Why isn't my event's properties object split into columns?

How does it handle events with different fields?

Will it dedupe duplicate events?

Can I strip PII before generating the seed?

Why is my revenue column a float?

Does it generate partitioning or clustering DDL?

How are timestamps handled?

How many events go into each INSERT?

Is my event data uploaded?

How big an event export can I convert?

Can I seed multiple fact/dimension tables at once?

Privacy first

Related guides

Convert JSON Event Data to Analytics Schema SQL

How to convert json event data to analytics schema sql

What this tool does (and doesn't) for analytics data

Event-field type inference by dialect

Free vs Pro limits

Cookbook

Flat event seed for a fact table

Nested properties become one TEXT column

Sparse events: key union with NULLs

Excluding PII before seeding a shared warehouse

Redshift/Snowflake target via Postgres output

Errors and edge cases

No BigQuery / Redshift / Snowflake dialect

Nested event properties not flattened

GA4 event_params array stored as a string

Revenue/amount columns inferred as float

High-cardinality timestamp strings as TEXT

Mixed numeric event field widens to float

Events wrapped under a key

Large event export over the size limit

No deduplication of duplicate events

Frequently asked questions

Can this output BigQuery / Redshift / Snowflake SQL?

Why isn't my event's properties object split into columns?

How does it handle events with different fields?

Will it dedupe duplicate events?

Can I strip PII before generating the seed?

Why is my revenue column a float?

Does it generate partitioning or clustering DDL?

How are timestamps handled?

How many events go into each INSERT?

Is my event data uploaded?

How big an event export can I convert?

Can I seed multiple fact/dimension tables at once?

Privacy first

Related guides