Diff A/B Test JSON Data to Compare Experiment Variants

How to compare a/b test json data across experiment variants

Step 1
Export the control variant payload — Capture the JSON the control bucket produces: the feature-flag state object, the rendered page-config, or the API response served with the experiment flag off (or with the control header). Copy it.
Step 2
Export the treatment variant payload the same way — Capture the equivalent JSON for the treatment bucket — same endpoint, same export path, only the variant assignment different. Consistency between the two captures is what makes the diff meaningful.
Step 3
Paste control on the left — Paste the control (variant A) payload into JSON A (base). It must be a single valid JSON value.
Step 4
Paste treatment on the right — Paste the treatment (variant B) payload into JSON B (modified). Both panels are required before the tool will run.
Step 5
Click Compare and verify only the intended delta exists — Read the diff. The entries should match exactly the field(s) your experiment is testing. Anything else — an extra changed, an added, a removed — is a confound to investigate before launch.
Step 6
Record the delta in the experiment brief — Use Copy JSON to capture the diff and paste it into the experiment doc as the 'configuration change' section, so reviewers can confirm the setup varies only what was intended and approve the launch.

Reading the variant diff for confounds

How each diff entry maps to experiment validity. The verdict assumes you know which single field the experiment is meant to vary.

Diff entry	Means	If it's the tested field	If it's anything else
`changed`	Same key, different value between variants	Good — this is the treatment	Confound — an unintended difference that can bias results
`added` (in B / treatment only)	Treatment has a key control lacks	OK if the experiment adds a module/field	Confound — extra surface only treatment users see
`removed` (in A / control only)	Control has a key treatment lacks	OK if the experiment hides something	Confound — control users get something treatment doesn't
`changed` (type flip)	`"true"` vs `true` on a flag	Almost never intended	Bug — mis-typed flag can mis-bucket or break the variant

Common variant comparisons and what the diff shows

Verified against the diff engine. Paths use dot notation for keys, bracket notation for ordered lists.

Comparison	Control → Treatment	Diff output
Single flag flipped (clean)	`{"newCheckout":false}` → `{"newCheckout":true}`	`changed newCheckout - false + true` — only the tested flag
Confound: extra flag rode along	`{"f1":false}` → `{"f1":true,"f2":true}`	`changed f1` + `added f2` — `f2` is the confound
Price test with leaked region drift	`{"price":10,"region":"eu"}` → `{"price":12,"region":"us"}`	`changed price` + `changed region` — region is the confound
Reordered UI module list	`["hero","grid"]` → `["grid","hero"]`	`changed [0]` + `changed [1]` — array compared by index
Type-mistyped flag	`{"on":true}` → `{"on":"true"}`	`changed on - true + "true"` — a bug, not a treatment

Cookbook

Anonymised control-vs-treatment payloads. Left is control (variant A), right is treatment (variant B). The diff is what you'd attach to the experiment brief.

A clean single-flag experiment

Example

The experiment tests one flag. The diff shows exactly one changed entry — proof the variants differ only in the tested dimension.

JSON A (control):              JSON B (treatment):
{                              {
  "flags": {                     "flags": {
    "newCheckout": false,          "newCheckout": true,
    "darkMode": false              "darkMode": false
  }                              }
}                              }

Diff:
~1 changed
changed  flags.newCheckout
  - false
  + true
→ Only the tested flag differs. Clean experiment.

Confound caught: a second flag leaked into treatment

Example

Treatment accidentally enabled an unrelated flag. The added entry is the confound that would have muddied the result.

JSON A (control):              JSON B (treatment):
{                              {
  "flags": {                     "flags": {
    "newCheckout": false           "newCheckout": true,
  }                                "betaSearch": true
}                                }
                               }

Diff:
~1 changed  +1 added
changed  flags.newCheckout  - false + true
added    flags.betaSearch   + true
→ betaSearch is a confound — remove it from treatment.

Price test with a leaked region difference

Example

The experiment tests price, but treatment also drifted region — which changes tax and shipping and would confound the conversion result.

JSON A (control):              JSON B (treatment):
{                              {
  "price": 10,                   "price": 12,
  "region": "eu-west-1"          "region": "us-east-1"
}                              }

Diff:
~2 changed
changed  price   - 10  + 12
changed  region  - "eu-west-1" + "us-east-1"
→ region is a confound; align both variants to one region.

A mistyped flag that would mis-bucket users

Example

Treatment stores the flag as the string "true" instead of boolean true. Many SDKs treat the string as truthy inconsistently. The diff exposes the type flip.

JSON A (control):              JSON B (treatment):
{ "experimentOn": true }      { "experimentOn": "true" }

Diff:
~1 changed
changed  experimentOn
  - true
  + "true"
→ Type flipped to string — fix before launch.

Confirming variants are otherwise identical

Example

After fixing a confound, re-diff to confirm the only remaining difference is the intended one — or, for two payloads that should match, that they're identical.

JSON A (control):              JSON B (treatment):
{ "layout": "grid",           { "layout": "grid",
  "items": 12 }                  "items": 12 }

Diff:
No differences
→ The two JSON values are identical.
→ No unintended delta between these two payloads.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

A mistyped flag is a `changed` entry, not a type error

By design

true (boolean) vs "true" (string) on a flag is caught — as a changed entry whose from/to reveal the type flip. There's no special type label; the value pair is the evidence. For experiments this matters because some flag SDKs coerce the string inconsistently, so a mistyped flag can silently mis-bucket users.

Reordered module/rule list shows index-wise changes

Positional by design

An ordered list of UI modules or targeting rules is compared by index, so the same items in a different order produce changed [0], changed [1], etc. If order is part of the variant (it often is for UI), that's correct; if order is incidental, sort both lists the same way before pasting so only real differences surface.

An added/removed key is a confound, not just noise

Investigate

Any added key (treatment-only) or removed key (control-only) beyond the field under test means one bucket has surface the other doesn't — a classic confound. Read every added/removed entry as a question: 'is this part of the experiment, or did it leak in?' Don't launch until each is explained.

Variant payload wrapped in an envelope won't parse

Parse error

Both inputs go through strict JSON.parse. A capture that includes an SDK envelope, log prefix, or trailing commas throws a parser error. Paste only the JSON payload; repair near-JSON first with /tool/json-format-fixer.

Only one variant pasted

Invalid

Both JSON A and JSON B are required; an empty or whitespace-only side returns Please provide both JSON A and JSON B. Capture both the control and treatment payloads before comparing.

Payload over 2 MB per side on the free plan

Upgrade required

Each pasted payload is capped at 2 MB on free; over that returns Free plan supports JSON inputs up to 2 MB. Upgrade to Pro for unlimited input size. Extract just the experiment-relevant sub-tree with /tool/json-path-extractor and diff that, or upgrade to Pro.

Key order differs between variant generators

Supported

Objects are compared by key, not position, so if control and treatment are emitted by different code paths that order keys differently, the diff stays clean. Only genuine field differences appear — no phantom confounds from serialization order.

`null` flag value vs absent flag

Removed / added

A flag set to null in one variant and absent in the other reports as removed/added of a null value, because null is a present value. If your flag system treats null as 'unset / default', read these entries as semantically equal rather than as a confound.

Frequently asked questions

What is a confound in an A/B test configuration?

A confound is any unintended difference between variants that could influence behaviour independently of the treatment. If treatment differs from control in both button colour and, say, response region, you can't attribute a conversion lift to colour alone. This tool lists every difference between the two payloads, so confounds show up as extra added/removed/changed entries beyond the field you meant to test — before the experiment runs.

How do I compare the API responses that each variant produces?

Make the call once with the experiment flag in the control state and once in the treatment state (or with the relevant variant header), copy each response body, and paste control on the left and treatment on the right. The diff shows which data fields differ between variants, confirming the data-layer change is limited to what the experiment intends.

Why does a reordered list of modules show as changes?

Ordered arrays are compared by index, so the same modules in a different order produce changed [0], changed [1], etc. If the order is itself part of the variant (a reordered layout test), that's the signal you want. If order is incidental, sort both arrays the same way before pasting so only true differences appear.

How do I catch a flag that's set to the wrong type?

The diff fires on serialized inequality, so a flag stored as "true" (string) in one variant and true (boolean) in the other shows as a changed entry exposing both values. This matters because some flag SDKs coerce the string inconsistently, which can mis-bucket users or break the variant in subtle ways.

Does object key order between variants cause false confounds?

No. Objects are compared by key, not position, so two payloads emitted by different code paths that order keys differently still diff clean. You won't see a phantom confound just because the serializer differs between buckets.

Can I diff more than two variants at once?

Not in one pass — the tool compares exactly two payloads. For a multi-arm experiment, diff each treatment against the control separately (B vs A, C vs A, D vs A). That isolates each arm's delta and keeps the confound check clean per arm.

How do I document the variant delta for reviewers?

Run the diff once the variants are final, hit Copy JSON, and paste the entry list into the experiment brief's 'configuration change' section. Reviewers can then confirm the delta is exactly the intended field(s) and nothing else before approving the launch.

Can I upload the variant config files?

No — the tool is paste-only, with a control panel and a treatment panel. Copy each payload and paste it. There's no file picker, and since everything runs locally, your experiment configs and targeting rules never leave the browser.

What size payload can I compare for free?

Up to 2 MB per side on the free plan. Full rendered page-configs can exceed that, so extract just the experiment-relevant sub-tree with /tool/json-path-extractor and diff that. Pro removes the cap entirely.

Should an added field in treatment always block the experiment?

Only if it's not part of the experiment. If the treatment is meant to add a module or field, the added entry is expected. If you didn't intend it, it's a confound — treatment users get surface control users don't, which can bias the metric. Explain every added/removed entry before launch.

Is the experiment configuration data transmitted to JAD Apps?

No. Both payloads are parsed and diffed in your browser. A/B test configs, feature-flag states, and targeting rules are never sent to a server. Clicking Compare triggers no network request — verifiable in your DevTools network tab.

How do I make order-insensitive arrays diff cleanly between variants?

The diff tool doesn't sort for you. Pre-process each variant so set-like arrays are in the same stable order — flatten and sort with /tool/json-flattener, or reshape with /tool/json-transposer into a keyed structure — then diff. Once both sides share an order, index-based comparison won't flag a pure reorder as a difference.

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

How to compare a/b test json data across experiment variants

Step 1
Export the control variant payload — Capture the JSON the control bucket produces: the feature-flag state object, the rendered page-config, or the API response served with the experiment flag off (or with the control header). Copy it.
Step 2
Export the treatment variant payload the same way — Capture the equivalent JSON for the treatment bucket — same endpoint, same export path, only the variant assignment different. Consistency between the two captures is what makes the diff meaningful.
Step 3
Paste control on the left — Paste the control (variant A) payload into JSON A (base). It must be a single valid JSON value.
Step 4
Paste treatment on the right — Paste the treatment (variant B) payload into JSON B (modified). Both panels are required before the tool will run.
Step 5
Click Compare and verify only the intended delta exists — Read the diff. The entries should match exactly the field(s) your experiment is testing. Anything else — an extra changed, an added, a removed — is a confound to investigate before launch.
Step 6
Record the delta in the experiment brief — Use Copy JSON to capture the diff and paste it into the experiment doc as the 'configuration change' section, so reviewers can confirm the setup varies only what was intended and approve the launch.

Reading the variant diff for confounds

How each diff entry maps to experiment validity. The verdict assumes you know which single field the experiment is meant to vary.

Diff entry	Means	If it's the tested field	If it's anything else
`changed`	Same key, different value between variants	Good — this is the treatment	Confound — an unintended difference that can bias results
`added` (in B / treatment only)	Treatment has a key control lacks	OK if the experiment adds a module/field	Confound — extra surface only treatment users see
`removed` (in A / control only)	Control has a key treatment lacks	OK if the experiment hides something	Confound — control users get something treatment doesn't
`changed` (type flip)	`"true"` vs `true` on a flag	Almost never intended	Bug — mis-typed flag can mis-bucket or break the variant

Common variant comparisons and what the diff shows

Verified against the diff engine. Paths use dot notation for keys, bracket notation for ordered lists.

Comparison	Control → Treatment	Diff output
Single flag flipped (clean)	`{"newCheckout":false}` → `{"newCheckout":true}`	`changed newCheckout - false + true` — only the tested flag
Confound: extra flag rode along	`{"f1":false}` → `{"f1":true,"f2":true}`	`changed f1` + `added f2` — `f2` is the confound
Price test with leaked region drift	`{"price":10,"region":"eu"}` → `{"price":12,"region":"us"}`	`changed price` + `changed region` — region is the confound
Reordered UI module list	`["hero","grid"]` → `["grid","hero"]`	`changed [0]` + `changed [1]` — array compared by index
Type-mistyped flag	`{"on":true}` → `{"on":"true"}`	`changed on - true + "true"` — a bug, not a treatment

Cookbook

Anonymised control-vs-treatment payloads. Left is control (variant A), right is treatment (variant B). The diff is what you'd attach to the experiment brief.

A clean single-flag experiment

Example

The experiment tests one flag. The diff shows exactly one changed entry — proof the variants differ only in the tested dimension.

JSON A (control):              JSON B (treatment):
{                              {
  "flags": {                     "flags": {
    "newCheckout": false,          "newCheckout": true,
    "darkMode": false              "darkMode": false
  }                              }
}                              }

Diff:
~1 changed
changed  flags.newCheckout
  - false
  + true
→ Only the tested flag differs. Clean experiment.

Confound caught: a second flag leaked into treatment

Example

Treatment accidentally enabled an unrelated flag. The added entry is the confound that would have muddied the result.

JSON A (control):              JSON B (treatment):
{                              {
  "flags": {                     "flags": {
    "newCheckout": false           "newCheckout": true,
  }                                "betaSearch": true
}                                }
                               }

Diff:
~1 changed  +1 added
changed  flags.newCheckout  - false + true
added    flags.betaSearch   + true
→ betaSearch is a confound — remove it from treatment.

Price test with a leaked region difference

Example

The experiment tests price, but treatment also drifted region — which changes tax and shipping and would confound the conversion result.

JSON A (control):              JSON B (treatment):
{                              {
  "price": 10,                   "price": 12,
  "region": "eu-west-1"          "region": "us-east-1"
}                              }

Diff:
~2 changed
changed  price   - 10  + 12
changed  region  - "eu-west-1" + "us-east-1"
→ region is a confound; align both variants to one region.

A mistyped flag that would mis-bucket users

Example

Treatment stores the flag as the string "true" instead of boolean true. Many SDKs treat the string as truthy inconsistently. The diff exposes the type flip.

JSON A (control):              JSON B (treatment):
{ "experimentOn": true }      { "experimentOn": "true" }

Diff:
~1 changed
changed  experimentOn
  - true
  + "true"
→ Type flipped to string — fix before launch.

Confirming variants are otherwise identical

Example

After fixing a confound, re-diff to confirm the only remaining difference is the intended one — or, for two payloads that should match, that they're identical.

JSON A (control):              JSON B (treatment):
{ "layout": "grid",           { "layout": "grid",
  "items": 12 }                  "items": 12 }

Diff:
No differences
→ The two JSON values are identical.
→ No unintended delta between these two payloads.

Errors and edge cases

Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.

A mistyped flag is a `changed` entry, not a type error

By design

Reordered module/rule list shows index-wise changes

Positional by design

An added/removed key is a confound, not just noise

Investigate

Variant payload wrapped in an envelope won't parse

Parse error

Only one variant pasted

Invalid

Both JSON A and JSON B are required; an empty or whitespace-only side returns Please provide both JSON A and JSON B. Capture both the control and treatment payloads before comparing.

Payload over 2 MB per side on the free plan

Upgrade required

Key order differs between variant generators

Supported

`null` flag value vs absent flag

Removed / added

Frequently asked questions

What is a confound in an A/B test configuration?

How do I compare the API responses that each variant produces?

Why does a reordered list of modules show as changes?

How do I catch a flag that's set to the wrong type?

Does object key order between variants cause false confounds?

Can I diff more than two variants at once?

How do I document the variant delta for reviewers?

Can I upload the variant config files?

What size payload can I compare for free?

Should an added field in treatment always block the experiment?

Is the experiment configuration data transmitted to JAD Apps?

How do I make order-insensitive arrays diff cleanly between variants?

Privacy first

Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.

Compare A/B Test JSON Data Across Experiment Variants

How to compare a/b test json data across experiment variants

Reading the variant diff for confounds

Common variant comparisons and what the diff shows

Cookbook

A clean single-flag experiment

Confound caught: a second flag leaked into treatment

Price test with a leaked region difference

A mistyped flag that would mis-bucket users

Confirming variants are otherwise identical

Errors and edge cases

A mistyped flag is a `changed` entry, not a type error

Reordered module/rule list shows index-wise changes

An added/removed key is a confound, not just noise

Variant payload wrapped in an envelope won't parse

Only one variant pasted

Payload over 2 MB per side on the free plan

Key order differs between variant generators

`null` flag value vs absent flag

Frequently asked questions

What is a confound in an A/B test configuration?

How do I compare the API responses that each variant produces?

Why does a reordered list of modules show as changes?

How do I catch a flag that's set to the wrong type?

Does object key order between variants cause false confounds?

Can I diff more than two variants at once?

How do I document the variant delta for reviewers?

Can I upload the variant config files?

What size payload can I compare for free?

Should an added field in treatment always block the experiment?

Is the experiment configuration data transmitted to JAD Apps?

How do I make order-insensitive arrays diff cleanly between variants?

Privacy first

Related guides

Compare A/B Test JSON Data Across Experiment Variants

How to compare a/b test json data across experiment variants

Reading the variant diff for confounds

Common variant comparisons and what the diff shows

Cookbook

A clean single-flag experiment

Confound caught: a second flag leaked into treatment

Price test with a leaked region difference

A mistyped flag that would mis-bucket users

Confirming variants are otherwise identical

Errors and edge cases

A mistyped flag is a `changed` entry, not a type error

Reordered module/rule list shows index-wise changes

An added/removed key is a confound, not just noise

Variant payload wrapped in an envelope won't parse

Only one variant pasted

Payload over 2 MB per side on the free plan

Key order differs between variant generators

`null` flag value vs absent flag

Frequently asked questions

What is a confound in an A/B test configuration?

How do I compare the API responses that each variant produces?

Why does a reordered list of modules show as changes?

How do I catch a flag that's set to the wrong type?

Does object key order between variants cause false confounds?

Can I diff more than two variants at once?

How do I document the variant delta for reviewers?

Can I upload the variant config files?

What size payload can I compare for free?

Should an added field in treatment always block the experiment?

Is the experiment configuration data transmitted to JAD Apps?

How do I make order-insensitive arrays diff cleanly between variants?

Privacy first

Related guides