How to privacy-scrub json events for analytics pipelines
- Step 1Capture a sample event — Grab one real event your app sends (from the analytics debugger or network tab), save it as
.json, and drop it on the dropzone. It is read in-browser; the event is never uploaded. - Step 2List the identifying keys — Type the PII field names into Keys (comma-separated) — e.g.
email, name, phone, ip, userId, deviceId. Decide per field whether a stable hashed id is acceptable to your privacy policy before removing it. - Step 3Keep Remove listed — Stay on the default Remove listed mode so the identifying keys you typed are the ones scrubbed and the analytic signal (event name, counts, non-PII properties) is preserved.
- Step 4Leave Deep on — Keep Deep (all levels) ticked so identifiers are scrubbed from nested
properties,traits, oruser_propertiesblocks and from arrays of batched events, not just the top level. - Step 5Run and confirm — Click Filter Keys. The panel shows the scrubbed event and a
N keys removedcount — confirm it matches the identifiers you expected so nothing slips into the pipeline. - Step 6Reuse the scrubbed shape — Copy or download the scrubbed event as a fixture. If your policy needs a stable pseudonymous id instead of dropping the user entirely, follow up with the JSON Anonymizer to hash the value rather than remove the key.
Identifying keys by platform shape
Where PII tends to hide in events from common analytics SDKs. Deep on reaches all of these. Matching is exact and case-sensitive.
| Platform shape | Where PII nests | Keys to list | Note |
|---|---|---|---|
Segment track/identify | properties, traits | email, name, phone, address | List the trait names your SDK sets. |
| Amplitude event | user_properties, event_properties | email, deviceId, ip | Amplitude auto-captures device/IP unless disabled. |
| Mixpanel event | properties (with $ prefixed keys) | $email, $name, $phone | Mixpanel reserved props start with $ — list them literally. |
| BigQuery row | top-level columns | email, user_ip, full_name | Snake-case column names must be listed exactly. |
What scrubbing keys can and cannot do
Key removal is structural; it minimises PII fields but does not solve every privacy concern.
| Concern | Handled here? | Approach |
|---|---|---|
PII in a dedicated property (email) | Yes — list the key | Remove listed + Deep on |
| PII inside a free-text property value | No | Redact the string value separately |
| Need a stable pseudonymous id, not removal | No — removes the key | Hash with the JSON Anonymizer |
A Mixpanel reserved prop like $email | Yes — list $email literally | Remove listed |
Cookbook
Before/after scrubs on real-shaped analytics events. Values are synthetic; this is what the result panel shows.
Scrub a Segment identify call
ExampleKeep the event signal, drop the identifying traits. Keys: email, name, phone, Remove, Deep on.
BEFORE
{
"type": "identify",
"userId": "u_91",
"traits": { "email": "ada@x.com", "name": "Ada", "plan": "pro" }
}
AFTER (2 keys removed)
{
"type": "identify",
"userId": "u_91",
"traits": { "plan": "pro" }
}Strip device and IP from an Amplitude event
ExampleRemove auto-captured identifiers nested in user_properties. Keys: ip, deviceId, Deep on.
BEFORE
{
"event_type": "page_view",
"user_properties": { "ip": "203.0.113.4", "deviceId": "d_22", "tier": "gold" }
}
AFTER (2 keys removed)
{
"event_type": "page_view",
"user_properties": { "tier": "gold" }
}Scrub a batch of events at once
ExampleDeep mode removes the identifier from every event in an array. Keys: email.
BEFORE
[
{ "e": "signup", "email": "a@x.com" },
{ "e": "login", "email": "b@x.com" }
]
AFTER (2 keys removed)
[
{ "e": "signup" },
{ "e": "login" }
]Remove a Mixpanel reserved property
ExampleMixpanel reserved props start with $; list them literally. Keys: $email, $name, Deep on.
BEFORE
{ "event": "Purchase", "properties": { "$email": "a@x.com", "$name": "Ada", "amount": 42 } }
AFTER (2 keys removed)
{ "event": "Purchase", "properties": { "amount": 42 } }Lookalike key survives by design
ExampleListing email leaves emailDomain in place — handy when the domain is non-identifying analytics signal. Keys: email.
BEFORE
{ "e": "signup", "email": "ada@acme.com", "emailDomain": "acme.com" }
AFTER (1 key removed — emailDomain kept as signal)
{ "e": "signup", "emailDomain": "acme.com" }Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
PII inside a free-text property
Not caughtKey removal drops whole keys, not substrings of values. An email inside a "searchQuery" or "message" string survives. Redact string values with a separate step before tracking.
Casing or prefix mismatch
Not removedMatching is exact and case-sensitive. Listing email will not catch Email or the Mixpanel reserved $email. List every variant your SDK emits (email, Email, $email).
Lookalike key kept
Preservedemail removes only email; userEmail, emailDomain, and email_hash stay. Sometimes that is intended (domain as signal) — list the extra exact names if it is not.
Need pseudonymous id, not removal
By designDropping userId breaks cross-event joins. If your policy allows a stable hashed id, keep the key and scramble the value with the JSON Anonymizer instead of removing it here.
Captured event is not valid JSON
Invalid JSONDebugger exports can be truncated or wrapped. If JSON.parse fails, nothing is scrubbed. Clean the snippet or run it through the JSON Format Fixer first.
Listed key absent from this event
By designAn identifier not present in this sample is ignored and adds 0 to the removed count. Other event types may still carry it — test each shape.
Empty property block left behind
PreservedScrubbing all traits can leave "traits": {}. That is harmless to send; remove empties afterward with the JSON Null Stripper if you prefer.
Event over the 2 MB free limit
BlockedFree accounts cap JSON at 2 MB. A large batch export may exceed it — scrub a representative sample or upgrade to Pro.
This is a design aid, not runtime scrubbing
Manual stepIt validates your key list against a sample but does not run inside your app. Implement the same removal in your tracking middleware so every event is scrubbed before it is sent.
Order and non-PII signal preserved
PreservedEvent name, counts, timestamps, and every non-listed property are kept exactly, so your analytic signal is intact after the scrub.
Frequently asked questions
Does this scrub events in my live app?
No — it is a design and testing aid. You validate the key list against a sample event, then implement the same removal in your tracking middleware so every event is scrubbed before it is sent to Mixpanel, Amplitude, Segment, or your warehouse.
Does it reach PII nested in properties or traits?
Yes, with Deep (all levels) on. It removes the listed keys from nested properties, traits, and user_properties blocks, and from each event in a batched array.
How do I handle Mixpanel reserved props like $email?
List them literally, including the $ — e.g. $email, $name, $phone. Matching is exact, so the prefix matters.
Will it remove emailDomain if I list email?
No. Matching is exact, not substring, so emailDomain is kept — often useful as non-identifying signal. List it explicitly if you want it gone too.
Can I keep a user join key without storing the raw id?
Not by removing it — that breaks the join. Keep the key and hash its value with the JSON Anonymizer for a stable pseudonymous id.
Is matching case-sensitive?
Yes. email and Email are different keys; list every casing your SDK emits.
Does scrubbing lose my analytics signal?
No. Only the listed identifying keys are removed; event names, counts, timestamps, and non-PII properties are preserved exactly.
What does the removed count tell me?
It is the number of identifying-key occurrences stripped across the event(s). Compare it to what you expected to catch casing or naming gaps.
Is the captured event uploaded anywhere?
No. Parsing and scrubbing run in your browser, so a real event with PII never leaves your machine.
What if the exported event is malformed?
Parsing fails and nothing is scrubbed. Repair it with the JSON Format Fixer or check it with the JSON Validator.
How large an event or batch can I process?
Up to 2 MB per JSON file on the free tier. Scrub a representative sample or upgrade to Pro for larger batches.
Can I instead keep only the allowed analytics fields?
Yes — switch to Keep listed (whitelist) mode, but note it behaves differently on nested data; see the public-API subset guide.
Privacy first
Conversion runs locally in your browser. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.