How to merge survey response csv exports
- Step 1Export responses from each survey platform —
Google Forms: open the form, go to theResponsestab, open the three-dot overflow menu, and chooseDownload responses (.csv)— if the form has a file-upload question this arrives as a.zip, so extract the CSV first.Typeform:Results→Responses→Download→ CSV.SurveyMonkey:Analyze Results→Export→All response data→ CSV — large surveys download as a ZIP of several CSVs.Microsoft Forms: there is no direct CSV, so clickOpen results in Exceland thenSave As→CSV UTF-8in Excel.Jotform:Submissions→Download All→ CSV. Give each file a name you will recognise, such asnps-cohort-a.csvandnps-cohort-b.csv. - Step 2Confirm the question wording matches across files — This is the rule that decides whether your files merge into one clean schema. csv-merger keys columns on the header text, which for a survey export is the question text — matching ignores letter case and surrounding spaces, but nothing else. The headers 'How likely are you to recommend us?' and 'How likely are you to recommend us to a friend?' are different columns; a one-word edit, a corrected typo, or added punctuation forks the data. If every file is an export of the same form with no question edits, the headers are identical and the merge is a clean single-schema concatenation. If a question was reworded between exports, the merge still succeeds, but the output carries both versions as separate columns — reconcile them in Step 5.
- Step 3Drop all the response CSVs onto the merger — The free tier merges 2 files at once; Pro raises the batch to 10. Files are processed in the order you add them, so add them oldest-export-first for chronological output. Leave the mode on Union (the default) — headers are matched by name across every file, and any column missing from a given file becomes an empty cell for that file's rows; switch to Strict only if you want the merge to fail when a file's column set differs. Open-ended answers and respondent PII are parsed in your browser and never reach a server.
- Step 4Read the per-file breakdown before you download — The results panel reports files in, total rows in, rows out, and the unified column count, plus a per-file row and column breakdown. Two checks. First, compare rows out against the response total each platform reported on its own dashboard — the merger only concatenates, so a merged file with fewer responses than you know you collected means a file is missing from the merge or an export came back short. Second, scan the per-file column counts — one file with a column count different from the rest is usually a SurveyMonkey two-row-header export or the wrong file, and spotting it here saves a re-merge after the download.
- Step 5Reconcile near-duplicate question columns — If Step 4 shows more columns than a single export has, a question was reworded between exports and the union produced two columns for it — for example 'Overall satisfaction' and 'Overall satisfaction with our service'. Decide which wording is canonical, then run each affected input file through csv-header-rename to align the headers before re-merging. Once the header strings match exactly, the two columns collapse into one. Fix the inputs rather than the merged output, so the correction is reusable for the next export.
- Step 6Add a source column for multi-form merges — Merging batched exports of one form needs no extra column — the responses are indistinguishable by design. Merging copies of a form, one per cohort, region, class, or event, is different: add a
sourcecolumn to each file before merging, with a value likecohort-aorlondonor2026-spring, so analysis can group or filter by origin. The merger adds no provenance of its own; it only concatenates. Add the column in any spreadsheet by pasting a constant value down the column before you merge. - Step 7Dedupe overlapping exports, then use the merged file — If your batched exports covered overlapping date ranges, the same responses appear in more than one file and the merge carries the duplicates through. Run csv-deduplicator on the merged output, keyed on the platform's unique response identifier —
Respondent IDfor SurveyMonkey,Response IDfor Typeform,IDfor Microsoft Forms,Submission IDfor Jotform. Google Forms exports have no response ID, so dedupe onTimestampplus one or two answer columns instead. For analysis in Excel, pandas, R, or BigQuery the merged CSV is then ready as-is; re-sort it byTimestampwith csv-sorter if you added the files out of order.
Survey export scenarios a plain merge solves
When a single header-matched merge is the right tool, and when the files need a fix-up step first.
| Scenario | What you have | Schema across files | Plain merge? |
|---|---|---|---|
| Same form, batched exports | One large response set exported in date-range chunks, or paged by the platform | Identical — same form, same questions | Yes — clean single-schema concatenation |
| Copies of one form, per cohort or region | Several forms built from one template, each with its own export | Identical, as long as no question was edited after duplicating | Yes — add a source column to each file first |
| Form edited mid-collection | A before-edit export and an after-edit export of the same form | Differs by the reworded or added question | Yes — the union keeps every column; reconcile reworded ones |
| Same questionnaire on two platforms | For example a Google Forms copy plus a Typeform copy | Differs — each platform words headers and metadata its own way | Partly — align headers with csv-header-rename first |
| SurveyMonkey multi-file ZIP | A large export delivered as a ZIP of several CSVs | Identical — one survey, just chunked | Yes — extract the ZIP and drop all the CSVs in |
| Two unrelated surveys | Questionnaires that ask different questions | Completely different | No — the union is a wide, mostly-empty file; keep them separate |
How question text becomes the merge key
csv-merger matches columns on the header string after trimming spaces and lower-casing. In a survey export that string is the question text — so wording, not meaning, decides what lines up.
| Header in file A | Header in file B | Merge result |
|---|---|---|
Email address | email address | One column — matching ignores letter case |
Email address (padded) | Email address | One column — leading and trailing spaces are trimmed before matching |
How satisfied are you? | How satisfied are you (no question mark) | Two columns — the missing question mark makes the strings different |
Q1. Satisfaction | Satisfaction | Two columns — any wording difference forks the data |
Timestamp present | column absent | One Timestamp column; file B's rows get an empty cell in it |
| blank header cell | blank header cell | Each becomes column_2, column_3 and so on by position — blanks merge only when they sit in the same column position |
Survey platform export quirks
How each platform's CSV export behaves, and the part of that behaviour that affects a merge.
| Platform | Export path | Quirk that affects merging |
|---|---|---|
| Google Forms | Responses tab → three-dot menu → Download responses (.csv) | First column is Timestamp; headers are the live question text, so editing a question changes the header. Checkbox answers are comma-joined into one cell. A form with a file-upload question exports as a ZIP. |
| Typeform | Results → Responses → Download → CSV | Adds submission-date and respondent metadata columns, plus any Hidden Fields you configured. These appear in every export, so they merge consistently across files from the same form. |
| SurveyMonkey | Analyze Results → Export → All response data → CSV | Uses two header rows — question text on row 1, answer-choice labels on row 2. csv-merger reads only row 1 as the header, so row 2 becomes a data row. Delete it first. Large exports arrive as a ZIP. |
| Microsoft Forms | Open results in Excel, then Save As CSV | No native CSV. Columns begin ID, Start time, Completion time, Email, Name. Re-saving through Excel can reformat dates to the saving machine's locale. |
| Jotform | Submissions → Download All → CSV | Includes a Submission ID and submission-date columns, which give you a reliable key for the dedupe step. |
Cookbook
Real survey-merge situations with the matching setup.
Batched exports of one form into a single dataset
ExampleA 40,000-response customer survey exported in four date-range chunks so each file generated quickly. Same form, no question edits, so the schema is identical across all four.
Inputs (added oldest-first): customer-survey-jan.csv 9,800 rows customer-survey-feb.csv 11,200 rows customer-survey-mar.csv 10,400 rows customer-survey-apr.csv 8,600 rows Merge config: Union mode (the default) Output: customer-survey-merged.csv 40,000 rows Headers identical across all four files — one clean schema. Rows concatenated in file order (Jan first, Apr last). rows out (40,000) == sum of rows in — nothing lost or added.
Per-cohort copies of one form, tagged by source
ExampleAn NPS survey duplicated for three customer segments; each segment has its own form and its own export. A source column added to each file before merging keeps the segments separable in analysis.
Pre-merge: add a 'source' column to each export nps-enterprise.csv -> source = enterprise nps-midmarket.csv -> source = mid-market nps-smb.csv -> source = smb Merge: drop all three files in Output: nps-all-segments.csv One schema (the template was never edited) plus a 'source' column to group by in the pivot table.
Form edited mid-collection — the union keeps both versions
ExampleQuestion 4 was reworded after 600 responses had come in. The before-edit and after-edit exports no longer share that column, so the union produces both.
Inputs:
feedback-before-edit.csv 600 rows, 8 columns
column 4: 'Was support helpful?'
feedback-after-edit.csv 940 rows, 8 columns
column 4: 'Was our support team helpful?'
Merge output: feedback-merged.csv 1,540 rows, 9 columns
Both wordings appear as SEPARATE columns:
'Was support helpful?' filled for the first 600 rows
'Was our support team helpful?' filled for the last 940 rows
Fix: csv-header-rename one file so both use the same wording,
then re-merge — the two columns collapse into one.Stripping SurveyMonkey's second header row
ExampleSurveyMonkey's CSV puts question text on row 1 and answer-choice sub-labels on row 2. csv-merger treats row 1 as the header, so row 2 lands in the data as a near-empty junk row.
SurveyMonkey export, first three lines: Respondent ID,Collector ID,Start Date,...,How did you hear about us? ,,,,Search engine <- row 2: sub-labels, not data 10294831,4471002,03/14/2026,...,Search engine Merged as-is, row 2 becomes the first data row of that file — a row of mostly-blank cells. Fix: delete row 2 from each SurveyMonkey CSV before merging (a one-line edit in any spreadsheet). Every SurveyMonkey export has the same structure, so the fix is the same for each file.
Merging a Google Forms copy and a Typeform copy
ExampleThe same questionnaire ran as a Google Form on the website and a Typeform in an email campaign. The two platforms word headers and metadata differently, so the union works but is not clean until the headers are aligned.
Inputs: web-google-forms.csv Timestamp, Q1, Q2, Q3, ... email-typeform.csv Submit Date, Q1, Q2, Q3, plus metadata Merged as-is: a union of ~14 columns. 'Timestamp' and 'Submit Date' stay separate; each platform's metadata columns are empty for the other platform's rows. Better: csv-header-rename the Typeform file so the shared questions match word-for-word and 'Submit Date' becomes 'Timestamp', then merge -> one clean schema, with empty cells only where a platform genuinely lacks a field.
Errors and edge cases
Real errors and silent failures sourced from each platform's own documentation. Match the wording to the row, fix what the row says to fix.
SurveyMonkey CSV has two header rows
Two-row headerSurveyMonkey's CSV export uses two rows for headers — question text on row 1, answer-choice sub-labels on row 2 for matrix and multiple-choice questions. csv-merger always reads row 1 as the header, which is correct for SurveyMonkey, but row 2 then becomes the first data row of that file: a row of mostly-empty cells. Delete row 2 from each SurveyMonkey CSV before merging — a one-line edit in any spreadsheet or text editor. Because every SurveyMonkey export shares this structure, a stray row that slips through is easy to spot in the merged preview, sitting near the top of that file's block.
A reworded question creates a duplicate column
Schema driftEditing a question's wording, or adding a new question, after responses have arrived changes that column's header in later exports. The before-edit and after-edit files no longer share the header, so the union merge produces two columns for one logical question — the old wording filled only for early rows, the new wording filled only for later rows. This is not an error and no data is lost, but it is untidy for analysis. Pick the canonical wording and align the inputs with csv-header-rename before merging. A genuinely new question is fine to leave split out — an empty cell there correctly means the question did not exist when that person responded.
Overlapping export ranges carry duplicate responses
Duplicates carried throughcsv-merger never deduplicates — by design, since each survey response is a distinct submission. Duplicates only appear when batched exports cover overlapping date ranges, for example one export for 1-31 January and another for 15 January to 15 February. Responses in the overlap then sit in both files and the merge keeps both copies. After merging, run csv-deduplicator keyed on the platform's unique response ID — Respondent ID, Response ID, ID, or Submission ID. Google Forms exports carry no response ID, so dedupe on Timestamp plus one or two answer columns instead, and never on answer columns alone — two respondents can legitimately give identical answers.
Google Forms checkbox answers sit in one cell
By designGoogle Forms records a checkbox (multi-select) answer as a single cell holding every chosen option joined by commas. The CSV quotes that cell so the internal commas do not break parsing, and csv-merger preserves the quoting through the merge — the multi-select cell survives intact. Nothing needs fixing for the merge itself; just be aware that the merged column holds delimited lists rather than single values when you analyse it. To break those into separate columns afterwards, use csv-column-splitter.
A Google Forms file-upload question exports as a ZIP
Platform behaviourIf a Google Form includes a file-upload question, the Download responses (.csv) action delivers a .zip rather than a bare CSV — the archive holds the response CSV alongside the uploaded files. Extract the response CSV before dropping it onto the merger; csv-merger reads CSV text, not archives. The same applies to SurveyMonkey's multi-file ZIP for large exports: unzip first, then merge the CSVs inside.
Microsoft Forms data re-saved through Excel
Excel side-effectMicrosoft Forms has no direct CSV export — you open results in Excel and Save As. Excel's CSV writer may prepend a UTF-8 byte-order mark and rewrite dates into the saving machine's locale. The byte-order mark does not affect the merge — it is not part of the header text csv-merger matches on — but locale-dependent date reformatting can leave Start time inconsistent if different files were saved on machines with different regional settings. Save every file from the same machine, or normalise the date column after merging. csv-merger never alters cell values; it concatenates whatever the files contain.
A file's first row is data, not a header
First row is always the headercsv-merger treats the first row of every file as that file's header. Drop in a file whose first row is actually a response — or the wrong file entirely — and those values become column names, so the union balloons with one-off columns. The per-file breakdown in the results panel is the catch: a file contributing columns no other file has is almost always a wrong or header-less file. Remove it and re-merge.
Merging genuinely different surveys
Wrong tool for the jobMerging two questionnaires that ask different questions produces the union of every question from both — a wide CSV where each row is filled for one survey's columns and empty for the other's. It is valid output but rarely useful: analysis tools then see one sparse table instead of two clean ones. Keep unrelated surveys as separate files and only merge response sets that answer the same questions — and if you are not certain two files belong together, run the merge in Strict mode, which rejects the pair outright when their column sets differ.
Frequently asked questions
Why do I have multiple survey CSV files for one dataset?
Three common reasons. (1) Batched exports — a large response set is exported in date-range chunks so the file generates quickly, or the platform pages a big export automatically. (2) Copies of one form — the same questionnaire is duplicated per cohort, region, class, campaign, or event, and each copy is its own form with its own export. (3) A platform that splits the export — SurveyMonkey delivers large 'All response data' exports as a ZIP of several CSVs. In all three cases the files share, or nearly share, one schema, so a header-matched merge combines them into a single dataset for analysis.
How does csv-merger decide which columns line up?
By header text. csv-merger reads the first row of each file as its header and matches columns whose header strings are equal after trimming surrounding spaces and lower-casing — so Email, email, and EMAIL all become one column. In a survey export the header is the question text, so two files merge into one clean schema only when their questions are worded identically. Any difference — a corrected typo, an added question number, different punctuation — counts as a different column. A column a file does not have becomes an empty cell for that file's rows — nothing is ever dropped. That is Union mode, the default; Strict mode keeps the same name-matching but stops the merge if any file's column set differs from the first.
What happens if I reworded a question partway through the survey?
In the default Union mode the merge still works — csv-merger does not reject the mismatch — but you get two columns for that one question: the old wording, filled only for responses exported before the edit, and the new wording, filled only for responses after. No data is lost. For a single clean column, pick the canonical wording, run the mismatched input files through csv-header-rename so the headers match exactly, then re-merge. A question you added rather than reworded is fine to leave split out — its empty cells correctly mean the question did not exist yet.
Will merging remove duplicate survey responses?
No, and that is deliberate. Each survey response is a distinct submission, so the merger concatenates every row without deduplicating. Genuine duplicates appear only when your exports cover overlapping date ranges. To remove those, run csv-deduplicator on the merged file keyed on the platform's unique response identifier — Respondent ID, Response ID, ID, or Submission ID. Google Forms has no such ID, so dedupe on Timestamp plus a couple of answer columns. Do not dedupe on answer columns alone: two respondents can legitimately give identical answers.
How do I merge Google Forms responses from several forms?
Export each form: open it, go to the Responses tab, open the three-dot menu, and choose Download responses (.csv). If a form has a file-upload question the download is a ZIP, so extract the CSV. Drop all the CSVs onto the merger above — 2 files on the free tier, up to 10 on Pro. The merge is clean only if the forms ask identically-worded questions, which means copies made from one template before any edits. If you built the forms separately and the wording drifted, align the headers with csv-header-rename first. Add a source column to each file beforehand if you need to tell the forms apart in analysis.
How do I handle SurveyMonkey's two header rows?
SurveyMonkey's CSV puts question text on row 1 and answer-choice sub-labels on row 2. csv-merger uses row 1 as the header, which is correct, but that leaves row 2 sitting in the data as a near-empty junk row. Delete that second row from each SurveyMonkey CSV before merging — a one-line edit in any spreadsheet or text editor. Every SurveyMonkey export has the same structure, so if you miss one, the stray row is easy to spot in the merged preview as a row of blanks near the top of that file's block.
Can I merge a Google Forms export with a Typeform export?
Yes, but not as a clean one-step merge. Google Forms and Typeform word their metadata columns differently — Timestamp versus a submission-date column — and often word the shared questions differently too. A straight merge gives you the union: both metadata columns kept separate, each platform's extras empty for the other's rows. For a single clean schema, run one file through csv-header-rename so the shared questions and the timestamp column use identical headers, then merge.
Does the order I add the files matter?
Yes, for row order. csv-merger concatenates files in the order you add them — the first file's responses first, then the next file's, and so on. Add your exports oldest-first for a chronologically ordered result. If the order ends up wrong, re-sort the merged file on the Timestamp or submission-date column with csv-sorter. Column order does not matter — columns are matched by header name, so a re-export with its columns reordered still merges correctly.
How many survey files can I merge at once, and how large?
The free tier merges 2 files per run; Pro raises the batch to 10, and higher plans go further. Everything runs in your browser, so the practical ceiling is memory — comfortably into the millions of response rows on a desktop browser, far more than a survey usually produces. The real limit is downstream: a Google Sheet caps at 10,000,000 cells (rows times columns) and Excel on Windows stops at 1,048,576 rows. A merged survey dataset rarely approaches either, but if it does, analyse it in pandas, R, DuckDB, or BigQuery, or split it with csv-row-splitter.
Will respondent answers and PII be uploaded to JAD Apps?
No. Parsing runs entirely in your browser through PapaParse — open-ended comments, email addresses, names, and any other PII your survey collected never leave the tab. The only thing stored server-side is a single counter (files merged, no content) for signed-in dashboard stats, which you can opt out of in account settings. Local-only processing is what keeps the tool usable for ethics-approved or IRB-reviewed research and for respondent data covered by GDPR and similar regimes.
My merged file has more columns than any single export — what happened?
Schema drift. The merged column count is the union of every header across your files, so more columns than any one export means the files did not share a single schema. Two usual causes: a question was reworded or added between exports, leaving the old and new wording as separate columns; or a wrong or header-less file slipped in and its first row became one-off column names. Use the per-file breakdown in the results panel to find the odd file out, fix the headers with csv-header-rename, and re-merge.
Can I automate survey merges in a pipeline?
Yes. csv-merger runs on the local @jadapps/runner, so survey data stays on your machine. GET /api/v1/tools/csv-merger returns the tool's option schema — mode (union or strict header matching) and delimiter. Pair the runner once, then POST the response CSVs to 127.0.0.1:9789/v1/tools/csv-merger/run. A typical pipeline: a scheduled export from each survey, a runner-side merge into the master dataset, then a push to your analysis store. No response data reaches JAD's servers.
Privacy first
Processing runs locally in your browser with PapaParse. No file is uploaded — only metadata counters are saved for signed-in dashboard stats.