Extract PDF Form Fields to JSON — Free Online Tool

How to extract a pdf form's field map to json

Step 1
Open the field extractor — Go to the PDF Form Field Extractor. Everything runs locally in your browser via pdf-lib — the PDF is never uploaded.
Step 2
Drop the PDF form — Drag the form (blank or filled — it makes no difference to the field map) onto the dropzone. There are no options to configure: the tool reads the form and runs automatically as soon as the file is added.
Step 3
Review the field list preview — The result panel shows the first 20 entries as { "name", "type", "value" } objects, with a count of the total number of fields detected below the preview.
Step 4
Download the JSON — Click Download to save the complete array as a .json file named after your PDF. Every field is included, in the order pdf-lib enumerates them.
Step 5
Map names to your schema — The name keys are the PDF's internal field names — often machine-generated. Build a lookup that maps them to your database columns or API fields. Keep this map in source control alongside your fill/validation code.
Step 6
Feed it into your fill or validation pipeline — Use the field list to generate a pre-fill template, or as the authoritative key set when you later read submitted values with a dedicated value-reading library.

What each JSON field means

Every entry in the output array has exactly three keys. Verified against the engine in lib/pdf/pdfEngine.ts.

Key	What it holds	Notes
`name`	The field's fully-qualified PDF name, e.g. `applicant.email` or `topmostSubform[0].Page1[0].dob[0]`	Comes from pdf-lib's `field.getName()`. Hierarchical fields are dotted; array-style names keep their `[0]` indices.
`type`	The pdf-lib class name for the field: `PDFTextField`, `PDFCheckBox`, `PDFRadioGroup`, `PDFDropdown`, `PDFOptionList`, `PDFButton`, or `PDFSignature`	Comes from `field.constructor.name`. This is the field's structural type, not a friendly label.
`value`	Always an empty string (`""`)	This tool maps structure, not content. It does not read the typed-in value — see the FAQ on reading values.

PDF field types you'll see in the `type` key

The seven pdf-lib field classes and what each represents in a real form.

type value	Form control	Typical use
`PDFTextField`	Single- or multi-line text box	Names, addresses, dates typed as text, comments
`PDFCheckBox`	Independent on/off checkbox	"I agree", opt-ins, yes/no toggles
`PDFRadioGroup`	Mutually-exclusive radio button set	Single-choice questions (one option of several)
`PDFDropdown`	Combo box (pick one, sometimes editable)	Country, state, category pickers
`PDFOptionList`	Scrollable list box (single or multi-select)	Long option lists, multi-select selections
`PDFButton`	Push button (submit, reset, JavaScript action)	Form-action buttons — usually not data fields
`PDFSignature`	Digital signature field	Placeholder for a cryptographic signature

Limits and behaviour

Real numbers from lib/tier-limits.ts and the tool client.

Aspect	Free	Pro
Max file size	2 MB	50 MB
Max pages	50	500
Files per run	1	1 (this tool processes a single file)
Output format	JSON only	JSON only
Options to configure	None — runs on drop	None — runs on drop

Cookbook

What the JSON actually looks like for common form shapes. Field names are illustrative; your form's real names appear verbatim in the output.

A simple application form

A flat AcroForm with text fields, a checkbox, and a dropdown. Note that every value is an empty string — the array describes the form's fields, not anyone's answers.

Output (extract-pdf-form-fields-to-json):
[
  { "name": "first_name", "type": "PDFTextField", "value": "" },
  { "name": "last_name",  "type": "PDFTextField", "value": "" },
  { "name": "email",      "type": "PDFTextField", "value": "" },
  { "name": "country",    "type": "PDFDropdown",  "value": "" },
  { "name": "agree",      "type": "PDFCheckBox",  "value": "" }
]

Hierarchical / nested field names

Forms authored in Acrobat or LiveCycle often use dotted, fully-qualified names. The tool reports them exactly as pdf-lib's getName() returns them.

[
  { "name": "topmostSubform[0].Page1[0].first[0]", "type": "PDFTextField", "value": "" },
  { "name": "topmostSubform[0].Page1[0].dob[0]",   "type": "PDFTextField", "value": "" },
  { "name": "topmostSubform[0].Page1[0].sex[0]",   "type": "PDFRadioGroup", "value": "" }
]

Use these full strings as the exact keys when you fill the form.

Turning the map into a fill template

Once you have the names and types, generate a key/value scaffold your team can fill in or feed to a fill library. The extractor gives you the left column; you supply the right.

From the JSON, build a template (pseudocode):
  first_name -> ""
  last_name  -> ""
  email      -> ""
  country    -> ""   (one of the dropdown options)
  agree      -> false (PDFCheckBox)

The extractor confirms the names exist and their types,
so your fill code won't fail on a typo'd field name.

Counting fields to gauge complexity

The result panel shows the total count below the preview. A quick way to size up an unfamiliar form before you commit to automating it.

Preview shows first 20 entries, then:
  ... (137 total items)

137 fields means this form is non-trivial — budget time to
map names to your schema and to handle radio groups and
option lists, which need their option labels handled separately.

Validating field names against your code

Diff the extracted names against the keys your fill script expects. Mismatches (renamed or removed fields after a form revision) surface immediately.

Expected by my script: first_name, last_name, email, phone
Extracted from new form: first_name, last_name, email

Missing: phone
→ The form was re-authored and dropped 'phone'. Update the
  script before it silently skips that data.

Edge cases and what actually happens

The `value` field is always empty

By design

This tool extracts the form's structure — names and types — not the data typed into it. Every entry's value is an empty string, even for a fully completed form. To read submitted answers you need a value-reading step; the field map gives you the canonical keys to read against.

PDF has no interactive form

Empty array

If the document has no AcroForm, the tool returns an empty array []. A form that's just lines and boxes printed on the page (a 'flat' form meant for handwriting) has no interactive fields to enumerate — use PDF OCR to read text off such a document instead.

XFA (LiveCycle / dynamic) forms

AcroForm only

The extractor reads the AcroForm dictionary via pdf-lib. Pure XFA forms (dynamic PDFs from Adobe LiveCycle) store their fields in an XML layer pdf-lib does not parse, so they may report few or no fields. Many XFA forms ship an AcroForm fallback layer — those still extract. For pure-XFA forms, use an XFA-aware desktop tool.

Flattened form

Empty array

If a form has been flattened (interactive fields baked into the page — see PDF Flatten), the interactive fields no longer exist, so the extractor returns an empty array. The visible text is still on the page; extract it with PDF to Text.

Fully-qualified / nested names look cryptic

Expected

Names like topmostSubform[0].Page1[0].field[0] are the form's real internal names, returned verbatim from pdf-lib. They are correct keys for automation even if they aren't human-readable. Build a name-to-label map in your own code.

File exceeds the size or page limit

Rejected

Free tier caps input at 2 MB and 50 pages; Pro raises this to 50 MB and 500 pages. A form larger than your tier's limit is rejected before processing. Form PDFs are usually small, so this rarely bites — but a scanned-image-heavy form can be large.

Encrypted / password-restricted PDF

Often supported

The engine loads with ignoreEncryption, so forms that merely set permission restrictions (no open password) typically still yield their field map. A PDF that requires a password just to open may fail to parse — remove the password first with PDF Unlock.

Push buttons appear in the output

Expected

Submit/reset/JavaScript buttons are real form fields and show up as PDFButton. They carry no data, so filter them out when building a data schema — key off the type to drop PDFButton (and usually PDFSignature) entries.

Duplicate-looking names

Expected

Radio groups expose a single field name for the whole group even though there are several physical buttons. You won't see one entry per button — you'll see one PDFRadioGroup entry. The individual option labels are not part of this output.

Need CSV instead of JSON

JSON only

This tool outputs a JSON array only — there is no CSV export option in the UI. If you need a spreadsheet, convert the JSON downstream, or for tabular content elsewhere in the PDF use PDF Table to JSON or PDF to Excel.

Frequently asked questions

Does this extract the values someone typed into the form?

No. This tool extracts the form's field map — each field's name and type — and returns the value as an empty string for every field. It tells you the structure of the form (what fields exist and what kind they are), not the data entered. That structure is exactly what you need to build pre-fill payloads, validate against a schema, or know the canonical keys before reading values with a dedicated value-reading library.

What does the JSON output look like?

A JSON array. Each element is an object with three keys: name (the field's fully-qualified PDF name), type (one of the seven pdf-lib field classes — PDFTextField, PDFCheckBox, PDFRadioGroup, PDFDropdown, PDFOptionList, PDFButton, PDFSignature), and value (always ""). It downloads as a .json file named after your PDF.

Why are the field names so cryptic?

The name is the form's internal field name, returned exactly as pdf-lib reports it. Acrobat- and LiveCycle-authored forms often use hierarchical, fully-qualified names like topmostSubform[0].Page1[0].first[0]. These are the correct keys to use when filling the form programmatically. To make them human-readable, build your own map from PDF field name to display label.

Does it work with Adobe XFA forms?

Standard AcroForms extract reliably. Pure XFA forms (dynamic PDFs from Adobe LiveCycle) keep their fields in an XML layer that pdf-lib does not parse, so they may report few or no fields. Many XFA forms include an AcroForm fallback that still extracts. For pure-XFA, use an XFA-aware desktop tool.

Are there any options to configure?

No. The tool runs automatically the moment you drop a file — there are no settings, formats, or toggles. It reads the form and produces the JSON field map. This keeps it fast and deterministic.

Can I export to CSV instead?

Not from this tool — the only output is a JSON array. If you need tabular data, convert the JSON downstream in a script or spreadsheet. For genuinely tabular content elsewhere in the PDF, see PDF Table to JSON or PDF to Excel.

How are checkboxes represented?

A checkbox appears as one entry with type: "PDFCheckBox". Because this tool maps structure rather than reading values, the value is an empty string — it does not report whether the box is checked. When you later read values, normalise checkbox state to a Boolean in your own code.

Why does my form return an empty array?

An empty array means no interactive AcroForm fields were found. Common causes: the form was flattened (fields baked into the page — see PDF Flatten), it's a pure-XFA form, or it's a printed/scanned form with no interactive layer. For scanned forms, use PDF OCR to read the text instead.

Will it handle a password-protected form?

The engine loads with encryption ignored, so permission-restricted forms (no open password) usually still yield their field map. A PDF that needs a password just to open may fail to parse — clear the password first with PDF Unlock, then run the extractor.

What are the file size and page limits?

Free tier accepts up to 2 MB and 50 pages per file; Pro raises this to 50 MB and 500 pages. Form PDFs are usually small, so the free limit covers the vast majority of real forms.

Is my form uploaded anywhere?

No. Parsing happens entirely in your browser via pdf-lib. The file never leaves your device; the result panel notes '0 bytes uploaded'. Only an anonymous usage counter is recorded when you're signed in.

How do I get the data once I have the field map?

The field map gives you the authoritative list of names and types. To capture submitted values, pair it with a value-reading step (for example, pdf-lib's own getText() / isChecked() in your own script, or a desktop form-data export). The map ensures you read against the exact field names the form actually uses, avoiding silent mismatches.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to extract a pdf form's field map to json

Step 1
Open the field extractor — Go to the PDF Form Field Extractor. Everything runs locally in your browser via pdf-lib — the PDF is never uploaded.
Step 2
Drop the PDF form — Drag the form (blank or filled — it makes no difference to the field map) onto the dropzone. There are no options to configure: the tool reads the form and runs automatically as soon as the file is added.
Step 3
Review the field list preview — The result panel shows the first 20 entries as { "name", "type", "value" } objects, with a count of the total number of fields detected below the preview.
Step 4
Download the JSON — Click Download to save the complete array as a .json file named after your PDF. Every field is included, in the order pdf-lib enumerates them.
Step 5
Map names to your schema — The name keys are the PDF's internal field names — often machine-generated. Build a lookup that maps them to your database columns or API fields. Keep this map in source control alongside your fill/validation code.
Step 6
Feed it into your fill or validation pipeline — Use the field list to generate a pre-fill template, or as the authoritative key set when you later read submitted values with a dedicated value-reading library.

What each JSON field means

Every entry in the output array has exactly three keys. Verified against the engine in lib/pdf/pdfEngine.ts.

Key	What it holds	Notes
`name`	The field's fully-qualified PDF name, e.g. `applicant.email` or `topmostSubform[0].Page1[0].dob[0]`	Comes from pdf-lib's `field.getName()`. Hierarchical fields are dotted; array-style names keep their `[0]` indices.
`type`	The pdf-lib class name for the field: `PDFTextField`, `PDFCheckBox`, `PDFRadioGroup`, `PDFDropdown`, `PDFOptionList`, `PDFButton`, or `PDFSignature`	Comes from `field.constructor.name`. This is the field's structural type, not a friendly label.
`value`	Always an empty string (`""`)	This tool maps structure, not content. It does not read the typed-in value — see the FAQ on reading values.

PDF field types you'll see in the `type` key

The seven pdf-lib field classes and what each represents in a real form.

type value	Form control	Typical use
`PDFTextField`	Single- or multi-line text box	Names, addresses, dates typed as text, comments
`PDFCheckBox`	Independent on/off checkbox	"I agree", opt-ins, yes/no toggles
`PDFRadioGroup`	Mutually-exclusive radio button set	Single-choice questions (one option of several)
`PDFDropdown`	Combo box (pick one, sometimes editable)	Country, state, category pickers
`PDFOptionList`	Scrollable list box (single or multi-select)	Long option lists, multi-select selections
`PDFButton`	Push button (submit, reset, JavaScript action)	Form-action buttons — usually not data fields
`PDFSignature`	Digital signature field	Placeholder for a cryptographic signature

Limits and behaviour

Real numbers from lib/tier-limits.ts and the tool client.

Aspect	Free	Pro
Max file size	2 MB	50 MB
Max pages	50	500
Files per run	1	1 (this tool processes a single file)
Output format	JSON only	JSON only
Options to configure	None — runs on drop	None — runs on drop

Cookbook

What the JSON actually looks like for common form shapes. Field names are illustrative; your form's real names appear verbatim in the output.

A simple application form

A flat AcroForm with text fields, a checkbox, and a dropdown. Note that every value is an empty string — the array describes the form's fields, not anyone's answers.

Output (extract-pdf-form-fields-to-json):
[
  { "name": "first_name", "type": "PDFTextField", "value": "" },
  { "name": "last_name",  "type": "PDFTextField", "value": "" },
  { "name": "email",      "type": "PDFTextField", "value": "" },
  { "name": "country",    "type": "PDFDropdown",  "value": "" },
  { "name": "agree",      "type": "PDFCheckBox",  "value": "" }
]

Hierarchical / nested field names

Forms authored in Acrobat or LiveCycle often use dotted, fully-qualified names. The tool reports them exactly as pdf-lib's getName() returns them.

[
  { "name": "topmostSubform[0].Page1[0].first[0]", "type": "PDFTextField", "value": "" },
  { "name": "topmostSubform[0].Page1[0].dob[0]",   "type": "PDFTextField", "value": "" },
  { "name": "topmostSubform[0].Page1[0].sex[0]",   "type": "PDFRadioGroup", "value": "" }
]

Use these full strings as the exact keys when you fill the form.

Turning the map into a fill template

Once you have the names and types, generate a key/value scaffold your team can fill in or feed to a fill library. The extractor gives you the left column; you supply the right.

From the JSON, build a template (pseudocode):
  first_name -> ""
  last_name  -> ""
  email      -> ""
  country    -> ""   (one of the dropdown options)
  agree      -> false (PDFCheckBox)

The extractor confirms the names exist and their types,
so your fill code won't fail on a typo'd field name.

Counting fields to gauge complexity

The result panel shows the total count below the preview. A quick way to size up an unfamiliar form before you commit to automating it.

Preview shows first 20 entries, then:
  ... (137 total items)

137 fields means this form is non-trivial — budget time to
map names to your schema and to handle radio groups and
option lists, which need their option labels handled separately.

Validating field names against your code

Diff the extracted names against the keys your fill script expects. Mismatches (renamed or removed fields after a form revision) surface immediately.

Expected by my script: first_name, last_name, email, phone
Extracted from new form: first_name, last_name, email

Missing: phone
→ The form was re-authored and dropped 'phone'. Update the
  script before it silently skips that data.

Edge cases and what actually happens

The `value` field is always empty

By design

PDF has no interactive form

Empty array

XFA (LiveCycle / dynamic) forms

AcroForm only

Flattened form

Empty array

Fully-qualified / nested names look cryptic

Expected

File exceeds the size or page limit

Rejected

Encrypted / password-restricted PDF

Often supported

Push buttons appear in the output

Expected

Duplicate-looking names

Expected

Need CSV instead of JSON

JSON only

Frequently asked questions

Does this extract the values someone typed into the form?

What does the JSON output look like?

Why are the field names so cryptic?

Does it work with Adobe XFA forms?

Are there any options to configure?

No. The tool runs automatically the moment you drop a file — there are no settings, formats, or toggles. It reads the form and produces the JSON field map. This keeps it fast and deterministic.

Can I export to CSV instead?

How are checkboxes represented?

Why does my form return an empty array?

Will it handle a password-protected form?

What are the file size and page limits?

Free tier accepts up to 2 MB and 50 pages per file; Pro raises this to 50 MB and 500 pages. Form PDFs are usually small, so the free limit covers the vast majority of real forms.

Is my form uploaded anywhere?

How do I get the data once I have the field map?

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

Extract a PDF Form's Field Map to JSON

How to extract a pdf form's field map to json

What each JSON field means

PDF field types you'll see in the `type` key

Limits and behaviour

Cookbook

A simple application form

Hierarchical / nested field names

Turning the map into a fill template

Counting fields to gauge complexity

Validating field names against your code

Edge cases and what actually happens

The `value` field is always empty

PDF has no interactive form

XFA (LiveCycle / dynamic) forms

Flattened form

Fully-qualified / nested names look cryptic

File exceeds the size or page limit

Encrypted / password-restricted PDF

Push buttons appear in the output

Duplicate-looking names

Need CSV instead of JSON

Frequently asked questions

Does this extract the values someone typed into the form?

What does the JSON output look like?

Why are the field names so cryptic?

Does it work with Adobe XFA forms?

Are there any options to configure?

Can I export to CSV instead?

How are checkboxes represented?

Why does my form return an empty array?

Will it handle a password-protected form?

What are the file size and page limits?

Is my form uploaded anywhere?

How do I get the data once I have the field map?

Privacy first

Related guides

Extract a PDF Form's Field Map to JSON

How to extract a pdf form's field map to json

What each JSON field means

PDF field types you'll see in the `type` key

Limits and behaviour

Cookbook

A simple application form

Hierarchical / nested field names

Turning the map into a fill template

Counting fields to gauge complexity

Validating field names against your code

Edge cases and what actually happens

The `value` field is always empty

PDF has no interactive form

XFA (LiveCycle / dynamic) forms

Flattened form

Fully-qualified / nested names look cryptic

File exceeds the size or page limit

Encrypted / password-restricted PDF

Push buttons appear in the output

Duplicate-looking names

Need CSV instead of JSON

Frequently asked questions

Does this extract the values someone typed into the form?

What does the JSON output look like?

Why are the field names so cryptic?

Does it work with Adobe XFA forms?

Are there any options to configure?

Can I export to CSV instead?

How are checkboxes represented?

Why does my form return an empty array?

Will it handle a password-protected form?

What are the file size and page limits?

Is my form uploaded anywhere?

How do I get the data once I have the field map?

Privacy first

Related guides