How to audit a pdf form's fields and structure
- Step 1Open the field extractor — Go to the PDF Form Field Extractor. It runs in your browser; submitted data is never uploaded.
- Step 2Drop a reference copy of the form — Use the blank master (or any submission). The field inventory is the same across copies. The tool runs automatically — no options to set.
- Step 3Capture the field inventory — The preview shows fields as
{ name, type, value }with the total count below. This is the baseline list every submission should match. - Step 4Download the JSON inventory — Save it as the form's field manifest. Decide which fields you treat as required for your process.
- Step 5Build your required-field checklist — From the inventory, mark the fields that must be completed (the tool doesn't carry a required flag, so you define this). Note the
PDFSignaturefields that must be signed. - Step 6Compare submissions against the inventory — In a separate value-reading step, read each submission's values keyed by the inventory's names; flag any required field that's empty or any signature field that's unsigned, then request completion.
What an audit needs vs. what this tool provides
The tool covers the structural half; the value half is a downstream step. Verified against lib/pdf/pdfEngine.ts.
| Audit question | This tool | Where the rest comes from |
|---|---|---|
| What fields should exist? | Provides the full field inventory | — |
| What's each field called? | Fully-qualified name | — |
| Which fields are signatures? | type = PDFSignature | Verify signing separately |
| Is a given field empty? | Not reported (value always "") | Value-reading step per submission |
| Which fields are required? | Not reported | You define from your process / PDF properties |
Field types and what to check
How each field type factors into a completion audit.
| Field type | Audit consideration |
|---|---|
PDFTextField | Required ones must be non-empty |
PDFCheckBox | Required consents must be checked |
PDFRadioGroup | A single-choice answer must be selected |
PDFDropdown | A valid option must be chosen |
PDFOptionList | At least one selection where required |
PDFButton | Ignore — not a submission field |
PDFSignature | Must be present/signed where required |
Cookbook
Using the field inventory to drive a real completion audit. Field names are illustrative; yours appear verbatim in the JSON.
The form's field manifest
Extract the inventory once. This is the list every submission must satisfy — note that values are blank because this is the structure, not a submission.
[
{ "name": "applicant_name", "type": "PDFTextField", "value": "" },
{ "name": "ssn", "type": "PDFTextField", "value": "" },
{ "name": "consent", "type": "PDFCheckBox", "value": "" },
{ "name": "applicant_sig", "type": "PDFSignature", "value": "" }
]Defining the required-field checklist
The tool doesn't carry a required flag, so you decide what's mandatory from the inventory and your own rules.
From the manifest, mark required: applicant_name -> required (text, must be non-empty) ssn -> required (text, must be non-empty) consent -> required (checkbox, must be checked) applicant_sig -> required (signature, must be signed) Save this checklist next to your audit script.
Auditing one submission against the manifest
Read a submission's values separately, then compare. Empty required fields and unsigned signatures are your incompletes.
Submission read (separate step) vs required checklist: applicant_name = "Dana Reyes" OK ssn = "" MISSING (required) consent = checked OK applicant_sig = unsigned MISSING (required) Incomplete: ssn, applicant_sig -> request resubmission.
Catching a form revision that broke the audit
Re-extracting the inventory after a form change shows added/removed fields so your audit checklist stays accurate.
Old manifest: applicant_name, ssn, consent, applicant_sig New manifest: applicant_name, ssn, consent, applicant_sig, dob New required field 'dob' added. Update the checklist before auditing the new batch.
Listing signature fields to confirm presence
Filter the inventory by type to see exactly which signatures each submission must contain.
Filter type == PDFSignature: applicant_sig witness_sig Every submission must carry both. Confirm they're actually signed and valid with PDF Signature Verify, not just present.
Edge cases and what actually happens
It does not report empty vs filled
Structure onlyEvery value is an empty string regardless of whether the field was completed. The tool inventories fields; it doesn't read their state. Detect empties in a downstream value-reading step that compares each submission against this inventory.
Required-field flag isn't in the output
Not reportedThe output is name + type + empty value. It doesn't surface the PDF's 'required' flag, read-only state, or default values. Define which fields are mandatory yourself, based on the inventory and your process rules.
Signature field listed but not validated
ExpectedA PDFSignature entry tells you a signature field exists, not whether it's signed or whether the signature is cryptographically valid. Confirm signing with PDF Signature Verify.
Auditing many submissions
ManualThe tool reads one file per run and returns structure, not data. Extract the inventory once for the baseline, then read each submission's values separately and compare. There's no batch audit in the UI.
Flattened submission
No fieldsIf a submission was flattened before you got it (see PDF Flatten), it has no interactive fields, so the inventory comes back empty for that file. Audit the visible text with PDF to Text instead.
Scanned/printed submission
No fieldsA printed-and-scanned form is an image with no fields to inventory. Use PDF OCR to recognise the content, then audit completeness against the recognised text manually.
Pure-XFA dynamic form
AcroForm onlypdf-lib parses AcroForm fields, not the XFA XML layer, so a pure-XFA form may return few or no fields. AcroForm forms inventory reliably; XFA needs an XFA-aware tool.
Hidden fields appear in the inventory
ExpectedAll interactive fields are enumerated, including hidden/conditional ones. This is useful for an audit — they're part of the form's true structure — but decide per field whether they're in scope for your completeness check.
Buttons in the inventory
IgnorePDFButton fields (submit/reset) aren't submission data. Exclude them from your required-field checklist so they don't distort the audit.
Permission-restricted submission
SupportedForms that only restrict editing still inventory fine, since the engine ignores encryption for parsing. A submission requiring a password to open may fail — clear it with PDF Unlock first.
Frequently asked questions
Does this tell me which fields a submission left empty?
Not on its own. It inventories the form's fields — every field's name and type — but returns an empty value for each, so it doesn't report filled-vs-empty state. The right audit workflow is: use this tool to capture the authoritative field inventory, then in a separate value-reading step compare each submission's values against that inventory to flag empties. This guarantees you're checking the right field names.
How does it identify required vs optional fields?
It doesn't — the output is name, type, and an empty value, with no required flag. You decide which fields are mandatory based on the inventory and your own process. In practice you mark the required fields once on the downloaded inventory and keep that checklist with your audit logic.
Why is the field inventory useful for auditing at all?
Because the most common audit failure is checking submissions against a wrong or outdated field name. The inventory gives you the form's real field names and types, straight from the PDF, so your completeness checks always target fields that actually exist. It also surfaces every interactive field — including hidden ones easy to miss in a reader.
Can it confirm whether a form was signed?
It can tell you a signature field exists (it appears as PDFSignature in the inventory), but it can't tell you whether that field is actually signed or whether the signature is valid. To verify signing, use PDF Signature Verify.
Can I audit many submissions automatically?
The tool itself reads one file per run and returns structure, not data. For automation, extract the inventory once to establish the baseline, then run a value-reading step over each submission keyed by the inventory's field names and flag incompletes programmatically. The inventory is the contract your automation checks against.
What about digitally signed PDF forms?
Their fields inventory normally fine because the engine loads with encryption ignored. The signature field shows as PDFSignature. Whether it's actually signed and valid is a separate check — use PDF Signature Verify for that.
What if a submission returns no fields?
An empty inventory means no interactive AcroForm fields — the submission was likely flattened, scanned, or pure-XFA. For flattened/scanned files, audit the visible content via PDF to Text or PDF OCR rather than the field layer.
Will hidden or conditional fields show up?
Yes. Every interactive field is enumerated, including hidden and conditional ones. That's helpful for a thorough audit, but you decide per field whether it's in scope — some conditional fields are legitimately empty when their branch wasn't triggered.
Should I include buttons in my checklist?
No. PDFButton fields are actions (submit/reset), not data. Exclude them from your required-field checklist so the audit only measures real inputs.
Can I export the inventory as CSV?
No — the output is a JSON array, downloaded as a .json file. Convert it to a checklist or spreadsheet downstream if you prefer that format for the audit.
Is the submitted form data uploaded?
No. Parsing runs entirely in your browser via pdf-lib; submitted forms never reach a server. The result panel shows '0 bytes uploaded'. Only an anonymous usage counter is recorded when you're signed in — relevant when auditing forms containing personal data.
What are the file size and page limits?
Free tier accepts up to 2 MB and 50 pages per file; Pro raises that to 50 MB and 500 pages. Most form submissions sit well under the free limit; a heavily scanned or image-rich submission can exceed it.
Privacy first
All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.