How to convert a pdf bank statement to a spreadsheet
- Step 1Download a real PDF statement — Export the statement as a PDF from your online-banking portal (not a screenshot). If you can select the transaction text in the PDF, it has a text layer and will extract. A scanned statement needs OCR first.
- Step 2Drop the statement onto the converter — Add the file above. There are no options — extraction runs as soon as the file is read, in your browser, so banking data never leaves your device.
- Step 3Review the transaction rows — The preview shows the first ~5,000 characters of CSV. Confirm transactions came through and note where the statement header, address block, and per-page balance summaries landed.
- Step 4Download the CSV — Click Download — the file saves as
.txtwith CSV contents. Rename to.csvif your importer keys off the extension. - Step 5Clean and type the columns — Delete header/branding/summary rows. Set the Date column to a date type and the amount columns to Number on import (Data → From Text/CSV) so Excel doesn't misread DD/MM dates or strip leading zeros.
- Step 6Reconcile or import to your accounting tool — Sort and filter for reconciliation in Excel, or import the cleaned CSV into Xero/QuickBooks via their bank-statement import, mapping Date, Description, and Amount.
Bank-statement layouts and how they extract
Banks structure transaction tables differently; the position-based extraction follows whatever columns are in the source.
| Layout | How it extracts |
|---|---|
| Separate Debit / Credit / Balance columns | Each lands in its own column when grid-aligned in the source |
| Single signed Amount column | Comes through as one column; signs/parentheses kept as printed text |
| Running balance inline per row | Extracts as a column — verify by re-deriving it in your sheet |
| Wrapped long descriptions | Wrap onto a second visual line → a second CSV row; merge after import |
| Per-page 'balance brought forward' summaries | Extract as their own rows between transaction blocks — delete |
What the output is (and isn't)
Set expectations so the file fits a bookkeeping workflow.
| Aspect | Reality |
|---|---|
| File format | CSV-formatted text, downloaded as .txt (rename to .csv) |
| Date typing | Dates are text as printed (DD/MM/YYYY etc.); set the type on import |
| Amount typing | Amounts are text; strip symbols/commas and format as Number |
| Categorisation | None — no transaction tagging; you categorise in your tool |
| Multiple statements at once | No — single file per run; batch is not in this tool's UI |
File-size and page limits by tier
Long statement runs can be large; check the limit before uploading.
| Tier | Max file size | Max pages |
|---|---|---|
| Free | 2 MB | 50 pages |
| Pro | 50 MB | 500 pages |
| Pro + Media | 500 MB | 2,000 pages |
| Developer | 2 GB | 10,000 pages |
| Enterprise | Unlimited | Unlimited |
Cookbook
Real statement extractions. Figures are illustrative; output is shown verbatim so you can see the header rows and column structure you'll clean up.
Debit / Credit / Balance columns
A statement with separate debit, credit, and balance columns extracts each to its own column when the source is grid-aligned.
CSV output: "Date","Description","Debit","Credit","Balance" "01/06","Opening balance","","","1,240.00" "03/06","Card payment TESCO","42.10","","1,197.90" "05/06","Salary ACME LTD","","2,100.00","3,297.90"
Single signed-amount column
Some banks use one Amount column with signs. The sign comes through as printed; reformat for your accounting tool.
CSV output: "Date","Description","Amount","Balance" "03/06","TESCO","-42.10","1,197.90" "05/06","ACME LTD","+2,100.00","3,297.90" → strip "+" and "," so amounts sum as numbers.
Header and summary rows to delete
Bank branding, the account/address block, and per-page balance summaries extract as their own rows around the transactions.
CSV output: "Your Bank plc — Statement" "Account: ****1234 Sort: 00-00-00" "Date","Description","Amount","Balance" "03/06","TESCO","-42.10","1,197.90" "Balance carried forward","","","1,197.90" → delete the title, account, and carried-forward rows.
DD/MM dates need typing on import
Dates extract as text exactly as printed. Set the column type on import so Excel doesn't read 03/06 as 6 March in a US locale (or vice versa).
Extracted: "03/06" "05/06" "11/06" In Excel: Data → From Text/CSV → set Date column to Date (DMY) for UK statements so 03/06 = 3 June.
Scanned statement needs OCR
A scanned or photographed statement has no text layer; OCR first, then extract.
Input: statement_scan.pdf (photo)
Output: (empty)
Fix:
1. /pdf-tools/pdf-ocr
2. re-run this tool → transactions extract
(verify OCR'd amounts and dates carefully)Edge cases and what actually happens
Header, branding, and address rows mixed in
Noisy outputBank branding, the account/sort-code block, and the address all print as text and extract as their own rows. The tool returns every line by position with no statement structure. Delete these so only transaction rows remain for reconciliation.
Per-page balance summaries become rows
Manual fixup'Balance brought forward' / 'carried forward' lines print on each page and extract as rows between transaction blocks. Delete them; they aren't transactions and will throw off any sum or running-balance check.
Long descriptions wrap onto two rows
By designA transaction description that wraps to a second visual line sits at a different Y-position, so it becomes a separate CSV row beneath the transaction. Merge the two rows after import.
Dates misread by Excel locale
Excel coercionDates extract as text exactly as printed; Excel applies its locale on open and can flip DD/MM and MM/DD. Import via Data → From Text/CSV and set the Date column type explicitly (e.g. Date DMY for UK statements) so 03/06 reads as 3 June.
Amounts won't sum because of symbols/commas
Manual fixupAmounts keep currency symbols, thousands separators, and any +/- signs as text, so Excel treats them as text. Strip the non-numeric characters and format the column as Number before totalling or reconciling.
Scanned or image-based statement
No text foundImage-only statements have no selectable text and yield nothing. OCR adds a text layer — but on financial figures, verify OCR'd digits, since a misread amount silently breaks a reconciliation.
Debit/Credit columns shift on blank cells
Manual fixupWhere a row has only a debit or only a credit, that row has fewer fragments, so columns rebuilt from positions can shift. Re-align affected rows after import, or use a single signed-amount column in your sheet.
Statement run exceeds the free 2 MB / 50-page limit
Blocked on free tierFree tier caps at 2 MB / 50 pages; a long multi-month statement can exceed it and is blocked before processing. Upgrade (Pro is 50 MB / 500 pages), or extract the pages for one period and process them.
Frequently asked questions
Does this create an Excel file?
It outputs CSV-formatted text that downloads as a .txt file. CSV opens in Excel or Google Sheets and imports into Xero/QuickBooks, so you get your transactions in a spreadsheet — it just isn't a native .xlsx with formatting. Rename to .csv if an importer needs the extension.
Will debit and credit columns be separated correctly?
For standard layouts with grid-aligned Date / Description / Debit / Credit / Balance columns, yes — each value lands in its column. On rows where only debit or only credit is filled, the column can shift because columns are rebuilt from text positions; re-align those rows, or normalise to a single signed Amount column in your sheet.
What if the balance column has running totals?
The running balance extracts as a column of values exactly as printed. Verify it by re-deriving it in Excel (previous balance ± transaction) — a mismatch flags a missing or misaligned row to fix before you rely on the data.
Does this work for credit-card statements?
Yes — credit-card statements use a similar transaction-table structure and extract the same way. Watch for the same header/summary rows and the same date/amount typing on import.
Why are my dates wrong after opening in Excel?
Dates extract as text as printed; Excel applies its locale on open and can swap day and month. Import via Data → From Text/CSV and set the Date column type explicitly (Date DMY for UK statements, MDY for US) so the dates read correctly.
Does it work on scanned statements?
Not directly — a scan has no text layer and yields nothing. Run OCR first, then re-run this tool. Verify OCR'd amounts and dates carefully; on a statement, a single misread digit breaks the reconciliation.
Can I categorise transactions automatically?
No — the tool extracts the rows; it doesn't tag or categorise them. Categorise in your spreadsheet or let your accounting tool's rules do it after import. This step saves the data entry, not the bookkeeping judgement.
Is my banking data uploaded anywhere?
No. Extraction runs entirely in your browser via pdf.js — account numbers, transactions, and balances never reach a server. Only anonymous usage counters are recorded when you're signed in, which is exactly the privacy posture banking data calls for.
Why do header and address lines appear as rows?
The tool returns every line of text on each page by position — bank branding, the account block, the address, and per-page balance summaries all print as text, so they extract as rows. Delete them in a quick cleanup; on your bank's stable template they land in the same place every month.
Can I process several months at once?
Not in one run — the tool takes a single file. Process each statement PDF individually. For volume, script extraction via the API and the local runner on a paid tier (Pro and above unlock API access).
How large a statement can I process?
Free tier handles up to 2 MB / 50 pages — enough for most monthly statements. Pro raises that to 50 MB / 500 pages, with higher tiers above for long consolidated runs.
Are formulas (e.g. the balance calculation) preserved?
No — PDFs store printed values only. You get the printed balances; re-create any running-balance or reconciliation formulas in your spreadsheet, which is also a useful check on the extracted figures.
Privacy first
All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.