Convert a PDF to Editable Word Text

How to convert a pdf to editable word text

Step 1
Open the PDF to Word converter — Load the PDF to Word tool. It is a pure browser tool — pdf.js parses the file on your own machine and nothing is uploaded.
Step 2
Drop in your PDF — Drag the file onto the dropzone (or click to browse). It accepts a single PDF. Extraction starts automatically — there are no settings to choose first.
Step 3
Read the on-screen preview — The extracted text appears in a scrollable panel, truncated to the first 5,000 characters. Skim it to confirm the text layer came through (a blank or garbled preview is the signal you have a scanned PDF — see the cookbook).
Step 4
Download the .txt file — Click Download. You get a UTF-8 text file named after your PDF (for example contract.pdf → contract.txt) containing the full text, not just the preview.
Step 5
Open or paste into Word — Open the .txt in Word, Google Docs, or LibreOffice — or paste its contents into an existing document. Word imports plain text as a single body stream; you then apply Heading 1/2, lists, and tables using Word's own styles.
Step 6
Apply structure and clean line breaks — Use Find & Replace in Word to tidy the predictable artefacts: collapse runs of spaces, re-join lines that were hard-wrapped in the PDF, and promote section titles to heading styles. Five minutes of styling beats an hour of un-boxing a fake .docx.

What actually carries across (and what doesn't)

The converter extracts the PDF text layer only. Everything that isn't selectable text is, by definition, not in that layer.

PDF element	In the output?	Why / what to do instead
Body text & headings (text)	Yes — text only	All selectable characters are extracted in reading order. The words come across; the heading style does not — re-apply Heading 1/2 in Word.
Paragraph breaks	Partly	Pages are separated by a blank line (`\n\n`). Within a page, text items are joined with single spaces, so original line/paragraph wrapping is approximate — re-flow in Word.
Tables	Text only, not grids	Cell text is extracted but not as a Word table. For real tabular structure use PDF to Excel (CSV) or PDF table to JSON.
Images, logos, charts	No	Images are not text and are skipped. Re-insert them in Word, or pull page renders with PDF to JPG.
Fonts, colours, sizes	No	Styling is dropped — you get raw text. Apply your template's fonts and styles in Word after pasting.
Scanned / image-only pages	No (empty)	There is no text layer to read. Run PDF OCR first to add one, then convert.

File-size & page limits by tier

PDF-family limits enforced before the converter runs (lib/tier-limits.ts).

Tier	Max file size	Max pages	Batch files
Free	2 MB	50 pages	1
Pro	50 MB	500 pages	5
Pro Media	500 MB	2,000 pages	50

Cookbook

Concrete before/after for the everyday "I just need to edit this PDF in Word" job. The Output blocks show roughly what lands in the .txt file.

A two-page Word-exported PDF, round-tripped back to editable text

The cleanest case: the PDF was exported from Word, so it has a perfect text layer. Extraction is near-lossless at the character level — you lose styling, not words.

Input:  proposal.pdf  (2 pages, exported from Word, 180 KB)

Workflow:
  1. Drop proposal.pdf onto /pdf-tools/pdf-to-word
  2. Preview shows clean text  ->  click Download
  3. proposal.txt opens in Word

Output (proposal.txt, abbreviated):
Project Proposal
Prepared for Acme Ltd

(page 1 body text in reading order...)

(blank line marks the page break)

(page 2 body text...)

Spotting a scanned PDF before you waste time in Word

If the preview is blank or shows only stray characters, the PDF has no text layer. Catch it here, OCR it, then convert — don't paste an empty file into Word and wonder why.

Input:  scanned-letter.pdf  (1 page, photo of a printed letter)

Preview panel:
  (empty)        <- no extractable text layer

Fix:
  1. Run /pdf-tools/pdf-ocr (language: English) -> searchable PDF
  2. Drop the OCR'd PDF onto /pdf-tools/pdf-to-word
  3. Now the preview shows the recognised text -> Download

Re-joining lines that were hard-wrapped in the PDF

PDFs often hard-wrap each visual line. The extractor preserves the words but the breaks come through as the source had them. One Find & Replace in Word reflows the prose.

Output (.txt) — wrapped as in the PDF:
The quarterly results exceeded
expectations across every region
except EMEA.

In Word, Find & Replace (regex / wildcards):
  Find:    ([a-z,])\n([a-z])
  Replace: \1 \2

Result:
The quarterly results exceeded expectations across every region except EMEA.

Convert only the pages you'll actually edit

Editing one section of a 120-page PDF? Extract that range first so the .txt is short and the free-tier page limit isn't a factor.

Goal: edit pages 12-18 of handbook.pdf in Word

  1. /pdf-tools/pdf-extract-pages -> pages "12-18" -> handbook.extract-pages.pdf
  2. /pdf-tools/pdf-to-word on that 7-page PDF
  3. handbook.extract-pages.txt  ->  paste the section into Word

Encrypted PDF: decrypt first, then convert

A password-protected PDF can't be parsed for text until it's decrypted. Remove the password (with the password you own), then run the converter.

Input:  signed-offer.pdf  (opens only with a password)

Direct convert -> error: the file can't be parsed while encrypted

Fix:
  1. /pdf-tools/pdf-remove-password (enter your password) -> decrypted PDF
  2. /pdf-tools/pdf-to-word -> signed-offer.txt -> edit in Word

Edge cases and what actually happens

You expected a .docx but got a .txt

By design

This tool extracts the PDF text layer and downloads it as a UTF-8 .txt file (named yourfile.txt). It does not synthesise a Microsoft Word .docx with styles, tables, and images. Open or paste the .txt into Word and apply formatting there. If you specifically need structured tables out of the PDF, use PDF to Excel instead.

Scanned or photographed PDF

No text layer

Image-only PDFs (scans, phone photos, faxed pages) contain pixels, not selectable text, so the preview comes back empty. Run PDF OCR first — it recognises the glyphs and emits a searchable PDF — then convert that to text.

Encrypted / password-protected PDF

Blocked until decrypted

pdf.js cannot read the text of an encrypted PDF. Decrypt it first with Remove PDF Password (using the password you legitimately hold) or Unlock PDF for owner-restricted files, then run the converter.

Multi-column layout (newsletter, academic paper)

Reading order may differ

Text is extracted in the order pdf.js reports items, grouped per page. For two- and three-column layouts the columns can interleave rather than reading down one column then the next. Skim the preview; if columns are scrambled, paste into Word and re-order the blocks manually.

Spacing looks off — extra or missing spaces

Expected

Within each page, text fragments are joined with a single space, so kerned or justified text can pick up extra spaces, and some glyph runs can lose them. This is cosmetic — fix with Word's Find & Replace (collapse multiple spaces to one).

PDF over the tier size or page limit

Rejected

Free tier caps at 2 MB / 50 pages. A larger document is blocked before extraction. Either upgrade (Pro = 50 MB / 500 pages) or split it first with Extract Pages and convert each part.

Ligatures and special glyphs

Usually preserved

Whether fi, fl, or ﬀ come back as separate letters or as a single ligature character depends on the font's ToUnicode map embedded in the PDF. Output is UTF-8, so any glyph the PDF maps correctly survives; a small number of decorative fonts map ligatures to private-use code points that may look odd — search-and-replace fixes those.

Hidden / off-page text comes through

Expected

The extractor returns the whole text layer, including text positioned off the visible page or set to a tiny/transparent size (sometimes used for SEO or watermarks). If unexpected strings appear, they were in the PDF's text layer — review and delete them in Word.

Form fields and their values

Partly extracted

Static text on a form is extracted; the contents of interactive AcroForm fields may not appear in the text layer. To pull field names and values specifically, use a form-aware tool such as PDF Form Extractor.

Frequently asked questions

Do I get a real .docx Word file?

No — and that's deliberate. The tool extracts the PDF's text layer and downloads it as a UTF-8 .txt file. You open or paste that into Microsoft Word, Google Docs, or LibreOffice and apply your own styles. There is no auto-generated .docx with reconstructed tables and images, because those reconstructions are usually more trouble to clean up than starting from clean text.

Why .txt instead of .docx?

Reliability and honesty. A faithful .docx reconstruction from a PDF requires guessing styles, table grids, and image placement, and the results are routinely messy. Clean extracted text drops into Word instantly and you control the formatting. If you need tabular data structured, PDF to Excel is the right tool; for Markdown, use PDF to Markdown.

Is my document uploaded anywhere?

No. The PDF is parsed in your browser with pdf.js. Nothing is sent to a server — the on-screen badge reads "Local browser processing · 0 bytes uploaded." This is the main reason to use it for offer letters, NDAs, and other confidential drafts.

Will formatting and fonts be preserved?

No. You get the text content; fonts, colours, sizes, headings, and layout are not carried into the .txt. Re-apply your styles in Word after pasting. Paragraph breaks are approximate — pages are separated by a blank line and intra-page wrapping comes through as the PDF had it.

Does it convert tables into Word tables?

No. Table cell text is extracted, but as plain text, not as a Word grid. For structured tables, use PDF to Excel (CSV output, columns detected by position) or PDF table to JSON.

What about images and charts?

They are skipped — images are not text. Re-insert graphics in Word, or render pages to images with PDF to JPG and paste those in.

It returned nothing / a blank preview. Why?

Your PDF is almost certainly a scan or photo — an image with no selectable text layer. Run PDF OCR first to create a searchable PDF, then convert that. A blank preview is the tool telling you there was no text to extract.

Are there any settings to configure?

No. The converter has no options panel — it runs automatically as soon as you drop the file, shows a preview, and offers a Download button. Any tuning (re-joining lines, fixing spacing, applying styles) happens afterwards in Word.

How big a PDF can I convert?

Free tier: up to 2 MB and 50 pages. Pro: 50 MB and 500 pages. Pro Media: 500 MB and 2,000 pages. Over the limit, split the file with Extract Pages and convert sections.

Can I open the .txt directly in Word without copy-pasting?

Yes. In Word choose File → Open and pick the .txt; Word imports it as a plain-text body you can then style. Google Docs and LibreOffice open .txt the same way. Copy-paste works too if you want it inside an existing document.

What's the difference between this and the PDF to Text tool?

Functionally none — both extract the text layer and produce a .txt file. The PDF to Text tool is framed for search/NLP/plumbing workflows; this page is framed for the "I want to edit it in Word" workflow. Pick whichever name matches your intent.

Will reading order always be correct?

For single-column documents, yes. Multi-column layouts (papers, newsletters) can interleave columns because extraction follows the order pdf.js reports text items in. Check the preview; re-order blocks in Word if needed.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert a pdf to editable word text

Step 1
Open the PDF to Word converter — Load the PDF to Word tool. It is a pure browser tool — pdf.js parses the file on your own machine and nothing is uploaded.
Step 2
Drop in your PDF — Drag the file onto the dropzone (or click to browse). It accepts a single PDF. Extraction starts automatically — there are no settings to choose first.
Step 3
Read the on-screen preview — The extracted text appears in a scrollable panel, truncated to the first 5,000 characters. Skim it to confirm the text layer came through (a blank or garbled preview is the signal you have a scanned PDF — see the cookbook).
Step 4
Download the .txt file — Click Download. You get a UTF-8 text file named after your PDF (for example contract.pdf → contract.txt) containing the full text, not just the preview.
Step 5
Open or paste into Word — Open the .txt in Word, Google Docs, or LibreOffice — or paste its contents into an existing document. Word imports plain text as a single body stream; you then apply Heading 1/2, lists, and tables using Word's own styles.
Step 6
Apply structure and clean line breaks — Use Find & Replace in Word to tidy the predictable artefacts: collapse runs of spaces, re-join lines that were hard-wrapped in the PDF, and promote section titles to heading styles. Five minutes of styling beats an hour of un-boxing a fake .docx.

What actually carries across (and what doesn't)

The converter extracts the PDF text layer only. Everything that isn't selectable text is, by definition, not in that layer.

PDF element	In the output?	Why / what to do instead
Body text & headings (text)	Yes — text only	All selectable characters are extracted in reading order. The words come across; the heading style does not — re-apply Heading 1/2 in Word.
Paragraph breaks	Partly	Pages are separated by a blank line (`\n\n`). Within a page, text items are joined with single spaces, so original line/paragraph wrapping is approximate — re-flow in Word.
Tables	Text only, not grids	Cell text is extracted but not as a Word table. For real tabular structure use PDF to Excel (CSV) or PDF table to JSON.
Images, logos, charts	No	Images are not text and are skipped. Re-insert them in Word, or pull page renders with PDF to JPG.
Fonts, colours, sizes	No	Styling is dropped — you get raw text. Apply your template's fonts and styles in Word after pasting.
Scanned / image-only pages	No (empty)	There is no text layer to read. Run PDF OCR first to add one, then convert.

File-size & page limits by tier

PDF-family limits enforced before the converter runs (lib/tier-limits.ts).

Tier	Max file size	Max pages	Batch files
Free	2 MB	50 pages	1
Pro	50 MB	500 pages	5
Pro Media	500 MB	2,000 pages	50

Cookbook

Concrete before/after for the everyday "I just need to edit this PDF in Word" job. The Output blocks show roughly what lands in the .txt file.

A two-page Word-exported PDF, round-tripped back to editable text

The cleanest case: the PDF was exported from Word, so it has a perfect text layer. Extraction is near-lossless at the character level — you lose styling, not words.

Input:  proposal.pdf  (2 pages, exported from Word, 180 KB)

Workflow:
  1. Drop proposal.pdf onto /pdf-tools/pdf-to-word
  2. Preview shows clean text  ->  click Download
  3. proposal.txt opens in Word

Output (proposal.txt, abbreviated):
Project Proposal
Prepared for Acme Ltd

(page 1 body text in reading order...)

(blank line marks the page break)

(page 2 body text...)

Spotting a scanned PDF before you waste time in Word

If the preview is blank or shows only stray characters, the PDF has no text layer. Catch it here, OCR it, then convert — don't paste an empty file into Word and wonder why.

Input:  scanned-letter.pdf  (1 page, photo of a printed letter)

Preview panel:
  (empty)        <- no extractable text layer

Fix:
  1. Run /pdf-tools/pdf-ocr (language: English) -> searchable PDF
  2. Drop the OCR'd PDF onto /pdf-tools/pdf-to-word
  3. Now the preview shows the recognised text -> Download

Re-joining lines that were hard-wrapped in the PDF

PDFs often hard-wrap each visual line. The extractor preserves the words but the breaks come through as the source had them. One Find & Replace in Word reflows the prose.

Output (.txt) — wrapped as in the PDF:
The quarterly results exceeded
expectations across every region
except EMEA.

In Word, Find & Replace (regex / wildcards):
  Find:    ([a-z,])\n([a-z])
  Replace: \1 \2

Result:
The quarterly results exceeded expectations across every region except EMEA.

Convert only the pages you'll actually edit

Editing one section of a 120-page PDF? Extract that range first so the .txt is short and the free-tier page limit isn't a factor.

Goal: edit pages 12-18 of handbook.pdf in Word

  1. /pdf-tools/pdf-extract-pages -> pages "12-18" -> handbook.extract-pages.pdf
  2. /pdf-tools/pdf-to-word on that 7-page PDF
  3. handbook.extract-pages.txt  ->  paste the section into Word

Encrypted PDF: decrypt first, then convert

A password-protected PDF can't be parsed for text until it's decrypted. Remove the password (with the password you own), then run the converter.

Input:  signed-offer.pdf  (opens only with a password)

Direct convert -> error: the file can't be parsed while encrypted

Fix:
  1. /pdf-tools/pdf-remove-password (enter your password) -> decrypted PDF
  2. /pdf-tools/pdf-to-word -> signed-offer.txt -> edit in Word

Edge cases and what actually happens

You expected a .docx but got a .txt

By design

Scanned or photographed PDF

No text layer

Encrypted / password-protected PDF

Blocked until decrypted

pdf.js cannot read the text of an encrypted PDF. Decrypt it first with Remove PDF Password (using the password you legitimately hold) or Unlock PDF for owner-restricted files, then run the converter.

Multi-column layout (newsletter, academic paper)

Reading order may differ

Spacing looks off — extra or missing spaces

Expected

PDF over the tier size or page limit

Rejected

Free tier caps at 2 MB / 50 pages. A larger document is blocked before extraction. Either upgrade (Pro = 50 MB / 500 pages) or split it first with Extract Pages and convert each part.

Ligatures and special glyphs

Usually preserved

Hidden / off-page text comes through

Expected

Form fields and their values

Partly extracted

Frequently asked questions

Do I get a real .docx Word file?

Why .txt instead of .docx?

Is my document uploaded anywhere?

Will formatting and fonts be preserved?

Does it convert tables into Word tables?

No. Table cell text is extracted, but as plain text, not as a Word grid. For structured tables, use PDF to Excel (CSV output, columns detected by position) or PDF table to JSON.

What about images and charts?

They are skipped — images are not text. Re-insert graphics in Word, or render pages to images with PDF to JPG and paste those in.

It returned nothing / a blank preview. Why?

Are there any settings to configure?

How big a PDF can I convert?

Free tier: up to 2 MB and 50 pages. Pro: 50 MB and 500 pages. Pro Media: 500 MB and 2,000 pages. Over the limit, split the file with Extract Pages and convert sections.

Can I open the .txt directly in Word without copy-pasting?

What's the difference between this and the PDF to Text tool?

Will reading order always be correct?

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert a pdf to editable word text

What actually carries across (and what doesn't)

File-size & page limits by tier

Cookbook

A two-page Word-exported PDF, round-tripped back to editable text

Spotting a scanned PDF before you waste time in Word

Re-joining lines that were hard-wrapped in the PDF

Convert only the pages you'll actually edit

Encrypted PDF: decrypt first, then convert

Edge cases and what actually happens

You expected a .docx but got a .txt

Scanned or photographed PDF

Encrypted / password-protected PDF

Multi-column layout (newsletter, academic paper)

Spacing looks off — extra or missing spaces

PDF over the tier size or page limit

Ligatures and special glyphs

Hidden / off-page text comes through

Form fields and their values

Frequently asked questions

Do I get a real .docx Word file?

Why .txt instead of .docx?

Is my document uploaded anywhere?

Will formatting and fonts be preserved?

Does it convert tables into Word tables?

What about images and charts?

It returned nothing / a blank preview. Why?

Are there any settings to configure?

How big a PDF can I convert?

Can I open the .txt directly in Word without copy-pasting?

What's the difference between this and the PDF to Text tool?

Will reading order always be correct?

Privacy first

Related guides

Convert a PDF to Editable Word Text

How to convert a pdf to editable word text

What actually carries across (and what doesn't)

File-size & page limits by tier

Cookbook

A two-page Word-exported PDF, round-tripped back to editable text

Spotting a scanned PDF before you waste time in Word

Re-joining lines that were hard-wrapped in the PDF

Convert only the pages you'll actually edit

Encrypted PDF: decrypt first, then convert

Edge cases and what actually happens

You expected a .docx but got a .txt

Scanned or photographed PDF

Encrypted / password-protected PDF

Multi-column layout (newsletter, academic paper)

Spacing looks off — extra or missing spaces

PDF over the tier size or page limit

Ligatures and special glyphs

Hidden / off-page text comes through

Form fields and their values

Frequently asked questions

Do I get a real .docx Word file?

Why .txt instead of .docx?

Is my document uploaded anywhere?

Will formatting and fonts be preserved?

Does it convert tables into Word tables?

What about images and charts?

It returned nothing / a blank preview. Why?

Are there any settings to configure?

How big a PDF can I convert?

Can I open the .txt directly in Word without copy-pasting?

What's the difference between this and the PDF to Text tool?

Will reading order always be correct?

Privacy first

Related guides