Convert PDF Documentation to Markdown — Free Browser Tool

How to convert a pdf manual or docs into markdown

Step 1
Check the manual has selectable text — Try to select a paragraph in the PDF. If it highlights, conversion will work. A scanned manual won't select — run PDF OCR first to add a text layer.
Step 2
Convert page-range chunks if it's a big manual — Free caps at 50 pages and Pro at 500. For a long manual, slice it into chapters with PDF Extract Pages and convert each chunk, so each output maps to a docs section.
Step 3
Drop the PDF onto the converter — It reads in your browser with pdf.js and converts automatically — there's no settings panel. You'll get a ## Page N heading before every page.
Step 4
Download and split by section — Save the .md, then divide it into one file per topic. The ## Page N markers and your own section titles are natural boundaries; many teams script this split by heading.
Step 5
Promote headings and fence code blocks — Turn chapter and section titles (which arrived as plain text) into #/##/###. Wrap every code sample in fenced blocks with a language tag — code came through as plain text, so this is manual but mechanical.
Step 6
Commit to your docs repo and build — Add the new .md files to your docs framework's content tree, update the sidebar/nav, and run the build. The PDF can now become a 'download the old manual' link if you still need it.

How documentation elements convert

What a manual or technical PDF produces, and the cleanup you'll do to make it real docs.

Doc element	In the Markdown?	Notes
Body text / instructions	Yes	Extracted and split into one sentence per line per page.
Chapter / section headings	As plain text	Not promoted to `#`/`##`. You set heading levels and split into pages.
Code samples & commands	As plain text	No fenced blocks, no language tag, no monospace. Wrap in ``` yourself and add the language.
Numbered step lists	As plain text	Step numbers may survive as literal characters, but no Markdown `1.` list is created.
Tables (spec sheets, params)	No (flattened)	Cells collapse into text. Use PDF Table to JSON or PDF to Excel for the data.
Screenshots & diagrams	No	Images are ignored. Re-capture or export with PDF to PNG and embed manually.
Callouts / admonitions (Note, Warning)	As plain text	The words survive; the box styling does not. Re-wrap as `:::note` / `>` blocks for your framework.
Cross-references & ToC links	As plain text	Page-number references and ToC entries come through as text, not working links.

Output format and tier limits

Fixed pipeline — no options for encoding, page range, or splitting.

Property	Value
Input	One `.pdf` at a time
Output	One `.md` file, UTF-8, `text/markdown`
Headings emitted	`## Page N` only
Section splitting	Manual / scripted after download
Free tier	2 MB / 50 pages
Pro tier	50 MB / 500 pages
Privacy	In-browser; 0 bytes uploaded

Cookbook

Recipes for migrating a PDF manual into a Markdown docs site. Sample content is illustrative.

Convert a short guide, then split by topic

A small manual converts in one pass; you then break it into one file per topic using the page markers as guides.

Input:  setup-guide.pdf (12 pages)

Output (setup-guide.md):
## Page 1
Getting Started
Install the CLI before you begin.
## Page 2
Configuration
Edit config.yaml to set your token.

→ split into getting-started.md, configuration.md, ...

Re-fence a code sample by hand

Commands and code come through as plain text. Wrap them in fenced blocks with a language so they render as code.

As extracted:
## Page 2
Run the installer:
npm install -g acme-cli
acme login

After your edit:
Run the installer:
```bash
npm install -g acme-cli
acme login
```

Re-create a numbered step list

Procedure steps lose their list structure. Convert the lines into a real Markdown ordered list.

As extracted:
Open Settings.
Click Integrations.
Paste your API key.

After your edit:
1. Open Settings.
2. Click Integrations.
3. Paste your API key.

Chunk a 400-page manual on Pro

A large manual exceeds the free 50-page cap and is awkward as one file even on Pro. Slice it into chapters first.

Workflow:
  1. PDF Extract Pages → chapters 1-40, 41-90, ...
  2. Convert each chunk here → ch1.md, ch2.md, ...
  3. Drop each .md into the matching docs section

(Free tier: keep chunks <= 50 pages and <= 2 MB.)

Convert callouts into framework admonitions

A 'Warning' box is just text after extraction. Re-wrap it as your docs framework's admonition syntax.

As extracted:
Warning Do not delete the lock file while a job is running.

After your edit (Docusaurus):
:::warning
Do not delete the lock file while a job is running.
:::

Edge cases and what actually happens

Code samples lose their formatting

Expected

Code in the PDF arrives as plain text — no backticks, no language tag, no monospace. Whitespace and indentation may also be collapsed. Wrap samples in fenced blocks and fix indentation manually after conversion.

Chapter/section headings stay as body text

By design

Only ## Page N is emitted. Your manual's heading hierarchy comes through as plain text lines; promote them to #/##/### and split into files yourself.

Spec/parameter tables flatten

Flattened

Tables collapse into space-joined text and lose columns — bad for parameter references. Extract the table data with PDF Table to JSON and rebuild it as a Markdown table.

Screenshots and diagrams are dropped

Expected

Images aren't extracted, so step screenshots and architecture diagrams disappear. Export them with PDF to PNG (or re-capture from the live product) and embed them where they belong.

Manual over 50 pages on free tier

blocked

Free caps at 50 pages; big manuals are blocked on drop. Pro allows 500. Slice the manual into chapters with PDF Extract Pages and convert each, or upgrade.

Manual over 2 MB on free tier

blocked

Image-rich manuals often exceed 2 MB and are blocked on free. Pro raises it to 50 MB. Compress with PDF Lossy Compress or convert on Pro.

Scanned / image-only manual

Empty output

A scanned manual has no text layer, so conversion yields empty pages. Run PDF OCR first, then convert the OCR'd file.

Running headers/footers repeat on every page

Noise

Manuals usually repeat a header/footer (product name, page number) on each page, and these extract as text at every page boundary. Strip them with a find-and-replace pass in your editor before publishing.

Table of contents becomes plain text

Expected

A PDF ToC with dotted leaders and page numbers extracts as text lines, not working links. Delete it and let your docs framework generate the sidebar/nav instead.

Frequently asked questions

Will my code samples extract as fenced code blocks?

No. Code comes through as plain text with no backticks, language tag, or monospace, and indentation may collapse. Wrap each sample in a fenced block (```), add the language, and fix indentation after conversion.

Are the manual's chapter and section headings preserved?

Only the ## Page N markers are headings. Your chapters and sections arrive as plain text because the tool can't infer heading levels from layout. Promote them to #/##/### and split the file into pages yourself.

What happens to parameter or spec tables?

They flatten into space-joined text and lose their columns. For reference tables, extract the data with PDF Table to JSON or PDF to Excel, then rebuild a clean Markdown table.

Can I keep the PDF and Markdown in sync automatically?

No. After conversion the Markdown becomes the source of truth; the tool is one-directional and has no link back to the PDF. Maintain the docs in Git going forward and keep the old PDF only as a download if needed.

How do I handle a 500-page manual?

Don't convert it as one blob. Slice it into chapters with PDF Extract Pages (each under 50 pages / 2 MB to stay on free, or up to 500 pages on Pro), convert each chunk, and drop the outputs into matching docs sections.

Do screenshots and diagrams come across?

No — images are ignored entirely. Export figures with PDF to PNG or re-capture them from the live product, then embed them in the Markdown where they belong.

Will it work in Docusaurus, MkDocs, or ReadTheDocs?

Yes. The output is standard Markdown with no extended syntax to strip, so it works across all the common docs frameworks. Promote the headings, fence the code, and add the pages to your nav.

How do I deal with repeated headers and footers?

They extract on every page (product name, page number, etc.). Do a find-and-replace pass in your editor to remove the repeating lines before you publish — the tool doesn't strip running headers.

My scanned manual converted to empty pages — why?

A scan is images, not text, so there's nothing for pdf.js to read. Run PDF OCR first to add a text layer, then convert the OCR'd manual here.

Is my documentation uploaded anywhere?

No. Conversion runs entirely in your browser via pdf.js, so internal and unreleased manuals stay on your machine. The result panel confirms '0 bytes uploaded'.

Can I script the whole docs migration?

Partly. On Pro, pdf-to-markdown is a runner-builtin you can call from the @jadapps/runner locally, then run your own script to split by heading and fence code. The PDF never reaches JAD's servers — it's processed on your machine.

How is this different from PDF to Text for docs?

PDF to Text gives a plain .txt with no structure at all. This adds ## Page N markers and sentence-per-line output, which are handier split points and diff units when you're building a Markdown docs site.

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

How to convert a pdf manual or docs into markdown

Step 1
Check the manual has selectable text — Try to select a paragraph in the PDF. If it highlights, conversion will work. A scanned manual won't select — run PDF OCR first to add a text layer.
Step 2
Convert page-range chunks if it's a big manual — Free caps at 50 pages and Pro at 500. For a long manual, slice it into chapters with PDF Extract Pages and convert each chunk, so each output maps to a docs section.
Step 3
Drop the PDF onto the converter — It reads in your browser with pdf.js and converts automatically — there's no settings panel. You'll get a ## Page N heading before every page.
Step 4
Download and split by section — Save the .md, then divide it into one file per topic. The ## Page N markers and your own section titles are natural boundaries; many teams script this split by heading.
Step 5
Promote headings and fence code blocks — Turn chapter and section titles (which arrived as plain text) into #/##/###. Wrap every code sample in fenced blocks with a language tag — code came through as plain text, so this is manual but mechanical.
Step 6
Commit to your docs repo and build — Add the new .md files to your docs framework's content tree, update the sidebar/nav, and run the build. The PDF can now become a 'download the old manual' link if you still need it.

How documentation elements convert

What a manual or technical PDF produces, and the cleanup you'll do to make it real docs.

Doc element	In the Markdown?	Notes
Body text / instructions	Yes	Extracted and split into one sentence per line per page.
Chapter / section headings	As plain text	Not promoted to `#`/`##`. You set heading levels and split into pages.
Code samples & commands	As plain text	No fenced blocks, no language tag, no monospace. Wrap in ``` yourself and add the language.
Numbered step lists	As plain text	Step numbers may survive as literal characters, but no Markdown `1.` list is created.
Tables (spec sheets, params)	No (flattened)	Cells collapse into text. Use PDF Table to JSON or PDF to Excel for the data.
Screenshots & diagrams	No	Images are ignored. Re-capture or export with PDF to PNG and embed manually.
Callouts / admonitions (Note, Warning)	As plain text	The words survive; the box styling does not. Re-wrap as `:::note` / `>` blocks for your framework.
Cross-references & ToC links	As plain text	Page-number references and ToC entries come through as text, not working links.

Output format and tier limits

Fixed pipeline — no options for encoding, page range, or splitting.

Property	Value
Input	One `.pdf` at a time
Output	One `.md` file, UTF-8, `text/markdown`
Headings emitted	`## Page N` only
Section splitting	Manual / scripted after download
Free tier	2 MB / 50 pages
Pro tier	50 MB / 500 pages
Privacy	In-browser; 0 bytes uploaded

Cookbook

Recipes for migrating a PDF manual into a Markdown docs site. Sample content is illustrative.

Convert a short guide, then split by topic

A small manual converts in one pass; you then break it into one file per topic using the page markers as guides.

Input:  setup-guide.pdf (12 pages)

Output (setup-guide.md):
## Page 1
Getting Started
Install the CLI before you begin.
## Page 2
Configuration
Edit config.yaml to set your token.

→ split into getting-started.md, configuration.md, ...

Re-fence a code sample by hand

Commands and code come through as plain text. Wrap them in fenced blocks with a language so they render as code.

As extracted:
## Page 2
Run the installer:
npm install -g acme-cli
acme login

After your edit:
Run the installer:
```bash
npm install -g acme-cli
acme login
```

Re-create a numbered step list

Procedure steps lose their list structure. Convert the lines into a real Markdown ordered list.

As extracted:
Open Settings.
Click Integrations.
Paste your API key.

After your edit:
1. Open Settings.
2. Click Integrations.
3. Paste your API key.

Chunk a 400-page manual on Pro

A large manual exceeds the free 50-page cap and is awkward as one file even on Pro. Slice it into chapters first.

Workflow:
  1. PDF Extract Pages → chapters 1-40, 41-90, ...
  2. Convert each chunk here → ch1.md, ch2.md, ...
  3. Drop each .md into the matching docs section

(Free tier: keep chunks <= 50 pages and <= 2 MB.)

Convert callouts into framework admonitions

A 'Warning' box is just text after extraction. Re-wrap it as your docs framework's admonition syntax.

As extracted:
Warning Do not delete the lock file while a job is running.

After your edit (Docusaurus):
:::warning
Do not delete the lock file while a job is running.
:::

Edge cases and what actually happens

Code samples lose their formatting

Expected

Chapter/section headings stay as body text

By design

Only ## Page N is emitted. Your manual's heading hierarchy comes through as plain text lines; promote them to #/##/### and split into files yourself.

Spec/parameter tables flatten

Flattened

Tables collapse into space-joined text and lose columns — bad for parameter references. Extract the table data with PDF Table to JSON and rebuild it as a Markdown table.

Screenshots and diagrams are dropped

Expected

Images aren't extracted, so step screenshots and architecture diagrams disappear. Export them with PDF to PNG (or re-capture from the live product) and embed them where they belong.

Manual over 50 pages on free tier

blocked

Free caps at 50 pages; big manuals are blocked on drop. Pro allows 500. Slice the manual into chapters with PDF Extract Pages and convert each, or upgrade.

Manual over 2 MB on free tier

blocked

Image-rich manuals often exceed 2 MB and are blocked on free. Pro raises it to 50 MB. Compress with PDF Lossy Compress or convert on Pro.

Scanned / image-only manual

Empty output

A scanned manual has no text layer, so conversion yields empty pages. Run PDF OCR first, then convert the OCR'd file.

Running headers/footers repeat on every page

Noise

Table of contents becomes plain text

Expected

A PDF ToC with dotted leaders and page numbers extracts as text lines, not working links. Delete it and let your docs framework generate the sidebar/nav instead.

Frequently asked questions

Will my code samples extract as fenced code blocks?

Are the manual's chapter and section headings preserved?

What happens to parameter or spec tables?

They flatten into space-joined text and lose their columns. For reference tables, extract the data with PDF Table to JSON or PDF to Excel, then rebuild a clean Markdown table.

Can I keep the PDF and Markdown in sync automatically?

How do I handle a 500-page manual?

Do screenshots and diagrams come across?

No — images are ignored entirely. Export figures with PDF to PNG or re-capture them from the live product, then embed them in the Markdown where they belong.

Will it work in Docusaurus, MkDocs, or ReadTheDocs?

Yes. The output is standard Markdown with no extended syntax to strip, so it works across all the common docs frameworks. Promote the headings, fence the code, and add the pages to your nav.

How do I deal with repeated headers and footers?

They extract on every page (product name, page number, etc.). Do a find-and-replace pass in your editor to remove the repeating lines before you publish — the tool doesn't strip running headers.

My scanned manual converted to empty pages — why?

A scan is images, not text, so there's nothing for pdf.js to read. Run PDF OCR first to add a text layer, then convert the OCR'd manual here.

Is my documentation uploaded anywhere?

No. Conversion runs entirely in your browser via pdf.js, so internal and unreleased manuals stay on your machine. The result panel confirms '0 bytes uploaded'.

Can I script the whole docs migration?

How is this different from PDF to Text for docs?

Privacy first

All PDF processing runs locally in your browser using PDF-lib and pdf.js. No file is ever uploaded — only metadata counters are saved for signed-in dashboard stats.

Convert a PDF Manual or Docs into Markdown

How to convert a pdf manual or docs into markdown

How documentation elements convert

Output format and tier limits

Cookbook

Convert a short guide, then split by topic

Re-fence a code sample by hand

Re-create a numbered step list

Chunk a 400-page manual on Pro

Convert callouts into framework admonitions

Edge cases and what actually happens

Code samples lose their formatting

Chapter/section headings stay as body text

Spec/parameter tables flatten

Screenshots and diagrams are dropped

Manual over 50 pages on free tier

Manual over 2 MB on free tier

Scanned / image-only manual

Running headers/footers repeat on every page

Table of contents becomes plain text

Frequently asked questions

Will my code samples extract as fenced code blocks?

Are the manual's chapter and section headings preserved?

What happens to parameter or spec tables?

Can I keep the PDF and Markdown in sync automatically?

How do I handle a 500-page manual?

Do screenshots and diagrams come across?

Will it work in Docusaurus, MkDocs, or ReadTheDocs?

How do I deal with repeated headers and footers?

My scanned manual converted to empty pages — why?

Is my documentation uploaded anywhere?

Can I script the whole docs migration?

How is this different from PDF to Text for docs?

Privacy first

Related guides

Convert a PDF Manual or Docs into Markdown

How to convert a pdf manual or docs into markdown

How documentation elements convert

Output format and tier limits

Cookbook

Convert a short guide, then split by topic

Re-fence a code sample by hand

Re-create a numbered step list

Chunk a 400-page manual on Pro

Convert callouts into framework admonitions

Edge cases and what actually happens

Code samples lose their formatting

Chapter/section headings stay as body text

Spec/parameter tables flatten

Screenshots and diagrams are dropped

Manual over 50 pages on free tier

Manual over 2 MB on free tier

Scanned / image-only manual

Running headers/footers repeat on every page

Table of contents becomes plain text

Frequently asked questions

Will my code samples extract as fenced code blocks?

Are the manual's chapter and section headings preserved?

What happens to parameter or spec tables?

Can I keep the PDF and Markdown in sync automatically?

How do I handle a 500-page manual?

Do screenshots and diagrams come across?

Will it work in Docusaurus, MkDocs, or ReadTheDocs?

How do I deal with repeated headers and footers?

My scanned manual converted to empty pages — why?

Is my documentation uploaded anywhere?

Can I script the whole docs migration?

How is this different from PDF to Text for docs?

Privacy first

Related guides