PDF / Image to Markdown AI Premium
Extract Markdown + bbox JSON from a PDF or image in 22-30 seconds for ChatGPT, Claude, RAG, embeddings. Premium AI keeps headings, tables, formulas, Vietnamese diacritics. PDF ≤50 pages, images jpg/png/webp ≤20MB.
- 22-30s for a 9-page Vietnamese PDF — 15× faster than Marker, on par with Adobe Extract
- Tables ship as HTML <table> with colspan/rowspan intact — not pipe-markdown that mangles wide tables
- Scanned PDFs and photos read natively via Qwen2-VL vision — no traditional OCR layer
Drag & drop files here
or click to choose files
Tải lên 1 PDF (≤50 trang) hoặc ảnh chụp tài liệu (≤20MB)
What's in the ZIP?
BetaPDF returns 2 files so you can use it flexibly across AI workflows:
filename.md— Plain Markdown — paste into ChatGPT / Claude, or embed in your docs.filename_content_list.json— JSON list of blocks + bbox — for RAG, embedding, automated OCR pipelines.
How to pdf / image to markdown in 3 Steps
Upload PDF (max 50 pages)
Pick language and toggle formulas / tables
Download ZIP with .md + .json for AI
How BetaPDF PDF → Markdown is different
Most online PDF→Markdown tools just lift the text layer and paste it into markdown — which breaks the moment you hit a scanned PDF, a complex table, or a multi-column layout. BetaPDF takes a different path: we read the PDF with a Qwen2-VL vision model running on vLLM (GB10 GPU), looking at each page the way a human does and reconstructing structure from the pixels. The result: tables ship as HTML <table> with colspan/rowspan intact, math formulas as native LaTeX, embedded images extracted to their own files, scans read directly without an intermediate OCR layer. 22-30 seconds for a 9-page Vietnamese PDF — 15× faster than open-source VLM alternatives.
HTML tables — no structure loss
Full <table>/<tr>/<td> output with merged-cell support. Renders cleanly in Obsidian, GitHub, ChatGPT, Notion — none of the markdown-pipe garbling that wide tables suffer from.
Native LaTeX formulas
Extracts \\frac, \\sum, \\int, matrices, and more. Paste straight into MathJax/KaTeX viewers or ChatGPT — no flattening to garbled glyphs.
Scanned PDFs read via vision
Qwen2-VL looks directly at PDF/image pixels. No text-layer dependency, no Tesseract OCR — 99.7% accuracy on Vietnamese diacritics over 300-DPI scans.
JSON bbox for RAG pipelines
ZIP includes *_content_list.json: every block has bbox (0-1000 normalized) + type (paragraph/table/figure/formula). Drop-in compatible with the Landing AI ADE shape.
Pipeline: MinerU 2.x → Qwen2-VL vision model → vLLM 0.16 serving on GB10 (sm_121). vlm-http-client backend with native batching, ~15× faster than vlm-auto-engine (transformers). Input: PDFs up to 100MB (sync ≤10 pages, async ≤50 pages) or jpg/png/webp images up to 20MB. Output: ZIP {.md + _content_list.json + images/}.
BetaPDF vs Marker / Adobe / CloudConvert / Landing AI ADE
Quick comparison of the questions that come up most often when choosing a PDF → Markdown tool for an AI workflow.
| Criterion | BetaPDF | Marker (OSS) | Adobe Extract | CloudConvert | Landing AI ADE |
|---|---|---|---|---|---|
| Speed (9-page VN PDF) | 22-30s | 60-180s | ~10s | ~20s | ~25s |
| Entry price (≥1000 pages/mo) | $9.99/mo | Free (self-host) | $14.99/mo | $8/mo (100pg) | $250/mo |
| HTML tables w/ merged cells | ✓ | — | ✓ | — | ✓ |
| LaTeX formulas | ✓ | ✓ | — | — | ✓ |
| Scanned PDF native (VLM) | ✓ | — | — | — | ✓ |
| Embedded images extracted | ✓ | — | — | — | ✓ |
| JSON bbox for RAG | ✓ | — | △ partial | — | ✓ |
| 99.7% Vietnamese diacritics | ✓ | △ | △ | △ | △ |
| Free web UI ≤50 pages | ✓ | — | — | △ | — |
Comparison as of 2026-05. Marker (github.com/VikParuchuri/marker) requires self-hosting on an 8GB+ GPU. CloudConvert caps the $8 plan at 100 pages/month. Landing AI ADE Team plan is $250/mo for 5,000 pages. Vietnamese diacritic accuracy: most Western tools don't report this metric but degrade visibly on scans; BetaPDF's 99.7% was measured on a 100-page 300-DPI scan benchmark.
Perfect for: feeding contracts/research/lecture notes into ChatGPT or Claude, building Vietnamese RAG knowledge bases, automated OCR pipelines. If your file is larger than 50 pages, run Split PDF first.
📖 New: How to Convert PDF to Markdown for ChatGPT, Claude & RAG — full step-by-step guide.
Usage Examples
Math paper → MathJax
Frequently Asked Questions
PDF to Markdown returns structured output (headings, lists, tables, LaTeX formulas) ready for LLMs, RAG, embeddings. PDF to Word is the right pick when you need to edit the document inside Microsoft Word.
Yes — and this is the key differentiator. BetaPDF uses the Qwen2-VL vision model on vLLM, reading PDF pixels directly the way a human does — no dependency on a text layer. Scanned contracts, photographed pages, and scanned books all work natively. You can also upload jpg/png/webp files directly (no need to convert to PDF first).
Yes. Output ships as HTML <table>/<tr>/<td>, including merged cells (colspan/rowspan). Markdown standard doesn't support merged cells, so tools that emit `| col1 | col2 |` syntax routinely garble wide or merged tables. BetaPDF sidesteps this by emitting HTML, which renders correctly in every markdown viewer that supports inline HTML (Obsidian, GitHub, ChatGPT).
Related Tools
PDF to Word
Convert PDF to editable Word document (.docx)
OCR PDF
Convert scanned PDF to searchable PDF with selectable text
PDF to Excel
Extract tables from PDF to editable Excel spreadsheet (.xlsx)
Extract Pages
Extract specific pages from a PDF
Compress PDF
Reduce PDF file size while preserving quality
Split PDF
Split a PDF file into multiple smaller files