How to convert a scanned PDF to Markdown
Scanned PDFs are everywhere — contracts, reports, old books, government forms — and they share one frustrating trait: the text is locked inside an image. You cannot select it, search it, or paste it into a document. This guide shows how to turn an image-only PDF into clean, structured Markdown in under a minute, without building an OCR pipeline.
Why Markdown?
Markdown is the most portable structured text format there is. It renders on GitHub, in wikis, in static-site generators, and — increasingly — it is the format large language models read most reliably. Converting a scan to Markdown gives you content that is searchable, version-controllable, and ready to feed into RAG and AI agents.
The old way vs. the AI way
Traditional OCR tools (Tesseract and friends) extract raw characters but lose all structure: headings collapse into body text, tables turn into jumbled rows, and multi-column layouts scramble. You then spend hours cleaning the output by hand.
Scan Hero takes a different approach. Every conversion runs on Anthropic's Claude models, which read the page the way a person does — recognising headings, tables, and reading order — and emit well-formed Markdown. There is no pipeline to build, no model to host, and no prompt engineering required.
Step by step
- Sign in free. Create an account with Google — you get 100 free credits, enough for about 10 full conversions, with no credit card.
- Upload your scanned PDF. Drag the file in. You see the exact credit cost before you convert.
- Choose Markdown as the output format. For photos and complex layouts, enable Describe images so figures are described inline.
- Download and refine. Get clean Markdown back. If anything needs tightening, use the AI adjustment feature to reformat in seconds — it never changes facts, only structure.
Tips for the best results
- Use the highest-resolution scan you have. Crisp, well-lit pages transcribe far more accurately than blurry phone photos.
- For multi-page documents, headings and section order are preserved automatically.
- Large files (over 10 MB) are processed asynchronously and returned the moment they are ready.
Convert other formats too
The same engine handles much more than PDFs. If your source is different, jump straight to the right workflow:
- Convert handwritten notes and scans to text
- Convert scanned invoices to JSON
- Convert Excel spreadsheets to Markdown
Ready to try it? Start free with 100 credits — no code, no card, no commitment.