Sign in to use this tool
This tool may consume credits. Please sign in to continue.

AI OCR to Markdown

Overview

This AI-powered OCR tool extracts text from images and PDF documents into structured Markdown, handling tables, hyperlinks, and embedded images alongside plain text. Upload a file and recognition runs automatically; results are returned page by page, with per-page copy and full-document download options.

What affects recognition accuracy

Source file quality is the main factor. For best results:

  • Scanned documents at 150 DPI or higher with clear, unobstructed text recognize most accurately
  • Blurry photos, heavily skewed pages, dense watermarks, or very small type (below 6pt) introduce errors
  • Multi-column layouts and complex formatting are handled better than with traditional rule-based OCR

For PDFs, each page is processed independently. Processing time scales with page count — keeping single submissions under 50 pages is recommended.

What each page result contains

After recognition, each page returns:

  • Markdown body — headings, paragraphs, lists, code blocks
  • Tables — extracted as Markdown table syntax, copyable separately
  • Hyperlinks — URLs found in the document are listed individually
  • Embedded images — charts and illustrations are extracted as inline base64 images when detectable
  • Page dimensions and DPI — the original pixel dimensions of the source page

Supported file types

Image formats

  • JPEG, PNG, WEBP
  • GIF, BMP, TIFF
  • SVG (vector graphics)
  • Best for single-page scans and screenshots

Document format

  • PDF (any page count)
  • Each page recognized independently
  • Results displayed per page with individual download

Downloading results

Individual pages can be downloaded as .md (Markdown) or .txt (plain text). For multi-page documents, "Download All" merges every page into a single file with --- separators between pages.