HTML to Markdown

Overview

HTML to Markdown converts HTML code or uploaded .html/.htm files (up to 10 MB) into Markdown text using Turndown.js. Paste a snippet or drop a file, adjust the format options, and copy or download the result as converted.md. The conversion runs entirely in your browser — nothing is sent to a server.

What converts cleanly vs. what needs manual cleanup

Converts reliably

  • Paragraphs and H1–H6 headings
  • Ordered and unordered lists, including nested
  • Code blocks (<code> and <pre>)
  • Inline links and image references
  • Blockquotes (<blockquote>)

May need manual adjustment

  • Tables with merged cells (colspan/rowspan not supported)
  • Deeply nested lists (indentation can shift)
  • Embedded images (converted to ![alt](src) but path depends on original domain)
  • Inline styles inside <div> (all CSS is stripped)

Format options that actually matter

The "Conversion Options" panel has ten settings. Defaults work for most cases, but these three are worth adjusting:

  • Heading style: atx (uses # prefixes) has broader compatibility than setext (which only applies to H1/H2 using ===/--- underlines). Use atx unless you have a specific reason not to.
  • Code block style: fenced (triple backticks) supports syntax-language annotations; indented (four spaces) is plain but always valid.
  • Link style: inlined keeps the URL next to the text; referenced moves all URLs to a reference list at the end — useful for long articles where inline URLs clutter reading.

What happens to scripts, styles, and modern HTML elements

<script> blocks, <style> declarations, and HTML5 elements with no Markdown equivalent (<video>, <canvas>, <form>) are stripped entirely. Only structure that maps to standard Markdown syntax is kept. This applies equally whether you upload a file or paste directly.

Getting better output from messy HTML

  • If the HTML comes from a CMS or email editor, stripping outer <div> wrappers and class attributes before pasting reduces noise — they produce no Markdown output but can fragment paragraphs.
  • For full web pages, copy just the <article> or <main> body rather than the entire page source. Navigation, headers, and footers add clutter with no useful Markdown equivalent.
  • If embedded images need to stay usable, check that their src values are absolute URLs. Relative paths like ../images/photo.jpg are preserved as-is and won't resolve outside the original site.