HTML to Markdown Converter
Paste HTML and get back clean Markdown — perfect for migrating content into Notion, Obsidian, GitHub READMEs, or any Markdown-based system.
<h1>Hello</h1><p><strong>bold</strong></p> → # Hello
**bold**
Paste HTML and get back clean Markdown — perfect for migrating content into Notion, Obsidian, GitHub READMEs, or any Markdown-based system.
<h1>Hello</h1><p><strong>bold</strong></p> → # Hello
**bold**
This tool reverses the usual flow: paste rendered HTML (from a webpage, an email, an RSS export, or a CMS), and get back clean Markdown ready for a Markdown-based system. Inline styles, classes, and most non-content attributes are stripped automatically.
Useful when migrating content out of a WYSIWYG-only CMS (Wix, Squarespace, older WordPress) into a Markdown-first system (Notion, Obsidian, Hugo, GitHub Pages). The output is conservative — when in doubt, the tool drops formatting rather than guessing.
The converter parses the input HTML using a DOM parser (typically the browser's built-in DOMParser or a pure JavaScript implementation). It traverses the resulting node tree recursively, mapping each HTML element to its Markdown equivalent. Block-level elements like <h1> become # headings, <p> becomes plain text with double newlines, and <strong> becomes **bold**. Inline styles, class attributes, and non-semantic elements (e.g., <span>) are stripped unless they carry semantic meaning (like <em>). The algorithm follows a predefined rule set that covers HTML5 tags and common edge cases: nested lists, blockquotes, code blocks, and images. Attributes like href and src are extracted to form Markdown links and images. Self-closing tags like <br> are converted to two trailing spaces for a line break. The output is assembled by concatenating the Markdown fragments with appropriate whitespace handling to ensure valid Markdown structure.
| This tool | Pandoc CLI | Manual Find/Replace | |
|---|---|---|---|
| Setup | No installation, runs in browser | Requires installing Pandoc (Haskell dependency) | No software, but extremely tedious for large documents |
| Accuracy | Handles most HTML5 tags, semantics, and common edge cases | Extremely accurate, supports many input/output formats, but can be overkill | Highly error-prone on nested structures and attributes |
| Speed | Instant for typical documents | Fast on command line, but overhead of starting the tool | Slow for anything beyond trivial changes |
Markdown was created by John Gruber in 2004 as a lightweight markup language for easy reading and writing of plain text. HTML, the predecessor, has been the standard for web content since the early 1990s. The need to convert between the two arose as static site generators, note-taking apps, and documentation platforms adopted Markdown. This tool implements the conversion logic inspired by Gruber's original Perl script and later enhanced by community-driven specifications like CommonMark.
Copy the rendered HTML from a Wix / Squarespace / old WordPress post, paste it here, and the Markdown output is ready to drop into Notion or Obsidian without manual cleanup.
Hugo, Jekyll, and Eleventy all run on Markdown content. Convert your existing site's HTML pages to Markdown for faster rebuilds.
Email clients export messy HTML with inline styles and conditional tags. Convert to Markdown to get just the content.
Reader-mode HTML from articles, when converted to Markdown, becomes a clean reading file for Obsidian or your knowledge base.
Internal docs in a corporate wiki (Confluence) → Markdown for migrating to GitBook, ReadMe, or a docs-as-code system.
Single-level lists work cleanly. Nested lists may flatten to a single level. For documents with deeply nested structures, use Turndown as a more sophisticated alternative.
All inline style= and class= attributes are stripped — they have no Markdown equivalent. If your content depends on styling (specific font sizes, colors), you'll need to add it back in your destination system.
Not yet. Tables are converted to plain text. For HTML tables, manually rebuild them in your destination system using its table syntax.