SmartMarkdown

Markdown to JSON Converter

Convert Markdown documents to a structured JSON representation. The converter parses your Markdown hierarchy — headings become section objects with IDs, content, and nested subsections; code blocks are extracted with their language tags; tables become header/rows arrays — giving you a machine-readable document structure for use in APIs, content pipelines, and developer tools.

Markdown Input

221 words

Converts document hierarchy to nested JSON. Sections, subsections, code blocks, and tables are all captured.

JSON Structure

Your JSON Structure will appear here

Edit the Markdown on the left, then click Convert

Reviewers

Sarah Chen, SEO Content Strategist

Based on 5 sources
278 people find this tool helpful

What Is a Markdown to JSON Converter

A Markdown to JSON converter parses a Markdown document's hierarchical structure — headings, sections, code blocks, tables — and serialises it as a structured JSON object. Rather than converting the visual representation of the document (to HTML, PDF, or Word), this converter extracts the document's semantic structure into a machine-readable format suitable for programmatic processing.

JSON output from a Markdown document is valuable in content pipeline contexts: building a search index from documentation, generating API responses from content files, feeding content to an LLM context window with structured metadata, or creating a content management layer on top of a Markdown file system. The JSON representation preserves the full information content of the Markdown document in a queryable, filterable structure.

SmartMarkdown uses the marked library to parse the Markdown into an abstract syntax tree (AST), then transforms the AST into a document hierarchy JSON object. The transformation runs entirely in-browser with no server round-trip.

The JSON Schema

The generated JSON follows a consistent schema with two top-level properties:

  • metadata: An object containing document-level statistics — wordCount (integer), sectionCount (integer), generatedAt (ISO 8601 UTC string), and sourceLength (character count of the Markdown source).
  • document.title: The text content of the first H1 heading, or null if no H1 is present.
  • document.sections: An array of section objects, one per top-level H2 heading (or H1 if no H2 headings exist).

Each section object in the sections array has the following shape:

  • id: A URL-safe slug generated from the heading text (lowercase, spaces to hyphens, special characters removed).
  • heading: The heading text as a plain string.
  • level: The heading level as an integer (1–6).
  • content: Concatenated paragraph text under the heading, before any sub-heading.
  • codeBlocks: Array of { language, code } objects.
  • tables: Array of { headers, rows } objects.
  • subsections: Array of nested section objects for child headings.

Section Hierarchy

SmartMarkdown builds the JSON section tree by tracking heading level transitions as it processes the document:

  • H2 = top-level sections: Each H2 heading creates a new object in the top-level sections array. Content (paragraphs, code blocks, tables) before the next heading is assigned to that section.
  • H3 = subsections of H2: H3 headings are added to the subsections array of the most recently opened H2 section.
  • H4 = sub-subsections: H4 headings are added to the subsections array of the most recently opened H3 section. The hierarchy continues for H5 and H6.
  • Depth-based nesting:The nesting is purely depth-based. If the document skips heading levels (H2 directly to H4), the H4 is added to the H2's subsections array with level: 4, maintaining structural accuracy without inserting phantom intermediate levels.

This structure mirrors how a table of contents represents document hierarchy — the JSON tree is effectively the document outline in JSON form with content and extracted elements at each node.

Benefits of Markdown to JSON

A JSON representation of a Markdown document enables use cases that raw Markdown or rendered HTML cannot support efficiently:

  • Machine-readable content: JSON can be consumed by any programming language without a Markdown parser. The structured format enables querying section content, extracting code examples, and processing tables programmatically.
  • API integration: Content in JSON format can be served directly as API responses. Documentation sections, FAQ items, and code examples extracted from Markdown can be exposed as a REST or GraphQL API without a CMS.
  • Content search indexing: Search engines like Algolia, Typesense, and Elasticsearch ingest documents as JSON objects. Converting Markdown to JSON produces records with section IDs, headings, and content fields that map directly to a search index schema.
  • Programmatic manipulation: JSON document structure can be transformed, filtered, merged, and re-serialised with standard JSON tools — jq, JavaScript array methods, Python dictionaries — without needing a Markdown-specific library.

Common Use Cases

Markdown to JSON conversion is used in these developer and content workflows:

  • Content APIs: Teams that manage documentation or blog content as Markdown files convert to JSON to build lightweight content APIs — serving section content, FAQ items, or code examples as structured API responses without a database.
  • Static site generator data files: Next.js, Astro, and similar frameworks can import JSON data files directly. Converting Markdown to JSON enables rich data-driven page generation beyond what MDX or remark plugins typically provide.
  • Documentation search: Algolia DocSearch and similar tools ingest JSON record arrays for indexing. Converting Markdown docs to JSON with section-level records produces a search index where users can find specific sections, not just whole pages.
  • AI training data and context: Structured JSON with section boundaries and content fields is preferable to raw Markdown for embedding documents in LLM context windows or preparing training data — the structure helps the model understand document organisation.

Tips for Better JSON Output

These practices produce the cleanest and most useful JSON output from Markdown:

  • Use clear heading hierarchy. The JSON section tree directly reflects the Markdown heading structure. Documents with consistent H1 → H2 → H3 hierarchy produce clean, deeply nested JSON. Inconsistent heading levels produce flat or oddly nested structures.
  • Use consistent heading levels. Avoid using the same heading level for both major and minor structural divisions — this produces sections and subsections at the same nesting depth in the JSON, making the hierarchy ambiguous.
  • Code blocks get language tags. Add language hints to fenced code blocks (```javascript, ```python) — these appear as the language field in the codeBlocks JSON array and are useful for syntax highlighting and content categorisation in downstream systems.
  • Check JSON in browser console.After downloading the JSON, open it in your browser's developer console (JSON.parse(text)) or paste it into jsonlint.com to verify the structure is as expected before integrating it into your application.

FAQ

Frequently Asked Questions