Sarah Chen, SEO Content Strategist
What Is a Markdown to JSON Converter
A Markdown to JSON converter parses a Markdown document's hierarchical structure — headings, sections, code blocks, tables — and serialises it as a structured JSON object. Rather than converting the visual representation of the document (to HTML, PDF, or Word), this converter extracts the document's semantic structure into a machine-readable format suitable for programmatic processing.
JSON output from a Markdown document is valuable in content pipeline contexts: building a search index from documentation, generating API responses from content files, feeding content to an LLM context window with structured metadata, or creating a content management layer on top of a Markdown file system. The JSON representation preserves the full information content of the Markdown document in a queryable, filterable structure.
SmartMarkdown uses the marked library to parse the Markdown into an abstract syntax tree (AST), then transforms the AST into a document hierarchy JSON object. The transformation runs entirely in-browser with no server round-trip.
The JSON Schema
The generated JSON follows a consistent schema with two top-level properties:
metadata: An object containing document-level statistics —wordCount(integer),sectionCount(integer),generatedAt(ISO 8601 UTC string), andsourceLength(character count of the Markdown source).document.title: The text content of the first H1 heading, or null if no H1 is present.document.sections: An array of section objects, one per top-level H2 heading (or H1 if no H2 headings exist).
Each section object in the sections array has the following shape:
id: A URL-safe slug generated from the heading text (lowercase, spaces to hyphens, special characters removed).heading: The heading text as a plain string.level: The heading level as an integer (1–6).content: Concatenated paragraph text under the heading, before any sub-heading.codeBlocks: Array of{ language, code }objects.tables: Array of{ headers, rows }objects.subsections: Array of nested section objects for child headings.
Section Hierarchy
SmartMarkdown builds the JSON section tree by tracking heading level transitions as it processes the document:
- H2 = top-level sections: Each H2 heading creates a new object in the top-level
sectionsarray. Content (paragraphs, code blocks, tables) before the next heading is assigned to that section. - H3 = subsections of H2: H3 headings are added to the
subsectionsarray of the most recently opened H2 section. - H4 = sub-subsections: H4 headings are added to the
subsectionsarray of the most recently opened H3 section. The hierarchy continues for H5 and H6. - Depth-based nesting:The nesting is purely depth-based. If the document skips heading levels (H2 directly to H4), the H4 is added to the H2's subsections array with level: 4, maintaining structural accuracy without inserting phantom intermediate levels.
This structure mirrors how a table of contents represents document hierarchy — the JSON tree is effectively the document outline in JSON form with content and extracted elements at each node.
Benefits of Markdown to JSON
A JSON representation of a Markdown document enables use cases that raw Markdown or rendered HTML cannot support efficiently:
- Machine-readable content: JSON can be consumed by any programming language without a Markdown parser. The structured format enables querying section content, extracting code examples, and processing tables programmatically.
- API integration: Content in JSON format can be served directly as API responses. Documentation sections, FAQ items, and code examples extracted from Markdown can be exposed as a REST or GraphQL API without a CMS.
- Content search indexing: Search engines like Algolia, Typesense, and Elasticsearch ingest documents as JSON objects. Converting Markdown to JSON produces records with section IDs, headings, and content fields that map directly to a search index schema.
- Programmatic manipulation: JSON document structure can be transformed, filtered, merged, and re-serialised with standard JSON tools — jq, JavaScript array methods, Python dictionaries — without needing a Markdown-specific library.
Common Use Cases
Markdown to JSON conversion is used in these developer and content workflows:
- Content APIs: Teams that manage documentation or blog content as Markdown files convert to JSON to build lightweight content APIs — serving section content, FAQ items, or code examples as structured API responses without a database.
- Static site generator data files: Next.js, Astro, and similar frameworks can import JSON data files directly. Converting Markdown to JSON enables rich data-driven page generation beyond what MDX or remark plugins typically provide.
- Documentation search: Algolia DocSearch and similar tools ingest JSON record arrays for indexing. Converting Markdown docs to JSON with section-level records produces a search index where users can find specific sections, not just whole pages.
- AI training data and context: Structured JSON with section boundaries and content fields is preferable to raw Markdown for embedding documents in LLM context windows or preparing training data — the structure helps the model understand document organisation.
Tips for Better JSON Output
These practices produce the cleanest and most useful JSON output from Markdown:
- Use clear heading hierarchy. The JSON section tree directly reflects the Markdown heading structure. Documents with consistent H1 → H2 → H3 hierarchy produce clean, deeply nested JSON. Inconsistent heading levels produce flat or oddly nested structures.
- Use consistent heading levels. Avoid using the same heading level for both major and minor structural divisions — this produces sections and subsections at the same nesting depth in the JSON, making the hierarchy ambiguous.
- Code blocks get language tags. Add language hints to fenced code blocks (
```javascript,```python) — these appear as thelanguagefield in the codeBlocks JSON array and are useful for syntax highlighting and content categorisation in downstream systems. - Check JSON in browser console.After downloading the JSON, open it in your browser's developer console (
JSON.parse(text)) or paste it into jsonlint.com to verify the structure is as expected before integrating it into your application.