Reinventing mdx — Rendering untrusted markdown

The story begins in the latter half of the title — client-side rendering componentized, untrusted markdown.

We hope to explore such a client-side rendering markdown solution. However, in the Vue community, the common practice is to directly set v-html, which leads to difficulty in passing custom components. The protagonist of this article, mdx, can obtain the AST of the markdown document and render it using JSX, which inherently supports componentization, after all, it is JSX.

However, the problem is that mdx positions itself as a programming language, and it is generally pre-rendered during the bundler's compilation phase; also, because it uses ES Module and JSX, it requires dynamic execution of JavaScript code. These reasons lead to its inability to render untrusted markdown files.

Thus, the story continues with the reinvention of mdx ——

What is mdx?#

mdx is a writing format that allows you to seamlessly insert JSX code into Markdown documents. You can also import components, such as interactive charts or pop-ups, and embed them into the content you write. This revolutionizes the use of components to write longer content.

# Hello, world!

<div className="note">
  > Some notable things in a block quote!
</div>

import { year } from './data.js'

export const name = 'world'

# Hello {name.toUpperCase()}

The current year is {year}

These two code snippets showcase the features of mdx. To summarize quickly, it essentially adds to markdown:

ESM Module import syntax: supports importing components, data, etc., from elsewhere for use
ESM Module export syntax: supports writing some code in mdx files to define components, data, etc., for use
JSX XML tags: <div>...</div>
JSX expressions: {name.toUpperCase()}

Of course, there are also some other markdown extension syntaxes:

mdc: mainly developed and used by the nuxt community for this extension
Generic directives/plugins syntax common mark proposal and corresponding Remark implementation remark-directive

Why not choose to reinvent (modify) their solutions?

The main issue is that they are not popular enough; it seems not many people use them.
Another issue is that both introduce some syntax outside of markdown and HTML.

For example, they might wrap an alert component block with two colons:

::alert{:type="type"}
Your warning
::

Of course, we won't argue here about which syntax is better or similar topics. Often, this doesn't lead to a conclusion; it mostly depends on the user's habits and taste and other factors.

However, it is undeniable that JSX, which is known as JavaScript + XML, has XML syntax that looks the same as HTML. For experienced web developers, it is directly the HTML they usually use, and for React users, JSX is even more familiar; also, pure text workers who are editing documents might be learning to write HTML or XML syntax for the first time, but considering that HTML has a richer community content, there are many tutorials to learn from, and many code snippets can be used without conversion, while XML might also become a potential universal skill, making it easier to transfer to other fields after learning it once.

In summary, I prefer to directly use the markdown + JSX syntax solution, which has a lower learning curve for users and is more universal in skills.

Issues with mdx?#

mdx actually positions itself as a programming language, and it describes itself in the documentation:

Please remember, MDX is a programming language. If you trust your users, then all is well. But be very careful with user input, and do not allow anyone to upload MDX content. If you must, use <iframe> and sandbox, but security is hard, and that doesn’t seem to be 100%. For Node, vm2 is worth a try. However, you should still use tools like Docker to sandbox the entire operating system, limit execution frequency, and be able to terminate it when the process takes too long to execute.

Of course, there is nothing wrong with such a design; you can use it freely in various documentation, SSG, etc., letting the bundler package or pre-render your mdx code, and your mdx code is developer-controlled.

However, what if we want to render componentized, untrusted markdown on the client-side?

Then, solutions like mdx will have many issues:

As we all know, running untrusted code is a dangerous thing; providing the ability to run arbitrary code in scenarios like rendering documents, configuration files, etc., is not good, especially since these formats may come from untrusted sources.
Secondly, its ESM import and export are also unreasonable in such scenarios.
Finally, mdx relies on a JS compiler, namely acron, and packaging this for the client doesn't seem feasible.

Reinventing a subset of mdx#

Thus, a wheel was created — mdio.

The goal is to remove the dynamic syntax of JSX in mdx. Other than the dynamic parameters and custom components that developers can pass in, everything else can be statically inferred, with no code execution capabilities, only using the information passed in. Its features look like this:

---
title: 123123123
tags: [t1, t2, t3]
---

# Hello World

<TagList />

1. list 1
2. list 2

Some text format, **bold**. The title is {frontmatter.title}.

<InfoBox name={"hello"} info={{"key":"value"}} list={[1,2,3]} box={null} />

<div>
  Raw html is ok
</div>

Essentially, it adds JSONX, which is JSON + XML (mistakenly).

XML tags: <div>...</div>
JSON expressions: curly braces wrapping a valid JSON expression, for example: {1}, {"text"}, {{"key":"value"}}, {null}
Access Path expressions: curly braces wrapping an expression that accesses frontmatter or passed environment variables, for example: {frontmatter.title}, {env.abc.def[0].ghi} (currently only planning to support statically determined fields and array index access)

In addition, it also needs to provide some corresponding peripheral facilities, client-side rendered components (taking Vue as an example):

Parsing mdio syntax to AST, then using Vue JSX to convert it into corresponding VNode to render for users;
Accepting Vue components from props (even without any modification, it can support importing asynchronous components via dynamic import), replacing real Vue components in the parsed AST;
Parsing YAML format data in the frontmatter of mdio, automatically passing all environment variables to custom components;
Composable functions: directly providing the document's frontmatter and AST information to deeply nested custom components within the document.

For the above code, the usage feels like this:

<script setup lang="ts">
import { Markdown } from '@breadio/vue';

const content = `... mdio syntax document string`;

// Defining some dynamic import asynchronous components
const components = {
  InfoBox: defineAsyncComponent(() => import('~/components/InfoBox.vue')),
  TagList: defineAsyncComponent(() => import('~/components/TagList.vue')),
};
</script>

<template>
  <Markdown :content="kuma" :components="components"></Markdown>
</template>

<script setup lang="ts">
// TagList.vue
// Automatically passes frontmatter props, can be used directly
const props = defineProps<{ frontmatter?: { tags?: string[] } }>();

// You can also use composable functions to get information
// import { useWikiContent } from '@breadio/vue'
// const { frontmatter } = useWikiContent();
</script>

<template>
  <p class="tag-list space-x-2">
    <span class="font-bold">Tags:</span>
    <span
      v-for="t in props.frontmatter?.tags ?? []"
      :key="t"
      class="rounded py-1 px-2 bg-gray-100"
      >{{ t }}</span
    >
  </p>
</template>

This is mdio, a subset of mdx that removes the dynamic syntax of mdx to support rendering componentized, untrusted markdown on the client-side. Next, we need to modify the mdx compiler to support the features that mdio wants.

The process of modifying mdx#

Since mdx relies on the unified / remark ecosystem, which is quite complex and involves a lot of packages.

unified is a general framework for parsing text, and its plugin ecosystem has many things, one of which is remark, used for parsing markdown. mdx is built on this ecosystem.

To know how we can modify it to achieve our desired results, let's first analyze the main process and dependencies of mdx parsing.

In fact, you can follow its import points to get a general idea, but since it really involves too many packages, searching is quite exhausting, so I've kindly provided links and images for you to follow along.

mdx Source Code Analysis#

The @mdx-js/mdx package is the overall entry point for the mdx project, exposing a bunch of core interfaces like compile, evaluate, etc. The core function for parsing mdx is in src/core.js, which essentially creates a unified instance.

You can see that it includes a bunch of plugins; I haven't looked closely at what each one does, but it references a remark-mdx plugin, which is what we want to look at.

The remark-mdx plugin is located in the same monorepo and is used to parse mdx syntax. It does very simple things, wrapping a few other plugins.

remark#

At this point, we need to explain the structure of the remark project, and then discuss a few plugins that seem related to mdx.

remark is a unified plugin or parser used to parse markdown into AST and convert AST to various formats, among other markdown-related functionalities.

The remark package is a wrapper around unified, internally creating a unified instance and adding markdown parsing-related plugins.

This includes two plugins: remark-parse and remark-stringify; here we only focus on remark-parse, which is a markdown compiler plugin.

You can see that remark-parse is wrapped in another layer, which is a wrapper for the mdast-util-from-markdown plugin.

This involves some things; let me explain: mdast stands for markdown AST, which is the abstract syntax tree representation of markdown. @types/mdast contains the definitions of the abstract syntax tree, and other mdast-util-* packages are various mdast-related packages. Additionally, you might see hast, which stands for HTML AST, the abstract syntax tree representation of HTML.

In addition, you can see that it adds two types of plugins.

One is the micromark plugin; micromark is a markdown parser that can be used independently of remark. The markdown parsing functionality of remark-parse is actually provided by micromark, so the plugins of micromark can also be used with remark; the core parsing logic is in micromark.

The second is the fromMarkdown plugin. This involves some basic knowledge of compilation principles; essentially, the source code goes through lexical analysis, turning into a Token stream, and then through syntax analysis, resulting in an intermediate representation, usually an AST. Here, micromark only parses markdown into a token stream with a syntax structure (referred to as Events in the source code), meaning its lexical and syntax analysis are done together (through an LL1 recursive descent parser). Then the fromMarkdown plugin will take this token stream with a syntax structure and generate an AST (since it parses markdown, it generates mdast).

Then, when we look at mdast-util-from-markdown, we can see that it indeed directly uses the compilation mechanism provided by micromark.

Then, mdast-util-from-markdown has a lot of additional code. Its specific function is to convert the Events stream parsed by micromark into the mdast abstract syntax tree structure.
We can stop here for now; micromark is the package responsible for parsing markdown, implementing an LL1 parser; we won't go into detail. In simple terms, a possible follow-up process is to convert mdast to hast, and then serialize hast to HTML or render it using JSX.

To summarize what we have gone through:

@mdx-js/mdx -> remark-mdx -> remark responsible for parsing markdown + mdx-related plugin handling corresponding syntax
remark -> remark-parse -> mdast-util-from-markdown -> depends on micromark responsible for parsing markdown into Events stream, mdast-util-from-markdown generates mdast

Returning to the remark-mdx package, based on the above source code analysis, we can easily know:

micromark-extension-mdxjs corresponds to mdxjs in the code, which is a micromark plugin used to extend the original markdown syntax;
mdast-util-mdx corresponds to mdxFromMarkdown and mdxToMarkdown in the code, which are used to construct mdast from Events stream and serialize mdast to markdown text.

Now, let's first look at micromark-extension-mdxjs:

Classic operation here, it wraps several plugins. Each plugin basically contains specific logic, so let's introduce their functions:

micromark-extension-mdxjs-esm: ESM import/export syntax support
micromark-extension-mdx-expression: JSX curly brace { ... } expression
micromark-extension-mdx-jsx: JSX XML tag syntax
micromark-extension-mdx-md: disables some markdown features

Then, let's look back at mdast-util-mdx:

It’s still the classic plugin nesting. You can see that the plugin names are still the same, but they have become mdast plugins, and the functionality has shifted from parsing markdown to generating mdast, with the specific added feature support as mentioned above.

To summarize the mdx-related plugins, remark-mdx -> micromark-extension-mdxjs and mdast-util-mdx, which correspond to the micromark and mdast stages respectively. These two packages can choose to support mdx features, mainly divided into three blocks:

ESM import/export syntax support
JSX curly brace { ... } expression
JSX XML tag syntax

Modifying Compilation from JSX to JSONX#

At this point, our approach to modifying mdx is becoming clearer, corresponding to the three feature blocks:

Remove support for ESM import/export; just don't import the plugin.
Modify support for JSX curly brace { ... } expressions:
- Directly perform JSON.parse.
- Write a compiler that supports a.b.c.d[0].e.f.
XML tag syntax does not need modification (you can also remove support for <div {...obj}></div> etc., but I won't elaborate).

Thus, we only need to modify the JS compilation method, which is specifically located in micromark-extension-mdx-expression in micromark-util-events-to-acorn. Of course, we have omitted some other modifications:

Converting the original code's JSDoc to TypeScript (thanks to GPT4's assistance).
Passing options for parameters, methods corresponding to JSONX-related data, etc.
Removing JSX compilation.
Modifying the definitions of various AST node names.

The core code for compiling JS in micromark-util-events-to-acorn is roughly this:

Changing it to what we want would look something like this:

On line 145, we directly perform a JSON.parse to obtain the JSON data inside.

If JSON.parse throws an error, then on lines 152 to 156, we attempt to split it by access path; here I've only written a simple version that splits by . (I was too lazy to write array indices).

Extending mdast Conversion#

In the previous section, we have generated the mdast for mdio, and next, we need to implement the conversion of mdio's mdast nodes to real hast nodes. Specifically, we need to pass the handling functions for mdio nodes to the remark-rehype library, which is roughly the feeling of the following code.

const mdioHandlers: ToHastHandlers = {
  MdioTextElement(state, node: MdioTextElement | MdioFlowElement) {
    // Handle Fragment
    if (!node.name) {
      if (node.children.length > 0) {
        return state.all(node);
      } else {
        return undefined;
      }
    }

    const properties: Record<
      string,
      boolean | number | string | null | undefined | Array<string | number>
    > = {};

    for (const attr of node.attributes) {
      if (attr.type === 'MdioAttribute') {
        if (attr.value === null || attr.value === undefined || typeof attr.value === 'string') {
          properties[attr.name] = attr.value ?? '';
        } else if (attr.value.type === 'MdioAttributeValueExpression') {
          if (/^[A-Z]/.test(node.name)) {
            // For custom components, we directly use raw json data
            properties[attr.name] = attr.value.data?.json;
          } else {
            // For builtin dom, parse JSON string
            try {
              properties[attr.name] = JSON.stringify(attr.value.data?.json);
            } catch (_error) {
              properties[attr.name] = '';
            }
          }
        }
      }
    }

    return {
      type: 'element',
      tagName: node.name,
      properties,
      position: node.position,
      children: state.all(node)
    };
  },
  MdioTextExpression(state, node: MdioTextExpression) {
    if (node.data?.json) {
      const json = node.data.json;
      if (typeof json === 'string' || typeof json === 'number' || typeof json === 'bigint') {
        return {
          type: 'text',
          position: node.position,
          value: '' + json
        };
      } else if (typeof json === 'object') {
        try {
          return {
            type: 'text',
            position: node.position,
            value: JSON.stringify(json)
          };
        } catch (error) {}
      }
    }
    return undefined;
  }
};
  
  const processor = unified()
    .use(remarkParse)
    .use(mdio)
    .use(remarkGfm)
    .use(remarkRehype, {
      handlers: mdioHandlers,
      passThrough: [
        'MdioFlowExpression',
        'MdioFlowElement',
        'MdioTextElement',
        'MdioTextExpression',
      ]
    })

Since the structure of AST nodes is generally complex, it’s mostly defined according to type, guiding us on how to piece together hast nodes. Actual running and debugging will reveal what is happening, so I won't describe it in detail.

In addition, to support access path access to object fields, we also need to traverse the mdast and replace it with the real results. As mentioned above, due to the complexity of type definitions, debugging is necessary to understand; the rewrite function below is used to get the real field through the access path.

function rewriteVariables(root: MdastRoot, env: Record<string, any>) {
  visit(root, function (node: MdastNodes) {
    if (node.type === 'MdioFlowExpression' || node.type === 'MdioTextExpression') {
      if (node.data?.path) {
        const real = rewrite(node.data.path);
        if (node.type === 'MdioTextExpression') {
          // @ts-expect-error ts2322
          node.type = 'text';
          node.value = real;
        }
      }
    } else if (node.type === 'MdioFlowElement' || node.type === 'MdioTextElement') {
      for (const attr of node.attributes) {
        if (attr.type === 'MdioExpressionAttribute' && attr.data?.path) {
          const real = rewrite(attr.data.path);
          attr.value = real;
        } else if (
          attr.type === 'MdioAttribute' &&
          attr.value &&
          typeof attr.value !== 'string' &&
          attr.value.data?.path
        ) {
          attr.value = rewrite(attr.value.data.path);
        }
      }
    }
  });

  function rewrite(path?: AccessPath) {
    if (!Array.isArray(path) || path.length === 0) return undefined;
    let cur: any = env;
    try {
      for (const p of path ?? []) {
        if (p in cur) {
          cur = cur[p];
        } else {
          cur = undefined;
          break;
        }
      }
    } catch (_error) {
      cur = undefined;
    }
    if (cur) {
      // TODO: handle more cases
      return cur.toString();
    } else {
      return '';
    }
  }
}

Handling Some Corner Cases#

First, once it becomes JSX, it no longer supports the original HTML comment syntax ; we could consider extending the parsing of XML tag syntax or simply run a regex over it. However, adding this support would mean it is no longer a strict subset of mdx.

Second, writing the original markdown syntax in JSX might lead to some elements being wrapped in additional <p> tags. See the example below:

<table>
    <thead>
        <tr class="header">
            <th><p>播放地區</p></th>
            <th><p>播放平台</p></th>
            <th><p>播放日期</p></th>
            <th><p>播放時間（<a href="UTC+8" title="wikilink">UTC+8</a>）</p></th>
            <th><p>字幕語言</p></th>
            <th><p>備註</p></th>
        </tr>
    </thead>
    <tbody></tbody>
</table>

After compilation by mdx, it generates a result similar to the following:

table
- thead
  - p  // <---
    - tr
      - th1
      - th2
      - ...
- tbody

Notice that the thead XML element is unexpectedly wrapped in an additional layer. In mdx, there is a plugin remark-mark-and-unravel that is used to eliminate unnecessary double nodes, which only wrap a single child node.

There are also some other handwritten plugins in mdx whose functions I haven't looked into yet; we can address them as issues arise.

Vue Components#

Finally, we wrap it with a Vue component, additionally accepting a components parameter to replace custom components with the corresponding Vue component constructors when converting hast to JSX.

import type { VNode } from '@vue/runtime-dom';
import type { Root as HastRoot } from 'hast';

import { visit } from 'unist-util-visit';
import { Fragment, jsx } from 'vue/jsx-runtime';
import { type DefineComponent, computed, defineComponent, h } from 'vue';

import { ParseResult, createParser, toJsxRuntime } from '@breadio/markdown';

export const Markdown = defineComponent({
  name: 'Markdown',
  inheritAttrs: true,
  props: {
    parsed: {
      type: Object,
      required: false
    },
    content: {
      type: String,
      required: false
    },
    components: {
      type: Object,
      required: false
    }
  },
  setup(props, attrs) {
    const parser = createParser();

    const parsed = computed(() => {
      if (props.parsed) {
        return props.parsed as ParseResult<any>;
      } else {
        try {
          const result = parser.parseSync(props.content ?? '');
          return result;
        } catch (error) {
          return undefined;
        }
      }
    });
    const hast = computed(() => {
      if (parsed.value?.hast) {
        const comps = unifyVueHast(parsed.value.hast, {
          frontmatter: parsed.value?.frontmatter
        });
      }
      return parsed.value?.hast;
    });
    const frontmatter = computed(() => parsed.value?.frontmatter);

    return () => {
      const header = attrs.slots.header?.({ frontmatter: frontmatter.value });
      const children = hast.value
        ? (toJsxRuntime(hast.value, {
            components: props.components,
            Fragment,
            // @ts-expect-error ts2322
            jsx,
            // @ts-expect-error ts2322
            jsxs: jsx,
            elementAttributeNameCase: 'html'
          }) as VNode)
        : null;
      const footer = attrs.slots.footer?.({ frontmatter: frontmatter.value });

      return h('div', null, [header, children, footer]);
    };
  }
});

function unifyVueHast(root: HastRoot, env: Record<string, any>) {
  const components = new Set<string>();
  visit(root, function (node) {
    if (node.type === 'element' && /^[A-Z]/.test(node.tagName)) {
      node.properties = { ...node.properties, ...env };
      components.add(node.tagName);
    }
  });
  return components;
}

Thus, we can use mdio like this.

<script setup lang="ts">
import { Markdown } from '@breadio/vue';

const content = `... mdio syntax document string`;

// Defining some dynamic import asynchronous components
const components = {
  InfoBox: defineAsyncComponent(() => import('~/components/InfoBox.vue')),
  TagList: defineAsyncComponent(() => import('~/components/TagList.vue')),
};
</script>

<template>
  <Markdown :content="kuma" :components="components"></Markdown>
</template>