Skip to content

Latest commit

 

History

History
291 lines (203 loc) · 8.08 KB

readme.md

File metadata and controls

291 lines (203 loc) · 8.08 KB

rehype-raw

Build Coverage Downloads Size Sponsors Backers Chat

rehype plugin to parse the tree (and raw nodes) again, keeping positional info okay.

Contents

What is this?

This package is a unified (rehype) plugin to parse a document again. To understand how it works, requires knowledge of ASTs (specifically, hast). This plugin passes each node and embedded raw HTML through an HTML parser (parse5), to recreate a tree exactly as how a browser would parse it, while keeping the original data and positional info intact.

unified is a project that transforms content with abstract syntax trees (ASTs). rehype adds support for HTML to unified. hast is the HTML AST that rehype uses. This is a rehype plugin that parses the tree again.

When should I use this?

This plugin is particularly useful when coming from markdown and wanting to support HTML embedded inside that markdown (which requires passing allowDangerousHtml: true to remark-rehype). Markdown dictates how, say, a list item or emphasis can be parsed. We can use that to turn the markdown syntax tree into an HTML syntax tree. But markdown also dictates that things that look like HTML, are passed through untouched, even when it just looks like XML but doesn’t really make sense, so we can’t normally use these strings of “HTML” to create an HTML syntax tree. This plugin can. It can be used to take those strings of HTML and include them into the syntax tree as actual nodes.

If your final result is HTML and you trust content, then “strings” are fine (you can pass allowDangerousHtml: true to rehype-stringify, which passes HTML through untouched). But there are two main cases where a proper syntax tree is preferred:

  • rehype plugins need a proper syntax tree as they operate on actual nodes to inspect or transform things, they can’t operate on strings of HTML
  • other output formats (React, MDX, etc) need actual nodes and can’t handle strings of HTML

This plugin is built on hast-util-raw, which does the work on syntax trees. rehype focusses on making it easier to transform content by abstracting such internals away.

Install

This package is ESM only. In Node.js (version 16+), install with npm:

npm install rehype-raw

In Deno with esm.sh:

import rehypeRaw from 'https://esm.sh/rehype-raw@7'

In browsers with esm.sh:

<script type="module">
  import rehypeRaw from 'https://esm.sh/rehype-raw@7?bundle'
</script>

Use

Say we have the following markdown file example.md:

<div class="note">

A mix of *markdown* and <em>HTML</em>.

</div>

…and our module example.js looks as follows:

import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeRaw from 'rehype-raw'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {read} from 'to-vfile'
import {unified} from 'unified'

const file = await unified()
  .use(remarkParse)
  .use(remarkRehype, {allowDangerousHtml: true})
  .use(rehypeRaw)
  .use(rehypeDocument, {title: '🙌'})
  .use(rehypeFormat)
  .use(rehypeStringify)
  .process(await read('example.md'))

console.log(String(file))

…now running node example.js yields:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>🙌</title>
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>
  <body>
    <div class="note">
      <p>A mix of <em>markdown</em> and <em>HTML</em>.</p>
    </div>
  </body>
</html>

API

This package exports no identifiers. The default export is rehypeRaw.

unified().use(rehypeRaw[, options])

Parse the tree (and raw nodes) again, keeping positional info okay.

Parameters
  • options (Options, optional) — configuration
Returns

Transform (Transformer).

Options

Configuration (TypeScript type).

Fields
  • passThrough (Array<string>, default: []) — list of custom hast node types to pass through (as in, keep); this option is a bit advanced as it requires knowledge of ASTs, so we defer to the docs in hast-util-raw
  • tagfilter? (boolean | null | undefined) — whether to disallow irregular tags in raw nodes according to GFM tagfilter (default: false); this affects the following tags, grouped by their kind: RAWTEXT (iframe, noembed, noframes, style, xmp), RCDATA (textarea, title), SCRIPT_DATA (script), PLAINTEXT (plaintext); when you know that you do not want authors to write these tags, you can enable this option to prevent their use from running amok.

Types

This package is fully typed with TypeScript. It exports the additional type Options.

The Raw node type is registered by and exposed from remark-rehype.

Compatibility

Projects maintained by the unified collective are compatible with maintained versions of Node.js.

When we cut a new major release, we drop support for unmaintained versions of Node. This means we try to keep the current release line, rehype-raw@^7, compatible with Node.js 16.

Security

The allowDangerousHtml option in remark-rehype is dangerous, so see that plugin on how to make it safe. Otherwise, this plugin is safe.

Contribute

See contributing.md in rehypejs/.github for ways to get started. See support.md for ways to get help.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

License

MIT © Titus Wormer