As I already mentioned, our AST will be consisted of 3 node types, which means that we will have to create an ENUM that will contain the values element
, props
and value
. This way the node types won't be hardcoded and patching the code can be very easy:
@@ -1,5 +1,6 @@
┊1┊1┊{
┊2┊2┊ "name": "jsx-runtime",
+┊ ┊3┊ "version": "0.1.0",
┊3┊4┊ "description": "A runtime version of JSX",
┊4┊5┊ "main": "build/jsx-runtime.js",
┊5┊6┊ "repository": {
@@ -0,0 +1,5 @@
+┊ ┊1┊const types = {
+┊ ┊2┊ element: 'element',
+┊ ┊3┊ value: 'value',
+┊ ┊4┊ props: 'props',
+┊ ┊5┊}🚫↵
Since we had 3 node types, it means that for each of them we should have a dedicated parsing function:
@@ -2,4 +2,40 @@
┊ 2┊ 2┊ element: 'element',
┊ 3┊ 3┊ value: 'value',
┊ 4┊ 4┊ props: 'props',
-┊ 5┊ ┊}🚫↵
+┊ ┊ 5┊}
+┊ ┊ 6┊
+┊ ┊ 7┊const parseElement = (str) => {
+┊ ┊ 8┊ let match
+┊ ┊ 9┊ let length
+┊ ┊10┊
+┊ ┊11┊ const node = {
+┊ ┊12┊ type: types.element,
+┊ ┊13┊ props: parseProps(''),
+┊ ┊14┊ children: [],
+┊ ┊15┊ length: 0,
+┊ ┊16┊ name: '',
+┊ ┊17┊ }
+┊ ┊18┊
+┊ ┊19┊ return node
+┊ ┊20┊}
+┊ ┊21┊
+┊ ┊22┊const parseProps = (str) => {
+┊ ┊23┊ let match
+┊ ┊24┊ let length
+┊ ┊25┊
+┊ ┊26┊ const node = {
+┊ ┊27┊ type: types.props,
+┊ ┊28┊ length: 0,
+┊ ┊29┊ props: {},
+┊ ┊30┊ }
+┊ ┊31┊
+┊ ┊32┊ return node
+┊ ┊33┊}
+┊ ┊34┊
+┊ ┊35┊const parseValue = (str) => {
+┊ ┊36┊ return {
+┊ ┊37┊ type: types.value,
+┊ ┊38┊ length: str.length,
+┊ ┊39┊ value: str.trim(),
+┊ ┊40┊ }
+┊ ┊41┊}
Each function creates the basic node type and returns it. Note that at the begnning of the scope of each function I've defined a couple of variables:
let match
- which will be used to store regular expression matches on the fly.let length
- which will be used to store the length of the match so we can trim the JSX code string right after and accumulate it innode.length
.
For now the parseValue()
function is pretty straight forward and just returns a node which wraps the given string.
We will begin with the implementation of the element node and we will branch out to other nodes as we go. First we will try to figure out the name of the element. If an element tag opener was not found, we will assume that the current part of the code is a value:
@@ -16,6 +16,19 @@
┊16┊16┊ name: '',
┊17┊17┊ }
┊18┊18┊
+┊ ┊19┊ match = str.match(/<(\w+)/)
+┊ ┊20┊
+┊ ┊21┊ if (!match) {
+┊ ┊22┊ str = str.split('<')[0]
+┊ ┊23┊
+┊ ┊24┊ return parseValue(str)
+┊ ┊25┊ }
+┊ ┊26┊
+┊ ┊27┊ node.name = match[1]
+┊ ┊28┊ length = match.index + match[0].length
+┊ ┊29┊ str = str.slice(length)
+┊ ┊30┊ node.length += length
+┊ ┊31┊
┊19┊32┊ return node
┊20┊33┊}
Up next, we need to parse the props. To make things more efficient, we will need to first find the tag closer so we can provide the parseProps()
method the relevant part of the string:
@@ -29,6 +29,15 @@
┊29┊29┊ str = str.slice(length)
┊30┊30┊ node.length += length
┊31┊31┊
+┊ ┊32┊ match = str.match(/>/)
+┊ ┊33┊
+┊ ┊34┊ if (!match) return node
+┊ ┊35┊
+┊ ┊36┊ node.props = parseProps(str.slice(0, match.index), values)
+┊ ┊37┊ length = node.props.length
+┊ ┊38┊ str = str.slice(length)
+┊ ┊39┊ node.length += length
+┊ ┊40┊
┊32┊41┊ return node
┊33┊42┊}
Now that we've plucked the right substring, we can go ahead and implement the parseProps()
function logic:
@@ -51,6 +51,27 @@
┊51┊51┊ props: {},
┊52┊52┊ }
┊53┊53┊
+┊ ┊54┊ const matchNextProp = () => {
+┊ ┊55┊ match =
+┊ ┊56┊ str.match(/ *\w+="(?:.*[^\\]")?/) ||
+┊ ┊57┊ str.match(/ *\w+/)
+┊ ┊58┊ }
+┊ ┊59┊
+┊ ┊60┊ matchNextProp()
+┊ ┊61┊
+┊ ┊62┊ while (match) {
+┊ ┊63┊ const propStr = match[0]
+┊ ┊64┊ let [key, ...value] = propStr.split('=')
+┊ ┊65┊ node.length += propStr.length
+┊ ┊66┊ key = key.trim()
+┊ ┊67┊ value = value.join('=')
+┊ ┊68┊ value = value ? value.slice(1, -1) : true
+┊ ┊69┊ node.props[key] = value
+┊ ┊70┊ str = str.slice(0, match.index) + str.slice(match.index + propStr.length)
+┊ ┊71┊
+┊ ┊72┊ matchNextProp()
+┊ ┊73┊ }
+┊ ┊74┊
┊54┊75┊ return node
┊55┊76┊}
The logic is pretty straight forward - we iterate through the string, and each time we try match the next key->value pair. Once a pair wasn't found, we return the node with the accumulated props. Note that providing only an attribute with no value is also a valid syntax which will set its value to true
by default, thus the / *\w+/
regexp. Let's proceed where we left of with the element parsing implementation.
We need to figure out whether the current element is self closing or not. If it is, we will return the node, and otherwise we will continue to parsing its children:
@@ -38,6 +38,22 @@
┊38┊38┊ str = str.slice(length)
┊39┊39┊ node.length += length
┊40┊40┊
+┊ ┊41┊ match = str.match(/^ *\/ *>/)
+┊ ┊42┊
+┊ ┊43┊ if (match) {
+┊ ┊44┊ node.length += match.index + match[0].length
+┊ ┊45┊
+┊ ┊46┊ return node
+┊ ┊47┊ }
+┊ ┊48┊
+┊ ┊49┊ match = str.match(/>/)
+┊ ┊50┊
+┊ ┊51┊ if (!match) return node
+┊ ┊52┊
+┊ ┊53┊ length = match.index + 1
+┊ ┊54┊ str = str.slice(length)
+┊ ┊55┊ node.length += length
+┊ ┊56┊
┊41┊57┊ return node
┊42┊58┊}
Accordingly, we're gonna implement the children parsing logic:
@@ -54,6 +54,16 @@
┊54┊54┊ str = str.slice(length)
┊55┊55┊ node.length += length
┊56┊56┊
+┊ ┊57┊ let child = parseElement(str)
+┊ ┊58┊
+┊ ┊59┊ while (child.type === types.element || child.value) {
+┊ ┊60┊ length = child.length
+┊ ┊61┊ str = str.slice(length)
+┊ ┊62┊ node.length += length
+┊ ┊63┊ node.children.push(child)
+┊ ┊64┊ child = parseElement(str)
+┊ ┊65┊ }
+┊ ┊66┊
┊57┊67┊ return node
┊58┊68┊}
Children parsing is recursive. We keep calling the parseElement()
method for the current substring until there's no more match. Once we've gone through all the children, we can finish the process by finding the closing tag:
@@ -64,6 +64,12 @@
┊64┊64┊ child = parseElement(str)
┊65┊65┊ }
┊66┊66┊
+┊ ┊67┊ match = str.match(new RegExp(`</${node.name}>`))
+┊ ┊68┊
+┊ ┊69┊ if (!match) return node
+┊ ┊70┊
+┊ ┊71┊ node.length += match.index + match[0].length
+┊ ┊72┊
┊67┊73┊ return node
┊68┊74┊}
The HTML parsing part is finished! Now we can call the parseElement()
for any given HTML string and we should get a JSON output which represents an AST, like the following:
{
"type": "element",
"props": {
"type": "props",
"length": 20,
"props": {
"onclick": "onclick()"
}
},
"children": [
{
"type": "element",
"props": {
"type": "props",
"length": 15,
"props": {
"src": "icon.svg"
}
},
"children": [],
"length": 18,
"name": "img"
},
{
"type": "element",
"props": {
"type": "props",
"length": 0,
"props": {}
},
"children": [
{
"type": "value",
"length": 4,
"value": "text"
}
],
"length": 12,
"name": "span"
}
],
"length": 74,
"name": "div"
}
< Intro | Next Step > |
---|