Front-end developer with Backend, SEO, and security skills. Open Source maintainer. Blogger. Polish Wikipedia redactor.
Did you ever wanted to create your own programming language? In this article I will demonstrate how to quickly write simple language that compile to JavaScript using free tools and PEG.js parser generator. This article will show you everything what is needed to quickly create your own programming language.
Parser generator as name suggest is a program that generate the parser for you based on grammar, language specification. Written in specific syntax. In this article we will use PEG.js parser generator that generate JavaScript file that will parse the code for your language and output AST.
AST is acronym for Abstract Syntax Tree. It's the way to represent code in a format that tools can understand. We will use AST in format of a Esprima, that is JavaScript parser that output AST.
What cool about Esprima syntax is that there are tools that generate code based on their AST. Example is escodegen that take Esprima AST as input and output JavaScript code.
Here I will show you how to create simple parser grammar for if statement.
The syntax of PEG.js is not very complicated, it consist of name of the rule then the matching and block of JavaScript that is executed and returned from the rule.
Here is simple example provided by PEG.js documentation:
{
function makeInteger(o) {
return parseInt(o.join(""), 10);
}
}
start
= additive
additive
= left:multiplicative "+" right:additive { return left + right; }
/ multiplicative
multiplicative
= left:primary "*" right:multiplicative { return left * right; }
/ primary
primary
= integer
/ "(" additive:additive ")" { return additive; }
integer "integer"
= digits:[0-9]+ { return makeInteger(digits); }
It can parse and evaluate simple arithmetic expressions example is
that evaluates to
10+2*3
. You can test this parser at PEG.js Online Tool.
16
But what we need is not interpret the code and return single value but return Esprima AST. To see how Esprima AST look like you can check AST Explorer select Esprima as output and type some JavaScript.
Here is example of simple code like this:
if (foo == "bar") {
10 + 10
10 * 20
}
The output in JSON format that looks like this:
{
"type": "Program",
"body": [
{
"type": "IfStatement",
"test": {
"type": "BinaryExpression",
"operator": "==",
"left": {
"type": "Identifier",
"name": "foo",
"range": [
4,
7
]
},
"right": {
"type": "Literal",
"value": "bar",
"raw": "\"bar\"",
"range": [
11,
16
]
},
"range": [
4,
16
]
},
"consequent": {
"type": "BlockStatement",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "BinaryExpression",
"operator": "+",
"left": {
"type": "Literal",
"value": 10,
"raw": "10",
"range": [
23,
25
]
},
"right": {
"type": "Literal",
"value": 10,
"raw": "10",
"range": [
28,
30
]
},
"range": [
23,
30
]
},
"range": [
23,
30
]
},
{
"type": "ExpressionStatement",
"expression": {
"type": "BinaryExpression",
"operator": "*",
"left": {
"type": "Literal",
"value": 10,
"raw": "10",
"range": [
34,
36
]
},
"right": {
"type": "Literal",
"value": 20,
"raw": "20",
"range": [
39,
41
]
},
"range": [
34,
41
]
},
"range": [
34,
41
]
}
],
"range": [
18,
43
]
},
"alternate": null,
"range": [
0,
43
]
}
],
"sourceType": "module",
"range": [
0,
43
]
}
You don't need to care about "range" and "raw". They part of the parser output.
Lets split the JSON down to its part:
The if statement need to be in format:
{
"type": "IfStatement",
"test": {
},
"consequent": {
},
"alternate": null
}
Where "test" and "consequent are any expressions:
The condition can be any expression but here we will have binary expression that compare two things:
{
"type": "BinaryExpression",
"operator": "==",
"left": {},
"right": {}
}
Variables usage loos like this:
{
"type": "Identifier",
"name": "foo"
}
Literal string that is used in our code looks like this:
{
"type": "Literal",
"value": "bar"
}
The block inside if is created like this:
{
"type": "BlockStatement",
"body": [ ]
}
And the whole program is created like this:
{
"type": "Program",
"body": [ ]
}
For our demo language we will create code that looks similar to ruby:
if foo = "bar" then
10 + 10
10 * 20
end
and we will create AST, that then will create JavaScript code.
Peg grammar for if looks like this:
if = "if" _ expression:(comparison / expression) _ "then" body:(statements / _) _ "end" {
return {
"type": "IfStatement",
"test": expression,
"consequent": {
"type": "BlockStatement",
"body": body
},
"alternate": null
};
}
we have "if" token, then expression that is comparison or expression and body is statements or white space.
Comparison look like this:
comparison = _ left:expression _ "==" _ right:expression _ {
return {
"type": "BinaryExpression",
"operator": "==",
"left": left,
"right": right
};
}
Expression looks like this:
expression = expression:(variable / literal) { return expression; }
Variable is created from three rules:
variable = !keywords variable:name {
return {
"type": "Identifier",
"name": variable
}
}
keywords = "if" / "then" / "end"
name = [A-Z_$a-z][A-Z_a-z0-9]* { return text(); }
Now lets look at statements:
statements = _ head:(if / expression_statement) _ tail:(!"end" _ (if / expression_statement))* {
return [head].concat(tail.map(function(element) {
return element[2];
}));
}
expression_statement = expression:expression {
return {
"type": "ExpressionStatement",
"expression": expression
};
}
And last thing are literals:
literal = value:(string / Integer) {
return {"type": "Literal", "value": value };
}
string = "\"" ([^"] / "\\\\\"")* "\"" {
return JSON.parse(text());
}
Integer "integer"
= _ [0-9]+ { return parseInt(text(), 10); }
And that is the whole parser, that generate AST. After we have Esprima AST all we have to do, is to generate the code with escodegen.
The code that generate the AST and create JavaScript code looks like this:
const ast = parser.parse(code);
const js_code = escodegen.generate(ast);
the parser variable is the name that you give when you generate the parser using PEG.js.
And here is simple demo that I was using to write the parser, you can play with the grammar and generate different syntax for your own programming language that compile to JavaScript.
This simple application save your code in LocalStorage, If it compile without errors, on each change. So you can safely use it to create your own language. But I don't guarantee that you will not loose your work, so you may use something that is more robust.
Writing language that compile to JavaScript is simple. The techniques explained in this article should allow you to create any programming language that compile to JavaScript on your own.
Create your free account to unlock your custom reading experience.