Welcome to the next part of creating your own programming language. In this part, we’ll continue improving our toy language by implementing Exceptions. Here are the previous parts:
The full source code is available over on GitHub.
First, we’ll define the syntax rules of how we will throw and handle an Exception simular to Ruby syntax:
raise
keyword:raise
raise "This is an Exception"
class Exception [message]
end
raise new Exception ["This is an Exception message"]
1: do_something []
2:
3: fun do_something
4: new Test :: do_something_else []
5: end
6:
7: class Test
8:
9: fun do_something_else
10: do_even_more []
11: end
12:
13: fun do_even_more
14: raise "A message that describes the error."
15: end
16:
17: end
Output:
A message that describes the error.
at Test#do_even_more:14
at Test#do_something_else:10
at do_something:4
at test.toy:1
begin
# Statements raising an Exception
rescue
# Handle an Exception
ensure
# Always executed
end
rescue
keyword:begin
# Statements raising an Exception
rescue error
# Access and handle an Exception using `error` variable
print error
end
In this section, we will cover lexical analysis as the first stage of the compiling process that divides the source code into language lexemes such as keyword, variable, operator, etc.
To define the lexemes in this toy-language implementation, I’m using the regex expressions listed in the TokenType enum:
package org.example.toylanguage.token;
...
public enum TokenType {
Comment("\\#.*"),
LineBreak("[\\n\\r]"),
Whitespace("[\\s\\t]"),
Keyword("(if|elif|else|end|print|input|class|fun|return|loop|in|by|break|next|assert)(?=\\s|$)(?!_)"),
GroupDivider("(\\[|\\]|\\,|\\{|}|\\.{2}|(\\:(?!\\:)))"),
Logical("(true|false)(?=\\s|$)(?!_)"),
Numeric("([-]?(?=[.]?[0-9])[0-9]*(?![.]{2})[.]?[0-9]*)"),
Null("(null)(?=,|\\s|$)(?!_)"),
This("(this)(?=,|\\s|$)(?!_)"),
Text("\"([^\"]*)\""),
Operator("(\\+|-|\\*{1,2}|/{1,2}|%|>=|>|<=|<{1,2}|={1,2}|!=|!|:{2}\\s+new|:{2}|\\(|\\)|(new|and|or|as|is)(?=\\s|$)(?!_))"),
Variable("[a-zA-Z_]+[a-zA-Z0-9_]*");
...
}
Every line of toy-language code is processed through these regex expressions, and with the help of LexicalParser, we transform source code into Token lexemes. To parse new words declared in the Exception rules (raise
, begin
, rescue
, ensure
), we need to add them to the Keyword
lexeme’s regex expression:
...
public enum TokenType {
...
Keyword("(if|elif|else|end|print|input|class|fun|return|loop|in|by|break|next|assert|raise|begin|rescue|ensure)(?=\\s|$)(?!_)"),
...
}
In this section, we’ll convert the lexemes received from the lexical analysis into the final statements.
To convert declared Keyword
lexemes into statements, we need to define statements. To implement a statement, we’re using the Statement interface. The implemented RaiseExceptionStatement
should contain a raised expression in it:
package org.example.toylanguage.statement;
@RequiredArgsConstructor
@Getter
public class RaiseExceptionStatement implements Statement {
private final Expression expression;
@Override
public void execute() {
// TODO raise an Exception
}
}
The statement to handle an Exception will also implement the Statement interface, but unlike the RaiseExceptionStatement
, it should include nested statements for each of these three blocks:
begin
# Begin block
rescue error
# Rescue block
ensure
# Ensure block
end
Each of these statements, being an implementation of CompositeStatement, will contain nested statements within itself, allowing multiple statements to be executed. To write an Exception in the error variable for the rescue
block, we’ll define the errorVariable
as a String field:
package org.example.toylanguage.statement;
@RequiredArgsConstructor
@Getter
public class HandleExceptionStatement implements Statement {
private final CompositeStatement bodyStatement;
private final CompositeStatement rescueStatement;
private final CompositeStatement ensureStatement;
private final String errorVariable;
@Override
public void execute() {
// TODO handle an Exception
}
}
The next step is to transform Exception tokens into the RaiseExceptionStatement
and HandleExceptionStatement
implementations. To convert tokens into statements, we use StatementParser. In this specific case, to convert the Keyword token, we need to modify the StatementParser#parseKeywordStatement(Token token) method, that parses an operator depending on the first word that statement starts .
Let’s add new first words in the switch block to raise and handle an Exception: the raise
for RaiseExceptionStatement and the begin
for HandleExceptionStatement:
package org.example.toylanguage;
public StatementParser {
...
private void parseKeywordStatement(Token token) {
switch (token.getValue()) {
...
case "raise":
parseRaiseExceptionStatement();
break;
case "begin":
parseHandleExceptionStatement();
break;
default:
throw new SyntaxException(String.format("Failed to parse a keyword: %s", token.getValue()));
}
}
...
}
In order to create the RaiseExceptionStatement
, we only need to read the expression that is being raised. To read expressions, we use the ExpressionReader class, which parses a complete expression until it reaches the beginning of the next statement:
private void parseRaiseExceptionStatement() {
Expression expression = ExpressionReader.readExpression(tokens);
...
}
After creating the RaiseExceptionStatement
, we need to add it to the StatementParser#compositeStatement
as a nested statement in the outer statement:
private void parseRaiseExceptionStatement() {
Expression expression = ExpressionReader.readExpression(tokens);
RaiseExceptionStatement statement = new RaiseExceptionStatement(expression);
compositeStatement.addStatement(statement);
}
To create HandleExceptionOperator
, we need to read three blocks: begin (body)
, rescue
, and ensure
, not forgetting to read the end
word at the end, which stands for the end of the Exception handling operator:
private void parseHandleExceptionStatement() {
// read begin block
CompositeStatement beginStatement = ...;
// read rescue block
CompositeStatement rescueStatement = ...;
String errorVariable = ...;
// read ensure block
CompositeStatement ensureStatement = ..;
// skip the end keyword
tokens.next(TokenType.Keyword, "end");
// construct a statement
HandleExceptionStatement statement = new HandleExceptionStatement(beginStatement, rescueStatement, ensureStatement, errorVariable);
compositeStatement.addStatement(statement);
}
Let’s start with the begin
block. To parse nested statements inside it, we’ll need to use the StatementParser#parse(StatementParser, CompositeStatement, DefinitionScope)
method. As the first argument, it accepts the StatementParser instance of the outer block of code. The second argument is the CompositeStatement, which will accumulate all the nested statements in a parsed block. The third argument is DefinitionScope, which is used to write all the structures (classes and functions) declared inside a parsed block. If we want to restrict the structures declared inside the begin block to be accessed from the outer block we should open a new DefinitionScope:
// read begin block
CompositeStatement beginStatement = new CompositeStatement();
DefinitionScope beginScope = DefinitionContext.newScope();
StatementParser.parse(this, beginStatement, beginScope);
The StatementParser#parse(StatementParser, CompositeStatement, DefinitionScope)
method will read all the nested statements until we reach a finalizing word standing for the end of this block. Currently, to check if we met the finalizing word, we use StatementParser#hasNextStatement()
. Let’s add new rescue
and ensure
words to make sure to stop parsing statements when we met these blocks:
public class StatementParser {
...
private boolean hasNextStatement() {
if (!tokens.hasNext())
return false;
if (tokens.peek(TokenType.Operator, TokenType.Variable, TokenType.This))
return true;
if (tokens.peek(TokenType.Keyword)) {
return !tokens.peek(TokenType.Keyword, "elif", "rescue", "ensure", "else", "end");
}
return false;
}
...
}
Next, let's read the second rescue
block. It can be missing if a user doesn't want to catch and handle an Exception:
// read rescue block
CompositeStatement rescueStatement = ...;
String errorVariable = ...;
if (tokens.peek(TokenType.Keyword, "rescue")) {
tokens.next(); // skip rescue word
}
Before reading the nested statements, let’s check if the user specified a variable to refer to the raised Exception:
// read rescue block
CompositeStatement rescueStatement = ...;
String errorVariable = null;
if (tokens.peek(TokenType.Keyword, "rescue")) {
tokens.next(); // skip rescue word
if (tokens.peekSameLine(TokenType.Variable)) {
Token error = tokens.next();
errorVariable = error.getValue();
}
}
Now let’s read the nested statements as we previously read the begin statements:
// read rescue block
CompositeStatement rescueStatement = null;
String errorVariable = null;
if (tokens.peek(TokenType.Keyword, "rescue")) {
tokens.next(); // skip rescue word
if (tokens.peekSameLine(TokenType.Variable)) {
Token error = tokens.next();
errorVariable = error.getValue();
}
rescueStatement = new CompositeStatement();
DefinitionScope rescueScope = DefinitionContext.newScope();
StatementParser.parse(this, rescueStatement, rescueScope);
}
And finally, let’s finish the third ensure
block. It can be optional as the rescue
block:
// read ensure block
CompositeStatement ensureStatement = null;
if (tokens.peek(TokenType.Keyword, "ensure")) {
tokens.next(); // skip rescue word
ensureStatement = new CompositeStatement();
DefinitionScope ensureScope = DefinitionContext.newScope();
StatementParser.parse(this, ensureStatement, ensureScope);
}
When we execute the RaiseExceptionStatement, each of the subsequent statements should be notified that the program crashed, and the execution should be stopped. To share this event between other statements, we’ll introduce the ExceptionContext
class that will hold the Exception details:
package org.example.toylanguage.context;
public class ExceptionContext {
@Getter
private static Exception exception;
private static boolean raised;
@RequiredArgsConstructor
@Getter
public static class Exception {
private final Value<?> value;
@Override
public String toString() {
return value.toString();
}
}
}
The Exception class will provide detailed information about the raised Exception, including records of the application's movement within it to print the stack trace.
Next, we’ll add a few methods to raise and handle the exception:
public class ExceptionContext {
...
public static void raiseException(Value<?> value) {
exception = new Exception(value);
raised = true;
}
public static void rescueException() {
exception = null;
raised = false;
}
public static boolean isRaised() {
return raised;
}
}
Next, let’s complete the RaiseExceptionStatement#execute()
and notify other statements with ExeceptionContext
:
package org.example.toylanguage.statement;
public class RaiseExceptionStatement implements Statement {
private final Expression expression;
@Override
public void execute() {
Value<?> value = expression.evaluate();
ExceptionContext.raiseException(value);
}
}
In case a user didn’t provide the error expression, we can print a default text expression:
public class RaiseExceptionStatement implements Statement {
private final Expression expression;
@Override
public void execute() {
Value<?> value = expression.evaluate();
if (value == NullValue.NULL_INSTANCE) {
value = new TextValue("Empty exception");
}
ExceptionContext.raiseException(value);
}
}
Knowing that the ExceptionContext will be notified about a raised Exception, we should check that no subsequent statements will be executed if the Exception is raised. Currently, all statements in any block of code are executed by the CompositeStatement
implementations. For every CompositeStatement
implementation where we iterate nested statements with CompositeStatement#statements2Execute
, we need to set validation after each executed statement in case there is an Exception occurred and in positive case stop the execution:
package org.example.toylanguage.statement;
@Getter
public class CompositeStatement implements Statement {
...
@Override
public void execute() {
for (Statement statement : statements2Execute) {
statement.execute();
// stop the execution in case Exception occurred
if (ExceptionContext.isRaised())
return;
//stop the execution in case ReturnStatement is invoked
if (ReturnContext.getScope().isInvoked())
return;
}
}
}
package org.example.toylanguage.statement.loop;
public abstract class AbstractLoopStatement implements CompositeStatement {
...
@Override
public void execute() {
...
try {
...
while (hasNext()) {
...
try {
// execute inner statements
for (Statement statement : getStatements2Execute()) {
statement.execute();
// stop the execution in case Exception occurred
if (ExceptionContext.isRaised())
return;
// stop the execution in case ReturnStatement is invoked
if (ReturnContext.getScope().isInvoked())
return;
// stop the execution in case BreakStatement is invoked
if (BreakContext.getScope().isInvoked())
return;
// jump to the next iteration in case NextStatement is invoked
if (NextContext.getScope().isInvoked())
break;
}
} finally {
NextContext.reset();
MemoryContext.endScope(); // release each iteration memory
...
}
}
} finally {
MemoryContext.endScope(); // release loop memory
BreakContext.reset();
}
}
}
With these changes being set, the statements will stop execution after a raised Exception statement. At the end of program execution, we should if the Exception has been raised and print an Exception message:
package org.example.toylanguage;
public class ToyLanguage {
@SneakyThrows
public void execute(Path path) {
String source = Files.readString(path);
LexicalParser lexicalParser = new LexicalParser(source);
List<Token> tokens = lexicalParser.parse();
DefinitionContext.pushScope(DefinitionContext.newScope());
MemoryContext.pushScope(MemoryContext.newScope());
try {
CompositeStatement statement = new CompositeStatement();
StatementParser.parse(tokens, statement);
statement.execute();
} finally {
DefinitionContext.endScope();
MemoryContext.endScope();
if (ExceptionContext.isRaised()) {
ExceptionContext.printStackTrace();
}
}
}
}
To print an Exception, we’ll be using the ExceptionContext#printStackTrace()
method, which later on will display the records of the application’s movement as well:
public class ExceptionContext {
...
public static void printStackTrace() {
System.err.println(exception);
}
}
To handle an Exception, let’s finish the HandleExceptionStatement#execute()
implementation. It will consist of three parts for each of the defined blocks:
public class HandleExceptionStatement implements Statement {
private final CompositeStatement beginStatement;
private final CompositeStatement rescueStatement;
private final CompositeStatement ensureStatement;
private final String errorVariable;
@Override
public void execute() {
//begin block
// rescue block
// ensure block
}
}
Each of the blocks should be executed in a new MemoryScope, restricting access to the variables declared in the nested block from the outer block:
//begin block
MemoryContext.pushScope(MemoryContext.newScope());
try {
bodyStatement.execute();
} finally {
MemoryContext.endScope();
}
The rescue
block is optional and should be executed only if we caught an Exception in the ExceptionContext
:
// rescue block
if (rescueStatement != null && ExceptionContext.isRaised()) {
MemoryContext.pushScope(MemoryContext.newScope());
try {
rescueStatement.execute();
} finally {
MemoryContext.endScope();
}
}
If this block rescues an Exception, we should inform the ExceptionContext
that the error has been caught:
// rescue block
if (rescueStatement != null && ExceptionContext.isRaised()) {
MemoryContext.pushScope(MemoryContext.newScope());
ExceptionContext.rescueException();
try {
rescueStatement.execute();
} finally {
MemoryContext.endScope();
}
}
Lastly, for this block, we should initialize the error variable provided by a user with the Exception’s value retrieved from ExceptionContext:
// rescue block
if (rescueStatement != null && ExceptionContext.isRaised()) {
MemoryContext.pushScope(MemoryContext.newScope());
if (errorVariable != null) {
MemoryContext.getScope().setLocal(errorVariable, ExceptionContext.getException().getValue());
}
ExceptionContext.rescueException();
try {
rescueStatement.execute();
} finally {
MemoryContext.endScope();
}
}
The third ensure
block may also be optional as the rescue
block:
// ensure block
if (ensureStatement != null) {
MemoryContext.pushScope(MemoryContext.newScope());
try {
ensureStatement.execute();
} finally {
MemoryContext.endScope();
}
}
In this subsection, we’ll collect records of the application's movement during its execution and display a complete stack trace for raised exceptions:
A message that describes the error.
at Test#do_even_more:14
at Test#do_something_else:10
at do_something:4
at test.toy:1
To collect a stack trace, each of our Statement implementations should contain information about the block name and the row number. We can transform the Statement interface into an abstract class defining these two fields: blockName
and rowNumber
:
package org.example.toylanguage.statement;
@RequiredArgsConstructor
@Getter
public abstract class Statement {
private final Integer rowNumber;
private final String blockName;
public abstract void execute();
}
The rowNumber
can be accessed from the Token containing a word that marks the start of a statement:
public class StatementParser {
...
private void parseKeywordStatement(Token rowToken) {
switch (rowToken.getValue()) {
case "print":
parsePrintStatement(rowToken);
break;
case "input":
parseInputStatement(rowToken);
break;
case "if":
parseConditionStatement(rowToken);
break;
case "class":
parseClassDefinition(rowToken);
break;
case "fun":
parseFunctionDefinition(rowToken);
break;
case "return":
parseReturnStatement(rowToken);
break;
case "loop":
parseLoopStatement(rowToken);
break;
case "break":
parseBreakStatement(rowToken);
break;
case "next":
parseNextStatement(rowToken);
break;
case "assert":
parseAssertStatement(rowToken);
break;
case "raise":
parseRaiseExceptionStatement(rowToken);
break;
case "begin":
parseHandleExceptionStatement(rowToken);
break;
default:
throw new SyntaxException(String.format("Failed to parse a keyword: %s", rowToken.getValue()));
}
}
...
}
The structures in the toy-language we currently have are classes and functions.
To set ClassStatement#blockName
, we can use the class name obtained from ClassDetails#getName()
:
public class StatementParser {
...
private void parseClassDefinition(Token rowToken) {
// read class details
ClassDetails classDetails = readClassDetails();
...
// add class definition
...
ClassStatement classStatement = new ClassStatement(rowToken.getRow(), classDetails.getName());
...
//parse class's statements
...
}
...
}
To set FunctionStatement#blockName
, we can use the function name. In addition to the name, we can specify a class name if the function is declared inside the class:
public class StatementParser {
...
private void parseFunctionDefinition(Token rowToken) {
Token type = tokens.next(TokenType.Variable);
...
//add function definition
String blockName = type.getValue();
if (compositeStatement instanceof ClassStatement) {
blockName = compositeStatement.getBlockName() + "#" + blockName;
}
FunctionStatement functionStatement = new FunctionStatement(rowToken.getRow(), blockName);
...
}
...
}
Other statements do not define structures and can reuse classes’ and functions’ block names by referring to the outer block of code with StatementParser#compositeStatement#getBlockName()
, e.g.:
public class StatementParser {
...
private void parsePrintStatement(Token rowToken) {
...
PrintStatement statement = new PrintStatement(rowToken.getRow(), compositeStatement.getBlockName(), expression);
...
}
...
private void parseInputStatement(Token rowToken) {
...
InputStatement statement = new InputStatement(rowToken.getRow(), compositeStatement.getBlockName(), variable.getValue(), scanner::nextLine);
...
}
...
}
With the current way of creating the root CompositeStatement, we have to provide a root name in the ToyLanguage class, which could be defined as a file name:
public class ToyLanguage {
@SneakyThrows
public void execute(Path path) {
String sourceCode = Files.readString(path);
List<Token> tokens = LexicalParser.parse(sourceCode);
DefinitionContext.pushScope(DefinitionContext.newScope());
MemoryContext.pushScope(MemoryContext.newScope());
try {
CompositeStatement statement = new CompositeStatement(null, path.getFileName().toString());
StatementParser.parse(tokens, statement);
statement.execute();
} finally {
DefinitionContext.endScope();
MemoryContext.endScope();
if (ExceptionContext.isRaised()) {
ExceptionContext.printStackTrace();
}
}
}
}
Now each Statement contains the block name and the row number. Let’s add a collection of statements to the ExceptionContext#Exception
:
public class ExceptionContext {
@Getter
private static Exception exception;
private static boolean raised;
public static boolean raiseException(Value<?> value) {
exception = new Exception(value, new Stack<>());
raised = true;
}
public static void rescueException() {
exception = null;
raised = false;
}
public static boolean isRaised() {
return raised;
}
public static void addTracedStatement(Statement statement) {
if (isRaised()) {
exception.stackTrace.add(statement);
}
}
public static void printStackTrace() {
System.err.println(exception);
rescueException();
}
@RequiredArgsConstructor
@Getter
public static class Exception {
private final Value<?> value;
private final List<Statement> stackTrace;
@Override
public String toString() {
return String.format("%s%n%s",
value,
stackTrace
.stream()
.map(st -> String.format("%4sat %s:%d", "", st.getBlockName(), st.getRowNumber()))
.collect(Collectors.joining("\n"))
);
}
}
}
The ExceptionContext#addTracedStatement(Statement)
should be invoked by every Statement containing an Expression after calling Expression#evaluate()
:
package org.example.toylanguage.statement;
public class ExpressionStatement extends Statement {
...
@Override
public void execute() {
expression.evaluate();
ExceptionContext.addTracedStatement(this);
}
}
package org.example.toylanguage.statement;
public class PrintStatement extends Statement {
...
@Override
public void execute() {
Value<?> value = expression.evaluate();
System.out.println(value);
ExceptionContext.addTracedStatement(this);
}
}
package org.example.toylanguage.statement;
public class RaiseExceptionStatement extends Statement {
...
@Override
public void execute() {
Value<?> value = expression.evaluate();
if (value == NullValue.NULL_INSTANCE) {
value = new TextValue("Empty exception");
}
ExceptionContext.raiseException(value);
ExceptionContext.addTracedStatement(this);
}
}
package org.example.toylanguage.statement;
public class ReturnStatement extends Statement {
...
@Override
public void execute() {
Value<?> result = expression.evaluate();
ReturnContext.getScope().invoke(result);
ExceptionContext.addTracedStatement(this);
}
}
In this part, we implemented a simple model to raise and handle exceptions. One more step towards making a complete programming language. Here are some examples you can run: raise_exception.toy and handle_exception.toy.