In this part of creating your own programming language, we will implement classes as an extension over the previously defined structures. Please check out the previous parts: Building Your Own Programming Language From Scratch Building Your Own Programming Language From Scratch: Part II - Dijkstra's Two-Stack Algorithm Build Your Own Programming Language Part III: Improving Lexical Analysis with Regex Lookaheads Building Your Own Programming Language From Scratch Part IV: Implementing Functions Building Your Own Programming Language From Scratch: Part V - Arrays Building Your Own Programming Language From Scratch: Part VI - Loops The full source code is available . over on GitHub 1. Lexical Analysis In the first section, we will cover lexical analysis. In short, it’s a process to divide the source code into language lexemes, such as keyword, variable, operator, etc. You may remember from the previous parts that I was using the regex expressions in the enum to define all the lexeme types. TokenType Let’s look at the following class prototype and add the missing regex parts to our lexemes: TokenType First, we need to put the word in the lexeme's expression to let the lexical analyzer know where our class declaration begins: class Keyword package org.example.toylanguage.token; ... public enum TokenType { ... Keyword("(if|elif|else|end|print|input|fun|return|loop|in|by|break|next|class)(?=\\s|$)"), ... private final String regex; } 2) Second, we need the new lexeme as a marker for a reference to the current object: This public enum TokenType { ... This("(this)(?=,|\\s|$)"); private final String regex; } 2. Syntax Analysis In the second section, we will transform the lexemes received from the lexical analyzer into the final statements following our language rules. 2.1 Definition Scope When we declare a class or a function, this declaration should be available within the defined isolated bounds. For example, if we declare a function named in the following listing, it will be available for execution after declaration: turn_on [] But if we declare the same function inside a class scope, this function will no longer be accessed directly from the main block: To implement these definition bounds, we’ll create the class and store all the declared definitions inside two sets for classes and functions: DefinitionScope package org.example.toylanguage.context.definition; public class DefinitionScope { private final Set<ClassDefinition> classes; private final Set<FunctionDefinition> functions; public DefinitionScope() { this.classes = new HashSet<>(); this.functions = new HashSet<>(); } } 2) In addition, we may want to access the parent’s definition scope. For example, if we declare two separate classes and create an instance of the first class inside the second one: To provide this ability, we will add the parent instance as a reference to the upper layer, which we will use to climb to the top definition layer. DefinitionScope public class DefinitionScope { private final Set<ClassDefinition> classes; private final Set<FunctionDefinition> functions; private final DefinitionScope parent; public DefinitionScope(DefinitionScope parent) { this.classes = new HashSet<>(); this.functions = new HashSet<>(); this.parent = parent; } } 3) Now, let’s finish the implementation by providing the interfaces to add a definition and retrieve it by name using parent scope: public class DefinitionScope { … public ClassDefinition getClass(String name) { Optional<ClassDefinition> classDefinition = classes.stream() .filter(t -> t.getName().equals(name)) .findAny(); if (classDefinition.isPresent()) return classDefinition.get(); else if (parent != null) return parent.getClass(name); else throw new ExecutionException(String.format("Class is not defined: %s", name)); } public void addClass(ClassDefinition classDefinition) { classes.add(classDefinition); } public FunctionDefinition getFunction(String name) { Optional<FunctionDefinition> functionDefinition = functions.stream() .filter(t -> t.getName().equals(name)) .findAny(); if (functionDefinition.isPresent()) return functionDefinition.get(); else if (parent != null) return parent.getFunction(name); else throw new ExecutionException(String.format("Function is not defined: %s", name)); } public void addFunction(FunctionDefinition functionDefinition) { functions.add(functionDefinition); } } 4) Finally, to manage declared definition scopes and switch between them, we create the context class using collection (LIFO): java.util.Stack package org.example.toylanguage.context.definition; public class DefinitionContext { private final static Stack<DefinitionScope> scopes = new Stack<>(); public static DefinitionScope getScope() { return scopes.peek(); } public static DefinitionScope newScope() { return new DefinitionScope(scopes.isEmpty() ? null : scopes.peek()); } public static void pushScope(DefinitionScope scope) { scopes.push(scope); } public static void endScope() { scopes.pop(); } } 2.2 Memory scope In this section, we will cover the to manage class and function variables. MemoryScope Each declared variable, similarly to the class or function definition, should be accessible only within an isolated block of code. For example, if we define a variable in the following listing, you can access it right after the declaration: But if we declare a variable within a function or a class, the variable will no longer be available from the main (upper) block of code: To implement this logic and store variables defined in a specific scope, we create the class that will contain a map with the variable name as a key and the variable as a value: MemoryScope Value public class MemoryScope { private final Map<String, Value<?>> variables; public MemoryScope() { this.variables = new HashMap<>(); } } 2) Next, similarly to the , we provide access to the parent’s scope variables: DefinitionScope public class MemoryScope { private final Map<String, Value<?>> variables; private final MemoryScope parent; public MemoryScope(MemoryScope parent) { this.variables = new HashMap<>(); this.parent = parent; } } 3) Next, we add methods to get and set variables. When we set a variable, we always re-assign the previously set value if there is already a defined variable in the upper layer: MemoryScope public class MemoryScope { ... public Value<?> get(String name) { Value<?> value = variables.get(name); if (value != null) return value; else if (parent != null) return parent.get(name); else return NullValue.NULL_INSTANCE; } public void set(String name, Value<?> value) { MemoryScope variableScope = findScope(name); if (variableScope == null) { variables.put(name, value); } else { variableScope.variables.put(name, value); } } private MemoryScope findScope(String name) { if (variables.containsKey(name)) return this; return parent == null ? null : parent.findScope(name); } } 4) In addition to the and methods, we add two more implementations to interact with the current (local) layer of : set get MemoryScope public class MemoryScope { ... public Value<?> getLocal(String name) { return variables.get(name); } public void setLocal(String name, Value<?> value) { variables.put(name, value); } } These methods will be used later to initialize function arguments or class’s instance arguments. For example, if we create an instance of the class and pass the predefined global variable, this variable shouldn’t be changed when we try to update the property: Lamp type lamp_instance :: type 5) Finally, to manage variables and switch between memory scopes, we create the implementation using collection: MemoryContext java.util.Stack package org.example.toylanguage.context; public class MemoryContext { private static final Stack<MemoryScope> scopes = new Stack<>(); public static MemoryScope getScope() { return scopes.peek(); } public static MemoryScope newScope() { return new MemoryScope(scopes.isEmpty() ? null : scopes.peek()); } public static void pushScope(MemoryScope scope) { scopes.push(scope); } public static void endScope() { scopes.pop(); } } 2.3 Class Definition In this section, we will read and store class definitions. First, we create the implementation. This statement will be executed every time we create a class’s instance: Statement package org.example.toylanguage.statement; public class ClassStatement { } 2) Each class can contain nested statements for the initialization and other operations, like a constructor. To store these statements, we extend the which contains the list of nested statements to execute: CompositeStatement public class ClassStatement extends CompositeStatement { } 3) Next, we declare the to store the class name, its arguments, constructor statements, and the definition scope with the class’s functions: ClassDefinition package org.example.toylanguage.context.definition; import java.util.List; @RequiredArgsConstructor @Getter @EqualsAndHashCode(onlyExplicitlyIncluded = true) public class ClassDefinition implements Definition { @EqualsAndHashCode.Include private final String name; private final List<String> arguments; private final ClassStatement statement; private final DefinitionScope definitionScope; } 4) Now, we’re ready to read the class declaration using . When we meet the Keyword lexeme with value, first we need to read the class name and its arguments inside the square brackets: StatementParser class package org.example.toylanguage; public class StatementParser { ... private void parseClassDefinition() { Token type = tokens.next(TokenType.Variable); List<String> arguments = new ArrayList<>(); if (tokens.peek(TokenType.GroupDivider, "[")) { tokens.next(TokenType.GroupDivider, "["); //skip opening square bracket while (!tokens.peek(TokenType.GroupDivider, "]")) { Token argumentToken = tokens.next(TokenType.Variable); arguments.add(argumentToken.getValue()); if (tokens.peek(TokenType.GroupDivider, ",")) tokens.next(); } tokens.next(TokenType.GroupDivider, "]"); //skip closing square bracket } } } 5) After the arguments, we expect to read the nested constructor statements: To store these statements, we create an instance of the previously defined : ClassStatement private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); } 6) In addition to arguments and nested statements, our classes can also contain functions. To make these functions accessible only within the class definition, we initialize a new layer of : DefinitionScope private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); DefinitionScope classScope = DefinitionContext.newScope(); } 7) Next, we initialize an instance of and put it to the : ClassDefinition DefinitionContext private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); DefinitionScope classScope = DefinitionContext.newScope(); ClassDefinition classDefinition = new ClassDefinition(type.getValue(), arguments, classStatement, classScope); DefinitionContext.getScope().addClass(classDefinition); } 8) Finally, to read the class constructor statements and functions, we call the static method that will collect statements inside instance until we meet the finalizing lexeme that must be skipped at the end: StatementParser#parse() classStatement end private void parseClassDefinition() { ... //parse class statements StatementParser.parse(this, classStatement, classScope); tokens.next(TokenType.Keyword, "end"); } 2.4 Class Instance At this point, we can already read class definitions with constructor statements and functions. Now, let’s parse the class’s instance: First, we define , which will contain the state of each class instance. Classes, unlike functions, should have a permanent and this scope should be available with all the class's instance arguments and state variables every time we interact with the class instance: ClassValue MemoryScope public class ClassValue extends IterableValue<ClassDefinition> { private final MemoryScope memoryScope; public ClassValue(ClassDefinition definition, MemoryScope memoryScope) { super(definition); this.memoryScope = memoryScope; } } 2) Next, we provide methods to work with the class’s instance properties using : MemoryContext public class ClassValue extends IterableValue<ClassDefinition> { private final MemoryScope memoryScope; public ClassValue(ClassDefinition definition, MemoryScope memoryScope) { super(definition); this.memoryScope = memoryScope; } @Override public String toString() { return getValue().getArguments().stream() .map(t -> t + " = " + getValue(t)) .collect(Collectors.joining(", ", getValue().getName() + " [ ", " ]")); } public Value<?> getValue(String name) { Value<?> result = MemoryContext.getScope().getLocal(name); return result != null ? result : NULL_INSTANCE; } public void setValue(String name, Value<?> value) { MemoryContext.getScope().setLocal(name, value); } } 3) Please notice that by calling the and methods, we work with the current layer of variables. But before accessing the class’s instance state, we need to put its to the and release it when we finish: MemoryScope#getLocal() MemoryScope#setLocal() MemoryScope MemoryScope MemoryContext public class ClassValue extends IterableValue<ClassDefinition> { ... public Value<?> getValue(String name) { MemoryContext.pushScope(memoryScope); try { Value<?> result = MemoryContext.getScope().getLocal(name); return result != null ? result : NULL_INSTANCE; } finally { MemoryContext.endScope(); } } public void setValue(String name, Value<?> value) { MemoryContext.pushScope(memoryScope); try { MemoryContext.getScope().setLocal(name, value); } finally { MemoryContext.endScope(); } } } 4) Next, we can implement the remaining that will be used to construct defined class instances during syntax analysis. To declare class’s instance definition, we provide the and list of arguments that will be transformed into the final instances during statement execution: ClassExpression ClassDefinition Expression Value package org.example.toylanguage.expression; @RequiredArgsConstructor @Getter public class ClassExpression implements Expression { private final ClassDefinition definition; private final List<Expression> arguments; @Override public Value<?> evaluate() { ... } } 5) Let's implement the method that will be used during execution to create an instance of the previously defined . First, we evaluate the arguments into the arguments: Expression#evaluate() ClassValue Expression Value @Override public Value<?> evaluate() { //initialize class arguments List<Value<?>> values = arguments.stream() .map(Expression::evaluate) .collect(Collectors.toList()); } 6) Next, we create an empty memory scope that should be isolated from the other variables and can contain only the class's instance state variables: @Override public Value<?> evaluate() { //initialize class arguments List<Value<?>> values = arguments.stream().map(Expression::evaluate).collect(Collectors.toList()); //get class's definition and statement ClassStatement classStatement = definition.getStatement(); //set separate scope MemoryScope classScope = new MemoryScope(null); MemoryContext.pushScope(classScope); try { ... } finally { MemoryContext.endScope(); } } 7) Next, we create an instance of and write the class’s arguments to the isolated memory scope: ClassValue Value try { //initialize constructor arguments ClassValue classValue = new ClassValue(definition, classScope); IntStream.range(0, definition.getArguments().size()).boxed() .forEach(i -> MemoryContext.getScope() .setLocal(definition.getArguments().get(i), values.size() > i ? values.get(i) : NullValue.NULL_INSTANCE)); ... } finally { MemoryContext.endScope(); } Please notice that we transformed the arguments into the arguments before setting up an empty . Otherwise, we will not be able to access the class’s instance arguments, for example: Expression Value MemoryScope 8) And finally, we can execute the . But before that, we should set the class’s to be able to access the class’s functions within constructor statements: ClassStatement DefinitionScope try { //initialize constructor arguments ClassValue classValue = new ClassValue(definition, classScope); ClassInstanceContext.pushValue(classValue); IntStream.range(0, definition.getArguments().size()).boxed() .forEach(i -> MemoryContext.getScope() .setLocal(definition.getArguments().get(i), values.size() > i ? values.get(i) : NullValue.NULL_INSTANCE)); //execute function body DefinitionContext.pushScope(definition.getDefinitionScope()); try { classStatement.execute(); } finally { DefinitionContext.endScope(); } return classValue; } finally { MemoryContext.endScope(); ClassInstanceContext.popValue(); } 9*) One last thing, we can make the classes more flexible and allow a user to create class instances before declaring the class definition: This can be done by delegating the initialization to the and access it only when we evaluate an expression: ClassDefinition DefinitionContext public class ClassExpression implements Expression { private final String name; private final List<Expression> arguments; @Override public Value<?> evaluate() { //get class's definition and statement ClassDefinition definition = DefinitionContext.getScope().getClass(name); ... } } You can do the same delegation for the to invoke functions before definition. FunctionExpression 10) Finally, we can finish reading the class instances with the . There is no difference between the . We just need to read the arguments and construct the : ExpressionReader previously defined structure instances Expression ClassExpression public class ExpressionReader ... // read class instance: new Class[arguments] private ClassExpression readClassInstance(Token token) { List<Expression> arguments = new ArrayList<>(); if (tokens.peekSameLine(TokenType.GroupDivider, "[")) { tokens.next(TokenType.GroupDivider, "["); //skip opening square bracket while (!tokens.peekSameLine(TokenType.GroupDivider, "]")) { Expression value = ExpressionReader.readExpression(this); arguments.add(value); if (tokens.peekSameLine(TokenType.GroupDivider, ",")) tokens.next(); } tokens.next(TokenType.GroupDivider, "]"); //skip closing square bracket } return new ClassExpression(token.getValue(), arguments); } } 2.5 Class Function At this moment, we can create a class and execute the class's constructor statements. But we are still unable to execute the class's functions. Let’s overload the method that will accept the as a reference to the class instance we want to invoke a function from: FunctionExpression#evaluate ClassValue package org.example.toylanguage.expression; public class FunctionExpression implements Expression { ... public Value<?> evaluate(ClassValue classValue) { } } 2) The next step is to transform the function arguments into the arguments using current : Expression Value MemoryScope public Value<?> evaluate(ClassValue classValue) { //initialize function arguments List<Value<?>> values = argumentExpressions.stream() .map(Expression::evaluate) .collect(Collectors.toList()); } 3) Next, we need to pass the class's and to the context: MemoryScope DefinitionScope ... //get definition and memory scopes from class definition ClassDefinition classDefinition = classValue.getValue(); DefinitionScope classDefinitionScope = classDefinition.getDefinitionScope(); //set class's definition and memory scopes DefinitionContext.pushScope(classDefinitionScope); MemoryContext.pushScope(memoryScope); 4) Lastly for this implementation, we invoke default method and pass evaluated arguments: FunctionExpression#evaluate(List<Value<?>> values) Value public Value<?> evaluate(ClassValue classValue) { //initialize function arguments List<Value<?>> values = argumentExpressions.stream().map(Expression::evaluate).collect(Collectors.toList()); //get definition and memory scopes from class definition ClassDefinition classDefinition = classValue.getValue(); DefinitionScope classDefinitionScope = classDefinition.getDefinitionScope(); MemoryScope memoryScope = classValue.getMemoryScope(); //set class's definition and memory scopes DefinitionContext.pushScope(classDefinitionScope); MemoryContext.pushScope(memoryScope); try { //proceed function return evaluate(values); } finally { DefinitionContext.endScope(); MemoryContext.endScope(); } } 5) To invoke the class’s function, we will use the double colon operator. Currently, this operator is managed by the ( ) implementation, which is responsible for accessing the class’s properties: :: ClassPropertyOperator StructureValueOperator Let’s improve it to support function invocations with the same double colon character: :: Class’s function can be managed by this operator only when the left expression is the and the second one is the : ClassExpression FunctionExpression package org.example.toylanguage.expression.operator; public class ClassPropertyOperator extends BinaryOperatorExpression implements AssignExpression { public ClassPropertyOperator(Expression left, Expression right) { super(left, right); } @Override public Value<?> evaluate() { Value<?> left = getLeft().evaluate(); if (left instanceof ClassValue) { if (getRight() instanceof VariableExpression) { // access class's property // new ClassInstance[] :: class_argument return ((ClassValue) left).getValue(((VariableExpression) getRight()).getName()); } else if (getRight() instanceof FunctionExpression) { // execute class's function // new ClassInstance[] :: class_function [] return ((FunctionExpression) getRight()).evaluate((ClassValue) left); } } throw new ExecutionException(String.format("Unable to access class's property `%s``", getRight())); } @Override public void assign(Value<?> value) { Value<?> left = getLeft().evaluate(); if (left instanceof ClassValue && getRight() instanceof VariableExpression) { String propertyName = ((VariableExpression) getRight()).getName(); ((ClassValue) left).setValue(propertyName, value); } } } 3. Wrap Up In this part, we implemented classes. We can now create more complex things, such as stack implementation: main [] fun main [] stack = new Stack [] loop num in 0..5 # push 0,1,2,3,4 stack :: push [ num ] end size = stack :: size [] loop i in 0..size # should print 4,3,2,1,0 print stack :: pop [] end end class Stack [] arr = {} n = 0 fun push [ item ] this :: arr << item n = n + 1 end fun pop [] n = n - 1 item = arr { n } arr { n } = null return item end fun size [] return this :: n end end