In this part of creating your own programming language, we will implement classes as an extension over the previously defined structures. Please check out the previous parts: \ 1. [Building Your Own Programming Language From Scratch](https://hackernoon.com/building-your-own-programming-language-from-scratch?ref=hackernoon.com) 2. [Building Your Own Programming Language From Scratch: Part II - Dijkstra's Two-Stack Algorithm](https://hackernoon.com/building-your-own-programming-language-from-scratch-part-ii-dijkstras-two-stack-algorithm?ref=hackernoon.com) 3. [Build Your Own Programming Language Part III: Improving Lexical Analysis with Regex Lookaheads](https://hackernoon.com/build-your-own-programming-language-part-iii-improving-lexical-analysis-with-regex-lookaheads?ref=hackernoon.com) 4. [Building Your Own Programming Language From Scratch Part IV: Implementing Functions](https://hackernoon.com/building-your-own-programming-language-from-scratch-part-iv-implementing-functions?ref=hackernoon.com) 5. [Building Your Own Programming Language From Scratch: Part V - Arrays](https://hackernoon.com/building-your-own-programming-language-from-scratch-part-v-arrays?ref=hackernoon.com) 6. [Building Your Own Programming Language From Scratch: Part VI - Loops](https://hackernoon.com/building-your-own-programming-language-from-scratch-part-vi-loops) **The full source code is available __[over on GitHub](https://github.com/alexandermakeev/toy-language)__.** \ ## 1. Lexical Analysis In the first section, we will cover lexical analysis. In short, it’s a process to divide the source code into language lexemes, such as keyword, variable, operator, etc. \ You may remember from the previous parts that I was using the regex expressions in the [TokenType](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/token/TokenType.java) enum to define all the lexeme types. \ Let’s look at the following class prototype and add the missing regex parts to our `TokenType` lexemes:  \ 1) First, we need to put the `class` word in the `Keyword` lexeme's expression to let the lexical analyzer know where our class declaration begins: ```java package org.example.toylanguage.token; ... public enum TokenType { ... Keyword("(if|elif|else|end|print|input|fun|return|loop|in|by|break|next|class)(?=\\s|$)"), ... private final String regex; } ``` \ 2) Second, we need the new `This` lexeme as a marker for a reference to the current object: ```java public enum TokenType { ... This("(this)(?=,|\\s|$)"); private final String regex; } ``` \ ## **2. Syntax Analysis** In the second section, we will transform the lexemes received from the lexical analyzer into the final statements following our language rules. ## **2.1 Definition Scope** When we declare a class or a function, this declaration should be available within the defined isolated bounds. For example, if we declare a function named `turn_on []` in the following listing, it will be available for execution after declaration:  \ But if we declare the same function inside a class scope, this function will no longer be accessed directly from the main block:  \ 1) To implement these definition bounds, we’ll create the `DefinitionScope` class and store all the declared definitions inside two sets for classes and functions: ```java package org.example.toylanguage.context.definition; public class DefinitionScope { private final Set<ClassDefinition> classes; private final Set<FunctionDefinition> functions; public DefinitionScope() { this.classes = new HashSet<>(); this.functions = new HashSet<>(); } } ``` \ 2) In addition, we may want to access the parent’s definition scope. For example, if we declare two separate classes and create an instance of the first class inside the second one:  \ To provide this ability, we will add the `DefinitionScope` parent instance as a reference to the upper layer, which we will use to climb to the top definition layer. ```ruby public class DefinitionScope { private final Set<ClassDefinition> classes; private final Set<FunctionDefinition> functions; private final DefinitionScope parent; public DefinitionScope(DefinitionScope parent) { this.classes = new HashSet<>(); this.functions = new HashSet<>(); this.parent = parent; } } ``` \ 3) Now, let’s finish the implementation by providing the interfaces to add a definition and retrieve it by name using parent scope: ```ruby public class DefinitionScope { … public ClassDefinition getClass(String name) { Optional<ClassDefinition> classDefinition = classes.stream() .filter(t -> t.getName().equals(name)) .findAny(); if (classDefinition.isPresent()) return classDefinition.get(); else if (parent != null) return parent.getClass(name); else throw new ExecutionException(String.format("Class is not defined: %s", name)); } public void addClass(ClassDefinition classDefinition) { classes.add(classDefinition); } public FunctionDefinition getFunction(String name) { Optional<FunctionDefinition> functionDefinition = functions.stream() .filter(t -> t.getName().equals(name)) .findAny(); if (functionDefinition.isPresent()) return functionDefinition.get(); else if (parent != null) return parent.getFunction(name); else throw new ExecutionException(String.format("Function is not defined: %s", name)); } public void addFunction(FunctionDefinition functionDefinition) { functions.add(functionDefinition); } } ``` \ 4) Finally, to manage declared definition scopes and switch between them, we create the context class using `java.util.Stack` collection (LIFO): ```java package org.example.toylanguage.context.definition; public class DefinitionContext { private final static Stack<DefinitionScope> scopes = new Stack<>(); public static DefinitionScope getScope() { return scopes.peek(); } public static DefinitionScope newScope() { return new DefinitionScope(scopes.isEmpty() ? null : scopes.peek()); } public static void pushScope(DefinitionScope scope) { scopes.push(scope); } public static void endScope() { scopes.pop(); } } ``` \ ## **2.2 Memory scope** In this section, we will cover the `MemoryScope` to manage class and function variables. \ 1) Each declared variable, similarly to the class or function definition, should be accessible only within an isolated block of code. For example, if we define a variable in the following listing, you can access it right after the declaration:  \ But if we declare a variable within a function or a class, the variable will no longer be available from the main (upper) block of code:  \ To implement this logic and store variables defined in a specific scope, we create the `MemoryScope` class that will contain a map with the variable name as a key and the variable `Value` as a value: ```java public class MemoryScope { private final Map<String, Value<?>> variables; public MemoryScope() { this.variables = new HashMap<>(); } } ``` \ 2) Next, similarly to the `DefinitionScope`, we provide access to the parent’s scope variables: ```java public class MemoryScope { private final Map<String, Value<?>> variables; private final MemoryScope parent; public MemoryScope(MemoryScope parent) { this.variables = new HashMap<>(); this.parent = parent; } } ``` \ 3) Next, we add methods to get and set variables. When we set a variable, we always re-assign the previously set value if there is already a defined variable in the upper `MemoryScope` layer: ```java public class MemoryScope { ... public Value<?> get(String name) { Value<?> value = variables.get(name); if (value != null) return value; else if (parent != null) return parent.get(name); else return NullValue.NULL_INSTANCE; } public void set(String name, Value<?> value) { MemoryScope variableScope = findScope(name); if (variableScope == null) { variables.put(name, value); } else { variableScope.variables.put(name, value); } } private MemoryScope findScope(String name) { if (variables.containsKey(name)) return this; return parent == null ? null : parent.findScope(name); } } ``` \ 4) In addition to the `set` and `get` methods, we add two more implementations to interact with the current (local) layer of `MemoryScope`: ```java public class MemoryScope { ... public Value<?> getLocal(String name) { return variables.get(name); } public void setLocal(String name, Value<?> value) { variables.put(name, value); } } ``` \ These methods will be used later to initialize function arguments or class’s instance arguments. For example, if we create an instance of the `Lamp` class and pass the predefined global `type` variable, this variable shouldn’t be changed when we try to update the `lamp_instance :: type` property:  \ 5) Finally, to manage variables and switch between memory scopes, we create the `MemoryContext` implementation using `java.util.Stack` collection: ```java package org.example.toylanguage.context; public class MemoryContext { private static final Stack<MemoryScope> scopes = new Stack<>(); public static MemoryScope getScope() { return scopes.peek(); } public static MemoryScope newScope() { return new MemoryScope(scopes.isEmpty() ? null : scopes.peek()); } public static void pushScope(MemoryScope scope) { scopes.push(scope); } public static void endScope() { scopes.pop(); } } ``` \ ## **2.3 Class Definition** In this section, we will read and store class definitions.  \ 1) First, we create the [Statement](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/statement/Statement.java) implementation. This statement will be executed every time we create a class’s instance: ```java package org.example.toylanguage.statement; public class ClassStatement { } ``` \ 2) Each class can contain nested statements for the initialization and other operations, like a constructor. To store these statements, we extend the [CompositeStatement](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/statement/CompositeStatement.java) which contains the list of nested statements to execute: ```java public class ClassStatement extends CompositeStatement { } ``` \ 3) Next, we declare the `ClassDefinition` to store the class name, its arguments, constructor statements, and the definition scope with the class’s functions: ```java package org.example.toylanguage.context.definition; import java.util.List; @RequiredArgsConstructor @Getter @EqualsAndHashCode(onlyExplicitlyIncluded = true) public class ClassDefinition implements Definition { @EqualsAndHashCode.Include private final String name; private final List<String> arguments; private final ClassStatement statement; private final DefinitionScope definitionScope; } ``` \ 4) Now, we’re ready to read the class declaration using [StatementParser](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/StatementParser.java). When we meet the Keyword lexeme with `class` value, first we need to read the class name and its arguments inside the square brackets:  \ ```java package org.example.toylanguage; public class StatementParser { ... private void parseClassDefinition() { Token type = tokens.next(TokenType.Variable); List<String> arguments = new ArrayList<>(); if (tokens.peek(TokenType.GroupDivider, "[")) { tokens.next(TokenType.GroupDivider, "["); //skip opening square bracket while (!tokens.peek(TokenType.GroupDivider, "]")) { Token argumentToken = tokens.next(TokenType.Variable); arguments.add(argumentToken.getValue()); if (tokens.peek(TokenType.GroupDivider, ",")) tokens.next(); } tokens.next(TokenType.GroupDivider, "]"); //skip closing square bracket } } } ``` \ 5) After the arguments, we expect to read the nested constructor statements:  \ To store these statements, we create an instance of the previously defined `ClassStatement`: ```java private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); } ``` \ 6) In addition to arguments and nested statements, our classes can also contain functions. To make these functions accessible only within the class definition, we initialize a new layer of`DefinitionScope`: ```java private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); DefinitionScope classScope = DefinitionContext.newScope(); } ``` \ 7) Next, we initialize an instance of `ClassDefinition` and put it to the `DefinitionContext`: ```java private void parseClassDefinition() { ... ClassStatement classStatement = new ClassStatement(); DefinitionScope classScope = DefinitionContext.newScope(); ClassDefinition classDefinition = new ClassDefinition(type.getValue(), arguments, classStatement, classScope); DefinitionContext.getScope().addClass(classDefinition); } ``` \ 8) Finally, to read the class constructor statements and functions, we call the static [StatementParser#parse()](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/StatementParser.java#L32) method that will collect statements inside `classStatement` instance until we meet the finalizing `end` lexeme that must be skipped at the end: ```java private void parseClassDefinition() { ... //parse class statements StatementParser.parse(this, classStatement, classScope); tokens.next(TokenType.Keyword, "end"); } ``` \ ## **2.4 Class Instance** At this point, we can already read class definitions with constructor statements and functions. Now, let’s parse the class’s instance:  \ 1) First, we define `ClassValue`, which will contain the state of each class instance. Classes, unlike functions, should have a permanent `MemoryScope` and this scope should be available with all the class's instance arguments and state variables every time we interact with the class instance: ```java public class ClassValue extends IterableValue<ClassDefinition> { private final MemoryScope memoryScope; public ClassValue(ClassDefinition definition, MemoryScope memoryScope) { super(definition); this.memoryScope = memoryScope; } } ``` \ 2) Next, we provide methods to work with the class’s instance properties using `MemoryContext`: ```java public class ClassValue extends IterableValue<ClassDefinition> { private final MemoryScope memoryScope; public ClassValue(ClassDefinition definition, MemoryScope memoryScope) { super(definition); this.memoryScope = memoryScope; } @Override public String toString() { return getValue().getArguments().stream() .map(t -> t + " = " + getValue(t)) .collect(Collectors.joining(", ", getValue().getName() + " [ ", " ]")); } public Value<?> getValue(String name) { Value<?> result = MemoryContext.getScope().getLocal(name); return result != null ? result : NULL_INSTANCE; } public void setValue(String name, Value<?> value) { MemoryContext.getScope().setLocal(name, value); } } ``` \ 3) Please notice that by calling the`MemoryScope#getLocal()` and `MemoryScope#setLocal()` methods, we work with the current layer of `MemoryScope` variables. But before accessing the class’s instance state, we need to put its `MemoryScope` to the `MemoryContext` and release it when we finish: ```java public class ClassValue extends IterableValue<ClassDefinition> { ... public Value<?> getValue(String name) { MemoryContext.pushScope(memoryScope); try { Value<?> result = MemoryContext.getScope().getLocal(name); return result != null ? result : NULL_INSTANCE; } finally { MemoryContext.endScope(); } } public void setValue(String name, Value<?> value) { MemoryContext.pushScope(memoryScope); try { MemoryContext.getScope().setLocal(name, value); } finally { MemoryContext.endScope(); } } } ``` \ 4) Next, we can implement the remaining `ClassExpression` that will be used to construct defined class instances during syntax analysis. To declare class’s instance definition, we provide the `ClassDefinition` and list of `Expression` arguments that will be transformed into the final `Value` instances during statement execution: ```java package org.example.toylanguage.expression; @RequiredArgsConstructor @Getter public class ClassExpression implements Expression { private final ClassDefinition definition; private final List<Expression> arguments; @Override public Value<?> evaluate() { ... } } ``` \ 5) Let's implement the `Expression#evaluate()` method that will be used during execution to create an instance of the previously defined `ClassValue`. First, we evaluate the `Expression` arguments into the `Value` arguments: ```java @Override public Value<?> evaluate() { //initialize class arguments List<Value<?>> values = arguments.stream() .map(Expression::evaluate) .collect(Collectors.toList()); } ``` \ 6) Next, we create an empty memory scope that should be isolated from the other variables and can contain only the class's instance state variables: ```java @Override public Value<?> evaluate() { //initialize class arguments List<Value<?>> values = arguments.stream().map(Expression::evaluate).collect(Collectors.toList()); //get class's definition and statement ClassStatement classStatement = definition.getStatement(); //set separate scope MemoryScope classScope = new MemoryScope(null); MemoryContext.pushScope(classScope); try { ... } finally { MemoryContext.endScope(); } } ``` \ 7) Next, we create an instance of `ClassValue` and write the class’s `Value` arguments to the isolated memory scope: ```java try { //initialize constructor arguments ClassValue classValue = new ClassValue(definition, classScope); IntStream.range(0, definition.getArguments().size()).boxed() .forEach(i -> MemoryContext.getScope() .setLocal(definition.getArguments().get(i), values.size() > i ? values.get(i) : NullValue.NULL_INSTANCE)); ... } finally { MemoryContext.endScope(); } ``` \ Please notice that we transformed the `Expression` arguments into the `Value` arguments before setting up an empty `MemoryScope`. Otherwise, we will not be able to access the class’s instance arguments, for example:  \ 8) And finally, we can execute the `ClassStatement`. But before that, we should set the class’s `DefinitionScope` to be able to access the class’s functions within constructor statements: ```java try { //initialize constructor arguments ClassValue classValue = new ClassValue(definition, classScope); ClassInstanceContext.pushValue(classValue); IntStream.range(0, definition.getArguments().size()).boxed() .forEach(i -> MemoryContext.getScope() .setLocal(definition.getArguments().get(i), values.size() > i ? values.get(i) : NullValue.NULL_INSTANCE)); //execute function body DefinitionContext.pushScope(definition.getDefinitionScope()); try { classStatement.execute(); } finally { DefinitionContext.endScope(); } return classValue; } finally { MemoryContext.endScope(); ClassInstanceContext.popValue(); } ``` \ 9\*) One last thing, we can make the classes more flexible and allow a user to create class instances before declaring the class definition:  \ This can be done by delegating the `ClassDefinition` initialization to the `DefinitionContext` and access it only when we evaluate an expression: ```java public class ClassExpression implements Expression { private final String name; private final List<Expression> arguments; @Override public Value<?> evaluate() { //get class's definition and statement ClassDefinition definition = DefinitionContext.getScope().getClass(name); ... } } ``` \ You can do the same delegation for the [FunctionExpression](https://github.com/alexandermakeev/toy-language/blob/74350de0fe5303261b6817a6277acc8b9e7a307b/src/main/java/org/example/toylanguage/expression/FunctionExpression.java#L25) to invoke functions before definition. \ 10) Finally, we can finish reading the class instances with the `ExpressionReader`. There is no difference between the [previously defined structure instances](https://github.com/alexandermakeev/toy-language/blob/feature/loops/src/main/java/org/example/toylanguage/StatementParser.java#L329). We just need to read the `Expression` arguments and construct the `ClassExpression`: ```java public class ExpressionReader ... // read class instance: new Class[arguments] private ClassExpression readClassInstance(Token token) { List<Expression> arguments = new ArrayList<>(); if (tokens.peekSameLine(TokenType.GroupDivider, "[")) { tokens.next(TokenType.GroupDivider, "["); //skip opening square bracket while (!tokens.peekSameLine(TokenType.GroupDivider, "]")) { Expression value = ExpressionReader.readExpression(this); arguments.add(value); if (tokens.peekSameLine(TokenType.GroupDivider, ",")) tokens.next(); } tokens.next(TokenType.GroupDivider, "]"); //skip closing square bracket } return new ClassExpression(token.getValue(), arguments); } } ``` \ ## **2.5 Class Function** At this moment, we can create a class and execute the class's constructor statements. But we are still unable to execute the class's functions. \n 1) Let’s overload the `FunctionExpression#evaluate` method that will accept the `ClassValue` as a reference to the class instance we want to invoke a function from: ```java package org.example.toylanguage.expression; public class FunctionExpression implements Expression { ... public Value<?> evaluate(ClassValue classValue) { } } ``` \ 2) The next step is to transform the function `Expression` arguments into the `Value` arguments using current `MemoryScope`: ```java public Value<?> evaluate(ClassValue classValue) { //initialize function arguments List<Value<?>> values = argumentExpressions.stream() .map(Expression::evaluate) .collect(Collectors.toList()); } ``` \ 3) Next, we need to pass the class's `MemoryScope` and `DefinitionScope` to the context: ```java ... //get definition and memory scopes from class definition ClassDefinition classDefinition = classValue.getValue(); DefinitionScope classDefinitionScope = classDefinition.getDefinitionScope(); //set class's definition and memory scopes DefinitionContext.pushScope(classDefinitionScope); MemoryContext.pushScope(memoryScope); ``` \ 4) Lastly for this implementation, we invoke default [FunctionExpression#evaluate(List<Value<?>> values)](https://github.com/alexandermakeev/toy-language/blob/master/src/main/java/org/example/toylanguage/expression/FunctionExpression.java#L64) method and pass evaluated `Value` arguments: ```java public Value<?> evaluate(ClassValue classValue) { //initialize function arguments List<Value<?>> values = argumentExpressions.stream().map(Expression::evaluate).collect(Collectors.toList()); //get definition and memory scopes from class definition ClassDefinition classDefinition = classValue.getValue(); DefinitionScope classDefinitionScope = classDefinition.getDefinitionScope(); MemoryScope memoryScope = classValue.getMemoryScope(); //set class's definition and memory scopes DefinitionContext.pushScope(classDefinitionScope); MemoryContext.pushScope(memoryScope); try { //proceed function return evaluate(values); } finally { DefinitionContext.endScope(); MemoryContext.endScope(); } } ``` \ 5) To invoke the class’s function, we will use the double colon `::` operator. Currently, this operator is managed by the `ClassPropertyOperator` ([StructureValueOperator](https://github.com/alexandermakeev/toy-language/blob/feature/loops/src/main/java/org/example/toylanguage/expression/operator/StructureValueOperator.java)) implementation, which is responsible for accessing the class’s properties:  \ Let’s improve it to support function invocations with the same double colon `::` character:  \ Class’s function can be managed by this operator only when the left expression is the `ClassExpression` and the second one is the `FunctionExpression`: ```java package org.example.toylanguage.expression.operator; public class ClassPropertyOperator extends BinaryOperatorExpression implements AssignExpression { public ClassPropertyOperator(Expression left, Expression right) { super(left, right); } @Override public Value<?> evaluate() { Value<?> left = getLeft().evaluate(); if (left instanceof ClassValue) { if (getRight() instanceof VariableExpression) { // access class's property // new ClassInstance[] :: class_argument return ((ClassValue) left).getValue(((VariableExpression) getRight()).getName()); } else if (getRight() instanceof FunctionExpression) { // execute class's function // new ClassInstance[] :: class_function [] return ((FunctionExpression) getRight()).evaluate((ClassValue) left); } } throw new ExecutionException(String.format("Unable to access class's property `%s``", getRight())); } @Override public void assign(Value<?> value) { Value<?> left = getLeft().evaluate(); if (left instanceof ClassValue && getRight() instanceof VariableExpression) { String propertyName = ((VariableExpression) getRight()).getName(); ((ClassValue) left).setValue(propertyName, value); } } } ``` \ ## 3. Wrap Up In this part, we implemented classes. We can now create more complex things, such as stack implementation: ```ruby main [] fun main [] stack = new Stack [] loop num in 0..5 # push 0,1,2,3,4 stack :: push [ num ] end size = stack :: size [] loop i in 0..size # should print 4,3,2,1,0 print stack :: pop [] end end class Stack [] arr = {} n = 0 fun push [ item ] this :: arr << item n = n + 1 end fun pop [] n = n - 1 item = arr { n } arr { n } = null return item end fun size [] return this :: n end end ``` \