In this article we will improve mathematical expressions evaluation for our programming language (please see Part I) using Dijkstra's Two-Stack algorithm. The essence of this algorithm is that any complex expression ultimately comes down to the simple expression. In the end, It will be a unary or binary operation on one/two operands.
How Dijkstra's Two-Stack algorithm works:
Let’s suppose we have the following expression: 2 * (3 + 4) - 5
. When we calculate this expression first we summarize 3
and 4
in the parentheses. After that, we multiply the result by 2
. And in the end we subtract 5
from the result. Down below is the expression evaluation using Dijkstra's Two-Stack algorithm:
2
to the operands stack:
*
to operators:
(
to operands:
3
to operands:
+
to operators:
4
to operands:
)
to operators. Closing parenthesis pops an operation until a nearest opening parenthesis (
. Therefore, we pop top operator and apply it on the two top operands:
-
to operators. The operators stack's top element *
has greater precedence than the current operator -
. Therefore, first we apply top operator on two top operands:
5
to operands:
-
operator:
Let’s dive into the code. First, we will update our Operator
enum including operators precedence and introduce a few new operators:
@RequiredArgsConstructor
@Getter
public enum Operator {
Not("!", NotOperator.class, 5),
StructureValue("::", StructureValueOperator.class, 5),
Addition("+", AdditionOperator.class, 3),
Subtraction("-", SubtractionOperator.class, 3),
LessThan("<", LessThanOperator.class, 2),
GreaterThan(">", GreaterThanOperator.class, 2);
private final String character;
private final Class<? extends OperatorExpression> type;
private final Integer precedence;
Operator(String character, Integer precedence) {
this(character, null, precedence);
}
public static Operator getType(String character) {
return Arrays.stream(values())
.filter(t -> Objects.equals(t.getCharacter(), character))
.findAny().orElse(null);
}
public boolean greaterThan(Operator o) {
return getPrecedence().compareTo(o.getPrecedence()) > 0;
}
}
public class StatementParser {
...
private Expression readExpression() {
...
while (peek(TokenType.Operator)) {
Token operation = next(TokenType.Operator);
Operator operator = Operator.getType(operation.getValue());
Class<? extends OperatorExpression> operatorType = operator.getType();
...
}
...
}
...
}
We will no longer mark new
as the keyword lexeme, it will be the operator which will help us to instantiate a structure object inside a more complex expression:
public enum TokenType {
...
Keyword("(if|then|end|print|input|struct|arg)"),
Operator("(\\+|\\-|\\>|\\<|\\={1,2}|\\!|\\:{2}|new)");
...
}
We set it the highest priority:
public enum Operator {
...
StructureInstance("new", StructureInstanceOperator.class, 5),
...
}
public class StructureInstanceOperator extends UnaryOperatorExpression {
public StructureInstanceOperator(Expression value) {
super(value);
}
@Override
public Value<?> calc(Value<?> value) {
return value; //will return toString() value
}
}
We will also include two more operators with higher priority than addition and subtraction. Don’t forget to add regex expression for each operator to be counted when we’re doing lexical analysis:
public enum TokenType {
...
Operator("(\\+|\\-|\\>|\\<|\\={1,2}|\\!|\\:{2}|new|\\*|\\/)");
...
}
public enum Operator {
...
Multiplication("*", MultiplicationOperator.class, 4),
Division("/", DivisionOperator.class, 4),
...
}
These operators will be used to change operator’s precedence by surrounding expression in the parentheses. We will not create OperatorExpression
implementations as we make an expression calculation using parentheses:
public enum TokenType {
...
Operator("(\\+|\\-|\\>|\\<|\\={1,2}|\\!|\\:{2}|new|\\*|\\/|\\(|\\))");
...
}
public enum Operator {
...
LeftParen("(", 1),
RightParen(")", 1),
...
}
We also need to introduce a new assignment operator with a single equal =
sign. The variable value will be assigned to the variable only when we complete evaluation of the right expression. Therefore, this operator should have the lowest priority:
public enum Operator {
...
Assignment("=", AssignmentOperator.class, 6),
...
}
public class AssignOperator extends BinaryOperatorExpression {
public AssignOperator(Expression left, Expression right) {
super(left, right);
}
@Override
public Value<?> calc(Value<?> left, Value<?> right) {
if (getLeft() instanceof VariableExpression) {
((VariableExpression) getLeft()).setValue(right);
}
return left;
}
}
During the execution we expect to retrieve the left expression as a VariableExpression
- the variable we’re assigning value to. Thus, we introduce a new method in our VariableExpression
to set the variable value. As you may remember our variables Map is being stored in the StatementParser
. To set access it and set a new value we introduce a BiConsumer
instance, passing variable name as a key and variable value as a new value:
public class VariableExpression implements Expression {
...
private final BiConsumer<String, Value<?>> setter;
...
public void setValue(Value<?> value) {
setter.accept(name, value);
}
}
public class StatementParser {
...
private Expression nextExpression() {
...
return new VariableExpression(value, variables::get, variables::put);
}
...
}
Previously, to build a mathematical expression we expected to obtain an operator after each operand:
Expression left = nextExpression();
//recursively read an expression
while (peek(TokenType.Operator)) {
Token operation = next(TokenType.Operator);
Operator operator = Operator.getType(operation.getValue());
Class<? extends OperatorExpression> operatorType = operator.getType();
if (BinaryOperatorExpression.class.isAssignableFrom(operatorType)) {
Expression right = nextExpression();
left = operatorType
.getConstructor(Expression.class, Expression.class)
.newInstance(left, right);
} else if (UnaryOperatorExpression.class.isAssignableFrom(operatorType)) {
left = operatorType
.getConstructor(Expression.class)
.newInstance(left);
}
}
With the current behavior we can expect two subsequent operators in the row, e.g. addition of structure instantiation as an right operand or two following opening parentheses:
value + new Car [“E30” “325i”] :: model
or
( ( 1 + 2) * 3) - 4
To read an entire expression with any sequence of operands and operators we will introduce the LineBreak
lexeme which will help us to catch all expression tokens until we reach end of the row:
public class StatementParser {
LineBreak("[\\n\\r]"),
Whitespace("[\\s\\t]"),
...
}
The second divider we need to count is when we instantiate a structure object:
new Car [ “E30” “325i” new Engine [“M20B25”] :: model ]
This structure instantiation is quite complex for the StatementParser
to divide each passing argument into separate expression. To deal with it we can introduce new comma GroupDivider
:
public enum TokenType {
...
GroupDivider("(\\[|\\]|\\,)"),
...
}
new Car [ “E30”, “325i”, new Engine [“M20B25”] :: model ]
Now we can start refactoring our expression evaluation in the StatementParser
using Dijkstra’s Two-Stack algorithm.
Firstly, we will introduce skipLineBreaks()
method, that will increment tokens position while we reach LineBreak
token:
private void skipLineBreaks() {
while (tokens.get(position).getType() == TokenType.LineBreak
&& ++position != tokens.size()) ;
}
It will be used on the top of each next()
and peek()
methods to escape catching LineBreak
token:
private Token next(TokenType type, TokenType... types) {
skipLineBreaks();
...
}
private Token next(TokenType type, String value) {
skipLineBreaks();
...
}
private boolean peek(TokenType type, String content) {
skipLineBreaks();
...
}
private boolean peek(TokenType type) {
skipLineBreaks();
...
}
We will also override the next()
method without any arguments just returning the next token:
private Token next() {
skipLineBreaks();
return tokens.get(position++);
}
To read and calculate an entire expression using Dijkstra's Two-Stack algorithm we will need two stacks, the operators stack, and the operands stack. The first stack of operators will contain the Operator
enum types. The second stack of operands will have Expression
type, it will help us to store every type of expression including values, variables and composite operators. It will be clearer if we perform expression reading in a separate inner class ExpressionReader
:
private class ExpressionReader {
private final Stack<Expression> operands;
private final Stack<Operator> operators;
private ExpressionReader() {
this.operands = new Stack<>();
this.operators = new Stack<>();
}
}
Then we create a readExpression()
method with a while loop to read an entire expression. Our mathematical expression can begin with Operator
, Variable
, Numeric
, Logical
or Text
token types. In this case we can’t exclude the LineBreak
token during token retrieval because this token means the end of the expression. Therefore, we introduce an additional peek()
method inside our inner class that will read an expression until we reach the end of the row:
private class ExpressionReader {
...
private Expression readExpression() {
while (peek(TokenType.Operator, TokenType.Variable, TokenType.Numeric, TokenType.Logical, TokenType.Text)) {
Token token = next();
switch (token.getType()) {
case Operator:
...
default:
...
}
}
...
}
private boolean peek(TokenType type, TokenType... types) {
TokenType[] tokenTypes = ArrayUtils.add(types, type);
if (position < tokens.size()) {
Token token = tokens.get(position);
return Stream.of(tokenTypes).anyMatch(t -> t == token.getType());
}
return false;
}
}
The first lexeme type we will process is Operator
. We retrieve the Operator
enum by token’s value and use it in the switch block. There are 3 cases we will deal with:
LeftParen
- just put an operator into the operators stackRightParen
- calculate previously pushed expressions until we reach the left parenthesis LeftParen
.
...
case Operator:
Operator operator = Operator.getType(token.getValue());
switch (operator) {
case LeftParen:
operators.push(operator);
break;
case RightParen:
//until left parenthesis is not reached
while (!operators.empty() && operators.peek() != Operator.LeftParen)
applyTopOperator();
operators.pop(); //pop left parenthesis
break;
default:
//until top operator has greater precedence
while (!operators.isEmpty() && operators.peek().greaterThan(operator))
applyTopOperator();
operators.push(operator); // finally, add less prioritized operator
}
break;
...
To apply the top operator on the top operand(s) we create an additional applyTopOperator()
method. First we pop an operator and the first operand. Then if our operator is binary we pop the second operand. To create an OperatorExpression
object we retrieve a suitable constructor for unary or binary operator implementation and make an instance with earlier popped operands. In the end we push the OperatorExpression
instance to the operands stack. Thus, this pushed operand Expression
can be used later by other operators inside a more composite expression:
@SneakyThrows
private void applyTopOperator() {
// e.g. a + b
Operator operator = operators.pop();
Class<? extends OperatorExpression> operatorType = operator.getType();
Expression left = operands.pop();
if (BinaryOperatorExpression.class.isAssignableFrom(operatorType)) {
Expression right = operands.pop();
operands.push(operatorType
.getConstructor(Expression.class, Expression.class)
.newInstance(right, left));
} else if (UnaryOperatorExpression.class.isAssignableFrom(operatorType)) {
// e.g. new Instance []
operands.push(operatorType
.getConstructor(Expression.class)
.newInstance(left));
} else {
throw new SyntaxException(String.format("Operator `%s` is not supported", operatorType));
}
}
If our token is a variable or a literal, we transform it to the appropriate Expression
object inside the following switch block and then push the result to the operands stack:
...
default:
String value = token.getValue();
Expression operand;
switch (token.getType()) {
case Numeric:
operand = new NumericValue(Integer.parseInt(value));
break;
case Logical:
operand = new LogicalValue(Boolean.valueOf(value));
break;
case Text:
operand = new TextValue(value);
break;
case Variable:
default:
if (!operators.isEmpty() && operators.peek() == Operator.StructureInstance) {
operand = readInstance(token);
} else {
operand = new VariableExpression(value, variables::get, variables::put);
}
}
operands.push(operand);
...
A variable token can also indicate the start of the structure instantiation. In this case we read an entire structure expression with arguments inside square brackets and only then assign it to the StructureExpression
value. We will move existent readInstance()
method from StatementParser
class into the inner ExpressionReader
class with following changes:
...
private Expression readInstance(Token token) {
StructureDefinition definition = structures.get(token.getValue());
List<Expression> arguments = new ArrayList<>();
if (StatementParser.this.peek(TokenType.GroupDivider, "[")) {
next(TokenType.GroupDivider, "["); //skip open square bracket
while (!StatementParser.this.peek(TokenType.GroupDivider, "]")) {
Expression value = new ExpressionReader().readExpression();
arguments.add(value);
if (StatementParser.this.peek(TokenType.GroupDivider, ","))
next();
}
next(TokenType.GroupDivider, "]"); //skip close square bracket
}
if (definition == null) {
throw new SyntaxException(String.format("Structure is not defined: %s", token.getValue()));
}
return new StructureExpression(definition, arguments, variables::get);
}
...
After we finished processing expression tokens and reached the end of the expression we will need an extra while loop to pop remaining operators and apply it on the top operand(s). In the end we should have the only one resulting composite operand in the operands stack, we return it as a final expression result:
private Expression readExpression() {
...
while (!operators.isEmpty()) {
applyTopOperator();
}
return operands.pop();
}
Now we can now build AssignmentStatement
in the parseExpression()
method. The AssignStatement
can start with a Variable or an Operator token type (in case we have a new
operator for the structure instantiation). When we read an expression we expect to read an entire Assignment
expression accumulating left value as a variable name and right expression as a variable value. Therefore, we go to one tokens position back and start reading the expression using ExpressionReader
starting from variable’s name declaration:
private Statement parseExpression() {
Token token = next(TokenType.Keyword, TokenType.Variable, TokenType.Operator);
switch (token.getType()) {
case Variable:
case Operator:
position--;
Expression value = new ExpressionReader().readExpression();
...
case Keyword:
default:
...
}
}
In the end we expect to obtain the complete AssignmentOperator
and we can create an AssignmentStatement
instance:
...
case Variable:
case Operator:
position--;
Expression value = new ExpressionReader().readExpression();
if (value instanceof AssignmentOperator) {
VariableExpression variable = (VariableExpression) ((AssignmentOperator) value).getLeft();
return new AssignmentStatement(variable.getName(), ((AssignmentOperator) value).getRight(), variables::put);
} else {
throw new SyntaxException(String.format("Unsupported statement: `%s`", value));
}
...
The last thing we need to do is to update expressions reading for the input
and if
statements using our new inner ExpressionReader
class. It will allow us to read complex expression inside print
and if
statements:
...
case Keyword:
default:
switch (token.getValue()) {
case "print":
Expression expression = new ExpressionReader().readExpression();
...
case "if":
Expression condition = new ExpressionReader().readExpression();
...
}
...
In this article, we improved our own programming language with mathematical expressions evaluation using Dijkstra's Two-Stack algorithm. We injected operators priority with an ability to change it using parentheses, including complex structure instantiations. The full source code is available over on GitHub.