Welcome to other chapters of Let’s Understand Chrome V8
Bytecode is the output of the parse, which is an architecture-independent abstract machine code.
In this article, we start debugging from the AST, explain bytecode generation, analyze the kernal code and important data structures, as shown in Figure 1.
V8 has hundreds of bytecodes ranging from simple operations like Add
and Sub
to complex operations like LdaNamedProperty
. Each bytecode can use registers and accumulator as operands. The accumulator is a regular register like any other register, but the difference is that the read/write of the accumulator is implicit.
For example: Add r1 adds the value of register r1 to the accumulator, the accumulator is not explicitly given because it is default.
Bytecodes are defined in v8/src/interpreter/bytecodes.h, here are some examples.
The above code is the macro definition of bytecode. Let’s talk about V(Ldar, ImplicitRegisterUse::kWriteAccumulator, OperandType::kReg), Ldar means that load data into the accumulator, ImplicitRegisterUse::kWriteAccumulator and OperandType::kReg are the source operand and destination operand respectively. See the example below:
For other bytecode instructions, please refer to the instruction definition file of V8. In order to improve performance, V8 marks the bytecode with high execution frequency as hot code and uses TurboFan to compile the code into local machine code, as shown in Figure 4.
Turbofan compilation (from bytecode to local machine code) requires more time and resources, so it is only suitable for hot code. Interpreter compilation (from JS to bytecode) requires very little time and resources and is suitable for general cases.
Commonly, the high number of execution will upgrade the bytecodes to hot codes.
But what is the reason for hot code downgrading to bytecode? There are many reasons, the common reason is debugging — — you open F12 to debug JS.
Before driving into bytecode, we need to know the AST tree, because bytecode generation is the process of walking the AST tree. In V8, walking the AST is a finite state automaton, which together with some predefined macro templates to generate bytecode. Figure 5 shows the data structure of AST.
All nodes of a AST tree inherit from the parent class AstNode, and AstNode has many member methods. Among the many methods, the NodeType method is the most important undoubtedly, because when translating an AstNode into bytecode, the NodeType will convert the parent class AstNode into a specific subclass, such as an ExPRESSION or a STATEMENT. Then, read the corresponding data and generate bytecode. The following code converts AstNode to Assignment.
In the above code, expr->target(), expr->value() and expr->op() may be called recursively because expressions can contain multiple subexpressions.
The above code is the entry for generating bytecode and finally enters into VisitStatements(literal->body()) that is responsible for bytecode generating.
Before generating bytecode, need to take out the type of the subclass, below is the AstNode->XXXtype() that is responsible for taking out the type.
The ASTNode is composed of the above three parts of the code. The first part of the code corresponds to Figure 5.
The above code is the entry function of bytecode generation. Figure 6 is VisitStatements’s call stack.
Okay, that wraps it up for this share. I’ll see you guys next time, take care!
Please reach out to me if you have any issues.
WeChat: qq9123013 Email: [email protected]
Also published here.