Let’s Understand Chrome V8 — Chapter 5: Compilation Parser

Welcome to other chapters of Let’s Understand Chrome V8

In this article, we’ll talk about the source code and important data structures of V8’s JS parser. Our test case is the same as in Chapter 4.

1. Parser

In V8, the parser is the next stage of the scanner, and the token output by the scanner is the input of the parser. During compiling JS, the parser will frequently call the scanner to generate the token. With the help of our case, let’s analyze the code of the parser. The following is DoParseProgram which is our start.

The 6th line defines a result, which will save the AST tree. After DoParseProgram() exit, the AST is generated. In our use case, the following method will be called.

ParseStatementList() starts parsing program statements. In the third line of code, peek() takes out the type of the token word. For our case, the type obtained by the peek is Token::FUNCTION, so the result of while is false, and then jumps to line 22 and executes ParseStatementListItem(), which is defined below.

Line 3, Consume() is the token cache we mentioned in the third article. It takes out a token from the cache, and when the cache is missing, it calls the Scanner to generate tokens.

For our case, after knowing that the token type is a function, we need to determine which FunctionKind the function is. The specific code of FunctionKind is as follows:

Note: Don’t confuse FunctionKind with Token::FUNCTION, the token is stuff in compilation technology, but FunctionKind belongs to ECMA specification. In our case, the FunctionKind of the function is KnormalFunction, so the parser will analyze the Token::IDENTIFIER of this function, the code is below.

CurrentSymbol judges whether it is single-byte or double-byte. In our case, JsPrint is a single byte. Figure 1 is the call stack of CurrentSymbol.

At this point, Parse is complete. In our case, the three most important things Parse does:

In the JS source code, when Parser meets the function token, it will know that the next token is a function;
Judging FunctionKind, is it asynchronous or something else? Our case is kNormalFunction.
Take out the next token JsPrint, and parse it.

2. Lazy Parser

What is a delay parser? It is an optimization technique in V8, that is, the code is not parsed until it is executed. As we all know, because of control conditions, not all code will be executed. Based on this, V8 uses a delay parser and delay compilation to improve efficiency.

In our case, the JsPrint’s type is kNormalFunction, which is the function to be executed immediately, so we need to parse it immediately.

After analyzing the function name (JsPrint), ParseFunctionLiteral will be called, which is responsible for parsing the function body.

Figure 2 is the case, you can see that JsPrint will not be executed immediately, meeting the delay parse condition. We can also get the same conclusion from the JS code: console.log() is executed first, and then console.log() calls JsPrint, so it satisfies the delay parse condition.

Debugging the program is the best way to verify the above conclusions. By debugging the ParseFunctionLiteral(), and watching the is_lazy and is_top_level members, you will also agree with the above conclusion. Figure 3 is the ParseFunctionLiteral’s call stack.

Figure 4 is the abstract syntax tree of JsPrint.

Okay, that wraps it up for this share. I’ll see you guys next time, take care!

Please reach out to me if you have any issues.

WeChat: qq9123013 Email: v8blink@outlook.com

This article was first published here.

Let’s Understand Chrome V8 — Chapter 5: Compilation Parser

Too Long; Didn't Read

1. Parser

2. Lazy Parser

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Let’s Understand Chrome V8 — Chapter 5: Compilation Parser

Too Long; Didn't Read

1. Parser

2. Lazy Parser

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES