We’ve studied several zkEVM projects in the previously published article Exploring Popular zkEVM Solutions, among which the representative of native-based zkEVM is what is being researched by the AppliedZKP team of the Ethereum Foundation. The text below will guide you through the circuit code of the AppliedZKP zkEVM.
The code is still in the initial implementation, and many crates have not yet stabilized. Thus, all discussions in this article are subject to change in the future. Please always refer to the latest code if you are to do code development and code study. This text aims to provide an easy guide to the basic framework and logic of the AppliedZKP zkEVM code base. Hopefully, more developers will join us in the further research of zkEVM.
Code Base |
Path |
Branch |
Commit_ID |
Date |
---|---|---|---|---|
zkEVM-circuits |
main |
39a7bf7b30e25faa7db26a3cb94640240164542d |
2021-10-18 |
In the current code, there are three crates, namely bus-mapping, keccak256, and zkevm-circuits. The relationship among the three is shown in the figure below:
Also in Exploring Popular zkEVM Solutions, we pointed out that Bus-mapping is the state channel of evmcircuit and statecircuit in zkEVM. The devs organize the code in the way shown in the figure above. Let’s look at each crate in turn.
The objectives of bus-mapping are:
So the basic function is to process witness data, regardless of whether it is the path data executed by the EVM or the state data that changes before and after.
Let's look at each crate in bus-mapping specifically.
Basically, EVM defines ProgramCounter
, GlobalCounter
, EVMWord
, GasInfo
, and GasCost
include and contains data infrustractures like stack
, memory
, and opcodes
. It can be seen that pc
and gc
are encapsulation to usize
; EVMWord
is the encapsulation of the u8
array with a length of 32.
StackAddress
is usize, but Stack is a dynamic array of EVMWord
. MemoryAddress
is also usize, which should be between 0 and 1023 while Memory is an array of u8. Developers use Rust macro cleverly to extract key memory information, such as traits of index and range, and implement them in a unified way. Storage is a HashMap and its key values are all EVMWord
.
The opcodes crate defines the opcode traits of stack, memory, and storage, as well as the implementation of parts of the opcode traits.
pub trait Opcode: Debug {
/// Generate the associated [`MemoryOp`](crate::operation::MemoryOp)s,
/// [`StackOp`](crate::operation::StackOp)s, and
/// [`StorageOp`](crate::operation::StorageOp)s associated to the Opcode
/// is implemented for.
fn gen_associated_ops(
&self,
exec_step: &mut ExecutionStep,
container: &mut OperationContainer,
next_steps: &[ExecutionStep],
) -> Result<usize, Error>;
}
Let’s take mload
as an example to see how the trait is implemented. Here is the implementation of go-ethereum.
// core/vm/instructions.go
func opMload(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
v := scope.Stack.peek()
offset := int64(v.Uint64())
v.SetBytes(scope.Memory.GetPtr(offset, 32))
return nil, nil
}
There are three steps of mload:
1. Take the value of offset from the last position of the stack;
2. Take the value of [offset, offset+32)
bytes from memory;
3. Write bytes in the last position of the stack.
gen_associated_ops
of mload is to do these steps. For detailed code, please refer to
bus-mapping/src/EVM/opcodes/mload.rs.
This is the most important crate in bus-mapping, including BlockConstants
, ExecutionTrace
, and OperationRef
. BlockConstants
contains some constants of EVM, such as coinbase
, timestamp
, etc.
ExecutionTrace
is the core of bus-mapping, which is the interface form of witness. ExecutionStep
contains content used in every execution such as memory, stack, current instruction, and Operation-related bus_mapping_instance
. ExecutionStep
implements its own gen_associated_ops
function, which can call the corresponding gen_associated_ops
according to the current instruction.
pub struct ExecutionStep {
pub(crate) memory: Memory,
pub(crate) stack: Stack,
pub(crate) instruction: OpcodeId,
pub(crate) gas_info: GasInfo,
pub(crate) depth: u8,
pub(crate) pc: ProgramCounter,
pub(crate) gc: GlobalCounter,
// Holds refs to the container with the related mem ops.
pub(crate) bus_mapping_instance: Vec<OperationRef>,
}
ExecutionTrace
include all steps executed, and container includes each operation used in Trace. The most critical function in this structure is build, which is used to set gc
and generate a corresponding Operation for each step in the trace.
pub struct ExecutionTrace<F: FieldExt> {
steps: Vec<ExecutionStep>,
block_ctants: BlockConstants<F>,
container: OperationContainer,
}
The operation crate is relatively simple, defining RW
, Target
, MemoryOp
, StackOp
, StorageOp
, and Operation
, where RW
is of two types, i.e., read and write, and Target
includes Memory, Stack, and Storage. The encapsulation of Operation will specify what kind of Operation it is.
This crate requires the use of halo2 to implement the zkEVM circuits, which is also a more complex part of the implementation of zkEVM. It temporarily includes the following subcrates: evm_circuit
, gadget
, state_circuit
, and util
.
Util
is used for tool implementations, such as converting Rust's basic data types, such as bool
, u8
, etc., to the elements of a finite field.
Gadget
defines the basic tool chip. For example, evm_word
is a constraint check for 256 bit (or 32 bytes), is_zero
is to determine whether it is 0, and monotone
checks whether the advice column increases monotonically within a range. The implementation of these chip circuits follows the general architectural design. For more details, please refer to our previous publication Develop Circuits Using Halo2.
In zkEVM design specs, there are state proof and evm proof. State proof is used to help evm proof check whether all random read, write, and access records are effective. The records of random read and write are generated by state_circuit. State_circuit defines BusMapping and Config. From Develop Circuits Using Halo2, we can know that config is the most critical part to write the circuit of Halo2. It contains all the information on the lookup table needed for evm proof.
Take the code comments in state_circuit as an example. q_target
is the selector, where 1 represents the first line of each target, 2 represents memory, 3 represents stack, and 4 represents storage. As we can see, state is grouped by address first, and then sorted by global_counter
. Note that the address here is not the Ethereum address. For memory and stack, the address is the StackAddress
and MemoryAddress
mentioned in the previous section, namely usize, so the values in the example are 0 and 1. global_counter
is the counter, which will count different addresses separately. Value is the operation value, and flag marks read and write, where 1 is write, and 0 is read. padding is used to fill the gap between each target and has no practical meaning. storage_key
is exclusive to the storage operation, which is the key, and value_pre
is the value of the previous value in this target, which is 0 at the beginning.
/*
Example state table:
| q_target | address | global_counter | value | flag | padding | storage_key | value_prev |
-------------------------------------------------------------------------------------------
| 1 | 0 | 0 | 0 | 1 | 0 | | | // init row (write value 0)
| 2 | 0 | 12 | 12 | 1 | 0 | | |
| 2 | 0 | 24 | 12 | 0 | 0 | | |
| 2 | 1 | 0 | 0 | 1 | 0 | | | // init row (write value 0)
| 2 | 1 | 2 | 12 | 0 | 0 | | |
| 2 | | | | | 1 | | | // padding
| 2 | | | | | 1 | | | // padding
| 1 | 0 | 3 | 4 | 1 | 0 | | |
| 3 | 0 | 17 | 32 | 1 | 0 | | |
| 3 | 0 | 89 | 32 | 0 | 0 | | |
| 3 | 1 | 48 | 32 | 1 | 0 | | |
| 3 | 1 | 49 | 32 | 0 | 0 | | |
| 3 | | | | | 1 | | | // padding
| 1 | 1 | 55 | 32 | 1 | 0 | 5 | 0 | // first storage op at the new address has to be write
| 4 | 1 | 56 | 33 | 1 | 0 | 8 | 32 |
| 4 | | | | | 1 | | | // padding
*/
The evm_circuit
is the core of the entire projec, including the implementation of opcode, the definition of Constraint, the definition of EVM execution results, the element of each execution step (ExecutionStep), and the most important EVMCircuit.
For example, in the definition of Constraint, the name serves as an annotation. When the selector is a success, the gadget's selector - polys are the Constraint’s polynomials, and lookups are the lookup tables used by the Constraint. In the Halo2 circuit instruction manual, the constraints interface function in the OpGadget trait is an array that returns Constraint.
struct Constraint<F> {
name: &'static str,
selector: Expression<F>,
polys: Vec<Expression<F>>,
lookups: Vec<Lookup<F>>,
}
Lookup is divided into two categories. First, FixedLookup
, which an array of 3 Expression
s; second, BusMappingLookup
, which is an enum containing OpExecutionState
, Stack
, Memory
, AccountStorage
, etc.
pub(crate) enum Lookup<F> {
FixedLookup(FixedLookup, [Expression<F>; 3]),
BusMappingLookup(BusMappingLookup<F>),
}
The difference between CallField
and CallStateField
is that the former is used in the call of BusMappingLookup, and the latter is used in the OpExecutionState of BusMappingLookup.
The Cell
structure includes constraint expressions, columns, and the offset used by the selector.
pub(crate) struct Cell<F> {
// expression for constraint
expression: Expression<F>,
column: Column<Advice>,
// relative position to selector for synthesis
rotation: usize,
}
A word is represented by one expression and is positioned by an array of 32 cells.
struct Word<F> {
// random linear combination expression of cells
expression: Expression<F>,
// inner cells for synthesis
cells: [Cell<F>; 32],
}
The structure, CoreStateInstance
, is often used when writing opcode circuits and is used for context caching in the assign implementation in evm_circuit.
pub(crate) struct CoreStateInstance {
is_executing: bool,
global_counter: usize,
call_id: usize,
program_counter: usize,
stack_pointer: usize,
gas_counter: usize,
}
ExecutionStep
contains the opcode executed in this step, the execution result, and the value used. Although the name is the same as in the bus-mapping crate, its purpose and structure are different.
pub(crate) struct ExecutionStep {
opcode: OpcodeId,
case: Case,
values: Vec<BigUint>,
}
EVMCircuit
contains the selector, lookup table, read-write table, fixed table, and op execution gadget for each step.
struct EVMCircuit<F> {
q_step: Selector,
qs_byte_lookup: Column<Advice>,
fixed_table: [Column<Fixed>; 4],
rw_table: [Column<Advice>; 7],
op_execution_gadget: OpExecutionGadget<F>,
}
Currently, opcodes of arithmetic
, byte
, comparator
, dup
, pop
, and push
have been implemented. We take add/sub as an example to explain how to write an op circuit.
First, define the data structure required by op (when the operation is successful), which is also the operation data of the op in most cases. When it fails, no other data is needed, we just need to handle the corresponding case. For the AddSuccessAllocation
structure, the selector is used for activation in different cases. Because ADD
and SUB
are implemented together, swap is used to distinguish add or sub. a, b, c are used to represent the three values of the op operation, satisfying a + b == c. When sub, a + c == b.
struct AddSuccessAllocation<F> {
selector: Cell<F>,
swap: Cell<F>,
a: Word<F>,
b: Word<F>,
c: Word<F>,
carry: [Cell<F>; 32],
}
Second, define the corresponding Gadget structure, mainly the results of the op in different cases. For add/sub, there are three states: success, stackunderflow, and outofgas.
pub struct AddGadget<F> {
success: AddSuccessAllocation<F>,
stack_underflow: Cell<F>, // case selector
out_of_gas: (
Cell<F>, // case selector
Cell<F>, // gas available
),
}
The final step is to implement the OpGadget trait for the op. To begin with, define several case situations covered by this op, as well as the number of values, the number of cells, and whether or not to halt.
const RESPONSIBLE_OPCODES: &'static [OpcodeId] =
&[OpcodeId::ADD, OpcodeId::SUB]; // This operation implements opcodes ADD and SUB.
const CASE_CONFIGS: &'static [CaseConfig] = &[
CaseConfig {
case: Case::Success,
num_word: 3, // a + b + c
num_cell: 33, // 32 carry + swap
will_halt: false,
},
CaseConfig {
case: Case::StackUnderflow,
num_word: 0,
num_cell: 0,
will_halt: true,
},
CaseConfig {
case: Case::OutOfGas,
num_word: 0,
num_cell: 0,
will_halt: true,
},
]; // Several case situations by ADD and SUB
The construct()
function is relatively simple, which is to initialize our Gadget with case_allocations
. The constraints
function is a relatively critical interface, which needs the process of the current state and the next state to construct constraints.
fn constraints(
&self,
state_curr: &OpExecutionState<F>,
state_next: &OpExecutionState<F>,
) -> Vec<Constraint<F>> {
let OpExecutionState { opcode, .. } = &state_curr;
let common_polys = vec![
(opcode.expr() - OpcodeId::ADD.expr())
* (opcode.expr() - OpcodeId::SUB.expr()),
];
let success = {
// success constraint
};
let stack_underflow = {
// stack underflow constraint
};
let out_of_gas = {
// out_of_gas constraint
};
array::IntoIter::new([success, stack_underflow, out_of_gas])
.map(move |mut constraint| {
constraint.polys =
[common_polys.clone(), constraint.polys].concat();
constraint
})
.collect()
}
If you want to know how to use halo2 to develop circuits, please refer to Develop Circuits Using Halo2.
While we are studying the code, the crate of keccak256 is still in the construction stage. We will analyze it in the follow-up study.
As written in Sin7Y’s early articles, zkEVM is the brightest jewel in the crown of Ethereum scaling, and we look forward to more developers participating in this great project. Of course, as an open-source project, you may make your contribution in whatever way you like. Fixing a typo, improving documentation, or implementing a certain function are all helpful. If you are more familiar with the development of Halo2 and want to implement a certain opcode, you can take the above ADD/SUB as an example. The implementation process complies with writing the design specifications first. After the design specifications are merged by the core developers, you can write the opcode circuit code. Finally, don't forget to write the unit test code for the newly added opcode implementation.
ZkEVM Specifications https://github.com/appliedzkp/zkEVM-specs Circuits for zkEVM https://github.com/appliedzkp/zkEVM-circuits Develop Circuits Using Halo2 https://hackernoon.com/sin7y-tech-review-develop-circuits-using-halo-2 Exploring Popular zkEVM Solutions https://hackernoon.com/exploring-popular-zkEVM-solutions-appliedzkp-matter-labs-hermez-and-sin7y-quick-publication-ltq37ah
As a team of both vigorous and professional blockchain devs, Sin7Y has been doing constant research on zkEVM, layer 2, privacy computing, and cross-chain technology, delivering high-quality tech reviews on a regular basis. You can find us on Twitter or write to us at [email protected]. Feel free to contact us by DM or Email if you have any queries.