Note: To make it easy to understand, let’s ignore the features of zero-knowledge for a while.
We will express "lookup" by discussing the "subset argument" between two sets of data on one table (2^k rows, index starting from 0). As shown in the following figure, there are two sets of data, A and S, which are in two columns of the table, respectively.
The "subset argument" aims to prove that "each element in A appears in S, that is, A is a subset of S", which means that "all the elements in A must be contained in S, while all the elements in S may not be contained in A".
Take l_i be Lagrange baisi polynomial, that is, the value at row i is 1 while is 0 at the others.
Step1: Prover supply permutation columns of A and S
The permutation process is shown in the following figure
(code: halo2\src\plonk\lookup\prover.rs: permute_expression_pair
):
Step2: Constraints between A' and S'
Based on the above permutation results, it is required to make constraints to A' and S' (the black line in the figure) A′i = S′i or (the red line in the figure) A′i = A′i-1, that is:
(A′(X)- S′(X)).(A′(X) - A′(w^(-1).X)) = 0
In addition, it is required to make a constraint $A{'}$0 = $S{'}$0, that is
(A′(X) - S′(X)) ∙lo(X) = 0
To constrain that the permutations of A and S are A' and S′ respectively, it is needed to use following rules as the constraints:
Z(wX)∙(A'(X) + β)∙(S'(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ)=0
(1-Z(X))∙l_0(X) = 0
As is known, with d+1 points, a polynomial, of which the order is less than d+1, can be exclusively confirmed. Therefore, if the order of the polynomial is not enough, just with more queries than orders of the polynomial on different points, the verifier can get all data of the polynomial, losing the features of zero-knowledge. As a result, we need to scale the polynomial (Stark’s core concept) The principle is shown as follows (the polynomial value on y-axis indicates secret):
If the order of the original polynomial is 1, the verifier knows the values of the two points and the secret can be calculated;
If the order is greater than 1, then the verifier knows the values of the two points and any information about the secret cannot be revealed;
Therefore, to obtain the features of zero-knowledge, it requires to scale the original polynomial until its order is greater than the query times, so as to make sure that no information of the secret will be revealed.
In detail, for the table with a size of 2^k, if the size of A and S is less than 2^k, some random values need to be filled in it. But it must be noted that the filled random values, which don’t meet the requirements of the Lookup argument theory, need to be selected using the selector.
If the total number of rows is 2^k, in which the valid rows number is u while the invalid is t, it needs to meet u + t +1 = 2^k. The specific form is shown as follows:
There are two newly-added selectors:
q_blind, the last t lines, in which the value is 1 and others are 0
q_last, the value on row u is 1, and others are 0 (between usable rows and blind rows)
Then the above constraints need to be changed as follows:
(1 - (q_last(X) + q_blind(X)))∙(Z(wX)∙(A'(X) + β)∙(S'(X) + γ)-Z(X)∙(A(X) + β)∙(S(X) + γ))=0
(1 - (q_last(X) +q_blind(X) ))∙(A'(X) - S'(X)) ∙(A'(X) - A'(w^(-1)X)) = 0
The constraint to line 0 remains unchanged:
(A'(X) - S'(X)) ∙l_0(X) = 0
(1-Z(X))∙l_0(X) = 0
The constraint to line u:
(Z(X)^2-Z(X))∙q_last(X) = 0
According to the formula of Z(X) in polynomial, Z(u) = 1 (the value set of numerator and denominator is same). It is worth noting that this constraint, which constrains Z(u) within the range of {0,1}, is or quite small probability to be 0 (i.e., Ai + β = 0 or Si + γ = 0)
The above Lookup argument can be used for generalized conditions:
Multiple columns can be added to A and S. Compress with a random challenge, and the A' and S' can maintain one column (code: halo2\src\plonk\lookup\prover.rs:commit_permuted): a. The commit of each column in S can be calculated in advance, and according to the commit’s homomorphic features, the commit after compression can be easily calculated; b. Assuming that there is a M column, then compress_A = σm-1A0 + σm-2A1 +…+ A0, and A' = permuted (compress_A), and then the corresponding constraint becomes (code: halo2\src\plonk\lookup\prover .rs: commit_product):
(1 - (q_last(X) +q_blind(X) ))
∙(Z(wX)∙(A'(X) + β)∙(S'(X) + γ)-Z(X)
∙(∑iσ^(m-i). A_i(X) + β)∙(∑σ^(m-i)^ S_i~(X) + γ))=0
According to 1, for any number of lookup arguments, it can be realized with subset argument:
a. For example, constrain R (x,y, ...) on each row, which means that the R can be regarded as some sets of S, and then verify (x,y, ...) ϵ R.
b. In particular, if R is a function, it is required to run a validity check to the function input, and introducing a range check may be needed.
As well, it can support multiple tables by combining them into a large one and then corresponding to a tag selector, of which the method is similar to the technique of Chapter 4 and Chapter 5 in Plookup.