281 reads

Zero Knowledge Proofs - the Details are Hard

by DrmikeSeptember 12th, 2023

Too Long; Didn't Read

featured image - Zero Knowledge Proofs - the Details are Hard

Computing a proof of knowledge can be done in several ways. One of the more efficient methods is to use elliptic curve pairings. But to make pairings useful, there is a ton of mathematical detail used to make them work.

One of the details is called a “Quadratic Arithmetic Program” or QAP. The QAP is a graph structure with “gates” at vertices and “wires” on edges. Here’s an example:

This is taken from Elliptic Curve Cryptography for Developers. Each gate acts as a multiplier with two wires as inputs and one wire as output. Summations are “free”, they don’t require a gate. So the formula for wire $a_8$ is $a_8 = a_6(a_3 + a_4)$.

As you can see from the diagram, some of the wires are inputs and some are outputs. The idea behind using the QAP for zero knowledge is that several of the inputs are public and the rest are secret. The person who knows the secret can prove they can compute the QAP, and anyone who has access to the public data can verify the prover knows the secret.

The method for the proof is where the quadratic part of the arithmetic comes from. Every gate has a right hand input and a left hand input. Two functions are defined for every possible wire, a right hand function and a left hand function. The $a_j$’s in the diagram are coefficients to the QAP and they are used as the coefficients for the left and right functions. The sum of the coefficients times the right function is multiplied with the sum of the coefficients times the left function. The result is set equal to the sum of output wire coefficients times the output function plus all the cross terms. It’s easier to see this in math:

The main idea here is that most of the right and left functions are zero except where there are inputs to a gate. Since there is a function for every wire, all the output functions associated with a pure input wire are also zero. Since every gate has an output, there is one output function associated with each gate.

The last term in the equation is from all the cross terms which are not part of the QAP diagram. If you know all the coefficients, you can compute everything. The idea behind zero knowledge is that you don’t know all the coefficients. But you can verify that someone else does know if you are given a list of cross terms.

All these equations are computed over a finite field, typically a large prime number which is the order of the base point on an elliptic curve. The functions are called Lagrange interpolants, which require a prime number associated with each gate. The diagram shows $r_j$ values next to each gate which are these prime numbers.

Here’s the Lagrange interpolant for gate 0:

The value $x$ is any integer, but will be taken modulo the finite field size. For cryptographic numbers these will be in the range of $2^128$ to $2^256$ depending on the level of security. Values of $x$ which match the other gate’s prime numbers will result in zero. The value of $x = r_0$ will give one.

The right hand and left hand functions are created using the Lagrange interpolants of each gate. So let’s look at gate 3 which has $a_6$ on the left input and $a_3 + a_4$ on the right as well as $a_8$ on the output. Since $a_6$ is multiplied with $v_6(x)$, the function $v_6(x)$ uses the $l_3(x)$ interpolant function. Similarly, the $w_3(x)$ and $w_4(x)$ functions will also have $l_3(x)$ as one of their functions. From the graph we see that $a_3$ is also on the right side of the $r_1$ gate, so $w_3(x) = l_1(x) + l_3(x)$ .

If you are not completely confused at this point, you are not paying attention! The most confusing part for me when I first tried to write code to implement a QAP was how to create the right and left functions from each interpolant. There is an interpolant for each gate, and there are input as well as output wires to each gate. Each gate represents an equation in the QAP.

The right (or left) functions are associated with each wire. In this example, each wire goes to at most two gates, so each input function will have one or two Lagrange interpolants. In this particular example, half the functions are zero because the coefficients are not used at any gate. In larger QAP’s, there will be many more zero functions. That reduces the number of cross terms dramatically.

The output function is where the quadratic portion comes in. The multiplication of $a_6 v_6(x)$ with $(a_3 w_3(x) + a_4 w_4(x)) gives $a_6 l_3(x) ( a_3(l_1(x) + l_3(x)) + a_4 l_3(x)). When $x = r_3$, the function $l_1(r_3) = 0) and we get the formula for gate 3 with $a_6(a_3 + a_4) * l_3^2(x)$. The output will be correct as long as $a_8 * y_8(x)$ is the same result. We know the coefficients match, but the function for $y_8(x)$ has to be the square of the interpolant for gate 3. Setting $y_8(x) = l_3^2(x)$ then gives us the right answer when $x = r_3$.

The whole point of a zero knowledge proof is to compute things without knowledge of the coefficients. So when creating the proof, the value of $x$ is chosen to be far from the gate values. In the above example, the term $a_6 a_1 l_3(x) l_1(x)$ is then present in the calculation. This will be one of the terms in the $h(x) t(x)$ factor in the QAP equation.

The QAP formula are factors used to multiply points on an elliptic curve. The secret components are used by the prover along with random values to ensure security. The prover creates a common reference string of points associated with each record of inputs. They also compute a proof value which is a set of points on an elliptic curve.

The verifier knows a few of the inputs to each record. They also have access to the common reference string of points created by the prover. The construction of the zero knowledge proof allows the verifier to use the inputs they know along with the common reference string points to create a set of points which can be paired over an elliptic curve. If the pairing computations combine to match the proof output pairing created by the prover, all is well. If any bit is wrong, the entire computation will not match, and the verification will fail.

It’s obvious I’ve left off a lot of detail in the last few paragraphs. In fact, this article barely scratches the surface of what’s involved in creating code for a zero knowledge proof. So I’d like to point you to the book Elliptic Curve Cryptography for Developers which I’m in the process of writing. If you use the code mlrosing2 before 2023 Sept. 13 you get 45% off AND you can give me feedback to help improve the book. I’d really appreciate your input.