Let G be a cyclic group, P_i ∈ G is the element in the group, a_i ∈ Z is the scalar, and the Multi-Scalar Multiplication (MSM) is the algorithm for calculating ∑_{i=1}^{n} a_{i} P_{i} , the sum of multiple scalar multiplications. As the group operation is more complicated than the addition and multiplication of elements in the finite field, the MSM algorithm is employed to reduce the number of times of group operation as much as possible. Usually, G is a cyclic group defined on the elliptic curve of y^2=x^3+ax+b on the finite field of F_p, with the order |G| being b ≈ 256 bits, and 0 ≤ a_i< 2^b. If the group operation is conducted by the plain algorithm of fast exponentiation, the number of group operations needed for every a_i.P_i is 1.5bn times on average, and for all a_i.P_i is 1.5b times in total. Next, we will introduce some optimization methods corresponding to larger n.
We can split the b bit scalar into c width windows: reduce the scalar to 2^c base system.
Here, Q_j=∑_{i=1}^{n} a_{i j} P_{i}, so we reduce the original algorithm issue of b-bit MSM into a smaller c-bit issue. We can put the same scalars from ∑_{i=1}^{n} a_{i j} P_{i} into a bucket and write them in another form:
For example, when c = 3, it is calculated as:
When calculating ∑kS_k, set the partial sum T_l=S_l+⋯+S_k, then
However, the operation results of T_l can be obtained through the recurrence sequence T_l=T_l+1+S_l. It means that only one group operation is needed for the operation for each T_l. Under the MSM algorithm with cc-bit scalar, all S_k requires n−2^c+1 times of group operation. All T_k requires 2^c−1 times and the sum of these T_k requires 2^c−2 times of group operation. Therefore, each window with the value of c requires n+2^c−2 times of group operations. The operation of ∑_{j=0}^{b / c-1} Q_{j} 2^{j c} only requires the normal method of fast exponentiation by (b/c−1)(c+1)=b − c + b/c−1 times of group operation.
In summary, we can conclude that the total number of group operations required by the method of the Windowing Technique of a width of c is
Set c = logn, then the number of group operations is 2bn/logn. When n≈10^5 , the number of group operations reduces to 1/12 of its original value.
For the cyclic group G on the elliptic curve y^2=x^3+ax+b on the finite field F_p, if the following group endomorphism of φ can be found: there exists α,β ∈ F_p, satisfying φ(x,y)=(αx,βy) holds for all the points on G, then it is easy to prove that such an endomorphism is a multiplication map, namely there is a λ making φ(P)=λP hold for all the points of P on G, which means that whenever we know the coordinates of one point, we can change it to the coordinates of another point simply by multiplying a number in F_p to both the values of abscissa and ordinate. The algorithm can be further optimized based on this important property.
When λ = −1, α = 1, β = −1. The reverse value of one point can be obtained only by taking the opposite number from the ordinate. Based on this feature, in the plain fast exponentiation algorithm, the scalar can be written into the non-adjacent form (NAF), namely ∑e_{i}.2^{i}, e_{i} ∈ {−1,0,1} and any two adjacent e_i cannot be non-zero at the same time. Compared to the b-bit scalars, the group operation number in the fast exponentiation algorithm can be reduced from the original average of 3/2⋅b times to 4/3⋅b times. This technique can also be used to optimize the Windowing Technique. After writing a_i to a_i=∑_{j=0}^{b / c-1} a_{i, j} 2^{j c}, 0 <= a_{i, j}<2^{c}
conduct the following operation:
The result of a_i=∑_{j=0}^{b / c-1} a’_{i, j}^2^{j c} and -2^{c-1} <= a’_{i, j}^ <= 2^{c-1} can be obtained. Since the complexity of addition and subtraction in group operation of an elliptic curve is the same, we can put these terms into a bucket according to the absolute value of scalars. Therefore, the number of groups can be reduced from the original 2c to 2^c , and the overall number of group operations required is reduced from (1) to:
If the parameters of the elliptic curve are special, for example, the BLS curve can be written as y^2=x^3+b, and p ≡ 1( mod3 ), take third-order element α from F_p^x, then there is a corresponding λ that holds λ(x,y)=(αx,y)
Algorithm 1: Signed Digits
In addition, λ3 ≡ 1(mod |G| ). Then the multiplication operation can be optimized as follows:
From [1], we can find a small enough scalar |m1|, |m2|<√G to make the above equation true, which gives us a way to reduce the multiplication of bb bits into the multiplication of two b/2 bits. Use it into the Windowing Technique, and the number of group operations is reduced to
In summary, both the above two optimization methods can reduce the number of group operations by 5.5% when n=10^5.
[1] Francesco Sica, Mathieu Ciet, and Jean-Jacques Quisquater. Analysis of the gallant-lambert-vanstone method based on efficient endomorphisms: Elliptic and hyperelliptic curves. In International Workshop on Selected Areas in Cryptography, pages 21–36. Springer, 2002.