Modern data availability schemes rely on sophisticated mathematical techniques to efficiently encode, distribute, and verify large volumes of data. In Ethereum’s PeerDAS (EIP-7594) design, this process is powered by polynomial mathematics and the Fast Fourier Transform (FFT). This article provides a clear and intuitive explanation of how raw blob data—initially represented as fixed-size byte chunks—is converted into field elements, interpreted as evaluations of a polynomial, and then extended into parity data using FFT-based polynomial interpolation. We explore how roots of unity define the evaluation domain, how polynomials are reconstructed and extended in the frequency domain, and how these mechanisms enable efficient data availability sampling and recovery. By connecting abstract mathematical concepts with concrete implementation details from the EIP-7594 specification, this article demystifies how blob encoding works under the hood and clarifies the critical role of polynomials, finite fields, and FFTs in modern data availability systems. KZG Commitments and Proofs KZG Commitments and Proofs Before diving into how blob data is encoded, it is important to understand KZG commitments, which form the cryptographic backbone of the system. KZG commitments A KZG commitment is a way to “lock in” a polynomial so that its contents cannot be changed later, while still allowing anyone to verify specific evaluations of that polynomial. You can think of it like sealing a document in a tamper-proof envelope: once sealed, the contents are fixed, but you can still prove that a particular statement about the contents is true without revealing everything inside. KZG commitment In practice, a polynomial (representing blob data) is committed to using cryptographic pairings. Later, a prover can generate a KZG proof showing that the polynomial evaluates to a specific value at a given point. The verifier can check this proof efficiently without needing the entire polynomial. This property is essential for data availability, because it allows light clients to verify small pieces of large blobs. KZG proof Erasure Coding Erasure Coding Erasure coding is a method used to protect data against loss. Instead of storing a file as-is, the data is first split into pieces, and then extra pieces (parity data) are generated in such a way that the original data can still be recovered even if some pieces are missing. split into pieces extra pieces (parity data) In the context of Ethereum blobs, erasure coding allows the network to recover missing parts of a blob even if only a subset of the data is available. This is essential for data availability, because not every node needs to store or download the entire blob. How Lagrange Interpolation Fits In How Lagrange Interpolation Fits In Lagrange interpolation is the mathematical tool that makes this possible. It provides a way to reconstruct the unique polynomial that passes through a given set of points. In practice: The blob’s field elements define known points on the polynomial. Lagrange interpolation reconstructs the polynomial from those points. FFT is then used to efficiently evaluate or extend that polynomial. The blob’s field elements define known points on the polynomial. Lagrange interpolation reconstructs the polynomial from those points. FFT is then used to efficiently evaluate or extend that polynomial. Together, these steps allow Ethereum to: Encode data efficiently Recover missing pieces Verify correctness cryptographically Encode data efficiently Recover missing pieces Verify correctness cryptographically Fast Fourier Transform (FFT) Fast Fourier Transform (FFT) The world of algorithms can be thought of in two broad groups. Some algorithms are extremely practical and show up everywhere, like graph algorithms such as BFS or DFS. These are tools we rely on constantly because they solve real problems efficiently. Then there is another group of algorithms that are beautiful not just because they are useful, but because of how elegant and clever they are. These algorithms may not always be used directly in everyday coding, but they inspire deep understanding and creativity. One such example is the Fast Fourier Transform, or FFT. The FFT is one of the most important algorithms ever created. It plays a key role in many modern technologies such as signal processing, image compression, wireless communication, and even blockchain systems. What makes FFT special is not just its usefulness, but the clever mathematical ideas behind it. At first, it may look complicated, especially because it involves concepts like polynomials and frequency transformations. However, at its core, FFT is simply a smart way to evaluate and manipulate polynomials much faster than the naive approach. By breaking a big problem into smaller ones and reusing computations, FFT turns a slow process into a very fast one. This combination of beauty, efficiency, and deep mathematical insight is why FFT is considered one of the greatest algorithms ever created. Let’s start with a simple idea. Suppose we have two polynomials and we want to multiply them. The most straightforward way to do this is the method everyone learns in school: multiply every term of the first polynomial with every term of the second one and then add everything together. While this works, it becomes very slow when the polynomials get large, because the number of operations grows very quickly (quadratically). This is why we look for a smarter way to do polynomial multiplication. To understand a better approach, we first need to think about how a polynomial can be represented. The most common way is the coefficient form, where each number in a list represents the coefficient of a power of x. For example, a polynomial like3 + 2x + 5x² is represented as the list [3, 2, 5]. This representation is simple, but multiplying two such lists directly is expensive, because every coefficient in one polynomial must be multiplied with every coefficient in the other. coefficient form 3 + 2x + 5x² [3, 2, 5] Now comes an important idea: a polynomial does not have to be represented only by its coefficients. There is another equally valid way to represent a polynomial — by its values at specific points. For example, if we know the value of a polynomial at several different x-values, we can uniquely reconstruct the original polynomial. In fact, a polynomial of degree d is completely determined by its values at d + 1 distinct points. This is a fundamental idea from algebra. values at specific points d d + 1 This leads to a powerful insight: instead of multiplying polynomials using their coefficients, we can convert them into their value representations. Once both polynomials are expressed as values at the same set of points, multiplying them becomes extremely easy — we just multiply the corresponding values point-by-point. After that, we convert the result back into coefficient form. This reduces a difficult multiplication problem into simpler steps. The challenge, however, is choosing the right points at which to evaluate the polynomial. If we pick them randomly, the conversion process becomes slow again. This is where the Fast Fourier Transform (FFT) comes in. FFT cleverly chooses special points called roots of unity. These points have special mathematical properties that allow the conversion between coefficient form and value form to be done extremely efficiently - in O(n log n) time instead of O(n²). roots of unity O(n log n) O(n²) In simple terms, the FFT takes a polynomial written in coefficient form and evaluates it at carefully chosen points (the roots of unity). These points are chosen so that the work can be split recursively into smaller and smaller pieces, making the computation fast. Once the values are computed, multiplying two polynomials becomes a matter of multiplying corresponding values. After that, an inverse FFT is used to convert the result back into coefficient form. This idea — switching between coefficient form and value form using FFT — is the core trick behind fast polynomial multiplication. It is also the same idea used in many real-world systems, including signal processing, image compression, and modern blockchain data availability schemes. What makes it beautiful is that a deep mathematical insight leads directly to a huge practical speedup, turning an otherwise slow algorithm into one of the most powerful tools in computing. To make this idea work recursively, we need points where squaring them preserves the same structure. This leads us to a special set of numbers called roots of unity. These are complex numbers that, when raised to a certain power, equal 1. For example, the 4th roots of unity are 1, −1, i, and −i. They are evenly spaced around the unit circle and have beautiful symmetry properties. roots of unity When we evaluate a polynomial at these roots of unity, something powerful happens. We can split the polynomial into its even and odd parts, evaluate each part separately, and then combine the results efficiently. This divide-and-conquer process cuts the work in half at every step, leading to a total running time of O(n log n) instead of O(n²). O(n log n) O(n²) Suppose we have a polynomial P(x) = a0 + a1·x + a2·x² + a3·x³ + ... + an·xⁿ P(x) = a0 + a1·x + a2·x² + a3·x³ + ... + an·xⁿ At first glance, this looks like one big expression. But we can rewrite it in a clever way by grouping together the even-power terms and the odd-power terms. That means we rewrite the polynomial as: P(x) = (a0 + a2·x^2 + a4·x^4 + …) + x · (a1 + a3·x^2 + a5·x^4 + …) P(x) = (a0 + a2·x^2 + a4·x^4 + …) + x · (a1 + a3·x^2 + a5·x^4 + …) Now something important happens. The part inside the first parentheses is a polynomial in x², and the part inside the second parentheses is also a polynomial in x².So we can rewrite the whole expression as: x² x² P(x) = P_even(x^2) + x * P_odd(x^2) P(x) = P_even(x^2) + x * P_odd(x^2) Instead of evaluating one big polynomial of degree n, we now only need to evaluate two smaller polynomials of size n/2, and then combine their results. n two smaller polynomials of size n/2 Why this helps Why this helps Now imagine we want to evaluate the polynomial at some value x. x We compute: P_even(x^2) P_odd(x^2) P_even(x^2) P_odd(x^2) Then we combine them: P(x) = P_even(x^2) + x * P_odd(x^2) P(x) = P_even(x^2) + x * P_odd(x^2) This already cuts the work in half.But the real magic happens when we evaluate the polynomial atspecial values of x. special values of x Why roots of unity matter Why roots of unity matter Suppose we choose values of xxx such that: x^2 = y and (-x)^2 = y x^2 = y and (-x)^2 = y This means both x and −x map to the same y. x −x y Now look what happens: P(X) = P_even(y) + x * P_odd(y) P(-X) = P_even(y) - x * P_odd(y) P(X) = P_even(y) + x * P_odd(y) P(-X) = P_even(y) - x * P_odd(y) So by computing just one pair of values P_even(y) and P_odd(y), we immediately get two outputs: one of x and one for -x just one pair of values P_even(y) P_odd(y) x -x Why roots of unity are perfect for FFT Why roots of unity are perfect for FFT To understand why roots of unity are used in the Fast Fourier Transform, we first need to understand what they actually are. A root of unity is a number that becomes 1 when raised to a certain power. For example, if we say “the 4th roots of unity,” we mean all numbers that satisfy: root of unity 1 when raised to a certain power x^4 = 1 x^4 = 1 There are exactly four such numbers. These are: 1, -1, i, -i 1, -1, i, -i These values lie evenly spaced on a circle in the complex plane. That even spacing is the key property that makes them so powerful. What does ω (omega) mean? What does ω (omega) mean? In FFT, we usually pick one special root of unity, called ω (omega). one special root of unity ω (omega) If we want n roots of unity, we define: n ω = e^(2πi / n) ω = e^(2πi / n) This means: ω⁰ = 1 ω¹ = e^(2πi / n) ω² = e^(4πi / n) ... ωⁿ = 1 again ω⁰ = 1 ω¹ = e^(2πi / n) ω² = e^(4πi / n) ... ωⁿ = 1 again So the sequence repeats perfectly after n steps. n These values form a cycle, evenly spaced around a circle. This regular structure is exactly what FFT needs. cycle Why roots of unity are special for FFT Why roots of unity are special for FFT FFT works by repeatedly breaking a problem into smaller subproblems. For this to work efficiently, the evaluation points must have a special property: When you square a root of unity, you still get another root of unity. smaller subproblems If ω is an 8th root of unity, then ω² is a 4th root of unity, and ω⁴ is a 2nd root of unity. If ω is an 8th root of unity, then ω² is a 4th root of unity, and ω⁴ is a 2nd root of unity. This means: • At the top level, you evaluate the polynomial at 8 points• At the next level, those same points naturally group into 4• Then into 2• Then into 1 This is exactly what the recursive FFT needs. Why this matters for splitting the polynomial Why this matters for splitting the polynomial When we split a polynomial into even and odd parts: P(x) = P_even(x²) + x · P_odd(x²) P(x) = P_even(x²) + x · P_odd(x²) we must evaluate both parts at the same squared input. same Roots of unity make this possible because: If x is a root of unity, then x² is also a root of unity. The structure stays consistent at every recursive step. If x is a root of unity, then x² is also a root of unity. The structure stays consistent at every recursive step. This is why FFT works so efficiently — every recursive level uses the same mathematical structure, just with fewer points. same mathematical structure Why not use random numbers? Why not use random numbers? If we picked random x values: Squaring them would not give us another useful set of evaluation points. The recursive structure would break. The algorithm would fall back to slow O(n²) behavior. Squaring them would not give us another useful set of evaluation points. The recursive structure would break. The algorithm would fall back to slow O(n²) behavior. Roots of unity are special because they preserve structure under squaring, which is exactly what FFT needs to reduce the problem size again and again. preserve structure under squaring The diagram illustrates how the Fast Fourier Transform (FFT) evaluates a polynomial efficiently by breaking it into smaller pieces using symmetry in roots of unity. We start with a polynomial P(x) = [p_0, p_1, …., p_(n-1)] and instead of evaluating it directly at many points, we evaluate it at special values called roots of unity, written as ω^0,ω^1,…,ω^n−1 where ω=e^(2πi/n). These roots are evenly spaced points on the unit circle and have special algebraic properties that make recursive computation possible. P(x) = [p_0, p_1, …., p_(n-1)] ω^0,ω^1,…,ω^n−1 ω=e^(2πi/n) The key idea shown in the diagram is splitting the polynomial into two smaller polynomials: one containing the even-indexed coefficients and the other containing the odd-indexed coefficients. This gives us two new polynomials, P_e(x) and P_o(x). The original polynomial can then be written as P(x) = P_e(x^2)+ x * P_o(x^2). This transformation is powerful because instead of evaluating one large polynomial, we now evaluate two smaller ones at squared input values. P_e(x) and P_o(x) P(x) = P_e(x^2)+ x * P_o(x^2) The FFT uses this idea recursively. The even and odd polynomials are evaluated at the squared roots of unity, which themselves form a smaller set of roots of unity. This means the same trick can be applied again and again until the problem becomes trivial. At each level, the results are combined using the formulas P(w^j) = P_e(w^2j) + w^j * P_o(w^2j) & P(-w^j) = P_e(w^2j) - w^j * P_o(w^2j) P(w^j) = P_e(w^2j) + w^j * P_o(w^2j) & P(-w^j) = P_e(w^2j) - w^j * P_o(w^2j) This structure is what allows the FFT to reduce a problem of size n into two problems of size n/2, leading to a total runtime of O(n log n). The diagram visually captures this recursive splitting and recombination process, showing how a polynomial is efficiently evaluated at all roots of unity by repeatedly dividing the work into smaller, manageable pieces. n n/2 all its powers 𝜔^0 , 𝜔^1 , 𝜔^2 , … , 𝜔^(𝑛 − 1) are distinct points evenly spaced around the unit circle in the complex plane. So when we say x_j = w^j we are simply saying “the j-th evaluation point is the j-th root of unity”. Why does −ω^j = ω^(j+n/2) ? This comes directly from how roots of unity are arranged on the circle. Since ω = e^(2πi/n), then: ω^(n/2) = e^(πi) = -1 now multiply both side by ω^j: ω^(j+n/2) = ω^j * (-1) = -ω^j So geometrically, multiplying by ω^(n/2) rotates a point halfway around the circle — exactly to its opposite side. That is why: −ω^j = ω^(j+n/2) x_j = w^j the j-th evaluation point is the j-th root of unity −ω^j = ω^(j+n/2) ω = e^(2πi/n) ω^(n/2) = e^(πi) = -1 ω^j ω^(j+n/2) = ω^j * (-1) = -ω^j ω^(n/2) −ω^j = ω^(j+n/2) When we write y_e[j] = P_e(ω^2j), we are simply giving a name to a value we have already computed so that we don’t have to recompute it again. Think of it as storing a result in a table for later use. The polynomial P_e(x) represents the “even part” of the original polynomial, and evaluating it at x = ω^(2j) is something we do repeatedly when building the full FFT result. Instead of recalculating P_e(ω^2j) every time it appears, we compute it once, store it in an array (called y_e), and then reuse it wherever needed. This is exactly the same idea as dynamic programming: compute a smaller subproblem once, save the answer, and reuse it to build larger results efficiently. y_e[j] = P_e(ω^2j) name to a value we have already computed P_e(x) x = ω^(2j) P_e(ω^2j) y_e The same logic applies to y_o[j] = P_o(ω^(2j)). Here, P_o is the polynomial made from the odd-indexed coefficients, and we evaluate it at the same squared roots of unity. Once we have both y_e[j] and y_o[j], we can combine them to compute the full polynomial values using the formulas y_o[j] = P_o(ω^(2j)) P_o y_e[j] y_o[j] P(ω^j) = y_e[j] + ω^j * y_o[j] & P(ω^(j+n/2)) = y_e[j] − ω^j * y_o[j] P(ω^j) = y_e[j] + ω^j * y_o[j] & P(ω^(j+n/2)) = y_e[j] − ω^j * y_o[j] So the reason we “replace” P_e(ω^2j) with y_e[j] is not a mathematical trick, but a practical one: it avoids repeated work. We compute each smaller polynomial value once, store it, and reuse it efficiently. This reuse of previously computed results is exactly what makes FFT fast, reducing the complexity from quadratic time to O(n*logn) P_e(ω^2j) y_e[j] FFT algorithm uses a clever divide-and-conquer strategy based on mathematical symmetry. The process begins with a polynomial written as a list of coefficients. If the polynomial has only one coefficient, then its value is already known and no further work is needed. This is the base case. If the polynomial has more than one coefficient, it is split into two smaller polynomials: one formed from the coefficients at even positions, and the other from the coefficients at odd positions. This separation allows the problem to be reduced in size while preserving all the information needed to reconstruct the final result. Each of these smaller polynomials is then processed recursively in the same way. The algorithm evaluates them at special points called roots of unity. These points have a special mathematical property that allows values computed for one part of the polynomial to be reused for another part, which is what makes the algorithm fast. Because of this symmetry, two values can be computed at once from a single pair of results, dramatically reducing the total amount of work. To make this recursive process work cleanly, the number of coefficients must be a power of two. If the polynomial does not naturally have a length that is a power of two, we simply add extra zero coefficients at the end. This does not change the polynomial’s value but ensures the recursion can split evenly at every step. This padding step is not a mathematical trick—it is a practical requirement that allows the algorithm to work efficiently and correctly. As the recursion unfolds, the algorithm combines results using simple additions and multiplications with roots of unity. Each level builds on the previous one until all values of the polynomial at the chosen points are computed. By the end, the polynomial has been evaluated at all required points using far fewer operations than a naive approach. One important detail that appears at the very end of the FFT process is the reordering of the output values. When the recursive FFT finishes, the values are not yet in the natural order you would expect (like P(ω⁰), P(ω¹), P(ω²), …). Instead, they appear in a special scrambled order called bit-reversed order. This happens because of how the recursion splits the problem. At each level, the algorithm separates even-indexed and odd-indexed elements. This repeated splitting corresponds to reading the index numbers in binary and reversing their bits. So the position where a value ends up depends on the binary representation of its index, reversed. Let’s see an example how for an equation value’s at root of unity are been calculated using FFT for polynomial below: P(x) = 1 + 5x + 7x² + x³ + 2x⁴ + 3x⁵ P(x) = 1 + 5x + 7x² + x³ + 2x⁴ + 3x⁵ The algorithm splits the polynomial into even- and odd-indexed coefficients. Since our polynomial has only six coefficients, we first pad it with zeros to reach the nearest power of two. This ensures the algorithm can work correctly. The splitting process is then applied recursively, dividing the polynomial into smaller parts until each subproblem contains only a single value. FFT Recursive Decomposition Tree (Coefficient Split) Level 0 (n = 8) ------------------------------------------------- [ 1, 5, 7, 1, 2, 3, 0, 0 ] | | | | v v Level 1 (n = 4) ---------------------------------------------------- Even indices: Odd indices: [ 1, 7, 2, 0 ] [ 5, 1, 3, 0 ] | | | | | | | | v v v v Level 2 (n = 2) ---------------------------------------------------- Even indices: Odd indices: Even indices: Odd indices: [1, 2] [7, 0] [5, 3] [1, 0] | | | | | | | | | | | | | | | | v v v v v v v v Level 3 (n = 1) ← Base Case -------------------------------- Even Odd Even Odd Even Odd Even Odd [1] [2] [7] [0] [5] [3] [1] [0] Level 0 (n = 8) ------------------------------------------------- [ 1, 5, 7, 1, 2, 3, 0, 0 ] | | | | v v Level 1 (n = 4) ---------------------------------------------------- Even indices: Odd indices: [ 1, 7, 2, 0 ] [ 5, 1, 3, 0 ] | | | | | | | | v v v v Level 2 (n = 2) ---------------------------------------------------- Even indices: Odd indices: Even indices: Odd indices: [1, 2] [7, 0] [5, 3] [1, 0] | | | | | | | | | | | | | | | | v v v v v v v v Level 3 (n = 1) ← Base Case -------------------------------- Even Odd Even Odd Even Odd Even Odd [1] [2] [7] [0] [5] [3] [1] [0] let’s do calculation steps by step from base case where recursion call ends real calculation starts on Tree (Coefficient Split) Level 3 (n = 1) ← Base Case -------------------------------- Even Odd Even Odd Even Odd Even Odd [1] [2] [7] [0] [5] [3] [1] [0] Level 2 (n = 2) ---------------------------------------------------- Even indices: Odd indices: Even indices: Odd indices: [1, 2] [7, 0] [5, 3] [1, 0] Y = [3, -1] Y = [7, 7] Y = [8, 2] Y = [1, 1] we now know that at level 2, n=2 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi -1 [1, 2]: y_e = [1] & y_o = [2] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 1 + 1*2 = 3 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 1 - 1*2 = -1 Y = [3, -1] [7, 0]: y_e = [7] & y_o = [0] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 7 + 1*0 = 7 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 7 - 1*0 = 7 Y = [7, 7] [5, 3]: y_e = [5] & y_o = [3] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 5 + 1*3 = 8 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 5 - 1*3 = 2 Y = [8, 2] [1, 0]: y_e = [1] & y_o = [0] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 1 + 1*0 = 1 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 1 - 1*0 = 1 Y = [1, 1] Level 1 (n = 4) ---------------------------------------------------- Even indices: Odd indices: [ 1, 7, 2, 0 ] [ 5, 1, 3, 0 ] Y = [10, -1+7i, -4, -1-7i] Y = [9, 2+i, 7, 2-i] we now know that at level 1, n=4 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi/2 i 2 ω^2 = e^πi -1 3 ω^3 = e^3πi/2 -i [1, 7, 2, 0]: y_e = [3, -1] & y_o = [7,7] n/2 - 1 = 4/2 -1 = 1 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 3 + 1*7 = 10 Y[0+4/2] = Y[2] = y_e[0] - ω^0 * y_o[0] = 3 - 1*7 = -4 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = -1 + i*7 = -1+7i Y[1+4/2] = Y[3] = y_e[1] - ω^1 * y_o[1] = -1 - i*7 = -1-7i Y = [10, -1+7i, -4, -1-7i] [5, 1, 3, 0]: y_e = [8, 2] & y_o = [1, 1] n/2 - 1 = 4/2 -1 = 1 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 8 + 1*1 = 9 Y[0+4/2] = Y[2] = y_e[0] - ω^0 * y_o[0] = 8 - 1*1 = 7 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = 2 + i*1 = 2+i Y[1+4/2] = Y[3] = y_e[1] - ω^1 * y_o[1] = 2 - i*1 = 2-i Y = [9, 2+i, 7, 2-i] Level 0 (n = 8) ------------------------------------------------- [ 1, 5, 7, 1, 2, 3, 0, 0 ] Y = [19, -0.293 + 9.12*i,-4+7i, -1.707-4.879*i, 1, -1.707 + 4.879*i, -4-7i, -0.293 - 9.121*i] we now know that' at level 0, n=8 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi/4 (√2/2) + i(√2/2) 2 ω^2 = e^πi/2 i 3 ω^3 = e^3πi/4 -(√2/2) + i(√2/2) 4 ω^4 = e^πi -1 5 ω^5 = e^5πi/4 −(√2/2) − i(√2/2) 6 ω^6 = e^3πi/2 -i 7 ω^7 = e^7πi/4 (√2/2) − i(√2/2) [1, 5, 7, 1, 2, 3, 0, 0]: y_e = [10, -1+7i, -4, -1-7i] & y_o = [9, 2+i, 7, 2-i] n/2 - 1 = 8/2 -1 = 3 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 10 + 1*9 = 19 Y[0+8/2] = Y[4] = y_e[0] - ω^0 * y_o[0] = 10 - 1*9 = 1 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = (-1+7i) + ((√2/2) + i(√2/2))*(2+i) = (-1+(1/√2) + (3/√2+7)i Y[1+8/2] = Y[5] = y_e[1] - ω^1 * y_o[1] = (-1+7i) - ((√2/2) + i(√2/2))*(2+i) = -(1+(1/√2)) + (7-(3/√2))i at j = 2 Y[2] = y_e[2] + ω^2 * y_o[2] = -4 + i*7 = -4+7i Y[2+8/2] = Y[6] = y_e[2] - ω^2 * y_o[2] = -4 - i*7 = -4-7i at j = 3 Y[3] = y_e[3] + ω^3 * y_o[3] = (-1-7i) + (-(√2/2) + i(√2/2))*(2-i) = (-1-(1/√2)) + ((3/√2)-7)i Y[3+8/2] = Y[7] = y_e[3] - ω^3 * y_o[3] = (-1-7i) - (-(√2/2) + i(√2/2))*(2-i) = (-1+(1/√2)) - ((3/√2)+7)i Y = [19, (-1+(1/√2) + (3/√2+7)i, -4+7i, (-1-(1/√2)) + ((3/√2)-7)i, 1, -(1+(1/√2)) + (7-(3/√2))i, -4-7i, (-1+(1/√2)) - ((3/√2)+7)i ] Level 3 (n = 1) ← Base Case -------------------------------- Even Odd Even Odd Even Odd Even Odd [1] [2] [7] [0] [5] [3] [1] [0] Level 2 (n = 2) ---------------------------------------------------- Even indices: Odd indices: Even indices: Odd indices: [1, 2] [7, 0] [5, 3] [1, 0] Y = [3, -1] Y = [7, 7] Y = [8, 2] Y = [1, 1] we now know that at level 2, n=2 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi -1 [1, 2]: y_e = [1] & y_o = [2] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 1 + 1*2 = 3 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 1 - 1*2 = -1 Y = [3, -1] [7, 0]: y_e = [7] & y_o = [0] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 7 + 1*0 = 7 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 7 - 1*0 = 7 Y = [7, 7] [5, 3]: y_e = [5] & y_o = [3] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 5 + 1*3 = 8 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 5 - 1*3 = 2 Y = [8, 2] [1, 0]: y_e = [1] & y_o = [0] n/2 - 1 = 2/2 -1 = 0 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 1 + 1*0 = 1 Y[0+2/2] = Y[1] = y_e[[0] - ω^0 * y_o[0] = 1 - 1*0 = 1 Y = [1, 1] Level 1 (n = 4) ---------------------------------------------------- Even indices: Odd indices: [ 1, 7, 2, 0 ] [ 5, 1, 3, 0 ] Y = [10, -1+7i, -4, -1-7i] Y = [9, 2+i, 7, 2-i] we now know that at level 1, n=4 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi/2 i 2 ω^2 = e^πi -1 3 ω^3 = e^3πi/2 -i [1, 7, 2, 0]: y_e = [3, -1] & y_o = [7,7] n/2 - 1 = 4/2 -1 = 1 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 3 + 1*7 = 10 Y[0+4/2] = Y[2] = y_e[0] - ω^0 * y_o[0] = 3 - 1*7 = -4 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = -1 + i*7 = -1+7i Y[1+4/2] = Y[3] = y_e[1] - ω^1 * y_o[1] = -1 - i*7 = -1-7i Y = [10, -1+7i, -4, -1-7i] [5, 1, 3, 0]: y_e = [8, 2] & y_o = [1, 1] n/2 - 1 = 4/2 -1 = 1 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 8 + 1*1 = 9 Y[0+4/2] = Y[2] = y_e[0] - ω^0 * y_o[0] = 8 - 1*1 = 7 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = 2 + i*1 = 2+i Y[1+4/2] = Y[3] = y_e[1] - ω^1 * y_o[1] = 2 - i*1 = 2-i Y = [9, 2+i, 7, 2-i] Level 0 (n = 8) ------------------------------------------------- [ 1, 5, 7, 1, 2, 3, 0, 0 ] Y = [19, -0.293 + 9.12*i,-4+7i, -1.707-4.879*i, 1, -1.707 + 4.879*i, -4-7i, -0.293 - 9.121*i] we now know that' at level 0, n=8 ω^j = e^(2πij/n) j expression value 0 ω^0 = e^0 1 1 ω^1 = e^πi/4 (√2/2) + i(√2/2) 2 ω^2 = e^πi/2 i 3 ω^3 = e^3πi/4 -(√2/2) + i(√2/2) 4 ω^4 = e^πi -1 5 ω^5 = e^5πi/4 −(√2/2) − i(√2/2) 6 ω^6 = e^3πi/2 -i 7 ω^7 = e^7πi/4 (√2/2) − i(√2/2) [1, 5, 7, 1, 2, 3, 0, 0]: y_e = [10, -1+7i, -4, -1-7i] & y_o = [9, 2+i, 7, 2-i] n/2 - 1 = 8/2 -1 = 3 from: j=0 to n/2-1 at j = 0 Y[0] = y_e[0] + ω^0 * y_o[0] = 10 + 1*9 = 19 Y[0+8/2] = Y[4] = y_e[0] - ω^0 * y_o[0] = 10 - 1*9 = 1 at j = 1 Y[1] = y_e[1] + ω^1 * y_o[1] = (-1+7i) + ((√2/2) + i(√2/2))*(2+i) = (-1+(1/√2) + (3/√2+7)i Y[1+8/2] = Y[5] = y_e[1] - ω^1 * y_o[1] = (-1+7i) - ((√2/2) + i(√2/2))*(2+i) = -(1+(1/√2)) + (7-(3/√2))i at j = 2 Y[2] = y_e[2] + ω^2 * y_o[2] = -4 + i*7 = -4+7i Y[2+8/2] = Y[6] = y_e[2] - ω^2 * y_o[2] = -4 - i*7 = -4-7i at j = 3 Y[3] = y_e[3] + ω^3 * y_o[3] = (-1-7i) + (-(√2/2) + i(√2/2))*(2-i) = (-1-(1/√2)) + ((3/√2)-7)i Y[3+8/2] = Y[7] = y_e[3] - ω^3 * y_o[3] = (-1-7i) - (-(√2/2) + i(√2/2))*(2-i) = (-1+(1/√2)) - ((3/√2)+7)i Y = [19, (-1+(1/√2) + (3/√2+7)i, -4+7i, (-1-(1/√2)) + ((3/√2)-7)i, 1, -(1+(1/√2)) + (7-(3/√2))i, -4-7i, (-1+(1/√2)) - ((3/√2)+7)i ] The Fast Fourier Transform (FFT) is used when we want to compute the values of a polynomial at special points called roots of unity. As we saw earlier, roots of unity are numbers of the form ω^j=e^(2πij/n), and they have very useful symmetry properties. Instead of evaluating a polynomial separately at each point (which would be slow), FFT cleverly reuses intermediate results by splitting the polynomial into even and odd parts and evaluating them recursively at these roots of unity. This allows us to efficiently compute all polynomial values P(ω0),P(ω1),…,P(ωn−1)P(\omega^0), P(\omega^1), \dots, P(\omega^{n-1})P(ω0),P(ω1),…,P(ωn−1) in O(n log n) time. values of a polynomial at special points called roots of unity O(n log n) Once we have these values at the roots of unity, we are working in what is called the value representation of the polynomial. This representation is extremely useful for tasks like polynomial multiplication, erasure coding, and blob extension, because operations such as multiplication become simple point-by-point operations. value representation The Inverse FFT (IFFT) does the opposite job. Instead of starting with coefficients and producing values, it starts with the values at the roots of unity and reconstructs the original polynomial coefficients. Mathematically, this is possible because a polynomial of degree n−1 is uniquely determined by its values at n distinct points. The inverse FFT uses the same recursive structure as FFT, but it evaluates using the inverse roots of unity (that is, ω^−1) and applies a final normalization factor of 1/n. Apart from these small changes, the algorithm is structurally identical to FFT. Inverse FFT (IFFT) values at the roots of unity polynomial coefficients inverse roots of unity So in simple terms: FFT converts coefficients → values at roots of unity. Inverse FFT converts values at roots of unity → coefficients. FFT converts coefficients → values at roots of unity. FFT coefficients → values at roots of unity Inverse FFT converts values at roots of unity → coefficients. Inverse FFT values at roots of unity → coefficients This pair of transformations is what makes FFT so powerful. In systems like blob encoding and data availability, FFT is used to extend data by evaluating polynomials at more points (creating parity), while inverse FFT is used when we need to recover or reconstruct the original polynomial data from those evaluations. The FFT and the inverse FFT use the same core algorithm and the same recursive structure. The only real differences are what the input represents, which roots of unity are used, and what the output means. the same core algorithm and the same recursive structure what the input represents which roots of unity are used what the output means In the FFT, the input list represents the polynomial coefficients, ordered by degree (degree 0, degree 1, degree 2, and so on). The algorithm evaluates this polynomial at special points called the roots of unity, where ω^j =e ^(2πij/n)The output of FFT is a list of polynomial values at these roots of unity. FFT polynomial coefficients roots of unity ω^j =e ^(2πij/n) polynomial values In the inverse FFT, the structure of the algorithm is exactly the same, but the meaning of the input and output changes. Now, the input list represents the values of the polynomial at the roots of unity, instead of coefficients. The algorithm then reconstructs the original polynomial coefficients. inverse FFT values of the polynomial at the roots of unity original polynomial coefficients The only mathematical change is the root of unity that is used. In FFT we useω^j =e ^(2πij/n)while in inverse FFT we use ω^j = 1/n*(e^(-2πij/n))This introduces the required normalization factor 1/n and reverses the direction of the transformation. ω^j =e ^(2πij/n) ω^j = 1/n*(e^(-2πij/n)) 1/n Apart from these differences, everything else remains the same: the recursive splitting into even and odd parts, the combine step, and the overall flow of the algorithm. FFT produces values from coefficients, and inverse FFT produces coefficients from values, using essentially the same algorithm. everything else remains the same values from coefficients coefficients from values Now that we are familiar with the most important concepts and methods, we can move forward.Next, we will look at how blob data is converted into columns and respective proof of cell in the data availability scheme. Peer DAS Peer DAS func ComputeCellsAndKZGProofs(blob *Blob) ([]Cell, []Proof, error) { var ckzgBlob ckzg4844.Blob copy(ckzgBlob[:], blob[:]) ckzgCells, ckzgProofs, err := ckzg4844.ComputeCellsAndKZGProofs(&ckzgBlob) if err != nil { return nil, nil, err } if len(ckzgCells) != len(ckzgProofs) { return nil, nil, errors.New("mismatched cells and proofs length") } cells := make([]Cell, len(ckzgCells)) proofs := make([]Proof, len(ckzgProofs)) for i := range ckzgCells { copy(cells[i][:], ckzgCells[i][:]) copy(proofs[i][:], ckzgProofs[i][:]) } return cells, proofs, nil } func ComputeCellsAndKZGProofs(blob *Blob) ([]Cell, []Proof, error) { var ckzgBlob ckzg4844.Blob copy(ckzgBlob[:], blob[:]) ckzgCells, ckzgProofs, err := ckzg4844.ComputeCellsAndKZGProofs(&ckzgBlob) if err != nil { return nil, nil, err } if len(ckzgCells) != len(ckzgProofs) { return nil, nil, errors.New("mismatched cells and proofs length") } cells := make([]Cell, len(ckzgCells)) proofs := make([]Proof, len(ckzgProofs)) for i := range ckzgCells { copy(cells[i][:], ckzgCells[i][:]) copy(proofs[i][:], ckzgProofs[i][:]) } return cells, proofs, nil } This function, ComputeCellsAndKZGProofs, is a thin wrapper used in Prysm (located in beacon-chain/blockchain/kzg/kzg.go) to convert a blob into cells and their corresponding KZG proofs using the underlying ckzg4844 library. A Blob in Prysm is simply a fixed-size array of bytes (BytesPerBlob = 131,072 bytes (128 KB)), which represents the raw blob data exactly as it appears in the protocol. Since the ckzg4844 library defines its own Blob type, the function first copies the Prysm Blob byte-for-byte into a ckzg4844.Blob. This copy does not change the data; it only adapts it to the type expected by the cryptographic library. The function then calls ckzg4844.ComputeCellsAndKZGProofs, which performs all the heavy work: interpreting the blob data as field elements, constructing the polynomial, extending it, and producing the blob’s cells along with their KZG proofs. After the call, the function checks that the number of cells matches the number of proofs (since each cell must have exactly one proof), copies the results back into Prysm’s own Cell and Proof types, and finally returns them. In short, this function acts as a clean and safe bridge between Prysm’s data types and the low-level KZG/FFT-based blob processing logic implemented in ckzg4844. ComputeCellsAndKZGProofs beacon-chain/blockchain/kzg/kzg.go cells KZG proofs ckzg4844 Blob BytesPerBlob = 131,072 bytes (128 KB) ckzg4844 Blob Blob ckzg4844.Blob ckzg4844.ComputeCellsAndKZGProofs Cell Proof ckzg4844 Now we move on to the c-kzg-4844 library, which is used by all Ethereum clients for KZG and blob-related operations in the PeerDAS implementation. Next, we will look inside ckzg4844.ComputeCellsAndKZGProofs to understand what happens under the hood and how the core logic actually works. c-kzg-4844 c-kzg-4844 ckzg4844.ComputeCellsAndKZGProofs under the hood func ComputeCellsAndKZGProofs(blob *Blob) ([CellsPerExtBlob]Cell, [CellsPerExtBlob]KZGProof, error) { if !loaded { panic("trusted setup isn't loaded") } if blob == nil { return [CellsPerExtBlob]Cell{}, [CellsPerExtBlob]KZGProof{}, ErrBadArgs } cells := [CellsPerExtBlob]Cell{} // CellsPerExtBlob = 128 proofs := [CellsPerExtBlob]KZGProof{} // CellsPerExtBlob = 128 ret := C.compute_cells_and_kzg_proofs( (*C.Cell)(unsafe.Pointer(&cells)), (*C.KZGProof)(unsafe.Pointer(&proofs)), (*C.Blob)(unsafe.Pointer(blob)), &settings) if ret != C.C_KZG_OK { return [CellsPerExtBlob]Cell{}, [CellsPerExtBlob]KZGProof{}, makeErrorFromRet(ret) } return cells, proofs, nil } func ComputeCellsAndKZGProofs(blob *Blob) ([CellsPerExtBlob]Cell, [CellsPerExtBlob]KZGProof, error) { if !loaded { panic("trusted setup isn't loaded") } if blob == nil { return [CellsPerExtBlob]Cell{}, [CellsPerExtBlob]KZGProof{}, ErrBadArgs } cells := [CellsPerExtBlob]Cell{} // CellsPerExtBlob = 128 proofs := [CellsPerExtBlob]KZGProof{} // CellsPerExtBlob = 128 ret := C.compute_cells_and_kzg_proofs( (*C.Cell)(unsafe.Pointer(&cells)), (*C.KZGProof)(unsafe.Pointer(&proofs)), (*C.Blob)(unsafe.Pointer(blob)), &settings) if ret != C.C_KZG_OK { return [CellsPerExtBlob]Cell{}, [CellsPerExtBlob]KZGProof{}, makeErrorFromRet(ret) } return cells, proofs, nil } This ComputeCellsAndKZGProofs function in c-kzg-4844 is a Go wrapper around the core C implementation that actually performs blob processing. First, it checks that the trusted setup has been loaded and that the input blob is not nil, since both are required for KZG operations. It then allocates fixed-size arrays for the output cells and their corresponding KZG proofs. After that, the function calls the C function compute_cells_and_kzg_proofs, passing pointers to the blob, the output arrays, and the trusted setup settings. All the real work—converting the blob into field elements, building and extending the polynomial using FFT, and generating the cells and proofs—happens inside this C function. Once the call returns, the Go wrapper checks for errors and either returns the computed cells and proofs or an error. In short, this function safely bridges Go code to the low-level C logic that implements the full blob-to-cells and KZG proof generation process. ComputeCellsAndKZGProofs c-kzg-4844 nil compute_cells_and_kzg_proofs C_KZG_RET compute_cells_and_kzg_proofs( Cell *cells, KZGProof *proofs, const Blob *blob, const KZGSettings *s ) { C_KZG_RET ret; fr_t *poly_monomial = NULL; fr_t *poly_lagrange = NULL; fr_t *data_fr = NULL; g1_t *proofs_g1 = NULL; /* If both of these are null, something is wrong */ if (cells == NULL && proofs == NULL) { return C_KZG_BADARGS; } /* Allocate space fr-form arrays */ ret = new_fr_array(&poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB); // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (ret != C_KZG_OK) goto out; ret = new_fr_array(&poly_lagrange, FIELD_ELEMENTS_PER_EXT_BLOB); // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (ret != C_KZG_OK) goto out; /* * Convert the blob to a polynomial in lagrange form. Note that only the first 4096 fields of * the polynomial will be set. The upper 4096 fields will remain zero. The extra space is * required because the polynomial will be evaluated to the extended domain (8192 roots of * unity). */ ret = blob_to_polynomial(poly_lagrange, blob); if (ret != C_KZG_OK) goto out; /* We need the polynomial to be in monomial form */ // FIELD_ELEMENTS_PER_BLOB = 4096 ret = poly_lagrange_to_monomial(poly_monomial, poly_lagrange, FIELD_ELEMENTS_PER_BLOB, s); if (ret != C_KZG_OK) goto out; /* Ensure that only the first FIELD_ELEMENTS_PER_BLOB elements can be non-zero */ // FIELD_ELEMENTS_PER_BLOB = 4096 & FIELD_ELEMENTS_PER_EXT_BLOB = 8192 for (size_t i = FIELD_ELEMENTS_PER_BLOB; i < FIELD_ELEMENTS_PER_EXT_BLOB; i++) { assert(fr_equal(&poly_monomial[i], &FR_ZERO)); } if (cells != NULL) { /* Allocate space for our data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = new_fr_array(&data_fr, FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Get the data points via forward transformation */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = fr_fft(data_fr, poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = bit_reversal_permutation(data_fr, sizeof(fr_t), FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the cells to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { // FILED_ELEMENTS_PER_CELL = 64 for (size_t j = 0; j < FIELD_ELEMENTS_PER_CELL; j++) { size_t index = i * FIELD_ELEMENTS_PER_CELL + j; size_t offset = j * BYTES_PER_FIELD_ELEMENT; bytes_from_bls_field((Bytes32 *)&cells[i].bytes[offset], &data_fr[index]); } } } if (proofs != NULL) { /* Allocate space for our proofs in g1-form */ ret = new_g1_array(&proofs_g1, CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Compute the proofs, only uses the first half of the polynomial */ ret = compute_fk20_cell_proofs(proofs_g1, poly_monomial, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the proofs */ ret = bit_reversal_permutation(proofs_g1, sizeof(g1_t), CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the proofs to byte-form */ for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { bytes_from_g1(&proofs[i], &proofs_g1[i]); } } out: c_kzg_free(poly_monomial); c_kzg_free(poly_lagrange); c_kzg_free(data_fr); c_kzg_free(proofs_g1); return ret; } C_KZG_RET compute_cells_and_kzg_proofs( Cell *cells, KZGProof *proofs, const Blob *blob, const KZGSettings *s ) { C_KZG_RET ret; fr_t *poly_monomial = NULL; fr_t *poly_lagrange = NULL; fr_t *data_fr = NULL; g1_t *proofs_g1 = NULL; /* If both of these are null, something is wrong */ if (cells == NULL && proofs == NULL) { return C_KZG_BADARGS; } /* Allocate space fr-form arrays */ ret = new_fr_array(&poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB); // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (ret != C_KZG_OK) goto out; ret = new_fr_array(&poly_lagrange, FIELD_ELEMENTS_PER_EXT_BLOB); // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (ret != C_KZG_OK) goto out; /* * Convert the blob to a polynomial in lagrange form. Note that only the first 4096 fields of * the polynomial will be set. The upper 4096 fields will remain zero. The extra space is * required because the polynomial will be evaluated to the extended domain (8192 roots of * unity). */ ret = blob_to_polynomial(poly_lagrange, blob); if (ret != C_KZG_OK) goto out; /* We need the polynomial to be in monomial form */ // FIELD_ELEMENTS_PER_BLOB = 4096 ret = poly_lagrange_to_monomial(poly_monomial, poly_lagrange, FIELD_ELEMENTS_PER_BLOB, s); if (ret != C_KZG_OK) goto out; /* Ensure that only the first FIELD_ELEMENTS_PER_BLOB elements can be non-zero */ // FIELD_ELEMENTS_PER_BLOB = 4096 & FIELD_ELEMENTS_PER_EXT_BLOB = 8192 for (size_t i = FIELD_ELEMENTS_PER_BLOB; i < FIELD_ELEMENTS_PER_EXT_BLOB; i++) { assert(fr_equal(&poly_monomial[i], &FR_ZERO)); } if (cells != NULL) { /* Allocate space for our data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = new_fr_array(&data_fr, FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Get the data points via forward transformation */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = fr_fft(data_fr, poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 ret = bit_reversal_permutation(data_fr, sizeof(fr_t), FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the cells to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { // FILED_ELEMENTS_PER_CELL = 64 for (size_t j = 0; j < FIELD_ELEMENTS_PER_CELL; j++) { size_t index = i * FIELD_ELEMENTS_PER_CELL + j; size_t offset = j * BYTES_PER_FIELD_ELEMENT; bytes_from_bls_field((Bytes32 *)&cells[i].bytes[offset], &data_fr[index]); } } } if (proofs != NULL) { /* Allocate space for our proofs in g1-form */ ret = new_g1_array(&proofs_g1, CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Compute the proofs, only uses the first half of the polynomial */ ret = compute_fk20_cell_proofs(proofs_g1, poly_monomial, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the proofs */ ret = bit_reversal_permutation(proofs_g1, sizeof(g1_t), CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the proofs to byte-form */ for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { bytes_from_g1(&proofs[i], &proofs_g1[i]); } } out: c_kzg_free(poly_monomial); c_kzg_free(poly_lagrange); c_kzg_free(data_fr); c_kzg_free(proofs_g1); return ret; } This compute_cells_and_kzg_proofs function is the core engine that turns a raw blob into extended blob cells and their KZG proofs, and it does so step by step using polynomial math and FFTs. The function starts by checking basic validity: at least one of cells or proofs must be requested, otherwise the call makes no sense. It then allocates memory for two polynomial representations over the BLS12-381 scalar field: one in Lagrange (value) form and one in monomial (coefficient) form, each sized for the extended domain of 8192 field elements. compute_cells_and_kzg_proofs core engine extended blob cells KZG proofs cells proofs Lagrange (value) form monomial (coefficient) form extended domain Next, blob_to_polynomial interprets the blob’s raw bytes as 4096 field elements, where each 32-byte chunk of the blob is converted into a BLS12-381 field element. These are treated as polynomial values at fixed roots of unity, meaning conceptually we set P(ω⁰), P(ω¹), …, P(ω⁴⁰⁹⁵) = blob field elements, while the remaining 4096 slots are left as zero so the polynomial can later be evaluated over the full 8192-point domain. At this stage, the polynomial exists only in Lagrange form, meaning “values at known x-points,” not coefficients. blob_to_polynomial 4096 field elements Lagrange form Throughout this process, all arithmetic is done inside the BLS12-381 scalar field. BLS12-381 is a standardized cryptographic curve that defines a very large prime-number field that everyone in Ethereum agrees on. When the blob’s 32-byte chunks are processed, each chunk is interpreted as a number and validated so it fits inside this field. These numbers are called field elements, and they are the basic values used for polynomial math, FFTs, and KZG proofs. Using the BLS12-381 field ensures that all polynomial operations are well-defined, secure, and compatible with the cryptographic commitments and proofs used by the protocol. Throughout this process, all arithmetic is done inside the BLS12-381 scalar field. BLS12-381 is a standardized cryptographic curve that defines a very large prime-number field that everyone in Ethereum agrees on. When the blob’s 32-byte chunks are processed, each chunk is interpreted as a number and validated so it fits inside this field. These numbers are called field elements, and they are the basic values used for polynomial math, FFTs, and KZG proofs. Using the BLS12-381 field ensures that all polynomial operations are well-defined, secure, and compatible with the cryptographic commitments and proofs used by the protocol. BLS12-381 scalar field field elements The function then calls poly_lagrange_to_monomial, which performs an inverse FFT to reconstruct the actual polynomial coefficients from those 4096 values. This gives a degree-≤4095 polynomial in monomial form. The code explicitly checks that all coefficients beyond index 4095 are zero, ensuring the polynomial degree is correct and no unintended data leaked into the extended region. poly_lagrange_to_monomial inverse FFT First, we look at the inner logic of blob_to_polynomial and poly_lagrange_to_monomial, along with their most important internal steps. After understanding these, we move on to the remaining part of compute_cells_and_kzg_proofs, after the zero-check that ensures all coefficients from index 4096 to 8192 are zero. blob_to_polynomial poly_lagrange_to_monomial compute_cells_and_kzg_proofs C_KZG_RET blob_to_polynomial(fr_t *p, const Blob *blob) { C_KZG_RET ret; // FIELD_ELEMENTS_PER_BLOB = 4096 for (size_t i = 0; i < FIELD_ELEMENTS_PER_BLOB; i++) { // BYTES_PER_FIELD_ELEMENT = 32 ret = bytes_to_bls_field(&p[i], (const Bytes32 *)&blob->bytes[i * BYTES_PER_FIELD_ELEMENT]); if (ret != C_KZG_OK) return ret; } return C_KZG_OK; } C_KZG_RET bytes_to_bls_field(fr_t *out, const Bytes32 *b) { blst_scalar tmp; blst_scalar_from_bendian(&tmp, b->bytes); if (!blst_scalar_fr_check(&tmp)) return C_KZG_BADARGS; blst_fr_from_scalar(out, &tmp); return C_KZG_OK; } C_KZG_RET blob_to_polynomial(fr_t *p, const Blob *blob) { C_KZG_RET ret; // FIELD_ELEMENTS_PER_BLOB = 4096 for (size_t i = 0; i < FIELD_ELEMENTS_PER_BLOB; i++) { // BYTES_PER_FIELD_ELEMENT = 32 ret = bytes_to_bls_field(&p[i], (const Bytes32 *)&blob->bytes[i * BYTES_PER_FIELD_ELEMENT]); if (ret != C_KZG_OK) return ret; } return C_KZG_OK; } C_KZG_RET bytes_to_bls_field(fr_t *out, const Bytes32 *b) { blst_scalar tmp; blst_scalar_from_bendian(&tmp, b->bytes); if (!blst_scalar_fr_check(&tmp)) return C_KZG_BADARGS; blst_fr_from_scalar(out, &tmp); return C_KZG_OK; } These two functions work together to convert raw blob data into polynomial values that can be used for cryptographic and FFT-based operations. The blob_to_polynomial function takes the blob, which is just a large array of bytes, and interprets it as 4096 chunks of 32 bytes each. Each 32-byte chunk is treated as one value of the polynomial in Lagrange (value) form. The function loops over the blob, and for each chunk, it calls bytes_to_bls_field to convert those raw bytes into a valid element of the BLS12-381 scalar field. The result is stored directly in the polynomial array p, so after the loop finishes, p[0] through p[4095] contain the blob data expressed as field elements. blob_to_polynomial 4096 chunks of 32 bytes each Lagrange (value) form bytes_to_bls_field p p[0] p[4095] The bytes_to_bls_field function performs the actual conversion and validation. It reads the 32 bytes as a big-endian number, checks that this number fits inside the BLS12-381 scalar field (meaning it is within the allowed range), and then converts it into the internal field representation used by the cryptographic library. If the value is out of range, the function returns an error. Together, these functions ensure that the blob’s raw bytes are safely and correctly interpreted as field elements, which can then be treated as polynomial values at fixed roots of unity and used for polynomial reconstruction, FFTs, and KZG proof generation later in the pipeline. bytes_to_bls_field // a inner function call of compute_cells_and_kzg_proofs only C_KZG_RET poly_lagrange_to_monomial( fr_t *monomial_out, const fr_t *lagrange, size_t len, const KZGSettings *s ) { C_KZG_RET ret; fr_t *lagrange_brp = NULL; /* Allocate space for the intermediate BRP poly */ ret = new_fr_array(&lagrange_brp, len); if (ret != C_KZG_OK) goto out; /* Copy the values and perform a bit reverse permutation */ memcpy(lagrange_brp, lagrange, sizeof(fr_t) * len); ret = bit_reversal_permutation(lagrange_brp, sizeof(fr_t), len); if (ret != C_KZG_OK) goto out; /* Perform an inverse FFT on the BRP'd polynomial */ ret = fr_ifft(monomial_out, lagrange_brp, len, s); if (ret != C_KZG_OK) goto out; out: c_kzg_free(lagrange_brp); return ret; } // a inner function call of poly_lagrange_to_monomial only C_KZG_RET bit_reversal_permutation(void *values, size_t size, size_t n) { C_KZG_RET ret; byte *tmp = NULL; byte *v = (byte *)values; /* In these cases, do nothing */ if (n == 0 || n == 1) { ret = C_KZG_OK; goto out; } /* Ensure n is a power of two */ if (!is_power_of_two(n)) { ret = C_KZG_BADARGS; goto out; } /* Scratch space for swapping an entry of the values array */ ret = c_kzg_malloc((void **)&tmp, size); if (ret != C_KZG_OK) goto out; /* Reorder elements */ uint64_t unused_bit_len = 64 - log2_pow2(n); assert(unused_bit_len <= 63); for (size_t i = 0; i < n; i++) { uint64_t r = reverse_bits(i) >> unused_bit_len; if (r > i) { /* Swap the two elements */ memcpy(tmp, v + (i * size), size); memcpy(v + (i * size), v + (r * size), size); memcpy(v + (r * size), tmp, size); } } out: c_kzg_free(tmp); return ret; } // a inner function call of poly_lagrange_to_monomial only C_KZG_RET fr_ifft(fr_t *out, const fr_t *in, size_t n, const KZGSettings *s) { /* Handle zero length input */ if (n == 0) return C_KZG_OK; /* Ensure the length is valid */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (n > FIELD_ELEMENTS_PER_EXT_BLOB || !is_power_of_two(n)) { return C_KZG_BADARGS; } // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 size_t stride = FIELD_ELEMENTS_PER_EXT_BLOB / n; fr_fft_fast(out, in, 1, s->reverse_roots_of_unity, stride, n); fr_t inv_n; fr_from_uint64(&inv_n, n); blst_fr_eucl_inverse(&inv_n, &inv_n); for (size_t i = 0; i < n; i++) { blst_fr_mul(&out[i], &out[i], &inv_n); } return C_KZG_OK; } // a inner function call of compute_cells_and_kzg_proofs only C_KZG_RET poly_lagrange_to_monomial( fr_t *monomial_out, const fr_t *lagrange, size_t len, const KZGSettings *s ) { C_KZG_RET ret; fr_t *lagrange_brp = NULL; /* Allocate space for the intermediate BRP poly */ ret = new_fr_array(&lagrange_brp, len); if (ret != C_KZG_OK) goto out; /* Copy the values and perform a bit reverse permutation */ memcpy(lagrange_brp, lagrange, sizeof(fr_t) * len); ret = bit_reversal_permutation(lagrange_brp, sizeof(fr_t), len); if (ret != C_KZG_OK) goto out; /* Perform an inverse FFT on the BRP'd polynomial */ ret = fr_ifft(monomial_out, lagrange_brp, len, s); if (ret != C_KZG_OK) goto out; out: c_kzg_free(lagrange_brp); return ret; } // a inner function call of poly_lagrange_to_monomial only C_KZG_RET bit_reversal_permutation(void *values, size_t size, size_t n) { C_KZG_RET ret; byte *tmp = NULL; byte *v = (byte *)values; /* In these cases, do nothing */ if (n == 0 || n == 1) { ret = C_KZG_OK; goto out; } /* Ensure n is a power of two */ if (!is_power_of_two(n)) { ret = C_KZG_BADARGS; goto out; } /* Scratch space for swapping an entry of the values array */ ret = c_kzg_malloc((void **)&tmp, size); if (ret != C_KZG_OK) goto out; /* Reorder elements */ uint64_t unused_bit_len = 64 - log2_pow2(n); assert(unused_bit_len <= 63); for (size_t i = 0; i < n; i++) { uint64_t r = reverse_bits(i) >> unused_bit_len; if (r > i) { /* Swap the two elements */ memcpy(tmp, v + (i * size), size); memcpy(v + (i * size), v + (r * size), size); memcpy(v + (r * size), tmp, size); } } out: c_kzg_free(tmp); return ret; } // a inner function call of poly_lagrange_to_monomial only C_KZG_RET fr_ifft(fr_t *out, const fr_t *in, size_t n, const KZGSettings *s) { /* Handle zero length input */ if (n == 0) return C_KZG_OK; /* Ensure the length is valid */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 if (n > FIELD_ELEMENTS_PER_EXT_BLOB || !is_power_of_two(n)) { return C_KZG_BADARGS; } // FIELD_ELEMENTS_PER_EXT_BLOB = 8192 size_t stride = FIELD_ELEMENTS_PER_EXT_BLOB / n; fr_fft_fast(out, in, 1, s->reverse_roots_of_unity, stride, n); fr_t inv_n; fr_from_uint64(&inv_n, n); blst_fr_eucl_inverse(&inv_n, &inv_n); for (size_t i = 0; i < n; i++) { blst_fr_mul(&out[i], &out[i], &inv_n); } return C_KZG_OK; } The poly_lagrange_to_monomial function is the main entry point. It takes a polynomial whose values are already known at fixed roots of unity (this is the Lagrange form) and constructs the polynomial coefficients. Because FFT algorithms expect inputs in a very specific order, the function first creates a temporary copy of the input values and applies a bit-reversal permutation. This reordering step does not change the values themselves; it simply rearranges them so the inverse FFT can process them efficiently. Once the values are in the correct order, the function calls fr_ifft, which performs the inverse FFT and produces the polynomial’s coefficients in standard monomial form. poly_lagrange_to_monomial bit-reversal permutation fr_ifft The bit_reversal_permutation function, an internal call made by poly_lagrange_to_monomial handles that reordering step. FFT algorithms process data in stages that naturally follow a bit-reversed index order. This function rearranges the array indices so that index i is swapped with the index obtained by reversing the bits of i. For example, in an array of size 8, index 3 (binary 011) maps to index 6 (binary 110). This step is required so that the FFT logic can operate correctly without extra bookkeeping. The function only works when the array length is a power of two, which is why FFT-based algorithms always require sizes like 2, 4, 8, 4096, or 8192. bit_reversal_permutation internal call made by poly_lagrange_to_monomial i i 3 011 6 110 The fr_ifft function, an internal call made by poly_lagrange_to_monomial performs the actual inverse Fast Fourier Transform over the BLS12-381 scalar field. Conceptually, this step takes the polynomial values at roots of unity and reconstructs the unique polynomial that produced them. Internally, it uses the same FFT machinery as the forward transform, but with inverse roots of unity and a final normalization step. That normalization multiplies every coefficient by 1/n, which is mathematically required to undo the scaling introduced during FFT evaluation. After this step finishes, the output array contains the true polynomial coefficients, ready for further use such as polynomial evaluation, erasure coding, or KZG proof generation. fr_ifft internal call made by poly_lagrange_to_monomial inverse Fast Fourier Transform inverse roots of unity 1/n Now we will look at the remaining part of the compute_cells_and_kzg_proofs function, including the inner logic of the functions it calls and how the final cells and proofs are produced. compute_cells_and_kzg_proofs if (cells != NULL) { /* Allocate space for our data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = new_fr_array(&data_fr, FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Get the data points via forward transformation */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = fr_fft(data_fr, poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = bit_reversal_permutation(data_fr, sizeof(fr_t), FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the cells to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { // FIELD_ELEMENTS_PER_CELL = 64 for (size_t j = 0; j < FIELD_ELEMENTS_PER_CELL; j++) { // FIELD_ELEMENTS_PER_CELL = 64 size_t index = i * FIELD_ELEMENTS_PER_CELL + j; // BYTES_PER_FIELD_ELEMENT = 32 size_t offset = j * BYTES_PER_FIELD_ELEMENT; bytes_from_bls_field((Bytes32 *)&cells[i].bytes[offset], &data_fr[index]); } } } if (proofs != NULL) { /* Allocate space for our proofs in g1-form */ // CELLS_PER_EXT_BLOB = 128 ret = new_g1_array(&proofs_g1, CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Compute the proofs, only uses the first half of the polynomial */ ret = compute_fk20_cell_proofs(proofs_g1, poly_monomial, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the proofs */ // CELLS_PER_EXT_BLOB = 128 ret = bit_reversal_permutation(proofs_g1, sizeof(g1_t), CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the proofs to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { bytes_from_g1(&proofs[i], &proofs_g1[i]); } } out: c_kzg_free(poly_monomial); c_kzg_free(poly_lagrange); c_kzg_free(data_fr); c_kzg_free(proofs_g1); return ret; } if (cells != NULL) { /* Allocate space for our data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = new_fr_array(&data_fr, FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Get the data points via forward transformation */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = fr_fft(data_fr, poly_monomial, FIELD_ELEMENTS_PER_EXT_BLOB, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the data points */ // FIELD_ELEMENTS_PER_EXT_BLOB = 8196 ret = bit_reversal_permutation(data_fr, sizeof(fr_t), FIELD_ELEMENTS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the cells to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { // FIELD_ELEMENTS_PER_CELL = 64 for (size_t j = 0; j < FIELD_ELEMENTS_PER_CELL; j++) { // FIELD_ELEMENTS_PER_CELL = 64 size_t index = i * FIELD_ELEMENTS_PER_CELL + j; // BYTES_PER_FIELD_ELEMENT = 32 size_t offset = j * BYTES_PER_FIELD_ELEMENT; bytes_from_bls_field((Bytes32 *)&cells[i].bytes[offset], &data_fr[index]); } } } if (proofs != NULL) { /* Allocate space for our proofs in g1-form */ // CELLS_PER_EXT_BLOB = 128 ret = new_g1_array(&proofs_g1, CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Compute the proofs, only uses the first half of the polynomial */ ret = compute_fk20_cell_proofs(proofs_g1, poly_monomial, s); if (ret != C_KZG_OK) goto out; /* Bit-reverse the proofs */ // CELLS_PER_EXT_BLOB = 128 ret = bit_reversal_permutation(proofs_g1, sizeof(g1_t), CELLS_PER_EXT_BLOB); if (ret != C_KZG_OK) goto out; /* Convert all of the proofs to byte-form */ // CELLS_PER_EXT_BLOB = 128 for (size_t i = 0; i < CELLS_PER_EXT_BLOB; i++) { bytes_from_g1(&proofs[i], &proofs_g1[i]); } } out: c_kzg_free(poly_monomial); c_kzg_free(poly_lagrange); c_kzg_free(data_fr); c_kzg_free(proofs_g1); return ret; } After the polynomial has been reconstructed in monomial (coefficient) form, the function moves on to computing the actual cell data, but only if the caller has requested cells. To do this, it first allocates space for an array of field elements large enough to hold the extended polynomial evaluations (8192 field elements). It then performs a forward FFT (fr_fft) on the polynomial coefficients. This step evaluates the polynomial at 8192 fixed roots of unity, producing both the original data values and the additional parity values required for erasure coding. In other words, this is the exact step where the extended blob is created mathematically. After the FFT, a bit-reversal permutation is applied so the evaluated values are reordered into the index layout expected by the protocol. Finally, these 8192 field elements are grouped into 128 cells, where each cell contains 64 field elements, and each field element is converted back into its 32-byte representation and written into the final cells output. actual cell data extended polynomial evaluations forward FFT fr_fft parity values bit-reversal permutation 128 cells 64 field elements cells Next, if KZG proofs are requested, the function computes them independently of the cell data. It allocates space for one elliptic-curve proof per cell and calls compute_fk20_cell_proofs, which generates a proof for each cell using the polynomial coefficients and the trusted setup. These proofs allow anyone to verify that a given cell truly comes from the committed polynomial, without needing the rest of the blob. In this article, we do not go into the internal mathematical details of how each per-cell proof is constructed, since that logic is complex and handled entirely inside the KZG library. At this level, it is sufficient to understand that one proof is produced per cell, and that each proof securely binds the cell back to the original polynomial commitment. As with the cell data, the proofs are reordered using a bit-reversal permutation to match the protocol’s indexing rules, and then each proof is converted from its internal elliptic-curve representation into its serialized byte form. compute_fk20_cell_proofs not Finally, once all requested work is complete, the function frees all temporary memory buffers used during computation and returns the result. At this point, the original blob has been fully transformed into extended blob cells and their corresponding KZG proofs, ready to be used for data availability sampling, verification, and recovery in PeerDAS. extended blob cells and their corresponding KZG proofs When we run the FFT, we conceptually start with a list of polynomial coefficients where index 0 holds the constant term, index 1 holds the coefficient of x, index 2 holds the coefficient of x2, and so on. However, the FFT algorithm does not process these coefficients in simple left-to-right order. Instead, it repeatedly splits the input into even-indexed and odd-indexed parts, and then recursively splits those parts again. Because of this repeated “even-odd” splitting, the order in which values are visited and combined during the recursion is no longer the natural index order 0,1,2,3,…. For example, with 8 coefficients, the recursion ends up visiting indices in the order 0,4,2,6,1,5,3,7. This ordering looks strange at first, but if you write the indices in binary and reverse their bits, you will see that this is exactly the bit-reversal of the original index order. The FFT math itself is completely correct in this order—the algorithm produces the right values for the polynomial evaluated at the roots of unity—but those values land in positions that match the recursion order, not the final logical order we expect. In other words, the FFT naturally computes the results in “bit-reversed index order” because of how the divide-and-conquer recursion walks through the data. This is why, after the FFT computation finishes, we apply a bit-reversal permutation. That permutation does not change any values; it only rearranges them so that the value corresponding to ω^0 ends up at index 0, the value at ω^1 ends up at index 1, and so on. In simple terms, bit reversal is needed because FFT’s recursive structure produces results in a shuffled order, and this final reordering step puts everything back where it logically belongs.This is also why FFT implementations always require the input size to be a power of two: only then does the even-odd splitting correspond cleanly to bit reversal. Once you recognize that the recursion naturally generates a bit-reversed traversal of the original coefficient indices, the need for the bit-reversal permutation becomes obvious—it is simply correcting the order produced by the recursion, not fixing a mathematical mistake. When we run the FFT, we conceptually start with a list of polynomial coefficients where index 0 holds the constant term, index 1 holds the coefficient of x, index 2 holds the coefficient of x2, and so on. However, the FFT algorithm does not process these coefficients in simple left-to-right order. Instead, it repeatedly splits the input into even-indexed and odd-indexed parts, and then recursively splits those parts again. Because of this repeated “even-odd” splitting, the order in which values are visited and combined during the recursion is no longer the natural index order 0,1,2,3,…. For example, with 8 coefficients, the recursion ends up visiting indices in the order 0,4,2,6,1,5,3,7. This ordering looks strange at first, but if you write the indices in binary and reverse their bits, you will see that this is exactly the bit-reversal of the original index order. The FFT math itself is completely correct in this order—the algorithm produces the right values for the polynomial evaluated at the roots of unity—but those values land in positions that match the recursion order, not the final logical order we expect. In other words, the FFT naturally computes the results in “bit-reversed index order” because of how the divide-and-conquer recursion walks through the data. This is why, after the FFT computation finishes, we apply a bit-reversal permutation. That permutation does not change any values; it only rearranges them so that the value corresponding to ω^0 ends up at index 0, the value at ω^1 ends up at index 1, and so on. In simple terms, bit reversal is needed because FFT’s recursive structure produces results in a shuffled order, and this final reordering step puts everything back where it logically belongs. This is also why FFT implementations always require the input size to be a power of two: only then does the even-odd splitting correspond cleanly to bit reversal. Once you recognize that the recursion naturally generates a bit-reversed traversal of the original coefficient indices, the need for the bit-reversal permutation becomes obvious—it is simply correcting the order produced by the recursion, not fixing a mathematical mistake.