Ignacio Hagopian (@jsign) blog
    home
    GitHubX
    ⭕

    Deep dive into Circle-STARKs FFT

    Thanks to Ulrich Haböck for his feedback!

    • Introduction
    • Why a new FFT?
    • Where does FFT fit in Circle-STARKs?
    • The ingredients
    • The group
    • The domain
    • Twin-cosets
    • Standard-position cosets
    • The basis
    • Riemann-Roch space
    • Dimension gap
    • The final basis
    • FFT
    • First step (bivariate → univariate)
    • Solving a.
    • Solving b.
    • Second step (univariate → univariate)
    • Solving a.
    • Solving b.
    • Further and base case
    • IFFT
    • What about FRI?
    • Final words

    Introduction

    If you’re reading this article, you’re probably trying to learn more about Circle-STARKs, a new proving system published ~three months ago. I heard about this breakthrough the day the paper was released. Still, I didn’t read about it until Vitalik wrote Exploring Circle STARKs, which I highly recommend (I also recommend LambdaClass article).

    As a cryptography enthusiast, it’s normal to read a high-level explanation about some breakthrough and be left thinking: That sounds quite magical! I wonder why <insert here a million questions>. If the idea is new, usually you’re on your own! Any (limited) articles that aren’t the paper might unavoidably have to ignore information; that’s the whole point of being high-level (with time, what you consider “high-level” changes; many would argue that Vitaliks post isn’t).

    I’m not a mathematician nor cryptographer, but I bit the bullet and jumped to the paper to understand more — in particular, I wanted to understand the FFT in more detail, which is at the core of the breakthrough.

    Usually, I write articles to self-check my understanding and hopefully save time for someone out there trying to understand the same. This article is more detailed than Vitalik’s FFT section, which provides some hand-holding while trying to understand the paper.

    Even though I’m not a cryptographer and do not spend time reading papers every day, I recommend reading the paper. Considering I could read it (with some effort), I believe this proves the authors did a fantastic job.

    Why a new FFT?

    In cryptography, finite fields are used, which provide enough algebraic structure to build proving systems with a bounded size that is convenient for their implementation. The bigger the field size, the more work should be done per operation since we need to use multiple registers to represent their values (sum, multiply, reduce, etc).

    A relatively new field is Mersene31 (M31), a prime field with modulus 231−12^{31}-1231−1. The benefit of this field is that you can do fast implementations since:

    • You can represent values in 31-bit values, which is convenient for CPUs/GPUs (32-bit registers).
    • Adding two numbers requires, at most, one extra bit, which is perfect for the same reason.
    • Reducing a non-canonical value from addition or multiplication is very simple since you can decompose it into 32-bit chunks and, with some shifts and additions, have a canonical representation.

    In proving systems, polynomial interpolations and evaluations are commonly required. If you don’t want to do this naively (and thus, slowly), you usually want to use FFT if the evaluation domain is special enough. Usually, the multiplicative group of the field has a big enough subgroup generated by a primitive root of unity, which satisfies this need.

    This isn’t the case for this finite field, so that’s quite unfortunate. We have a field with good mechanical sympathy, but you’re missing the needed algebraic structure. Circle-STARKs were invented to deal with this challenge. The paper is more generic than targeting M31; it targets a CFFT-friendly prime, a prime ppp where p+1p+1p+1 is divisible by 2N+12^{N+1}2N+1 some large NNN. M31 is a particular instance of such a prime.

    In STARKs, there’s a particular decoupling between the field you use for arithmetization and the one required for good soundness in the FRI step. For Circle-STARKs, the same strategy is applied — for FRI you use an extension field of the arithmetization field.

    Where does FFT fit in Circle-STARKs?

    Proving systems have many layers of concepts to understand, but at the end of the day, some computation is represented as a constraint system using polynomials. After that transformation, you want to prove the knowledge of some witness that satisfies those constraints. These constraints are checked efficiently by using some fundamental (but powerful) polynomial theorems.

    There are two primary needs while transforming the computation to polynomials:

    • Given the execution witness table values, we want to encode this information in a polynomial. This is doing a polynomial interpolation.
    • We express the constraints as new polynomials created using the interpolated execution witness values and ask them to satisfy some equality within the interpolated domain.

    To prove the last point, some usual theorems/lemmas from polynomials are used, transforming the problem into proving that a polynomial division isn’t a rational function. i.e., the quotient is a polynomial with a bounded degree. This is where FRI comes in and proves that claim. (In other proving systems that aren’t STARKs, this varies).

    To apply FRI, we must do a “low degree extension,” evaluating the polynomial in a bigger domain. Then, we create the Merkle Tree and apply FRI. You can do this “naively” with some Lagrange basis and polynomial evaluations — but if the interpolation domain satisfies some properties, you can do this much faster with FFT and IFFT.

    In regular STARKs, we work with univariate polynomials with a field from which we can have a proper multiplicative subgroup to apply FFT/IFFT. In Circle-STARKs, all these ideas of interpolation and evaluation are the same — but we work with bivariate “polynomials” (this isn’t true! but more on this later) in a new kind of domain built from a novel group construction.

    Other parts of the Circle-STARK constructions also require some adjustments, but they’re finding analogous ways of doing the same as in regular STARKs. I won’t even claim I understand them in detail (yet!) since I got into the FFT rabbit hole for now!

    The STARK (and usually most) proving system is modular, and you can plug and play different constructions if they satisfy the required algebraic properties. I find this modularity fact very beautiful! — it’s worth taking a minute to appreciate how elegant math is.

    The ingredients

    There are multiple kinds of FFT. In my case, I was only aware of the usual FFT using roots of unity. However, there’s also ECFFT and others — still, you usually have to understand the same ingredients:

    • The group you’re working on.
    • The domain of evaluation.
    • The basis used for interpolation.

    For usual roots of unity FFT, you’d use the field multiplicative group, a proper subgroup, and [1,x,x2,…][1, x, x^2, …][1,x,x2,…], respectively. Now, let’s explore these dimensions for Circle-STARKs FFT.

    The group

    It all starts with the unity circle equation:

    x2+y2=1x^2+y^2=1x2+y2=1

    where (x,y)(x, y)(x,y) coordinates are elements of our field (e.g. M31). For the points in the circle, we define the following group operation:

    (x0,y0)⋅(x1,y1)=(x0⋅x1−y0⋅y1,x0⋅y1+y0⋅x1)(x_0, y_0) · (x_1, y_1) = (x_0 · x_1−y_0·y_1, x_0·y_1+y_0·x_1)(x0​,y0​)⋅(x1​,y1​)=(x0​⋅x1​−y0​⋅y1​,x0​⋅y1​+y0​⋅x1​)

    From this defined group operation, we can note three facts:

    • The neutral element is (1,0)(1, 0)(1,0), since (x,y)⋅(1,0)=(x⋅1−y⋅0,x⋅0+y⋅1)=(x,y)(x, y) · (1, 0) = (x · 1 - y · 0, x·0 + y · 1) = (x ,y)(x,y)⋅(1,0)=(x⋅1−y⋅0,x⋅0+y⋅1)=(x,y)
    • The squaring for a point is π(x,y)=(x,y)⋅(x,y)=(2⋅x2−1,2⋅x⋅y)π(x, y) = (x, y) · (x, y) = (2·x^ 2 − 1, 2 · x · y)π(x,y)=(x,y)⋅(x,y)=(2⋅x2−1,2⋅x⋅y)
    • The inverse of an element is (x,−y)(x, -y)(x,−y), since (x,y)⋅(x,−y)=(x2+y2,x⋅y−x⋅y)=(1,0)(x, y) · (x, -y) = (x^2 + y^2, x·y - x · y) = (1, 0)(x,y)⋅(x,−y)=(x2+y2,x⋅y−x⋅y)=(1,0)

    We can appreciate the above in the following diagram:

    image

    Finally, let’s highlight three facts that we’ll have to remember:

    • If we square PPP or −P-P−P we get the same xxx coordinate (i.e: 2x2−12x^2-12x2−1) and symmetrical results. You can check this yourself by doing the math with the group law.
    • This group is cyclic — there’s an element ggg from which you can generate all elements in the group. (Section 3.1 in the paper)
    • The number of elements in the group is p+1p+1p+1 which, in the case of Mersene31, means 2312^{31}231 elements. (Lemma 1 in the paper).

    The domain

    In regular STARKs, two domains are used:

    • Interpolation domain: is where we encode the witness values into a polynomial.
    • FRI domain: is where we evaluate the polynomial to construct the Merkle Tree.

    For targeting zero-knowledge, both domains shouldn’t overlap. This is done by choosing the FRI domain as a shifted (but larger!) version of the Interpolation domain. Here shifted usually means a coset — if GGG is the interpolation domain we take a coset Q⋅GQ · GQ⋅G. Using this coset still satisfies the properties we need for usual FFTs in STARKs. In the case of circle STARKs, there’s an analogous concept of Twin cosets, which is another way to have a non-overlapping domain for Circle-FFT.

    The Circle-STARK paper defines two constructions to achieve the same goal: Twin cosets and Standard position cosets. What’s important to note is that both constructions can be used as domains for Circle-FFT, so after this sub-section, you can imagine working only with Twin cosets or Standard position cosets (which is probably easier).

    Twin-cosets

    A Twin coset is defined as D=Q⋅Gn−1∪Q−1⋅Gn−1D = Q·G_{n-1} \cup Q^{-1}·G_{n-1}D=Q⋅Gn−1​∪Q−1⋅Gn−1​, but let’s unpack it a bit more:

    1. Let’s define GGG as the (cyclic) circle group we explained in the previous section. The sub-index nnn means that the group has order 2n2^n2n.
    2. Gn−1G_{n-1}Gn−1​ is defined as squaring GnG_{n}Gn​, thus it’s order is 2n−12^{n-1}2n−1. This is the same as squaring a domain generated by a primitive root of unity.
    3. QQQ is an element from GGG, which must not be the neutral element or have order two.
    4. Q⋅Gn−1Q·G_{n-1}Q⋅Gn−1​ is a left-coset of Gn−1G_{n-1}Gn−1​
    5. We can prove that it doesn’t overlap thanks to the properties we asked for in bullet 3.

    In a way, we’re taking Gn−1G_{n-1}Gn−1​ and shifting by QQQ to get half of DDD and by Q−1Q^{-1}Q−1 to get the other half. But let’s try to explore a bit deeper some other facts:

    • If we try to check if both groups overlap, we’d have to know some g0,g1∈Gn−1g_0, g_1 \in G_{n-1}g0​,g1​∈Gn−1​ that Q⋅g0=Q−1⋅g1Q·g_0 = Q^{-1}·g_1Q⋅g0​=Q−1⋅g1​, if we multiply by QQQ both sides we have Q2⋅g0=g1Q^2·g_0=g_1Q2⋅g0​=g1​ so clearly Q2∈Gn−1Q^{2} \in G_{n-1}Q2∈Gn−1​ which is why was mentioned that QQQ can’t be the neutral element or have order two.
    • More interestingly, the inverse element from one coset is in the other. Let g∈Gg \in Gg∈G then (Q⋅g)⋅(Q−1⋅g−1)=1(Q·g) · (Q^{-1}·g^{-1}) = 1(Q⋅g)⋅(Q−1⋅g−1)=1.

    This last point can be appreciated in the following drawing shown in the paper about Twin cosets examples:

    Figure 1 in paper - Twin Cosets of different sizes
    Figure 1 in paper - Twin Cosets of different sizes

    The paper doesn’t give a visual interpretation of that image (maybe considered trivial?), but from my perspective, it seems pretty clear that the blue and red points are the points of each coset. Additionally, you can appreciate that the inverse of a point is another point with the opposite color.

    Finally, comes a crucial point: what happens if we square DDD? As you can imagine, this is a basic step in any FFT-like algorithm, and we’d imagine it should halve in size and keep its main properties. Let’s try to do that directly:

    • We start with D=Q⋅Gn−1∪Q−1⋅Gn−1D = Q·G_{n-1} \cup Q^{-1}·G_{n-1}D=Q⋅Gn−1​∪Q−1⋅Gn−1​, now we square it…
    • π(D)=π(Q⋅Gn−1)∪π(Q−1⋅Gn−1)π(D) = π(Q·G_{n-1}) \cup π(Q^{-1}·G_{n-1})π(D)=π(Q⋅Gn−1​)∪π(Q−1⋅Gn−1​)
    • π(D)=π(Q)⋅Gn−2∪π(Q−1)⋅Gn−2π(D) = π(Q)·G_{n-2} \cup π(Q^{-1})·G_{n-2}π(D)=π(Q)⋅Gn−2​∪π(Q−1)⋅Gn−2​

    Looks quite familiar? :) Remember that QQQ and Q−1Q^{-1}Q−1 have the same xxx coordinate, and the πππ (squaring) keeps the inverse mapping, so we have again a Twin coset of half the size!

    This sounds exactly like what our previous understanding of common roots of the unity FFT domain should look like. A set of elements where the squaring function halves the set and keeps the property of point inverse relationship, which we can continue to square until we reach a base case.

    Why do we have to do this weird union of cosets instead of using a coset directly? The paper mentions the following:

    Although breaking with the group structure, twin-cosets are natural evaluation domains for the FFT. They are used to work around the non-smooth behaviour of the circle FFT under rotation, and are the typical extrapolation target in our application.

    I believe this is claiming that if you take any coset (for any Circle-FFT friendly prime ppp), then by squaring the coset you won’t preserve the FFT property we need.

    Standard-position cosets

    If the Circle-FFT prime ppp we’re using satisfies that p+1p+1p+1 it’s divided by a big 2n2^n2n, we can use another construct for the domain called Standard position coset: D=Q⋅GnD = Q·G_{n}D=Q⋅Gn​ with QQQ in Gn+1G_{n+1}Gn+1​.

    This construction is simpler since we don’t have to deal with unions. In the case of using M31, we have p+1=231p+1 = 2^{31}p+1=231 so we could build a proper domain of size up to 2302^{30}230. For other primes, you might have p+1=2n⋅kp+1 = 2^n · kp+1=2n⋅k , with a small nnn, so its size might fall short for our needs.

    Twin cosets don’t depend on any properties of ppp, so it’s a more general construction. As mentioned earlier, you can think of our domains as twin cosets or standard position cosets — both satisfy what we need for the FFT domain, so which one you use is (mostly) irrelevant.

    The basis

    Let’s take a step back and make a face-to-face comparison between usual STARKs and Circle-STARKs in relevant dimensions for an FFT:

    Domain
    Interpolation coefficients
    Encoding target
    Usual STARK
    Roots of unity in multiplicative subgroup
    Field element
    Univariate polynomials with basis 111, xxx, x2x^2x2,…
    Circle-STARK
    Twin-Cosets / Standard position cosets
    Field element
    Bivariate “polynomial” (Riemann-Roch space)

    Riemann-Roch space

    In the previous section, we discussed what domain we use for Circle-STARKs. But what is this Reimann-Roch space concept? Vitalik's article briefly mentioned, so here I’ll dive a bit deeper. Remember, I’m not a mathematician and was on my own unpacking all this, so hopefully, there aren’t any mistakes (if that’s the case, please ping me!).

    In usual STARKs, we “encode” the columns of the witness table into polynomials by doing an interpolation. This interpolation outputs the coefficients of a univariate polynomial where, on each element of the (root of unity) domain, the evaluation is equal to the witness value at a specific point in the execution trace. In a way, you can see this as encoding the witness table into something else — in this case, an univariate polynomial.

    In the case of Circle-STARKs, we encode these values not in a univariate polynomial but in a bivariate polynomial—not any bivariate polynomial, but ones in a (Riemann-Roch) space.

    Before continuing down the rabbit hole, it’s worth saying that the idea is to encode the witness values into “somewhere” that has appropriate properties to run FRI. In the case of usual STARKs, we talk about Reed-Solomon codes; this space has the required properties. In the case of Circle-STARKs, the idea is that this Riemann-Roch space is isomorphic to a Reed-Solomon code and thus also has the required properties.

    If we call this space LN(F)L_N(F)LN​(F) which:

    • FFF is the (extension) field that we’re working on.
    • Are bivariate polynomials modulo x2+y2−1x^2+y^2-1x2+y2−1
    • Have degree N/2N/2N/2. This means that for each term xa⋅ybx^a · y^bxa⋅yb it satisfies a+b≤N/2a+b \leq N/2a+b≤N/2

    Since we’re considering bivariate polynomials modulo x2+y2−1x^2+y^2-1x2+y2−1, you can take any bivariate polynomial and substitute y2=1−x2y^2 = 1 - x^2y2=1−x2 (i.e: y2=1⋅(x2+y2+1)+1−x2y^2 = 1 · (x^2 + y^2+1)+1-x^2y2=1⋅(x2+y2+1)+1−x2). Thus, this space of polynomials always has 111 as the degree of yyy terms. Moreover, we can define a canonical form for a polynomial in this space as:

    f(x,y)=f0(x)+y⋅f1(x)f(x, y) = f_0(x) + y·f_1(x)f(x,y)=f0​(x)+y⋅f1​(x)

    This should be obvious since we said the highest degree of yyy is 111. I find this canonical form to be handy to imagine how these polynomials look like, and also:

    • It already looks interesting that you can see f(x,y)f(x, y)f(x,y) as the sum of two terms involving univariate polynomials only depending on xxx. It might not be obvious why that’s useful, but this is no coincidence.
    • If you know about FRI, this should look quite familiar.

    Dimension gap

    What about this degree <N/2< N/2<N/2 requirement? The paper has a proven proposition that claims this space has dimension N+1N+1N+1. This means that by having these degree <N/2<N/2<N/2 bivariate polynomials we can encode stuff with N+1N+1N+1 degrees of freedom.

    Remember that we can see elements of the Riemann-Roch space in a canonical form like: f(x,y)=f0(x)+y⋅f1(x)f(x, y) = f_0(x) + y·f_1(x)f(x,y)=f0​(x)+y⋅f1​(x), which means that we could describe it on the following basis:

    [1,x,x2,...,xN/2,y,y⋅x,y⋅x2,...,y⋅xN/2−1][1, x, x^2, ..., x^{N/2}, y, y·x, y·x^2, ..., y·x^{N/2-1}][1,x,x2,...,xN/2,y,y⋅x,y⋅x2,...,y⋅xN/2−1]

    Despite this already looking like a weird basis compared with the usual roots-of-unity FFT one 1,x,x2,…1, x, x^2, …1,x,x2,…, it should make sense, given the mentioned canonical form.

    Note that y⋅xN/2−1y·x^{N/2-1}y⋅xN/2−1 has N/2−1N/2-1N/2−1 as the maximum degree of xxx since we have to respect the N/2N/2N/2 degree bound. If we count the number of elements on this basis is (1+N/2)+(1+N/2−1)=N+1(1+N/2)+(1+N/2-1)=N+1(1+N/2)+(1+N/2−1)=N+1.

    But remember, we want to use this space as the target for our FFT. The FFT “output” dimension is NNN, but the target space has one extra dimension. To solve this, a L’N(F)L’_N(F)L’N​(F) subspace is used as the “real” target space — since it’s a subspace it satisfies the properties as the space but correctly fits the FFT output requirements.

    The final basis

    To understand how L’N(F)L’_N(F)L’N​(F) work, I’ll make the assumption of using Standard position cosets for the domain. If we use Twin cosets the construction changes slightly, but it’s the same idea.

    Let's first define vn(x)v_n(x)vn​(x) as follows:

    • v1(x)=xv_1(x) = xv1​(x)=x
    • vn=vn−1∘πv_n = v_{n-1} \circ \pivn​=vn−1​∘π. Recall π\piπ was our squaring operation π(x,y)=(2⋅x2−1,2⋅x⋅y)\pi(x ,y) = (2·x^2 − 1, 2 · x · y)π(x,y)=(2⋅x2−1,2⋅x⋅y), but here we only refer to the xxx coordinate since vnv_nvn​ is univariate.
    • If we expand the first values:
      • v1(x)=xv_1(x) = xv1​(x)=x
      • v2(x)=2⋅x2−1v_2(x) = 2·x^2 − 1v2​(x)=2⋅x2−1
      • v2(x)=8⋅x4−8⋅x2+1v_2(x) = 8 · x^4 − 8 · x^2 + 1v2​(x)=8⋅x4−8⋅x2+1
      • …

    These vnv_nvn​ aren’t arbitrary polynomials, but vanishing polynomials. Note that since they’re powers of π\piπ, each polynomial zero-out on domains of different size. i.e: if you square and square a given domain, at some point you’ll reduce to a single point which (for standard position cosets) will be 0. For this article, we can ignore the implications of this, but it is relevant for the overall Circle-STARK construction.

    Let’s jump directly to the definition of our FFT basis. Remember that our basis is just a vector to which our FFT coefficients would map. Each element of our basis BNB_NBN​ (which generates L’NL’_NL’N​) is defined as:

    bj=yj0⋅v1(x)j1⋅...⋅vn−1(x)jn−1b_j = y^{j_0} · v_1(x)^{j_1}· ... · v_{n-1}(x)^{j_{n-1}}bj​=yj0​⋅v1​(x)j1​⋅...⋅vn−1​(x)jn−1​

    where j=j0+j1⋅2+…+jn−1⋅2n−1j = j_0 + j_1·2 + … + j_{n-1}·2^{n-1}j=j0​+j1​⋅2+…+jn−1​⋅2n−1.

    At first sight, this definition can be confusing and complex, but let’s unpack it:

    • jjj is the index of the element in the basis. i.e: bjb_jbj​ is the jjj-th element of the basis.
    • While moving through the index, you’re enabling/disabling terms in the definition. e.g. notice that odd jjj will have an yyy term, and even ones won’t.
    • Let’s expand to understand it better:
      • 111
      • yyy
      • v1(x)v_1(x)v1​(x)
      • y⋅v1(x)y·v_1(x)y⋅v1​(x)
      • v2(x)v_2(x)v2​(x)
      • y⋅v2(x)y·v_2(x)y⋅v2​(x)
      • v1(x)⋅v2(x)v_1(x)·v_2(x)v1​(x)⋅v2​(x)
      • y⋅v1(x)⋅v2(x)y·v_1(x)·v_2(x)y⋅v1​(x)⋅v2​(x)
      • …
    • Note that we can only have yyy with degree 000 or 111.

    If we want to be strict, we have to prove this is a proper basis — each term of the basis should be linearly independent of the others. The formal proof can be found in the paper — but the way I think about it is noticing that each newly introduced element has a degree of being “the next” power of two. If you try doing linear combinations up to element kkk, you can’t have an element of the degree of element k+1k+1k+1.

    Let’s build more intuition on why this basis is useful:

    • If you see the expanded first 8 terms above, you can notice that half of them have an yyy term and the other half doesn’t.
    • You can appreciate how using this basis already hints is a sub-space of our space, since it also can be described in the canonical form: f(x,y)=f0(x)+y⋅f1(x)f(x ,y) = f_0(x) + y · f_1(x)f(x,y)=f0​(x)+y⋅f1​(x).
    • This means f0f_0f0​ and f1f_1f1​ are constructed with the same basis but without the yyy term.
    • The basis of size N/2N/2N/2 could be constructed by “squaring” the base of size NNN. This is because our vnv_nvn​ definition is powers of — this should ring a bell for FFT-like requirements!

    Note the subspace definition implies its dimension NNN, solving the original problem that motivated this construction.

    FFT

    Now that we have a reasonable understanding of the domain and its basis, we can finally describe the FFT algorithm. Before doing so, let’s review what we are trying to achieve.

    The FFT algorithm allows the efficient calculation of the coefficients for a polynomial from its evaluations in a defined domain. The inverse FFT (IFFT) does the inverse operation. Given the coefficients on our basis, we can efficiently calculate the evaluation in the domain.

    As mentioned, it’s important that you always keep in mind that when we say the polynomial coefficients we aren’t talking about the coefficients in the usual basis of polynomials like: a⋅x2+b⋅x+3a· x^2+b· x+3a⋅x2+b⋅x+3, since this is assuming a basis of [1,x,x2][1, x, x^2][1,x,x2]. For this algorithm, we’re calculating the coefficients of the polynomials described in the basis BNB_NBN​ described before: [1,y,v1(x),y⋅v1(x),…][1, y, v_1(x), y· v_1(x), … ][1,y,v1​(x),y⋅v1​(x),…]. Note that the length of the basis depends on the length of the evaluation vector we have as input.

    The papers explain the FFT algorithm separately from proving how it works — I’ll merge both explanations since I find it more useful. The following is an image that can describe at a high level the mechanics of how it works:

    image

    Please return to this image while we explore how it works below. Let’s jump into it!

    First step (bivariate → univariate)

    Recall that each polynomial in our Riemann-Roch (sub!)space can be described in its canonical form:

    f(x,y)=f0(x)+y⋅f1(x)f(x, y) = f_0(x) + y · f_1(x)f(x,y)=f0​(x)+y⋅f1​(x)

    where f(x,y)f(x, y)f(x,y) can be described in “coefficient form” in the basis:

    LN′=[1,y,v1(x),y⋅v1(x),…]L'_N = [1, y, v_1(x), y· v_1(x), … ]LN′​=[1,y,v1​(x),y⋅v1​(x),…]

    Given our decomposition, we can see that both f0f_0f0​ and f1f_1f1​ can be described under the basis:

    U=[1,v1(x),v2(x),v1⋅v2(x),…]U = [1, v_1(x), v_2(x), v_1 · v_2(x), … ]U=[1,v1​(x),v2​(x),v1​⋅v2​(x),…]

    since we removed the yyy term. In other words, f1f_1f1​ has the terms with involving yyy and f0f_0f0​ has the terms without yyy. Since yyy in BnB_nBn​ has degree 000 or 111 then f0f_0f0​ and f1f_1f1​ only depend on even indexes elements of our original BNB_NBN​. This is useful since after this first decomposition, we can continue applying FFT in univariate “polynomials” (i.e: only depending on xxx). This new basis UUU, only depending on xxx, is half the size of the original! This is why, in the diagram above (in green), I refer to “+-” since we’re “collapsing” symmetrical points.

    Let’s assume that we could apply the FFT algorithm we’re describing to f0f_0f0​ and f1f_1f1​. That is, given their evaluations it returns the coefficients form. If we can reconstruct the coefficients of fff then we’re done.

    The question now is, given the evaluations of fff:

    1. How do we calculate the evaluations of f0f_0f0​ and f1f_1f1​?
    2. After we use a. result and run FFT on f0f_0f0​ and f1f_1f1​ we have their coefficients — how using that do we calculate the coefficients of fff?

    Solving a.

    For a., we note the following relationship:

    f0(x)=f(x,y)+f(x,−y)2f1(x)=f(x,y)−f(x,−y)2⋅yf_0(x) = \frac{f(x, y) + f(x, -y)}{2} \newline f_1(x) = \frac{f(x, y) - f(x, -y)}{2 · y}f0​(x)=2f(x,y)+f(x,−y)​f1​(x)=2⋅yf(x,y)−f(x,−y)​

    The first equation is canceling the yyy terms and dividing by 222 so we are only left with xxx terms. If you know how usual FFT works, this should look very familiar. The second equation is similar but cancels the terms without yyy, and then divides by 2⋅y2·y2⋅y to removing the doubling and the yyy term. With this, we can see how f0f_0f0​ and f1f_1f1​ are the right parts to write fff in the canonical form.

    These equations are a symbolic relationship, but remember that our input for the algorithm is the evaluations of fff. To get the evaluations of f0f_0f0​, we evaluate the above f0f_0f0​ equation at each point in the domain, say (dx,dy)(d_x, d_y)(dx​,dy​):

    f0(dx)=f(dx,dy)+f(dx,−dy)2f_0(d_x) = \frac{f(d_x, d_y) + f(d_x, -d_y)}{2}f0​(dx​)=2f(dx​,dy​)+f(dx​,−dy​)​

    Recall that the way our domain works, both (dx,dy)(d_x, d_y)(dx​,dy​) and (dx,−dy)(d_x, -d_y)(dx​,−dy​) are part of our domain (symmetric points on xxx), so we have both evaluations in the evaluation form of fff, so we can calculate this value. We do the same for all ddd values in the domain, so we can have the evaluation form of f0f_0f0​. We do the same to get the evaluation form of f1f_1f1​ but with the other equation. With this, we solved point a.

    Solving b.

    Now that we have the evaluations and magically use this same FFT but for f0f_0f0​ and f1f_1f1​ (we’ll explain this next, but let’s assume it works), we have the “coefficient form” for f0f_0f0​ and f1f_1f1​.

    Reconstructing the coefficient form for fff is relatively easy. Recall that f0f_0f0​ where the terms of BNB_NBN​ that don’t have the yyy term, so the even indexes in BNB_NBN​, and f1f_1f1​ would map to the odd indexes of BNB_NBN​ since we had y⋅f1(x)y · f_1(x)y⋅f1​(x) — so we interleave the coefficients from f0f_0f0​ with the coefficients of f1f_1f1​ and we’re done.

    OK, for now the algorithm works if we know that given evaluations of f0f_0f0​ (or f1f_1f1​) in UUU (defined before), we can get their coefficient form. Let’s prove that next.

    Second step (univariate → univariate)

    The problem now was transformed from bivariate to univariate. The basis now is BNB_NBN​ (with NNN half of the previous step) but we got rid of yyy terms — those were removed in the first step. The basis now looks like: [1,v0(x),v1(x),v0(x)⋅v1(x),…][1, v_0(x), v_1(x), v_0(x) · v_1(x), …][1,v0​(x),v1​(x),v0​(x)⋅v1​(x),…]. Recall that v0(x)=xv_0(x)=xv0​(x)=x, thus BN=[1,x,v1(x),x⋅v1(x),…]B_N = [1, x, v_1(x), x · v_1(x), …]BN​=[1,x,v1​(x),x⋅v1​(x),…].

    We now apply the same idea of function decomposition and express:

    f(x)=f0(x)+x⋅f1(x)f(x) = f_0(x) + x · f_1(x)f(x)=f0​(x)+x⋅f1​(x)

    But not so fast — this isn’t the relationship Circle-STARK FFT use but:

    f(x)=f0(π(x))+x⋅f1(π(x))f(x) = f_0(\pi(x)) + x · f_1(\pi(x))f(x)=f0​(π(x))+x⋅f1​(π(x))

    Let’s refresh why this is correct:

    • From our The domain section we know that vn=vn−1∘πv_n = v_{n-1} \circ \pivn​=vn−1​∘π
    • Our BNB_NBN​ basis could be split by terms that has and hasn’t an xxx term.

    In summary, f0f_0f0​ and f1f_1f1​ are functions that can be expressed on a “shifted” basis BN/2B_{N/2}BN/2​. This is because when we do the function composition f0(π(x))f_0(\pi(x))f0​(π(x)) we’re re-basing each term of the basis. This is the same idea of usual FFT when we do f0(x2)f_0(x^2)f0​(x2) — the only difference is that instead of “literal” squaring we do “circle squaring” on the xxx axis i.e: π\piπ. The basis was constructed intentionally so when we square it, we get half of the basis for size 2⋅N2·N2⋅N.

    To provide some extra visuals, here’s a diagram that can also help understanding:

    image

    This diagram compresses the important ideas —use it to assist your understanding. The colors are used intentionally!

    As with the first step section before, we ask the same questions — given the evaluations of fff:

    1. How do we calculate the evaluations of f0f_0f0​ and f1f_1f1​?
    2. After we use a. result and run FFT, we have the coefficients of f0f_0f0​ and f1f_1f1​, how do we calculate the coefficients of fff?

    Solving a.

    The logic here is analogous with the bivariate case — note the following facts:

    f0(π(x))=f(x)+f(−x)2f1(π(x))=f(x)−f(−x)2⋅xf_0(\pi(x)) = \frac{f(x) + f(-x)}{2} \newline f_1(\pi(x)) = \frac{f(x) - f(-x)}{2 · x}f0​(π(x))=2f(x)+f(−x)​f1​(π(x))=2⋅xf(x)−f(−x)​

    In short, by doing the same right-hand-side “trick” we’re left with one or other half of the terms from the polynomial. Note that f0(π(x))f_0(\pi(x))f0​(π(x)) and f1(π(x))f_1(\pi(x))f1​(π(x)) are used since f0f_0f0​ and f1f_1f1​ are polynomials expressed with the basis BN/2B_{N/2}BN/2​ — the right side of the equations are these polynomials but “squared”. Note again that the new basis for f0f_0f0​ and f1f_1f1​ is half the size.

    Solving b.

    If we magically apply the FFT algorithm to f0f_0f0​ and f1f_1f1​ on the new basis BN/2B_{N/2}BN/2​, then we have the coefficient form of both polynomials. With this, we can reconstruct the coefficient form of fff by constructing the NNN length vector by interleaving both f0f_0f0​ and f1f_1f1​ coefficient vector of size N/2N/2N/2.

    Again, this is analogous to the coefficient forms of f0f_0f0​ and f1f_1f1​ for f(x)=f0(x2)+x⋅f1(x2)f(x) = f_0(x^2) + x· f_1(x^2)f(x)=f0​(x2)+x⋅f1​(x2) — you might find it more intuitive to think about it that way.

    Further and base case

    We have a univariate case again, so we apply “Step 2” repeatedly, halving the basis on each step. At some point, we’ll reach a base case with a polynomial with a single evaluation. In this case, calculating the coefficient form is trivial since it is a constant function that always returns that evaluation.

    We’re done!

    IFFT

    The IFFT algorithm does the inverse operation — given the coefficient form of a polynomial in our BNB_NBN​ basis, it returns the polynomial evaluations in the domain. The paper doesn’t go through this in detail but provides some hints and leaves the reader to figure out the details. I think that’s fair since it is not hard to figure out the inverse if you understand how FFT works.

    Remember that in the bivariate and univariate steps in the FFT, we always ended up asking two essential questions, but now we make the “inverse” twist:

    1. Given fff coefficient form, how do we calculate f0f_0f0​ and f1f_1f1​ coefficients?
    2. Given the evaluations of f0f_0f0​ and f1f_1f1​ in their (half-sized) domain, how do we reconstruct the evaluations of fff?

    For a. it’s quite simple, f0f_0f0​ and f1f_1f1​ correspond the even and odd indexes in fff. So given fff coefficients, it’s easy (i.e: indexing) to get f0f_0f0​ and f1f_1f1​ coefficients in their halved basis.

    Now that we have each part's coefficients, we can call IFFT for these polynomials on a basis with half the size, which will return the evaluation form of each. To solve b. we refer to the already explained relationship:

    f(x)=f0(π(x))+x⋅f1(π(x))f(x) = f_0(\pi(x)) + x · f_1(\pi(x))f(x)=f0​(π(x))+x⋅f1​(π(x))

    For each ddd in BNB_NBN​, we already have the calculated f1(π(d))f_1(\pi(d))f1​(π(d)) and f1(π(d))f_1(\pi(d))f1​(π(d)). Note that for f(d)f(d)f(d) and f(−d)f(-d)f(−d) we use the same f0∘πf_0 \circ \pif0​∘π and f1∘πf_1 \circ \pif1​∘π results and only changes the xxx value in the equation. This is because squaring ddd and −d-d−d yields the same result, and BN/2B_{N/2}BN/2​ used in f0f_0f0​ and f1f_1f1​ was the basis for the IFFT call.

    Note the above explains the univariate case. For the final bivariate case (since this is a recursive algorithm), the idea is the same but the yyy term is introduced when calculating the reconstructed evaluations.

    What about FRI?

    Note that FRI is very similar to FFT, but in the function decomposition, we do a random linear combination of both parts, so if you understood the FFT/IFFT algorithm, then you’ve already done the heavy lifting!

    Final words

    The idea of the Circle-STARK FFT is analogous with the usual roots of unity FFT:

    • The concept of “squaring” isn’t x2x^2x2 but π\piπ in a coset in a circle group.
    • The “polynomials” we’re interpolating or evaluating are a “space” with some properties required for soundness.
    • The basis isn’t the usual [1,x,x2,..][1, x, x^2, ..][1,x,x2,..] but another construction using vnv_nvn​ which if you compose with π\piπ you get “the same effect” as if you square [1,x,x2,…][1, x, x^2, …][1,x,x2,…] (i.e: f(x2)f(x^2)f(x2)).

    The paper does a fantastic job explaining all this. It has a more formal style, as it should, since it must contain the most detailed and structured explanation since it’s the “official” explanation.

    I hope you’ve found this alternative (hand-holding) style explanation helpful!