Thanks to Ulrich Haböck for his feedback!
- Introduction
- Why a new FFT?
- Where does FFT fit in Circle-STARKs?
- The ingredients
- The group
- The domain
- Twin-cosets
- Standard-position cosets
- The basis
- Riemann-Roch space
- Dimension gap
- The final basis
- FFT
- First step (bivariate → univariate)
- Solving a.
- Solving b.
- Second step (univariate → univariate)
- Solving a.
- Solving b.
- Further and base case
- IFFT
- What about FRI?
- Final words
Introduction
If you’re reading this article, you’re probably trying to learn more about Circle-STARKs, a new proving system published ~three months ago. I heard about this breakthrough the day the paper was released. Still, I didn’t read about it until Vitalik wrote Exploring Circle STARKs, which I highly recommend (I also recommend LambdaClass article).
As a cryptography enthusiast, it’s normal to read a high-level explanation about some breakthrough and be left thinking: That sounds quite magical! I wonder why <insert here a million questions>. If the idea is new, usually you’re on your own! Any (limited) articles that aren’t the paper might unavoidably have to ignore information; that’s the whole point of being high-level (with time, what you consider “high-level” changes; many would argue that Vitaliks post isn’t).
I’m not a mathematician nor cryptographer, but I bit the bullet and jumped to the paper to understand more — in particular, I wanted to understand the FFT in more detail, which is at the core of the breakthrough.
Usually, I write articles to self-check my understanding and hopefully save time for someone out there trying to understand the same. This article is more detailed than Vitalik’s FFT section, which provides some hand-holding while trying to understand the paper.
Even though I’m not a cryptographer and do not spend time reading papers every day, I recommend reading the paper. Considering I could read it (with some effort), I believe this proves the authors did a fantastic job.
Why a new FFT?
In cryptography, finite fields are used, which provide enough algebraic structure to build proving systems with a bounded size that is convenient for their implementation. The bigger the field size, the more work should be done per operation since we need to use multiple registers to represent their values (sum, multiply, reduce, etc).
A relatively new field is Mersene31 (M31), a prime field with modulus . The benefit of this field is that you can do fast implementations since:
- You can represent values in 31-bit values, which is convenient for CPUs/GPUs (32-bit registers).
- Adding two numbers requires, at most, one extra bit, which is perfect for the same reason.
- Reducing a non-canonical value from addition or multiplication is very simple since you can decompose it into 32-bit chunks and, with some shifts and additions, have a canonical representation.
In proving systems, polynomial interpolations and evaluations are commonly required. If you don’t want to do this naively (and thus, slowly), you usually want to use FFT if the evaluation domain is special enough. Usually, the multiplicative group of the field has a big enough subgroup generated by a primitive root of unity, which satisfies this need.
This isn’t the case for this finite field, so that’s quite unfortunate. We have a field with good mechanical sympathy, but you’re missing the needed algebraic structure. Circle-STARKs were invented to deal with this challenge. The paper is more generic than targeting M31; it targets a CFFT-friendly prime, a prime where is divisible by some large . M31 is a particular instance of such a prime.
In STARKs, there’s a particular decoupling between the field you use for arithmetization and the one required for good soundness in the FRI step. For Circle-STARKs, the same strategy is applied — for FRI you use an extension field of the arithmetization field.
Where does FFT fit in Circle-STARKs?
Proving systems have many layers of concepts to understand, but at the end of the day, some computation is represented as a constraint system using polynomials. After that transformation, you want to prove the knowledge of some witness that satisfies those constraints. These constraints are checked efficiently by using some fundamental (but powerful) polynomial theorems.
There are two primary needs while transforming the computation to polynomials:
- Given the execution witness table values, we want to encode this information in a polynomial. This is doing a polynomial interpolation.
- We express the constraints as new polynomials created using the interpolated execution witness values and ask them to satisfy some equality within the interpolated domain.
To prove the last point, some usual theorems/lemmas from polynomials are used, transforming the problem into proving that a polynomial division isn’t a rational function. i.e., the quotient is a polynomial with a bounded degree. This is where FRI comes in and proves that claim. (In other proving systems that aren’t STARKs, this varies).
To apply FRI, we must do a “low degree extension,” evaluating the polynomial in a bigger domain. Then, we create the Merkle Tree and apply FRI. You can do this “naively” with some Lagrange basis and polynomial evaluations — but if the interpolation domain satisfies some properties, you can do this much faster with FFT and IFFT.
In regular STARKs, we work with univariate polynomials with a field from which we can have a proper multiplicative subgroup to apply FFT/IFFT. In Circle-STARKs, all these ideas of interpolation and evaluation are the same — but we work with bivariate “polynomials” (this isn’t true! but more on this later) in a new kind of domain built from a novel group construction.
Other parts of the Circle-STARK constructions also require some adjustments, but they’re finding analogous ways of doing the same as in regular STARKs. I won’t even claim I understand them in detail (yet!) since I got into the FFT rabbit hole for now!
The STARK (and usually most) proving system is modular, and you can plug and play different constructions if they satisfy the required algebraic properties. I find this modularity fact very beautiful! — it’s worth taking a minute to appreciate how elegant math is.
The ingredients
There are multiple kinds of FFT. In my case, I was only aware of the usual FFT using roots of unity. However, there’s also ECFFT and others — still, you usually have to understand the same ingredients:
- The group you’re working on.
- The domain of evaluation.
- The basis used for interpolation.
For usual roots of unity FFT, you’d use the field multiplicative group, a proper subgroup, and , respectively. Now, let’s explore these dimensions for Circle-STARKs FFT.
The group
It all starts with the unity circle equation:
where coordinates are elements of our field (e.g. M31). For the points in the circle, we define the following group operation:
From this defined group operation, we can note three facts:
- The neutral element is , since
- The squaring for a point is
- The inverse of an element is , since
We can appreciate the above in the following diagram:
Finally, let’s highlight three facts that we’ll have to remember:
- If we square or we get the same coordinate (i.e: ) and symmetrical results. You can check this yourself by doing the math with the group law.
- This group is cyclic — there’s an element from which you can generate all elements in the group. (Section 3.1 in the paper)
- The number of elements in the group is which, in the case of Mersene31, means elements. (Lemma 1 in the paper).
The domain
In regular STARKs, two domains are used:
- Interpolation domain: is where we encode the witness values into a polynomial.
- FRI domain: is where we evaluate the polynomial to construct the Merkle Tree.
For targeting zero-knowledge, both domains shouldn’t overlap. This is done by choosing the FRI domain as a shifted (but larger!) version of the Interpolation domain. Here shifted usually means a coset — if is the interpolation domain we take a coset . Using this coset still satisfies the properties we need for usual FFTs in STARKs. In the case of circle STARKs, there’s an analogous concept of Twin cosets, which is another way to have a non-overlapping domain for Circle-FFT.
The Circle-STARK paper defines two constructions to achieve the same goal: Twin cosets and Standard position cosets. What’s important to note is that both constructions can be used as domains for Circle-FFT, so after this sub-section, you can imagine working only with Twin cosets or Standard position cosets (which is probably easier).
Twin-cosets
A Twin coset is defined as , but let’s unpack it a bit more:
- Let’s define as the (cyclic) circle group we explained in the previous section. The sub-index means that the group has order .
- is defined as squaring , thus it’s order is . This is the same as squaring a domain generated by a primitive root of unity.
- is an element from , which must not be the neutral element or have order two.
- is a left-coset of
- We can prove that it doesn’t overlap thanks to the properties we asked for in bullet 3.
In a way, we’re taking and shifting by to get half of and by to get the other half. But let’s try to explore a bit deeper some other facts:
- If we try to check if both groups overlap, we’d have to know some that , if we multiply by both sides we have so clearly which is why was mentioned that can’t be the neutral element or have order two.
- More interestingly, the inverse element from one coset is in the other. Let then .
This last point can be appreciated in the following drawing shown in the paper about Twin cosets examples:
The paper doesn’t give a visual interpretation of that image (maybe considered trivial?), but from my perspective, it seems pretty clear that the blue and red points are the points of each coset. Additionally, you can appreciate that the inverse of a point is another point with the opposite color.
Finally, comes a crucial point: what happens if we square ? As you can imagine, this is a basic step in any FFT-like algorithm, and we’d imagine it should halve in size and keep its main properties. Let’s try to do that directly:
- We start with , now we square it…
Looks quite familiar? :) Remember that and have the same coordinate, and the (squaring) keeps the inverse mapping, so we have again a Twin coset of half the size!
This sounds exactly like what our previous understanding of common roots of the unity FFT domain should look like. A set of elements where the squaring function halves the set and keeps the property of point inverse relationship, which we can continue to square until we reach a base case.
Why do we have to do this weird union of cosets instead of using a coset directly? The paper mentions the following:
Although breaking with the group structure, twin-cosets are natural evaluation domains for the FFT. They are used to work around the non-smooth behaviour of the circle FFT under rotation, and are the typical extrapolation target in our application.
I believe this is claiming that if you take any coset (for any Circle-FFT friendly prime ), then by squaring the coset you won’t preserve the FFT property we need.
Standard-position cosets
If the Circle-FFT prime we’re using satisfies that it’s divided by a big , we can use another construct for the domain called Standard position coset: with in .
This construction is simpler since we don’t have to deal with unions. In the case of using M31, we have so we could build a proper domain of size up to . For other primes, you might have , with a small , so its size might fall short for our needs.
Twin cosets don’t depend on any properties of , so it’s a more general construction. As mentioned earlier, you can think of our domains as twin cosets or standard position cosets — both satisfy what we need for the FFT domain, so which one you use is (mostly) irrelevant.
The basis
Let’s take a step back and make a face-to-face comparison between usual STARKs and Circle-STARKs in relevant dimensions for an FFT:
Domain | Interpolation coefficients | Encoding target | |
Usual STARK | Roots of unity in multiplicative subgroup | Field element | Univariate polynomials with basis , , ,… |
Circle-STARK | Twin-Cosets / Standard position cosets | Field element | Bivariate “polynomial” (Riemann-Roch space) |
Riemann-Roch space
In the previous section, we discussed what domain we use for Circle-STARKs. But what is this Reimann-Roch space concept? Vitalik's article briefly mentioned, so here I’ll dive a bit deeper. Remember, I’m not a mathematician and was on my own unpacking all this, so hopefully, there aren’t any mistakes (if that’s the case, please ping me!).
In usual STARKs, we “encode” the columns of the witness table into polynomials by doing an interpolation. This interpolation outputs the coefficients of a univariate polynomial where, on each element of the (root of unity) domain, the evaluation is equal to the witness value at a specific point in the execution trace. In a way, you can see this as encoding the witness table into something else — in this case, an univariate polynomial.
In the case of Circle-STARKs, we encode these values not in a univariate polynomial but in a bivariate polynomial—not any bivariate polynomial, but ones in a (Riemann-Roch) space.
Before continuing down the rabbit hole, it’s worth saying that the idea is to encode the witness values into “somewhere” that has appropriate properties to run FRI. In the case of usual STARKs, we talk about Reed-Solomon codes; this space has the required properties. In the case of Circle-STARKs, the idea is that this Riemann-Roch space is isomorphic to a Reed-Solomon code and thus also has the required properties.
If we call this space which:
- is the (extension) field that we’re working on.
- Are bivariate polynomials modulo
- Have degree . This means that for each term it satisfies
Since we’re considering bivariate polynomials modulo , you can take any bivariate polynomial and substitute (i.e: ). Thus, this space of polynomials always has as the degree of terms. Moreover, we can define a canonical form for a polynomial in this space as:
This should be obvious since we said the highest degree of is . I find this canonical form to be handy to imagine how these polynomials look like, and also:
- It already looks interesting that you can see as the sum of two terms involving univariate polynomials only depending on . It might not be obvious why that’s useful, but this is no coincidence.
- If you know about FRI, this should look quite familiar.
Dimension gap
What about this degree requirement? The paper has a proven proposition that claims this space has dimension . This means that by having these degree bivariate polynomials we can encode stuff with degrees of freedom.
Remember that we can see elements of the Riemann-Roch space in a canonical form like: , which means that we could describe it on the following basis:
Despite this already looking like a weird basis compared with the usual roots-of-unity FFT one , it should make sense, given the mentioned canonical form.
Note that has as the maximum degree of since we have to respect the degree bound. If we count the number of elements on this basis is .
But remember, we want to use this space as the target for our FFT. The FFT “output” dimension is , but the target space has one extra dimension. To solve this, a subspace is used as the “real” target space — since it’s a subspace it satisfies the properties as the space but correctly fits the FFT output requirements.
The final basis
To understand how work, I’ll make the assumption of using Standard position cosets for the domain. If we use Twin cosets the construction changes slightly, but it’s the same idea.
Let's first define as follows:
- . Recall was our squaring operation , but here we only refer to the coordinate since is univariate.
- If we expand the first values:
- …
These aren’t arbitrary polynomials, but vanishing polynomials. Note that since they’re powers of , each polynomial zero-out on domains of different size. i.e: if you square and square a given domain, at some point you’ll reduce to a single point which (for standard position cosets) will be 0. For this article, we can ignore the implications of this, but it is relevant for the overall Circle-STARK construction.
Let’s jump directly to the definition of our FFT basis. Remember that our basis is just a vector to which our FFT coefficients would map. Each element of our basis (which generates ) is defined as:
where .
At first sight, this definition can be confusing and complex, but let’s unpack it:
- is the index of the element in the basis. i.e: is the -th element of the basis.
- While moving through the index, you’re enabling/disabling terms in the definition. e.g. notice that odd will have an term, and even ones won’t.
- Let’s expand to understand it better:
- …
- Note that we can only have with degree or .
If we want to be strict, we have to prove this is a proper basis — each term of the basis should be linearly independent of the others. The formal proof can be found in the paper — but the way I think about it is noticing that each newly introduced element has a degree of being “the next” power of two. If you try doing linear combinations up to element , you can’t have an element of the degree of element .
Let’s build more intuition on why this basis is useful:
- If you see the expanded first 8 terms above, you can notice that half of them have an term and the other half doesn’t.
- You can appreciate how using this basis already hints is a sub-space of our space, since it also can be described in the canonical form: .
- This means and are constructed with the same basis but without the term.
- The basis of size could be constructed by “squaring” the base of size . This is because our definition is powers of — this should ring a bell for FFT-like requirements!
Note the subspace definition implies its dimension , solving the original problem that motivated this construction.
FFT
Now that we have a reasonable understanding of the domain and its basis, we can finally describe the FFT algorithm. Before doing so, let’s review what we are trying to achieve.
The FFT algorithm allows the efficient calculation of the coefficients for a polynomial from its evaluations in a defined domain. The inverse FFT (IFFT) does the inverse operation. Given the coefficients on our basis, we can efficiently calculate the evaluation in the domain.
As mentioned, it’s important that you always keep in mind that when we say the polynomial coefficients we aren’t talking about the coefficients in the usual basis of polynomials like: , since this is assuming a basis of . For this algorithm, we’re calculating the coefficients of the polynomials described in the basis described before: . Note that the length of the basis depends on the length of the evaluation vector we have as input.
The papers explain the FFT algorithm separately from proving how it works — I’ll merge both explanations since I find it more useful. The following is an image that can describe at a high level the mechanics of how it works:
Please return to this image while we explore how it works below. Let’s jump into it!
First step (bivariate → univariate)
Recall that each polynomial in our Riemann-Roch (sub!)space can be described in its canonical form:
where can be described in “coefficient form” in the basis:
Given our decomposition, we can see that both and can be described under the basis:
since we removed the term. In other words, has the terms with involving and has the terms without . Since in has degree or then and only depend on even indexes elements of our original . This is useful since after this first decomposition, we can continue applying FFT in univariate “polynomials” (i.e: only depending on ). This new basis , only depending on , is half the size of the original! This is why, in the diagram above (in green), I refer to “+-” since we’re “collapsing” symmetrical points.
Let’s assume that we could apply the FFT algorithm we’re describing to and . That is, given their evaluations it returns the coefficients form. If we can reconstruct the coefficients of then we’re done.
The question now is, given the evaluations of :
- How do we calculate the evaluations of and ?
- After we use a. result and run FFT on and we have their coefficients — how using that do we calculate the coefficients of ?
Solving a.
For a., we note the following relationship:
The first equation is canceling the terms and dividing by so we are only left with terms. If you know how usual FFT works, this should look very familiar. The second equation is similar but cancels the terms without , and then divides by to removing the doubling and the term. With this, we can see how and are the right parts to write in the canonical form.
These equations are a symbolic relationship, but remember that our input for the algorithm is the evaluations of . To get the evaluations of , we evaluate the above equation at each point in the domain, say :
Recall that the way our domain works, both and are part of our domain (symmetric points on ), so we have both evaluations in the evaluation form of , so we can calculate this value. We do the same for all values in the domain, so we can have the evaluation form of . We do the same to get the evaluation form of but with the other equation. With this, we solved point a.
Solving b.
Now that we have the evaluations and magically use this same FFT but for and (we’ll explain this next, but let’s assume it works), we have the “coefficient form” for and .
Reconstructing the coefficient form for is relatively easy. Recall that where the terms of that don’t have the term, so the even indexes in , and would map to the odd indexes of since we had — so we interleave the coefficients from with the coefficients of and we’re done.
OK, for now the algorithm works if we know that given evaluations of (or ) in (defined before), we can get their coefficient form. Let’s prove that next.
Second step (univariate → univariate)
The problem now was transformed from bivariate to univariate. The basis now is (with half of the previous step) but we got rid of terms — those were removed in the first step. The basis now looks like: . Recall that , thus .
We now apply the same idea of function decomposition and express:
But not so fast — this isn’t the relationship Circle-STARK FFT use but:
Let’s refresh why this is correct:
- From our The domain section we know that
- Our basis could be split by terms that has and hasn’t an term.
In summary, and are functions that can be expressed on a “shifted” basis . This is because when we do the function composition we’re re-basing each term of the basis. This is the same idea of usual FFT when we do — the only difference is that instead of “literal” squaring we do “circle squaring” on the axis i.e: . The basis was constructed intentionally so when we square it, we get half of the basis for size .
To provide some extra visuals, here’s a diagram that can also help understanding:
This diagram compresses the important ideas —use it to assist your understanding. The colors are used intentionally!
As with the first step section before, we ask the same questions — given the evaluations of :
- How do we calculate the evaluations of and ?
- After we use a. result and run FFT, we have the coefficients of and , how do we calculate the coefficients of ?
Solving a.
The logic here is analogous with the bivariate case — note the following facts:
In short, by doing the same right-hand-side “trick” we’re left with one or other half of the terms from the polynomial. Note that and are used since and are polynomials expressed with the basis — the right side of the equations are these polynomials but “squared”. Note again that the new basis for and is half the size.
Solving b.
If we magically apply the FFT algorithm to and on the new basis , then we have the coefficient form of both polynomials. With this, we can reconstruct the coefficient form of by constructing the length vector by interleaving both and coefficient vector of size .
Again, this is analogous to the coefficient forms of and for — you might find it more intuitive to think about it that way.
Further and base case
We have a univariate case again, so we apply “Step 2” repeatedly, halving the basis on each step. At some point, we’ll reach a base case with a polynomial with a single evaluation. In this case, calculating the coefficient form is trivial since it is a constant function that always returns that evaluation.
We’re done!
IFFT
The IFFT algorithm does the inverse operation — given the coefficient form of a polynomial in our basis, it returns the polynomial evaluations in the domain. The paper doesn’t go through this in detail but provides some hints and leaves the reader to figure out the details. I think that’s fair since it is not hard to figure out the inverse if you understand how FFT works.
Remember that in the bivariate and univariate steps in the FFT, we always ended up asking two essential questions, but now we make the “inverse” twist:
- Given coefficient form, how do we calculate and coefficients?
- Given the evaluations of and in their (half-sized) domain, how do we reconstruct the evaluations of ?
For a. it’s quite simple, and correspond the even and odd indexes in . So given coefficients, it’s easy (i.e: indexing) to get and coefficients in their halved basis.
Now that we have each part's coefficients, we can call IFFT for these polynomials on a basis with half the size, which will return the evaluation form of each. To solve b. we refer to the already explained relationship:
For each in , we already have the calculated and . Note that for and we use the same and results and only changes the value in the equation. This is because squaring and yields the same result, and used in and was the basis for the IFFT call.
Note the above explains the univariate case. For the final bivariate case (since this is a recursive algorithm), the idea is the same but the term is introduced when calculating the reconstructed evaluations.
What about FRI?
Note that FRI is very similar to FFT, but in the function decomposition, we do a random linear combination of both parts, so if you understood the FFT/IFFT algorithm, then you’ve already done the heavy lifting!
Final words
The idea of the Circle-STARK FFT is analogous with the usual roots of unity FFT:
- The concept of “squaring” isn’t but in a coset in a circle group.
- The “polynomials” we’re interpolating or evaluating are a “space” with some properties required for soundness.
- The basis isn’t the usual but another construction using which if you compose with you get “the same effect” as if you square (i.e: ).
The paper does a fantastic job explaining all this. It has a more formal style, as it should, since it must contain the most detailed and structured explanation since it’s the “official” explanation.
I hope you’ve found this alternative (hand-holding) style explanation helpful!