$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
This article assumes familiarity with Shamir secret sharing^{1}, a technique that allows a dealer to “split up” a secret $s$ amongst $n$ players such that any subset of size $\ge t$ can reconstruct $s$ yet no subset of size $<t$ learns anything about the secret.
Recall that a secret ${\color{green}s}\in \Zp$ is $t$-out-of-$n$ secret-shared as follows:
The dealer encodes $s$ as the 0th coefficient in a random degree-$(t-1)$ polynomial $\color{green}{f(X)}$: \begin{align} f(X) &= s + \sum_{k=1}^{t-1} f_k X^k,\ \text{where each}\ f_k\randget \Zp \end{align}
The dealer gives each player $i\in [n]$, their share $s_i$ of $s$ as:
\begin{align}
\color{green}{s_i} &= f(i)\\
&= s + \sum_{k=1}^{t-1} f_k \cdot i^k
\end{align}
The shares $[s_1, s_2, \ldots, s_n]$ define the $t$-out-of-$n$ sharing of $s$.
Recall the definition of a Lagrange polynomial w.r.t. to a set of evaluation points $T$.
\begin{align}
\forall i\in[n],
\color{green}{\lagr_i(X)} &= \prod_{k\in T, k\ne i} \frac{X - k}{i - k}
\end{align}
The relevant properties of $L_i^T(X)$ are that:
\begin{align}
L_i(i) &= 1,\forall i \in T\\
L_i(j) &= 0,\forall i, j\in T, i\ne j\\
\end{align}
Any subset $T\subseteq[n]$ of $t$ or more players can reconstruct $s$ by combining their shares as follows:
\begin{align}
\sum_{i\in T} \lagr_i^T(0) s_i &= \sum_{i\in T}\lagr_i^T(0) f(i) = f(0) = s\\
\end{align}
Suppose the old players, who have a $t$-out-of-$n$ sharing of $s$, want to reshare s with a set of $\color{green}{n’}$ new players such that any $\color{green}{t’}$ players can reconstruct $s$.
In other words, they want to $t’$-out-of-$n’$ reshare $s$.
Importantly, they want to do this without leaking $s$ or any info about the current $t$-out-of-$n$ sharing of $s$. A technique for this, whose origins are (likely?) in the BGW paper^{2}, is described by Cachin et al.^{3} and involves four steps:
Each old player $i$ first “shares their share” with the new $n’$ players: i.e., randomly sample a degree-$(t’-1)$ polynomial $\color{green}{r_i(X)}$ that shares their $s_i$:
\begin{align}
\color{green}{r_i(X)} &= s_i + \sum_{k=1}^{t’-1} {\color{green}r_{i,k}} X^k,\ \text{where each}\ r_{i,k}\randget \Zp
\end{align}
Let ${\color{green}z_{i,j}}$ denote the share of $s_i$ for player $j\in[n’]$. \begin{align} {\color{green}z_{i,j}} = r_i(j) \end{align} Then, each old player $i$ will send $z_{i,j}$ to each new player $j\in [n’]$.
And voilà: SUCH A BEAUTIFUL, SIMPLE PROTOCOL for secret resharing.
It’s easy to see why if we reason about the underlying polynomial defined by the new players’ shares $z_j$.
Specifically, the degree-$(t’-1)$ polynomial $r(X)$ where $r(0) = s$:
\begin{align}
r(x) &= \sum_{i\in H} \lagr_i^H(0) r_i(X)\\
&= \sum_{i\in H} \lagr_i^H(0) \left(s_i + \sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\\
&= \left(\sum_{i\in H} \lagr_i^H(0) f(i)\right) + \left(\sum_{i\in H}\lagr_i^H(0) \left(\sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\right)\\
&= s + \sum_{i\in H}\lagr_i^H(0) \left(\sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\\
&\stackrel{\mathsf{def}}{=} s + \sum_{k=1}^{t’-1} {\color{green}r_k} X^k
\end{align}
In other words, $[s, r_1, r_2,\ldots,r_{t’-1}]$ are the coefficients of the polynomial obtained from the linear combination of the $r_i(X)$’s by the Lagrange coefficients $\lagr_i^H(0)$.
In more detail:
\begin{align}
r(x) &= s + \left(\begin{matrix}
&\lagr_{i_1}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_1,k} \cdot X^k\right) + {}\\
&\lagr_{i_2}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_2,k} \cdot X^k\right) + {}\\
&\ldots\\
&\lagr_{i_{|H|}}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_{|H|},k} \cdot X^k\right)\\
\end{matrix}\right)
\end{align}
Let ${\color{green}c_{i_j, k}} \stackrel{\mathsf{def}}{=} \lagr_{i_j}^H(0) \cdot r_{i_j, k}$.
Then, we can rewrite the above as:
\begin{align}
r(x) &= s + \left(\begin{matrix}
&\sum_{k=1}^{t’-1} c_{i_1,k} \cdot X^k + {}\\
&\sum_{k=1}^{t’-1} c_{i_2,k} \cdot X^k + {}\\
&\ldots\\
&\sum_{k=1}^{t’-1} c_{i_{|H|},k} \cdot X^k\\
\end{matrix}\right)
\end{align}
Let ${\color{green}r_k}\stackrel{\mathsf{def}}{=} \sum_{i_j \in H} c_{i_j, k}$.
Then, we can rewrite the above as:
\begin{align}
r(x) &\stackrel{\mathsf{def}}{=} s + \sum_{k=1}^{t’-1} r_k X^k
\end{align}
And, as we saw in Equation \ref{eq:newshare} above, any new player $j\in[n’]$ can get their share of $r(X)$ via:
\begin{align}
z_j
&= \sum_{i\in H} \lagr_i^H(0) r_i(j)\\
&= r(j)
\end{align}
Big thanks to Benny Pinkas for pointing me to the BGW paper^{2} and for pointing out subtleties in what it means for an old player to correctly share their share.
How to Share a Secret, by Shamir, Adi, in Commun. ACM, 1979, [URL] ↩
Completeness Theorems for Non-cryptographic Fault-tolerant Distributed Computation, by Ben-Or, Michael and Goldwasser, Shafi and Wigderson, Avi, in Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing, 1988, [URL] ↩ ↩^{2}
Asynchronous Verifiable Secret Sharing and Proactive Cryptosystems, by Cachin, Christian and Kursawe, Klaus and Lysyanskaya, Anna and Strobl, Reto, in Proceedings of the 9th ACM Conference on Computer and Communications Security, 2002, [URL] ↩
This step is non-trivial and is where most protocols work hard to achieve efficiency. For example, see ^{3}. Publicly-verifiable secret sharing (PVSS) on a public bulletin board such as a blockchain is a simple (albeit naive) way of achieving this: there will be $n$ PVSS transcripts, one for each-reshared $s_i$, and everyone can agree on the set $H$ of valid transcripts. ↩ ↩^{2}
I once ran into a video where Neil deGrasse Tyson, in relation to a debate with folks who didn’t “believe”^{1} in global warming nor in evolution, said the following:
“The good thing about science is that it’s true whether or not you believe in it” – Neil deGrasse Tyson
You can see the short video below:
The way to conceptualize, and popularize^{2}, science is not as being “true” or “false.”
Science is a process that we engage in to discover truths. And it often leads us astray^{3}$^,$^{4}, which is why the idea of “science being true” is at best misleading and at worst dangerous.
Science is not a belief in any sense of the word. Although many religions might be based on belief, science is an entirely different beast.
Science requires not belief, but experimentation, theory postulation and theory falsification (or refinement).
Those are fancy words, but here’s an example everyone can understand:
Question: How might we find out if there is a universal acceleration that falling objects have? Is it $5\ m/s^2$? Is it $10\ m/s^2$?
Poor answer: We could just blindly accept whatever answer Neil DeGrasse Tyson gives us. After all “science is true whether you believe it or not,” no?
Better answer: No. That is a kind of “scientism” akin to religious belief. Instead, we could engage in the scientific process:
Richard Feynman, a physicist you might know^{6} from his work on quantum electrodynamics (QED)^{7}, explains very beautifully:
What is inherent in science is not “truth” but “uncertainty.”
In one of his famous physics lecture, Feynman beautifully elucidates the leap of faith scientists must make when going from a concrete scientific experiment to a general law that predicts beyond what the experiment tested for.
I include an excerpt below, but I find the video to be much more convincing (and entertaining) to watch:
In the video above, Feynman genuinely asks the audience:
Why are we able to extend our laws to regions that we’re not sure?
How is it possible?
Why are we so confident??
He then explains that extending such general laws is the thing we must do if we are to learn anything new beyond the outcome of the concrete experiment. The price we pay, of course, is we (scientists) stand to be proven wrong in the future. In other words, the nature of scientific laws or theories is they are uncertain, subject to be fully-falsified or partially-refined.
It’s not a mistake to say that [the law] is true in a region where you haven’t looked yet.
If you will not say that it’s true in a region that you haven’t looked yet, [then] you don’t know anything!
[In other words,] If the only laws that you find are those which you just finished observing, then… you can’t make any predictions!
[But] the only utility of the science is to go on and to try and make guesses.
[…]
So what we do is always to stick our neck out!
And that of course means that the science is uncertain!
[…]
We always must make statements about the regions that we haven’t seen, or [else] there’s no use in the whole business.
Feynman reiterates on this point to make it stick:
We do not know all the conditions that we need for an experiment!
[…]
So in order to have any utility at all to the science…
In order not simply to describe an experiment that’s just been done, we have to propose laws beyond their range.
And there’s nothing wrong with that. That’s the success; that’s the point!
And, uhh, that makes the science uncertain.
If you thought before that science is certain, well, that’s just an error on your part.
Please don’t be fooled into thinking “science is true” (whatever that’s supposed to mean). Such claims are nonsensical.
First, they are nonsensical, because science is a process. Processes cannot be “true” or “false”; they are just a way of doing things. Second, because scientific theories are (mostly) falsifiable, which is a fancy way of saying there is room to prove them wrong. (i.e., they might actually be false!)
If one cannot rely on science “to be true”, what should one do instead?
The first option is to take a scientific theory and try to falsify it. For example, Einstein picked Newton’s laws of motion, showed they cannot properly describe motion for objects moving close to the speed of light and generalized them. And that’s how we got the general theory of relativity and, apparently, GPS on our phones.
Note: I look at this as falsifying Netwon’s theories, which were simply not going to work accurately in Einstein’s extreme conditions, while others might look at it as refining them (since Newton’s laws of motion still approximate things very well).
The second option is to devise an experiment, make some observations, generalize them into a theory, and see if your theory holds water by checking if it can predict anything useful. Then, you can go back to the first option and try to falsify your theory.
In other words, engage in science as opposed to “believing in science.”
If you must “believe in science [the scientific process]”, realize you are simply deferring to the authority of other scientists and the soundness of the scientific peer review process.
Yet other scientists are people like you and me.
And we make mistakes, are inherently ignorant or have perverse incentives^{8}.
I recognize that engaging in the scientific process is a very high bar to meet. I also recognize it’s not clear how to reach consensus faster on scientific theories of high importance such as anthropogenic climate change. But I do know that preaching “science is true; believe science” is a steadfast way of moving from scientific (falsifiable) territory into religious (unfalsifiable) territory. This would defeat the original goals of science: to ensure we find out when we are wrong and, as a result, get a bit closer to the truth.
The most widely-accepted philosophy of science comes from Karl Popper, who argued that a theory or statement is scientific only if it can, in principle, be empirically falsified (a.k.a., disproven). In other words, if a theory does not admit any tests that could potentially show it to be false, then the theory is not scientific.
But not all philosophers of science agree with Popper. Some argue that scientific theories must not only be falsifiable but also empirically-confirmed to some degree. Others argue that scientific theories can be unfalsifiable in practice as long as they are part of broader theoretical frameworks that are testable as a whole. Or, as long as these theories could become falsifiable with future, better technology (e.g., string theory). In this last case, indirect evidence, logical coherence, and the theory’s ability to explain and predict phenomena is what gets the theory to be viewed as scientific.
In general, celebrating the scientific method does not mean we should dispense with all unfalsifiable theories.
A contrived-but-simple example: your friend tells you “There exist pink cars with yellow stripes!” That is an unfalsifiable theory. Should you ignore everything coming out of their mouth next?
Admittedly, it would be very difficult to convince your friend such cars don’t exist: you’d have to show them that all cars in the universe don’t match this description, which would (likely) take infinite time. Yet the theory can be shown to be true by a trivial observation of such a car. And indeed, I know a person with such a car (so I know the theory is true)!^{9}
The point is, we make unfalsifiable statements (theories) like this all the time and we get along just fine.
Interestingly, religious theories (e.g., “there is a God”) fall into the same category of unfalsifiable but potentially-true theories. In fact, people who claim to have had direct experiences of the divine are firmly-convinced of their veracity. I no longer believe these theories should be thrown away (i.e., I recognize the limitations of my own form of consciousness). Instead, I think that the alleged consequences of such theories should be acted upon carefully since, if they are false, there is no way to find out due to their unfalsifiable nature. (In layman terms, don’t burn or kill people because they don’t believe in the same things you do.)
Here’s a more palatable unfalsifiable theory, though harder to formalize. My wise mother once proposed this theory to me when I was being a smartass about how bad unfalsifiable theories can be. She said “Okay, how about the theory that love exists in the world. Are you gonna throw away that theory too because it’s unfalsifiable?”. (If you don’t know this theory to be true, then you have other problems and I wish I could give you a warm hug.)
Science is not simply “true”. It is a process that you could engage in. It is not an axiom that you take for granted. It is not an authority that you defer to.
If you don’t have time to engage in science, you can trust-but-verify. However, there is a risk you’ll be deceived by:
Sentences like these should be critically inspected.
Thumb rule: Mentally-remove the word “science” (and its derivatives) from sentences. See if those sentences still sound convincing. If they do not, something is being left out and you could investigate. More and more, the word “science” is used authoritatively without so much as a citation to the relevant scientific work(s).
I do admire deGrasse Tyson’s efforts to bring the scientific process to the masses. And I’m sure he meant well in (mis)stating that “science is true” (e.g., perhaps he meant to convey, in an entertaining way, his own confidence in the scientific process).
Nonetheless, the conflation of “truth” with “science” and the implication that one should “believe” science worries me. I, for one, find this borderline-dangerous in our increasingly-polarized society, which is more and more filled with separating beliefs. Adding science to this list of beliefs would not serve anyone.
I hope this blog post clearly articulated that science is a (fallible) process and must not be blindly trusted.
I write “believe” in quotes because I find the usage of the term “believe” to be over-simplifying when it comes to how one should engage with complex scientific theories like the theory of anthropogenic climate change. In other words, to simply have to pick between “Do you or do you not believe in global warming?” is a disastrous way of getting any clarity on the causes of global warming. A better way might be to ask someone “What evidence is there for anthropogenic climate change and have you taken a close look at it?”. (PS: This blog post is not about the climate change issue.) ↩
I do want to recognize Neil deGrasse Tyson’s amazing efforts in popularizing science by speaking about it in a particularly entertaining way. In a way, that’s likely the problem behind his “science is true” claim: he did not carefully balance between entertainment and actual education. ↩
For example, in some less enlightened parts of the word, the “scientific” belief used to be that Earth was in the center of our solar system and that the Sun and stars revolved around it. Galileo Galilei was put into house arrest by the Roman Catholic church for the “heresy” of giving evidence that this theory may be wrong. ↩
Another falsified theory was the “Aether Theory.” Aether was believed to be a medium that filled space and enabled the propagation of light. The famous Michelson-Morley experiment in 1887 failed to detect aether, leading to the eventual acceptance of Einstein’s theory of relativity, which does not require the existence of aether. ↩
This principle of air resistance was famously demonstrated, albeit on the Moon, by astronaut David Scott during the Apollo 15 mission.He dropped a feather and a hammer side by side on the Moon. Due to the absence of air resistance on the Moon, they hit the lunar surface at the same time, providing a visual demonstration of this principle. ↩
You might also know Feynman from his hilarious autobiographical books such as “Surely you’re joking, Mr. Feynman! (Adventures of a Curious Character)“. ↩
Feynman, Julian Schwinger and Shin’ichirō Tomonaga were jointly awarded the Nobel Prize in 1965 for their contributions to QED. ↩
Several books can be (and probably have been) written on the perverse incentives in academia, in scientific peer-review, science funding, etc. ↩
I do not. Just trying to make a point. ↩
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
Non-membership proofs in Merkle trees are surprisingly elusive to many people. The problem statement is very simple:
Suppose you have a server who wants to authenticate elements of a set $S$ to a client without ever sending the whole set to this client.
(For simplicity, let’s assume this is a set of numbers.)
Specifically, the server first computes a succinct authentication digest of the set, denoted by $d$, and sends $d$ to the client.
Then, the server is able to prove either membership or non-membership of an element in the set by sending a succinct proof to the client which the client can efficiently verify with respect to the digest $d$.
Design a Merkle tree-based solution for this problem.
The most popular solution to this problem seems to be to build a Merkle tree whose leaves are sorted. This, unfortunately, is a rather sub-optimal solution, both from a security and a complexity point of view.
In this blog post, I hope to dispel the myth of the effectiveness of this sorted-leaves Merkle tree scheme.
Recall that it, if we only need to prove membership, it is very easy to solve the problem by building a Merkle tree over all elements in the set and letting the digest be the Merkle root.
For example, here’s how this would look for a particular choice of set $S$. (Original slides here.)
Then, a membership proof would be a Merkle sibling path to the proved element’s leaf (i.e., the nodes in yellow):
The client can easily verify the proof by computing the hashes along the path from the leaf to the root, and checking that it obtains the same root hash as it has stored locally:
It seems that many people believe sorting the leaves is the right approach to enable non-membership proofs.
This blog post will argue, from three different perspectives, why this is a sub-optimal choice.
Okay! Let say that, instead of the solution from above, the server does indeed first sort the set $S$ as $[2, 3, 6, 7, 9, 11, 13, 14]$ and then computes the Merkle tree:
Clearly, the server can still prove membership as before: just give a Merkle sibling path to the proved element’s leaf.
But, now, it is also possible to prove non-membership of an element.
For example, we can prove non-membership of $8$ by showing that (a) both $7$ and $9$ are in the tree and (b) that they are adjacent. This implies there’s no room where 8 could fit. Therefore, 8 cannot be in the tree:
In other words, the two membership proofs for the adjacent leaves of $7$ and $9$ constitute a non-membership proof for 8, which would have to be placed between them (but cannot be since “there’s no room”).
Can you spot the security issue? It’s a bit subtle and many people miss it…
Here it is: this scheme is secure only if the server correctly computes the Merkle tree over the sorted leaves.
Otherwise, if the server is malicious, it can re-order the leaves and pretend than an element $e$ is both in the set and not in the set.
For example, the malicious server could compute the tree as follows:
Note that the malicious server left 7 adjacent to 9, so that it can still give what appears to be a valid non-membership proof for 8:
At the same time, note that the malicious server inserted a leaf for 8 somewhere else. As a result, the server can still give what appears to be a valid membership proof for 8:
This, of course, is very bad: the server was able to prove two inconsistent statements about the membership of 8 in the digested set. Put differently, it clearly cannot be that 8 is both in $S$ and not in $S$ at the same time. Therefore, the sorted-leaves Merkle tree is insecure when the server cannot be trusted to produce correct digests (and we’ll define security below).
In other words, this type of attack completely ruins security: it makes any proof meaningless to the client (e.g., proof that $8\in S$), since it could easily be followed by a contradicting proof (e.g., a contradicting proof that $8\notin S$).
Not all hope is lost. In some settings, it can be reasonable to assume the digest (i.e., Merkle root) was produced correctly.
For example, in distributed consensus settings (a.k.a., in “blockchains”), there is no single server that dictates what the Merkle root of the data is. Instead, all $n = 3f+1$ servers try to compute the same correct root and vote on it. Servers who deviate from the correct root are ignored and consensus is reached on the correct one by a subset of $2f+1$ honest servers.
Therefore, in this setting, it is okay to rely on the sorted-leaves Merkle tree construction. (I’ll still argue here and here why you shouldn’t, but from different perspectives.)
Other harmless settings include single-client data outsourcing, where a client sorts & Merkle hashes his own data correctly, and transfers everything but the Merkle root to a malicious server.
Since the client has computed the correct root on his own, the client can rely on the server’s (non)membership proofs.
One thing worth emphasizing is that ad-hoc fixes to the problem of a potentially-incorrect digest are not worth it, especially since one can get a construction that needs no fixing from, e.g., a Merkle trie. Specifically, it is not worth it to require the server to prove that it correctly sorted the leaves (e.g., via a SNARK). Also, it is not worth it to rely on fraud proofs when one can have provably-correct behavior all the time. Lastly, it is not worth it to probabilistically audit the data structure to see if you can find two incorrectly-sorted leaves. None of these approaches are worth it because there exist more secure Merkle tree constructions like Merkle tries. Plus, these constructions are easier to update and have smaller proof sizes!
We can formalize the setting in which authenticated set constructions (like the sorted-leaves Merkle tree) are secure.
Specifically, we can define a notion of weak (non)membership soundness that captures the idea that the malicious server must compute the digest correctly:
An authenticated set scheme has weak (non)membership soundness if for all (polynomial-time) adversaries $A$, the probability that $A$ outputs a set $S$, an element $e$, and two proofs $\pi$ & $\pi’$ such that, letting $d$ be the (correct) digest of $S$, $\pi$ verifies as a valid membership proof for $e$ (w.r.t. $d$) while $\pi’$ also verifies as a valid non-membership proof for $e$ (w.r.t. $d$), is negligible in the security parameter of the scheme.
Notice that the adversary outputs a set of elements from which the correct digest $d$ is computed.
In fact, there is a long line of academic literature on 2-party and 3-party authenticated data structures that rely on this type of weaker soundness definitions (see Papamanthou’s PhD thesis^{1} for a survey).
Unfortunately, many applications today inherently rely on untrusted publishers who can compute malicious digests of their data.
For example, in key transparency logs such as Certificate Transparency (CT), log servers can present any digest to new clients joining the system. Therefore, in this setting, authenticated data structures (whether sets or not), must satisfy a stronger notion of security which allows the adversary to construct the digest maliciously.
In fact, such a stronger notion simply requires that the adversary output the digest $d$ directly, which gives the adversary freedom to construct an incorrect one as in our attack above:
An authenticated set scheme has strong (non)membership soundness if for all (polynomial-time) adversaries $A$, the probability that $A$ outputs a digest $d$, an element $e$, and two proofs $\pi$ & $\pi’$ such that $\pi$ verifies as a valid membership proof for $e$ (w.r.t. $d$) while $\pi’$ also verifies as a valid non-membership proof for $e$ (w.r.t. $d$), is negligible in the security parameter of the scheme.
The moral of the story is to pick a Merkle construction that has this stronger notion of security, unless you are sure that your setting allows for the weaker notion and you stand to benefit from relaxing the security (e.g., perhaps because you get a faster construction). A good example of this is the KZG-based authenticated dictionary from Ethereum Research^{2} which has weak soundness (as would be defined for dictionaries), but that’s okay since their consensus setting can accommodate it.
This one is much easier to explain.
Imagine you want to add a new element in your sorted-leaves Merkle tree of size 8.
What if it is smaller than everything else and has to be inserted as the first leaf of the tree?
Then, you would have to completely rehash the entire tree to incorporate this new leaf! This would take $O(n)$ work in a tree of $n$ leaves.
The same problem arises if you’d like to remove the first leaf.
To deal with the slowness of insertions, one can take an amortized approach and maintain a forest of sorted-leaves Merkle trees, where (1) new leaves are appended to the right of the forest as their own size-1 trees and (2) trees of the same size $2^i$ for any $i \ge 0$ get “merged” together by merge-sorting their leaves and rehashing. One can show this approach has $O(\log{n})$ amortized insertion cost. However, such amortized approaches still suffer from $O(n)$ worst-case times and must be de-amortized to bring the worst-case cost down to the amortized cost^{3}.
On the other hand, dealing with deletions can be easier. Specifically, if you do not care about wasted space, then deletions can be done faster by simply marking the leaf as “removed” and trying to garbage-collect as many empty subtrees as you can. Nonetheless, in the worst case, the storage complexity of an $n$-leaf Merkle tree after $O(n)$ deletes remains $O(n)$ (e.g., imagine deleting every even-numbered leaf).
The other problem with the sorted leaves construction is that two Merkle paths must be given as a non-membership proof.
In the best-case, this can be exactly $\log{n}-1$ hashes, but in the worst case this can be as much as $2\log{n}-2$ hashes (e.g., when one proof is in the left subtree and the other proof is in the right subtree).
This is not so great if proof size is a concern. It is also not so great when the Merkle tree is stored on disk since it can double the proof reading I/O cost.
Furthermore, actually achieving the best-case proof size complexity in an implementation can be tricky: the developer must efficiently batch the fetching of the two Merkle proofs from disk or memory, taking care never to fetch the same sibling hash twice (or waste I/O).
…and you want to maintain strong (non)membership soundness, then there is a simple way to fix your construction.
All you have to do is store, inside each internal node of your tree, the minimum and maximum element in that node’s subtree.
Now, a Merkle proof, whether for membership or not, has to additionally reveal the minimum and maximum’s along the proven path. Importantly, when hashing up to verify the Merkle proof, the verifier must ensure the revealed leaf and all min/max pairs revealed are consistent and hashed correctly as part of the verification.
This will of course further increase the proof size of your construction. It will also increase the complexity of implementing the verification procedure, since the min/max ranges have to be incorporated into the hashing and one must check that, for all revealed ranges in the proof, a parent’s range encompasses their child’s range.
Feel free to consider this approach. You could try reproducing the attack from above. You’ll see that while you can present one proof, you’ll have difficulty presenting the other because you will not be able to forge the authenticated min/max ranges. Thus, this construction has strong (non)membership soundness.
This deserves its own post, but here are the key reasons you should probably use a Merkle trie:
There are of course some disadvantages too, but I find them negligible:
In fact, some folks argue that the best trie implementation is via critbit trees^{5}. Unfortunately, I do not know enough about their benefits, especially when Merkleized, but this is probably very much worth exploring.
Kocher^{6} proves non-revocation of certificates via a sorted-leaves-like approach. His approach Merkelizes a list of sorted, non-revoked certificate ID ranges. Specifically, each leaf is a pair $(a, c)$ that says $a$ has been revoked but all certificates $b$ such that $b > a$ and $b < c$ have not been revoked.
Thus, one can prove non-revocation of $b$ by revealing the leaf $(a, c)$ that encompasses the non-revoked ID $b \in (a, c)$. One can also prove revocation of $a$ by revealing the leaf $(a, c)$.
A depiction of the sorted-leaves-like approach from Kocher’s original paper^{7}. The set of elements being authenticated here (i.e., revoked certificates) is $S = \{5, 12, 13, 15, 20, 50, 99\}$.
Of course, Kocher’s approach is vulnerable to the same mis-ordering attack we discussed above. (Furthermore, it also suffers from inefficiency of updates.)
Indeed, Buldas et al.^{7} point out the mis-ordering attack and solve the problem by Merkelizing a binary search tree (BST) instead, which they baptize as an authenticated search tree. However, as far as I could tell, the paper does not describe how to efficiently update such authenticated search trees while keeping them balanced (i.e., solve problem 2).
A depiction of the authenticated (binary) search tree approach from Buldas et al.’s original paper^{7}. The set of elements being authenticated here is $S = \{10, 12, 30, 40, 42, 56, 70, 80\}$.
Fortunately, a few years earlier, Naor and Nissim^{8} had proposed an authenticated 2-3 tree construction which did solve the problem of efficient updates, addressing all problems highlighted in this post. Surprisingly, Naor and Nissim did not point out the mis-ordering attack on Kocher’s work, only the inefficiency of updating it. Also surprisingly, there are no pictures of trees in their paper :(
I still find Merkle tries much easier to implement, but I never tried implementing a 2-3 tree.
Hopefully, this post gave you enough context on the problems of this popular sorted-leaves Merkle tree construction.
This leaves me wondering: are there any advantages to sorted-leaves Merkle trees?
The only advantage I see is that MHTs with sorted leaves are easy to describe: just sort the leaves, Merkleize them and prove non-membership of an element $e$ by revealing the two paths to the adjacent leaves that exclude $e$.
However, just because they are easy to describe does not mean they are easy to understand.
At least, from the questions and answers I see online, and from conversations with researchers and other engineers, their security caveats are not well understood.
First, my own answer on StackExchange makes an unfortunate use of the “sorted Merkle tree” terminology to refer to either a binary search tree^{9}, a trie, or a Sparse Merkle tree (SMT), which actually all have strong (non)membership soundness. Even worse, tries and SMTs are not really sorted, since data is typically hashed before being mapped into the trie.
Another StackExchange answer seems to perpetuate the myth that all you need for non-membership security is to sort the leaves, without paying attention to the weak (non)membership soundness guarantees of such a construction.
The answer quotes this post, where a sorted-leaves Merkle tree solution is described to solve a non-membership problem like the one in the intro. Unfortunately, the answer discards the nuance of the quoted post: there, the original author realized that the leaves could be incorrectly-sorted & resorted to fraud proofs to catch such misbehaviour; i.e., if someone detects a mis-ordered tree, they can easily prove it with two Merkle paths to the out-of-order leaves.
Yet a much easier and cheaper solution would have been to use an authenticated set with strong (non)membership soundness as defined above (e.g., a Merkle trie). This would have simplified the higher-level protocol, since it would have removed the need for fraud proofs, which are clearly less desirable when one can have provably-correct behavior all the time.
Oh well, we live and learn. Don’t sort your Merkle tree’s leaves, okay? Use a Merkle trie.
And, if you somehow find a reason to sort your leaves, please let me know what were the advantages of doing it. Don’t forget to compare to more secure solutions such as Merkle tries, which have strong (non)membership soundness.
Cryptography for Efficiency: New Directions in Authenticated Data Structures, by Charalampos Papamanthou, 2011, [URL] ↩
Multi-layer hashmaps for state storage, by Dankrad Feist, 2020, [URL] ↩
Static-to-dynamic tranformations, by Jeff Erickson, 2015, [URL] ↩
CONIKS: Bringing Key Transparency to End Users, by Marcela S. Melara and Aaron Blankstein and Joseph Bonneau and Edward W. Felten and Michael J. Freedman, in {24th USENIX Security Symposium (USENIX Security 15)}, 2015, [URL] ↩
Shoutout to Alnoki, the cofounder of Econia Labs who brought crit-bit trees to my attention. ↩
On certificate revocation and validation, by Kocher, Paul C., in Financial Cryptography, 1998 ↩
Accountable certificate management using undeniable attestations, by Ahto Buldas and Peeter Laud and Helger Lipmaa, in ACM CCS’00, 2000, [URL] ↩ ↩^{2} ↩^{3}
Certificate Revocation and Certificate Update, by Moni Naor and Kobbi Nissim, in 7th USENIX Security Symposium (USENIX Security 98), 1998, [URL] ↩
Note that a binary-search tree (BST) is a tree where all left descendants of a node are smaller than that node & all right descendants of a node are greater than that node. Importantly, trees with sorted leaves are not conceptualized as binary search trees, since their data is stored in the leaves, not in the internal nodes. ↩
For more details, see this post on Decentralized Thoughts
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
Short Randomizable Signatures, by Pointcheval, David and Sanders, Olivier, in CT-RSA 2016, 2016 ↩
tl;dr: Pairings, or bilinear maps, are a very powerful mathematical tool for cryptography. Pairings gave us our most succinct zero-knowledge proofs^{1}$^,$^{2}$^,$^{3}, our most efficient threshold signatures^{4}, our first usable identity-based encryption (IBE)^{5} scheme, and many other efficient cryptosystems^{6}. In this post, I’ll teach you a little about the properties of pairings, their cryptographic applications and their fascinating history. In fact, by the end of this post, some of you might want to spend a year or two in jail.
Twitter correction: The original tweet announcing this blog post stated that “SNARKs would not be possible without [pairings]”, with the highlighted S meant to emphasize the “succinctness” of such SNARKs. However, thanks to several folks on Twitter, I realized this is not exactly true and depends on what one means by “succinct.” Specifically, “succinct” SNARKs, in the polylogarithmic proof size sense defined by Gentry and Wichs^{7}, exist from a plethora of assumptions, including discrete log^{8} or random oracles^{9}. Furthermore, “succinct” SNARKs, in the sense of $O(1)$ group elements proof size, exist from RSA assumptions too^{10}. What pairings do give us, currently, are SNARKs with the smallest, concrete proof sizes (i.e., in # of bytes).
$$ \def\idt{\mathbb{1}_{\Gr_T}} \def\msk{\mathsf{msk}} \def\dsk{\mathsf{dsk}} \def\mpk{\mathsf{mpk}} $$
A pairing, also known as a bilinear map, is a function $e : \Gr_1 \times \Gr_2 \rightarrow \Gr_T$ between three groups $\Gr_1, \Gr_2$ and $\Gr_T$ of prime order $p$, with generators $g_1 = \langle \Gr_1 \rangle, g_2 = \langle \Gr_2 \rangle$ and $g_T = \langle \Gr_T \rangle$, respectively.
When $\Gr_1 = \Gr_2$, the pairing is called symmetric. Otherwise, it is asymmetric.
Most importantly, a pairing has two useful properties for cryptography: bilinearity and non-degeneracy.
Bilinearity requires that, for all $u\in\Gr_1$, $v\in\Gr_2$, and $a,b\in\Zp$:
\[e(u^a, v^b) = e(u, v)^{ab}\]For cryptography purposes, this is the coolest property. For example, this is what enables useful applications like tripartite Diffie-Hellman.
Non-degeneracy requires that:
\[e(g_1, g_2) \ne \idt\]Why this property? We want non-degeneracy because, without it, it is simple (and useless) to define a (degenerate) bilinear map that, for every input, returns $\idt$. Such a map would satisfy bilinearity, but would be completely useless.
Efficiency requires that there exists a polynomial-time algorithm in the size of a group element (i.e.; in $\lambda = \log_2{p}$) that can be used to evaluate the pairing $e$ on any inputs.
For example, let $r$ be a random element in $\Gr_T$.
First, the pairing is defined so that $e(g_1, g_2) = r$.
This way, the pairing satisfies non-degeneracy.
Second, given $(u,v)\in \Gr_1 \times \Gr_2$, an algorithm could spend exponential time $O(2^\lambda)$ to compute the discrete logs $x = \log_{g_1}{(u)}$ and $y = \log_{g_2}{(v)}$ and return $e(u, v) = e(g_1^x, g_2^y) = r^{xy}$.
This way, the pairing satisfies bilinearity because:
\begin{align}
e(u^a, v^b)
&= e\left((g_1^x)^a, (g_2^y)^b\right)\\\
&= e\left(g_1^{(ax)}, g_2^{(by)}\right)\\\
&= r^{(ax)\cdot (by)}\\\
&= \left(r^{xy}\right)^{ab}\\\
&= e(u, v)^{ab}
\end{align}
This is my limited historical understanding of pairings, mostly informed from Dan Boneh’s account in this video and from my own research into the relevant literature. Please email me if you are aware of more history and I can try to incorporate it.
The history of (cryptographic) pairings begins with a mathematician named André Weil^{11} who, during World War II, is sent to jail for refusing to serve in the French army. There, Weil, “managed to convince the liberal-minded prison director to put [him] in an individual cell where [he] was allowed to keep [..] a pen, ink, and paper.”
Weil used his newly-acquired tools to define a pairing across two elliptic curve groups. However, what was very odd at that time was that Weil put in a lot of effort to make sure his definition of a pairing is computable. And this extra effort is what made today’s pairing-based cryptography possible^{12}.
Funnily, Weil’s time in jail was so productive that he began wondering if he should spend a few months there every year. Even better, Weil contemplated if he should recommend to the relevant authorities that every mathematician spend some amount of time in jail. Weil writes:
I’m beginning to think that nothing is more conducive to the abstract sciences than prison.
[…]
My mathematics work is proceeding beyond my wildest hopes, and I am even a bit worried - if it’s only in prison that I work so well, will I have to arrange to spend two or three months locked up every year?
In the meantime, I am contemplating writing a report to the proper authorities, as follows: “To the Director of Scientific Research: Having recently been in a position to discover through personal experience the considerable advantages afforded to pure and disinterested research by a stay in the establishments of the Penitentiary System, I take the liberty of, etc. etc.”
You can read all of this and more in his fascinating autobiography, written from his perspective as a mathematician^{13}.
Weil’s work was the foundation. But three more developments were needed for pairing-based cryptography to rise.
In 1985, Victor Miller writes up a manuscript showing that Weil’s pairing, which actually involves evaluating a polynomial of exponential degree, can in fact be computed efficiently in polynomial time^{14}.
In December 1984, Miller gave a talk at IBM about elliptic curve cryptography where he claimed that elliptic curve discrete logs were more difficult to compute than ordinary discrete logs over finite fields^{15}. Miller was challenged by Manuel Blum, who was in the audience, to back up this claim by giving a reduction: i.e., showing that an algorithm $B$ for solving discrete log on elliptic curves can be efficiently turned into another algorithm $A$ for solving discrete logs in finite fields. Such a reduction would imply the problem solved by $B$ (i.e., computing elliptic curve discrete logs) is at least as hard, if not harder, than $A$’s problem (i.e., computing finite field discrete logs).
Miller set out to find a reduction by thinking about the only thing that related an elliptic curve group and a finite field: the Weil pairing. Funnily, what he quickly realized was that, although the Weil pairing gives a reduction, it’s in the opposite direction: i.e., it turned out an algorithm $A$ for discrete log in finite fields could be efficiently turned into an algorithm $B$ for discrete logs in elliptic curves with the help of the Weil pairing. This “unwanted” reduction is easy to see. Since $e(g^a, g) = e(g,g)^a$, solving discrete log on the elliptic curve element $g_a\in \Gr$ is just a matter of solving it on $e(g,g)^a\in \Gr_T$, which is actually a multiplicative subgroup of a finite field (see the inner details of pairings).
This almost showed the opposite of what Miller sought to prove, potentially destroying elliptic curve cryptography, but fortunately the degree of the extension field that the Weil pairing mapped into was too large, making this “unwanted” reduction inefficient and thus not a reduction after all.
This whole affair got Miller interested in seeing if the Weil pairing could be computed efficiently, which led to the discovery of his algorithm. Interestingly, he submitted this manuscript to FOCS, a top theoretical computer science conference, but the paper got rejected and would not be published until much later in the Journal of Cryptology (according to Miller)^{16}.
In 1991, Menezes, Vanstone and Okamoto^{17} leverage Miller’s efficient algorithm for evaluating the Weil pairing to break the discrete logarithm assumption on certain elliptic curves in sub-exponential time. This was quite amazing since, at that time, no sub-exponential time algorithms were known for elliptic curves.
Their attack, called the MOV attack, mapped an elliptic curve discrete logarithm challenge $g^a\in \Gr$ into a target group as $e(g^a, g)=e(g,g)^a \in \Gr_T$ using the pairing. Since the target group was a subgroup of a finite field $\mathbb{F}_q^{k}$, this allowed the use of faster, sub-exponential time algorithms for computing the discrete log on $e(g,g)^a$.
So far, pairings only seemed useful for cryptanalysis. No one knew how to use them for building (instead of breaking) cryptography.
This changed in 2000, when Joux^{18} used pairings to implement a 1-round key-exchange protocol between three parties, or tripartite Diffie-Hellman. Previously, such 1-round protocols were only known between two parties while three parties required 2 rounds.
From there, an abundance of new, efficient cryptography started pouring over:
An interesting pattern to notice here is how pairings evolved from a cryptanalytic tool used to break cryptosystems, to a constructive tool used to build cryptosystems. Interestingly, the same pattern also arose in the development of lattice-based cryptography.
There are a few tricks cryptographers often use when dealing with pairings in their proofs of correctness or security of a cryptosystem.
The most obvious trick, “multiplying in the exponent”, comes from the bilinearity property.
\begin{align} e(u^a, v^b) = e(u, v)^{ab} \end{align}
Bilinearity also implies the following trick: \begin{align} e(u, v^b) = e(u, v)^b \end{align} Or, alternatively: \begin{align} e(u^a, v) = e(u, v)^a \end{align}
Another trick, which is just an analogous way of defining bilinearity, is: \begin{align} e(u, v\cdot w) = e(u, v)\cdot e(u, w) \end{align}
Why does this work? Let $y,z$ denote the discrete logs (w.r.t. $g_2$) of $v$ and $w$, respectively.
Then, we have:
\begin{align}
e(u, v\cdot w)
&= e(u, g_2^y \cdot g_2^z)\\
&= e(u, g_2^{y + z})\\
&= e(u, g_2)^{y + z}\\
&= e(u, g_2)^y \cdot e(u, g_2)^z\\
&= e(u, g_2^y) \cdot e(u, g_2^z)\\
&= e(u, v)\cdot e(u, w)
\end{align}
Or, alternatively: \begin{align} e(u, v / w) = \frac{e(u, v)}{e(u, w)} \end{align}
This protocol was introduced by Joux in 2000^{18} and assumes a symmetric pairing: i.e., where \(\Gr_1 = \Gr_2 = \langle g\rangle \stackrel{\mathsf{def}}{=} \Gr\).
We have three parties Alice, Bob and Charles with secret keys $a, b$, and $c$ (respectively). They send each other their public keys $g^a, g^b, g^c$ and agree on a shared secret key $k = e(g, g)^{abc}$.^{20}
How?
Consider Alice’s point of view. She gets $g^b$ and $g^c$ from Bob and Charles. First, she can use her secret $a$ to compute $g^{ab}$. Second, she can use the pairing to compute $e(g^{ab}, g^c) = e(g, g)^{abc} = k$.
By symmetry, all other players can do the same and agree on the same $k$.
The protocol can be generalized to asymmetric pairings too, where $\Gr_1 \neq \Gr_2$.
Boneh, Lynn and Shacham give a very short signature scheme from pairings^{4}, which works as follows:
Notice that correctly-computed signatures will always verify since:
\begin{align}
e(\sigma, g_2) \stackrel{?}{=} e(H(m), \pk) \Leftrightarrow\\
e(H(m)^s, g_2) \stackrel{?}{=} e(H(m), g_2^s) \Leftrightarrow\\
e(H(m), g_2)^s \stackrel{?}{=} e(H(m), g_2)^s \Leftrightarrow\\
e(H(m), g_2) = e(H(m), g_2)
\end{align}
See the BLS paper^{4} for how to prove that no attacker can forge BLS signatures given access to $\pk$ and a signing oracle.
BLS signatures are quite amazing:
If you find yourself confused between the various notions of multi-signatures, aggregate signatures and threshold signatures, see my slides.
In an IBE scheme, one can encrypt directly to a user-friendly email address (or a phone number), instead of a cumbersome public key which is difficult to remember or type-in correctly.
Boneh and Franklin give a very efficient IBE scheme from pairings^{5}.
For IBE to work, a trusted third-party (TTP) called a private key generate (PKG) must be introduced, who will issue secret keys to users based on their email addresses. This PKG has a master secret key (MSK) $\msk \in \Zp$ with an associated master public key (MPK) $\mpk = g_2^s$, where $\langle g_2 \rangle = \Gr_2$.
The $\mpk$ is made public and can be used to encrypt a message to any user given their email address. Crucially, the PKG must keep the $\msk$ secret. Otherwise, an attacker who steals it can derive any user’s secret key and decrypt everyone’s messages.
As you can tell the PKG is a central point of failure: theft of the $\msk$ compromises everyone’s secrecy. To mitigate against this, the PKG can be decentralized into multiple authorities such that a threshold number of authorities must be compromised in order to steal the $\msk$.
Let $H_1 : \{0,1\}^* \rightarrow \Gr_1^*$ and $H_T : \Gr_T \rightarrow \{0,1\}^n$ be two hash functions modeled as random oracles.
To encrypt an $n$-bit message $m$ to a user with email address $id$, one computes:
\begin{align}
g_{id} &= e(H_1(id), \mpk) \in \Gr_T\\
r &\randget \Zp\\
\label{eq:ibe-ctxt}
c &= \left(g_2^r, m \xor H_T\left(\left(g_{id}\right)^r\right)\right) \in \Gr_2\times \{0,1\}^n
\end{align}
To decrypt, the user with email address $id$ must first obtain their decryption secret key $\dsk_{id}$ from the PKG. For this, we assume the PKG has a way of authenticating the user, before handing them their secret key. For example this could be done via email.
The PKG computes the user’s decryption secret key as: \begin{align} \dsk_{id} = H_1(id)^s \in \Gr_1 \end{align}
Now that the user has their decryption secret key, they can decrypt the ciphertext $c = (u, v)$ from Equation $\ref{eq:ibe-ctxt}$ as: \begin{align} m &= v \xor H_T(e(\dsk_{id}, u)) \end{align}
You can see why correctly-encrypted ciphertexts will decrypt successfully, since:
\begin{align}
v \xor H_T(e(\dsk_{id}, u))
&= \left(m \xor H_T\left((g_{id})^r \right)\right) \xor H_T\left(e(H_1(id)^s, g_2^r) \right)\\
&= \left(m \xor H_T\left(e(H_1(id), \mpk )^r \right)\right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\\
&= m \xor \left(H_T\left(e(H_1(id), g_2^s)^r \right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\right)\\
&= m \xor \left(H_T\left(e(H_1(id), g_2 )^{rs}\right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\right)\\
&= m
\end{align}
To see why this scheme is secure under chosen-plaintext attacks, refer to the original paper^{5}.
Mostly, I have no idea. How come? Well, I never really needed to know. And that’s the beauty of pairings: one can use them in a black-box fashion, with zero awareness of their internals.
Still, let’s take a small peek inside this black box. Let us consider the popular pairing-friendly BLS12-381 curve^{22}, from the family of BLS curves characterized by Barreto, Lynn and Scott^{23}.
Public service announcement: Some of you might’ve heard about Boneh-Lynn-Shacham (BLS) signatures. Please know that this is a different BLS than the BLS in Barretto-Lynn-Scott curves. Confusingly, both acronyms do share one author, Ben Lynn. (In case this was not confusing enough, wait until you have to work with BLS signatures over BLS12-381 curves.)
For BLS12-381, the three groups $\Gr_1, \Gr_2, \Gr_T$ involved are:
How does the pairing map across these three groups work? Well, the pairing $e(\cdot,\cdot)$ expands to something like: \begin{align} \label{eq:pairing-def} e(u, v) = f_{p, u}(v)^{(q^k - 1)/p} \end{align} It’s useful to know that computing a pairing consists of two steps:
For more on the internals, see other resources^{24}$^,$^{25}$^,$^{26}.
This section discusses various implementation-level details that practitioners can leverage to speed up their implementations.
The pairing over BLS12-381 is asymmetric: i.e., $\Gr_1 \ne \Gr_2$ are two different groups (of the same order $p$). However, there also exist symmetric pairings where $\Gr_1 = \Gr_2$ are the same group.
Unfortunately, “such symmetric pairings only exist on supersingular curves, which places a heavy restriction on either or both of the underlying efficiency and security of the protocol”^{27}. In other words, such supersingular curves are not as efficient at the same security level as the curves used in asymmetric pairings.
Therefore, practitioners today, as far as I am aware, exclusively rely on asymmetric pairings due to their higher efficiency when the security level is kept the same.
I will give a few key performance numbers for the BLS12-381 curve implemented in Filecoin’s (blstrs Rust wrapper around the popular blst library.
These microbenchmarks were run on a 10-core 2021 Apple M1 Max using cargo bench
.
As explained in Eq. \ref{eq:pairing-def}, a pairing involves two steps:
Therefore, a pairing takes around 486 microseconds (i.e., the sum of the two).
The $\Gr_T$ microbenchmarks were done by slightly-modifying the blstrs
benchmarking code here.
(See the HTML comments of this page for those modifications.)
Note: These benchmarks pick the exponentiated base randomly and do not perform any precomputation on it, which would speed up these times by $2\times$-$4\times$.
This is a well-known optimization that I’m including for completeness.
Specifically, many libraries allow you to compute a product $\prod_{0 < i < k} \left(g_i\right)^{x_i}$ of $k$ exponentiations much faster than individually computing the $k$ exponentiations and aggregating their product.
For example, blstrs seems to be incredibly fast in this regard:
When designing a pairing-based cryptographic protocol, you will want to carefully pick what to use $\Gr_1$ and what to use $\Gr_2$ for.
For example, in BLS signatures, if you want small signatures, then you would compute the signature $\sigma = H(m)^s \in \Gr_1$ and settle for a slightly-larger public key be in $\Gr_2$. On the other hand, if you wanted to minimize public key size, then you would let it be in $\Gr_1$ while taking slightly longer to compute the signature in $\Gr_2$.
Other things will also influence how you use $\Gr_1$ and $\Gr_2$, such as the existence of an isomorphism $\phi : \Gr_2 \rightarrow \Gr_1$ or the ability to hash uniformly into these groups. In fact, the existence of such an isomorphism separates between two types of asymmetric pairings: Type 2 and Type 3 (see Galbraith et al.^{25} for more information on the different types of pairings)
When compared to an elliptic curve that does not admit pairings, pairing-friendly elliptic curves are around two times slower.
For example, the popular prime-order elliptic curve group Ristretto255 offers:
If you recall how a pairing actually works (see Eq. $\ref{eq:pairing-def}$), you’ll notice the following optimization:
Whenever, we have to compute the product of $n$ pairings, we can first compute the $n$ Miller loops and do a single final exponentiation instead of $n$.
This drastically reduces the pairing computation time in many applications.
\begin{align}
\prod_i e(u_i, v_i)
&= \prod_i \left(f_{p, u_i}(v_i)^{(q^k - 1)/p}\right)\\
&= \left(\prod_i f_{p, u_i}(v_i)\right)^{(q^k - 1)/p}
\end{align}
This blog post was supposed to be just a short summary of the three properties of pairings: bilinearity, non-degeneracy and efficiency.
Unfortunately, I felt compelled to discuss their fascinating history. And I couldn’t let you walk away without seeing a few powerful cryptographic applications of pairings.
After that, I realized practitioners who implement pairing-based cryptosystems might benefit from knowing a little about their internal workings, since some of these details can be leveraged to speed up implementations.
I would like to thank Dan Boneh for helping me clarify and contextualize the history around Weil, as well as for his 2015 Simons talk, which inspired me to do a little more research and write this historical account.
Big thanks to:
PS: Twitter threads are a pain to search through, so if I missed acknowledging your contribution, please kindly let me know.
Quadratic Span Programs and Succinct NIZKs without PCPs, by Rosario Gennaro and Craig Gentry and Bryan Parno and Mariana Raykova, in Cryptology ePrint Archive, Paper 2012/215, 2012, [URL] ↩ ↩^{2}
Pinocchio: Nearly Practical Verifiable Computation, by Bryan Parno and Craig Gentry and Jon Howell and Mariana Raykova, in Cryptology ePrint Archive, Paper 2013/279, 2013, [URL] ↩
On the Size of Pairing-Based Non-interactive Arguments, by Groth, Jens, in Advances in Cryptology – EUROCRYPT 2016, 2016 ↩
Short Signatures from the Weil Pairing, by Boneh, Dan and Lynn, Ben and Shacham, Hovav, in Advances in Cryptology — ASIACRYPT 2001, 2001 ↩ ↩^{2} ↩^{3} ↩^{4}
Identity-Based Encryption from the Weil Pairing, by Boneh, Dan and Franklin, Matt, in Advances in Cryptology — CRYPTO 2001, 2001 ↩ ↩^{2} ↩^{3} ↩^{4}
Constant-Size Commitments to Polynomials and Their Applications, by Kate, Aniket and Zaverucha, Gregory M. and Goldberg, Ian, in ASIACRYPT’10, 2010 ↩
Separating Succinct Non-Interactive Arguments From All Falsifiable Assumptions, by Craig Gentry and Daniel Wichs, in Cryptology ePrint Archive, Report 2010/610, 2010, [URL] ↩ ↩^{2}
Efficient Zero-Knowledge Arguments for Arithmetic Circuits in the Discrete Log Setting, by Jonathan Bootle and Andrea Cerulli and Pyrros Chaidos and Jens Groth and Christophe Petit, in Cryptology ePrint Archive, Report 2016/263, 2016, [URL] ↩
Computationally-Sound Proofs, by Silvio Micali, in Logic Colloquium ‘95: Proceedings of the Annual European Summer Meeting of the Association of Symbolic Logic, 1998, [URL] ↩
Subvector Commitments with Application to Succinct Arguments, by Russell W.F. Lai and Giulio Malavolta, in Cryptology ePrint Archive, Report 2018/705, 2018, [URL] ↩ ↩^{2}
André Weil — Wikipedia, The Free Encyclopedia, by Wikipedia contributors, 2022, [URL] ↩
Thanks to Dan Boneh, who contrasted Weil’s definition with a different one by Shimura from his classic book on modular forms. While Shimura’s definition makes it much easier to prove all the properties of the pairing, it defines a pairing of order $n$ as a sum of $n$ points of order $n^2$. This makes it hopelessly non-computable. Weil’s definition, on the other hand, involves an evaluation of a very concrete function – there are no exponential-sized sums – but requires much more work to prove all its pairing properties. ↩
The Apprenticeship of a Mathematician, by Weil, Andre, 1992, [URL] ↩
Short Programs for functions on Curves, by Victor S. Miller, 1986, [URL] ↩
Miller tells this story himself in a talk he gave at Microsoft Research on October 10th, 2010. ↩
I am unable to find any trace of Miller’s published work on this beyond the manuscript Boneh published in^{14}. Any pointers would be appreciated. ↩
Reducing Elliptic Curve Logarithms to Logarithms in a Finite Field, by Menezes, Alfred and Vanstone, Scott and Okamoto, Tatsuaki, in ACM STOC, 1991, [URL] ↩
A One Round Protocol for Tripartite Diffie–Hellman, by Joux, Antoine, in Algorithmic Number Theory, 2000 ↩ ↩^{2} ↩^{3}
Evaluating 2-DNF Formulas on Ciphertexts, by Boneh, Dan and Goh, Eu-Jin and Nissim, Kobbi, in Theory of Cryptography, 2005 ↩
Typically, there will be some key-derivation function $\mathsf{KDF}$ used to derive the key as $k = \mathsf{KDF}(e(g,g)^{abc})$. ↩
Towards Scalable Threshold Cryptosystems, by Alin Tomescu and Robert Chen and Yiming Zheng and Ittai Abraham and Benny Pinkas and Guy Golan Gueta and Srinivas Devadas, in IEEE S\&P’20, 2020 ↩
BLS12-381 For The Rest Of Us, by Ben Edgington, 2022, [URL] ↩
Constructing Elliptic Curves with Prescribed Embedding Degrees, by Paulo S. L. M. Barreto and Ben Lynn and Michael Scott, in Cryptology ePrint Archive, Paper 2002/088, 2002, [URL] ↩
Pairings for cryptographers, by Steven D. Galbraith and Kenneth G. Paterson and Nigel P. Smart, in Discrete Applied Mathematics, 2008, [URL] ↩ ↩^{2}
An Introduction to Pairing-Based Cryptography, by Alfred Menezes, 2005, [URL] ↩
Subgroup security in pairing-based cryptography, by Paulo S. L. M. Barreto and Craig Costello and Rafael Misoczki and Michael Naehrig and Geovandro C. C. F. Pereira and Gustavo Zanon, in Cryptology ePrint Archive, Paper 2015/247, 2015, [URL] ↩
Pairing-Friendly Elliptic Curves of Prime Order, by Barreto, Paulo S. L. M. and Naehrig, Michael, in Selected Areas in Cryptography, 2006 ↩
Efficient Identity-Based Encryption over NTRU Lattices, by Léo Ducas and Vadim Lyubashevsky and Thomas Prest, in Cryptology ePrint Archive, Paper 2014/794, 2014, [URL] ↩
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
Practically, the only (somewhat-fast) accumulators without trusted setup (and constant-sized proofs) are RSA accumulators^{1} instantiated with great care^{2} over class groups^{3}.
Theoretically 😄, if you relax your definition of “accumulators” by:
…then, naturally you can use a Merkle prefix tree (a.k.a., a Merkle trie) to represent a set and obtain an accumulator.
Another approach is to either use (1) a binary search tree or (2) a tree with sorted leaves, where each internal node stores the minimum and the maximum element in its subtree^{4}.
Similarly, you can also use the rather beautiful Utreexo construction^{5}, which is also Merkle-based but does not support non-membership proofs.
Even more theoretically 😆, assuming you don’t care about performance at all, you might use a lattice-based accumulator^{6},^{7},^{8},^{9}. Some of them do not need a trusted setup, like ^{6}.
Even better, the recent lattice-based vector commitments by Peikert et al.^{10} can be turned into an accumulator. (Interestingly, I think accumulator proof sizes here could be made “almost” $O(\log_k{n})$-sized, for arbitrary $k$, if one used their lattice-based Verkle construction which, AFAICT, requires a trusted setup.)
One last theoretical idea is to generate an RSA group with a modulus $N$ of unknown factorization using the “RSA UFO” technique by Sander^{11}. Unfortunatly, such $N$ are too large and kill performance. Specifically, instead of the typical 2048-bit or 4096-bit, RSA UFO $N$’s are hundreds of thousands of bits (or larger?). Improving this would be a great avenue for future work.
One-Way Accumulators: A Decentralized Alternative to Digital Signatures, by Benaloh, Josh and de Mare, Michael, in EUROCRYPT ‘93, 1994 ↩ ↩^{2}
A note on the low order assumption in class group of an imaginary quadratic number fields, by Karim Belabas and Thorsten Kleinjung and Antonio Sanso and Benjamin Wesolowski, in Cryptology ePrint Archive, Report 2020/1310, 2020, [URL] ↩
Secure Accumulators from Euclidean Rings without Trusted Setup, by Lipmaa, Helger, in Applied Cryptography and Network Security, 2012 ↩
Accountable certificate management using undeniable attestations, by Ahto Buldas and Peeter Laud and Helger Lipmaa, in ACM CCS’00, 2000, [URL] ↩
Utreexo: A dynamic hash-based accumulator optimized for the Bitcoin UTXO set, by Thaddeus Dryja, 2019, [URL] ↩
Streaming Authenticated Data Structures, by Papamanthou, Charalampos and Shi, Elaine and Tamassia, Roberto and Yi, Ke, in EUROCRYPT 2013, 2013 ↩ ↩^{2}
Compact Accumulator using Lattices, by Mahabir Prasad Jhanwar and Reihaneh Safavi-Naini, in Cryptology ePrint Archive, Report 2014/1015, 2014, [URL] ↩
Zero-Knowledge Arguments for Lattice-Based Accumulators: Logarithmic-Size Ring Signatures and Group Signatures Without Trapdoors, by Benoît Libert and San Ling and Khoa Nguyen and Huaxiong Wang, in EUROCRYPT (2), 2016, [URL] ↩
Lattice-Based Group Signatures: Achieving Full Dynamicity with Ease, by San Ling and Khoa Nguyen and Huaxiong Wang and Yanhong Xu, in Cryptology ePrint Archive, Report 2017/353, 2017, [URL] ↩
Vector and Functional Commitments from Lattices, by Chris Peikert and Zachary Pepin and Chad Sharp, in Cryptology ePrint Archive, Report 2021/1254, 2021, [URL] ↩
Efficient Accumulators without Trapdoor Extended Abstract, by Sander, Tomas, in Information and Communication Security, 1999 ↩
\begin{align} \phi &= [\phi_0, \phi_1, \phi_2, \dots, \phi_d] \end{align}
Given $n$ pairs $(x_i, y_i)_{i\in[n]}$, one can compute or interpolate a degree $\le n-1$ polynomial $\phi(X)$ such that: \(\phi(x_i)=y_i,\forall i\in[n]\)
Specifically, the Lagrange interpolation formula says that: \begin{align} \label{eq:lagrange-formula} \phi(X) &= \sum_{i\in[n]} y_i \cdot \lagr_i(X),\ \text{where}\ \lagr_i(X) = \prod_{j\in[n],j\ne i} \frac{X-x_j}{x_i-x_j} \end{align}
This formula is intimidating at first, but there’s a very simple intuition behind it. The key idea is that $\lagr_i(X)$ is defined so that it has two properties:
You can actually convince yourself that $\lagr_i(X)$ has these properties by plugging in $x_i$ and $x_j$ to see what happens.
Important: The $\lagr_i(X)$ polynomials are dependent on the set of $x_i$’s only (and thus on $n$)! Specifically each $\lagr_i(X)$ has degree $n-1$ and has a root at each $x_j$ when $j\ne i$! In this sense, a better notation for them would be $\lagr_i^{[x_i, n]}(X)$ or $\lagr_i^{[n]}(X)$ to indicate this dependence.
Consider the following example with $n=3$ pairs of points. Then, by the Lagrange formula, we have:
\[\phi(X) = y_1 \lagr_1(X) + y_2 \lagr_2(X) + y_3 \lagr_3(X)\]Next, by applying the two key properties of $\lagr_i(X)$ from above, you can easily check that $\phi(x_i) = y_i,\forall i\in[3]$:
\begin{align}
\phi(x_1) &= y_1 \lagr_1(x_1) + y_2 \lagr_2(x_1) + y_3 \lagr_3(x_1) = y_1 \cdot 1 + y_2 \cdot 0 + y_3 \cdot 0 = y_1\\
\phi(x_2) &= y_1 \lagr_1(x_2) + y_2 \lagr_2(x_2) + y_3 \lagr_3(x_2) = y_1 \cdot 0 + y_2 \cdot 1 + y_3 \cdot 0 = y_2\\
\phi(x_3) &= y_1 \lagr_1(x_3) + y_2 \lagr_2(x_3) + y_3 \lagr_3(x_3) = y_1 \cdot 0 + y_2 \cdot 0 + y_3 \cdot 1 = y_3
\end{align}
An important detail is that the degree of the interpolated $\phi(X)$ is $\le n-1$ and not necessarily exactly equal to $n-1$. To see this, consider interpolating the polynomial $\phi(X)$ such that $\phi(i) = i$ for all $i\in [n]$. In other words, $x_i = y_i = i$.
The inspired reader might notice that the polynomial $\phi(X) = X$ could satisfy our constraints.
But is this what the Lagrange interpolation will return?
After all, the interpolated $\phi(X)$ is a sum of degree $n-1$ polynomials $\lagr_i(X)$, so could it have degree 1?
Well, it turns out, yes, because things cancel out.
To see this, take a simple example, with $n=3$:
\begin{align}
\phi(X) &=\sum_{i\in [3]} i \cdot \lagr_i(X) = \sum_{i\in [3]} i \cdot \prod_{j\in[3]\setminus{i}} \frac{X - j}{i - j}\\
&= 1\cdot \frac{X-2}{1-2}\frac{X-3}{1-3} + 2\cdot \frac{X-1}{2-1}\frac{X-3}{2-3} + 3\cdot\frac{X-1}{3-1}\frac{X-2}{3-2}\\
&= \frac{X-2}{-1}\frac{X-3}{-2} + 2\cdot \frac{X-1}{1}\frac{X-3}{-1} + 3\cdot \frac{X-1}{2}\frac{X-2}{1}\\
&= \frac{1}{2}(X-2)(X-3) - 2(X-1)(X-3) + \frac{3}{2}(X-1)(X-2)\\
&= \frac{1}{2}[(X-2)(X-3) + 3(X-1)(X-2)] - 2(X-1)(X-3)\\
&= \frac{1}{2}[(X-2)(4X-6)] - 2(X-1)(X-3)\\
&= (X-2)(2X-3) - 2(X-1)(X-3)\\
&= (2X^2 - 4X - 3X + 6) - 2(X^2 - 4X +3)\\
&= (2X^2 - 7X + 6) - 2X^2 + 8X - 6\\
&= X
\end{align}
If done naively, interpolating $\phi(X)$ using the Lagrange formula in Equation \ref{eq:lagrange-formula} will take $O(n^2)$ time.
However, there are known techniques for computing $\phi(X)$ in $O(n\log^2{n})$ time. We described part of these techniques in a previous blog post, but for the full techniques please refer to the “Modern Computer Algebra” book^{1}.
Fast polynomial evaluation and interpolation, by von zur Gathen, Joachim and Gerhard, Jurgen, in Modern Computer Algebra, 2013 ↩
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} %\definecolor{myBlueColor}{HTML}{268BD2} %\definecolor{myPinkColor}{HTML}{D33682} %\definecolor{myGreenColor}{HTML}{859900} %\def\mygreen#1{\color{myGreenColor}{#1}} %\def\mygreen#1{\color{green}{#1}} %\newcommand{\myblue}[1]{\textcolor{myBlueColor}{#1}} %\newcommand{\mypink}[1]{\textcolor{myPinkColor}{#1}} $$
First, read:
Notation:
Other works that give related techniques to compute proofs fast in KZG-like polynomial commitments are:
Let $f(X)$ be a polynomial with coefficients $f_i$:
\begin{align*}
f(X) &= f_m X^m + f_{m-1} X^{m-1} + \cdots f_1 X + f_0\\
&= \sum_{i\in[0,m]} f_i X^i
\end{align*}
Recall that a KZG evaluation proof $\pi_i$ for $f(\omega^i)$ is a KZG commitment to a quotient polynomial $Q_i(X) = \frac{f(X) - f(\omega^i)}{X-\omega^i}$: \begin{align*} \pi_i = g^{Q_i(\tau)} = g^{\frac{f(\tau) - f(\omega^i)}{\tau-\omega^i}} \end{align*}
Computing such a proof takes $O(m)$ time!
But what if we want to compute all $\pi_i, i\in[0, n)$?
If done naively, this would take $O(nm)$ time, which is too expensive!
Lower bound? Is this $O(nm)$ time complexity inherent? After all, to compute $\pi_i$ don’t we first need to compute $Q_i(X)$, which takes $O(m)$ time? As you’ll see below, the answer is “no!”
Fortunately, Feist and Khovratovich observe that the $Q_i$’s are algebraically-related and so are their $\pi_i$ KZG commitments! As a result, they observe that computing all $\pi_i$’s does not require computing all $Q_i$’s.
Below, we explain how their faster, $O(n\log{n})$-time technique works!
An important caveat is that their technique relies on the evaluation points being $\omega^0,\dots,\omega^{n-1}$.
To understand how the $\pi_i$’s relate to one other, let us look at the coefficients of $Q_i(X)$.
We can show that when dividing $f$ (of degree $m$) by $(X-\omega^i)$ we obtain a quotient polynomial with coefficients $t_0, t_1, \dots, t_{m-1}$ such that:
\begin{align}
\label{eq:div-coeffs-1}
t_{m-1} &= f_m\\
%t_{m-2} &= f_{m-1} + \omega^i \cdot f_m\\
%t_{m-3} &= f_{m-2} + \omega^i \cdot t_{m-2}\\
% &= f_{m-2} + \omega^i \cdot (f_{m-1} + \omega^i \cdot f_m)\\
% &= f_{m-2} + \omega^i \cdot f_{m-1} + \omega^{2i} \cdot f_m\\
%t_{m-4} &= f_{m-3} + \omega^i \cdot t_{m-3}\\
% &= f_{m-3} + \omega^i \cdot (f_{m-2} + \omega^i \cdot f_{m-1} + \omega^{2i} \cdot f_m)\\
% &= f_{m-3} + \omega^i \cdot f_{m-2} + \omega^{2i} \cdot f_{m-1} + \omega^{3i} \cdot f_m\\\
% & \vdots\\
\label{eq:div-coeffs-2}
t_{j} &= f_{j+1} + \omega^i \cdot t_{j+1}, \forall j \in [0, m-1)
% & \vdots\\
% t_0 &= f_1 + \omega^i \cdot f_2 + \omega^{2i} f_3 + \dots + \omega^{m-1} f_m
\end{align}
Note that the $t_i$’s are a function of $f_m, f_{m-1},\dots, f_1$, but not of $f_0$!
Indeed, the quotient obtained when dividing $f(X) = f_3 X^3 + f_2 X^2 + \dots + f_0$ by $X-\omega^i$ exactly matches Equations \ref{eq:div-coeffs-1} and \ref{eq:div-coeffs-2} above:
Specifically, the quotient's coefficients, as expected, are: \begin{align} t_2 &= \color{green}{f_3}\\ t_1 &= f_2 + \omega^i t_2\\ &= \color{blue}{f_2 + \omega^i f_3}\\ t_0 &= f_1 + \omega^i t_1 = f_1 + \omega^i \cdot (f_2 + \omega^i f_3)\\ &= \color{pink}{f_1 + \omega^i f_2 + \omega^{2i} f_3} \end{align}Next, let us expand Equations \ref{eq:div-coeffs-1} and \ref{eq:div-coeffs-2} above and get a better sense of the relationship between KZG quotient polynomials:
\begin{align}
\color{green}{t_{m-1}} &= \underline{f_m}\\
\color{blue}{t_{m-2}} &= f_{m-1} + \omega^i \cdot \color{green}{t_{m-1}} =\nonumber\\
&= \underline{f_{m-1} + \omega^i \cdot f_m}\\
\color{red}{t_{m-3}} &= f_{m-2} + \omega^i \cdot \color{blue}{t_{m-2}}\nonumber\\
&= f_{m-2} + \omega^i \cdot (f_{m-1} + \omega^i \cdot f_m)\nonumber\\
&= \underline{f_{m-2} + \omega^i \cdot f_{m-1} + \omega^{2i} \cdot f_m}\\
t_{m-4} &= f_{m-3} + \omega^i \cdot \color{red}{t_{m-3}}\nonumber\\
&= f_{m-3} + \omega^i \cdot (f_{m-2} + \omega^i \cdot f_{m-1} + \omega^{2i} \cdot f_m)\nonumber\\
&= \underline{f_{m-3} + \omega^i \cdot f_{m-2} + \omega^{2i} \cdot f_{m-1} + \omega^{3i} \cdot f_m}\\\
&\hspace{.55em}\vdots\nonumber\\
% t_{j} &= f_{j+1} + \omega^i \cdot t_{j+1}, \forall j \in [0, m-1)\\
% & \vdots\\
t_1 &= \underline{f_2 + \omega^i \cdot f_3 + \omega^{2i} \cdot f_4 + \dots + \omega^{(m-2)i} \cdot f_m}\\
t_0 &= \underline{f_1 + \omega^i \cdot f_2 + \omega^{2i} \cdot f_3 + \dots + \omega^{(m-1)i} \cdot f_m}
\end{align}
As you can see above, the quotient polynomial $Q_i(X) = \sum_{j=0}^{m-1} t_j X^j$ obtained when dividing $f(X)$ by $X-\omega^i$ is:
\begin{align}
Q_i(X) &= f_m \cdot X^{m-1} + {}\\
&+ \left(f_{m-1} + \omega^i \cdot f_m\right) \cdot X^{m-2} + {}\nonumber\\
&+ \left(f_{m-2} + \omega^i \cdot f_{m-1} + \omega^{2i} \cdot f_m\right) \cdot X^{m-3} + {}\nonumber\\
&+ \left(f_{m-3} + \omega^i \cdot f_{m-2} + \omega^{2i} \cdot f_{m-1} + \omega^{3i} \cdot f_m\right) \cdot X^{m-4} + {}\nonumber\\\
&+ \dots + {}\nonumber\\\
&+ \left(f_2 + \omega^i \cdot f_3 + \omega^{2i} \cdot f_4 + \dots + \omega^{(m-2)i} \cdot f_m\right) \cdot X + {}\nonumber\\
&+ \left(f_1 + \omega^i \cdot f_2 + \omega^{2i} \cdot f_3 + \dots + \omega^{(m-1)i} \cdot f_m\right)\nonumber
\end{align}
Factoring out the roots of unity, we can rearrange this as follows:
\begin{align}
\label{eq:HX}
Q_i(X) &= \left(f_m X^{m-1} + f_{m-1} X^{m-2} + \dots + f_1\right) (\omega^i)^0 + {}\\
&+ \left(f_m X^{m-2} + f_{m-1} X^{m-3} + \dots + f_2\right) (\omega^i)^1 + {}\nonumber\\
&+ \left(f_m X^{m-3} + f_{m-1} X^{m-4} + \dots + f_3\right) (\omega^i)^2 + {}\nonumber\\
&+ \dots + {}\nonumber\\
&+ \left(f_m X + f_{m-1}\right) (\omega^i)^{m-2} + {}\nonumber\\
&+ \left(f_m \right) (\omega^i)^{m-1}\nonumber
\end{align}
Baptising the polynomials above as $H_j(X)$, we can rewrite as:
\begin{align}
Q_i(X) &\bydef H_1(X) (\omega^i)^0 + {}\\
&+ H_2(X) (\omega^i)^1 + {}\nonumber\\
&+ \dots + {}\nonumber\\
&+ H_m(X) (\omega^i)^{m-1}\nonumber\\
\end{align}
More succinctly, the quotient polynomial is:
\begin{align}
\label{eq:Qi-poly}
Q_i(X) &= \sum_{k=0}^{m-1} H_{j+1}(X) \cdot (\omega^i)^k
\end{align}
Note: At this point, it is not helpful to write down a closed form formula for $H_j(X)$, but we’ll return to it later.
Next, let: \begin{align} \label{eq:hj} h_j = g^{H_j(\tau)},\forall j\in[m] \end{align} …denote a KZG commitment to $H_j(X)$. (We are ignoring for now the actual closed-form formula for the $H_j$’s.)
Recall that \(\pi_i=g^{Q_i(\tau)}\) denotes a KZG proof for $\omega^i$.
Therefore, applying Equation \ref{eq:Qi-poly} to $\pi_i$’s expression, we get: \begin{align} \label{eq:pi-dft-like} \pi_i = \prod_{j=0}^{m-1} \left(h_{j+1}\right)^{(\omega^i)^j}, \forall i\in[0,n) \end{align}
But a close look at Equation \ref{eq:pi-dft-like} reveals it is actually a Discrete Fourier Transform (DFT) on the $h_j$’s! Specifically, we can rewrite it as: \begin{align} \label{eq:pi-dft} [ \pi_0, \pi_1, \dots, \pi_{n-1} ] = \mathsf{DFT}_{\Gr}(h_1, h_2, \dots, h_m, h_{m+1},\dots, h_n) \end{align} Here, the extra $h_{m+1},\dots,h_n$ (if any) are just commitments to the zero polynomials: i.e., they are the identity element in $\Gr$. (Also, $\mathsf{DFT}_{\Gr}$ is a DFT on group elements via exponentiations, rather than on field elements via multiplication.)
Time complexity: Ignoring the time to compute the $h_j$ commitments, which we have not discussed yet, note that the DFT above would only take $O(n\log{n})$ time!
This (almost) summarizes the Feist-Khovratovich (FK) technique!
The key idea? KZG quotient polynomial commitments are actually related, if the evaluation points are roots of unity. Specifically, these commitments are the output of a single DFT as per Equation \ref{eq:pi-dft}, which can be computed in quasilinear time!
However, one key challenge remains, which we address next: computing the $h_j$ commitments.
To see how the $h_j$’s can be computed fast too, let’s rewrite them from Equation \ref{eq:HX}.
\begin{align}
H_1(X) &= f_m X^{m-1} + f_{m-1} X^{m-2} + \dots + f_1\\
H_2(X) &= f_m X^{m-2} + f_{m-1} X^{m-3} + \dots + f_2\\
H_3(X) &= f_m X^{m-3} + f_{m-1} X^{m-4} + \dots + f_3\\
&\vdots\\
H_m(X) &= f_m X + f_{m-1}\\
H_{m-1}(X) &= f_m
\end{align}
Key observation: We can express the $H_j(X)$ polynomials as a Toeplitz matrix product between a matrix $\mathbf{F}$ (of $f(X)$’s coefficients) and a column vector $V(X)$ (of the indeterminate variable $X$):
\begin{align}
\begin{bmatrix}
H_1(X)\\
H_2(X)\\
H_3(X)\\
\vdots\\
H_m(X)\\
H_{m-1}(X)\\
\end{bmatrix}
&=
\begin{bmatrix}
f_m & f_{m-1} & f_{m-2} & f_{m-3} & \dots & f_2 & f_1\\
0 & f_m & f_{m-1} & f_{m-2} & \dots & f_3 & f_2\\
0 & 0 & f_m & f_{m-1} & \dots & f_4 & f_3\\
\vdots & & & \ddots & & & \vdots\\
0 & 0 & 0 & 0 & \dots & f_m & f_{m-1}\\
0 & 0 & 0 & 0 & \dots & 0 & f_m
\end{bmatrix}
\cdot
\begin{bmatrix}
X^{m-1}\\
X^{m-2}\\
X^{m-3}\\
\vdots\\
X\\
1
\end{bmatrix}
\\
&\bydef
\mathbf{F} \cdot V(X)
\end{align}
Therefore, the commitments $h_j$ to the $H_j(X)$’s can also be expressed as a Toeplitz matrix product, where “multiplication” is replaced with “exponentation” and the column vector $V(X)$ is replaced by $V(\tau)$:
\begin{align}
\begin{bmatrix}
h_1\\
h_2\\
h_3\\
\vdots\\
h_m\\
h_{m-1}\\
\end{bmatrix}
&=
\begin{bmatrix}
f_m & f_{m-1} & f_{m-2} & f_{m-3} & \dots & f_2 & f_1\\
0 & f_m & f_{m-1} & f_{m-2} & \dots & f_3 & f_2\\
0 & 0 & f_m & f_{m-1} & \dots & f_4 & f_3\\
\vdots & & & \ddots & & & \vdots\\
0 & 0 & 0 & 0 & \dots & f_m & f_{m-1}\\
0 & 0 & 0 & 0 & \dots & 0 & f_m
\end{bmatrix}
\cdot
\begin{bmatrix}
\tau^{m-1}\\
\tau^{m-2}\\
\tau^{m-3}\\
\vdots\\
\tau\\
1
\end{bmatrix}
\\
&\bydef
\mathbf{F} \cdot V(\tau)
\end{align}
Fortunately, it is well known that such a matrix product can be computed in $O(m\log{m})$ time (incidentally, also via DFTs).
If you are curious, in a previous blogpost, as well as in a short paper^{3}, we explain in detail how this works.
We are all done! To summarize, to compute all proofs $\pi_i$ for $f(\omega^i)$, the Feist-Khovratovich (FK) technique^{1} proceeds as follows:
A few things that we could still talk about, but we are out of time:
Fast amortized Kate proofs, by Dankrad Feist and Dmitry Khovratovich, 2020, [URL] ↩ ↩^{2} ↩^{3}
Towards Scalable Threshold Cryptosystems, by Alin Tomescu and Robert Chen and Yiming Zheng and Ittai Abraham and Benny Pinkas and Guy Golan Gueta and Srinivas Devadas, in IEEE S\&P’20, 2020 ↩
How to compute all Pointproofs, by Alin Tomescu, in Cryptology ePrint Archive, Report 2020/1516, 2020, [URL] ↩ ↩^{2}
fflonk: a Fast-Fourier inspired verifier efficient version of PlonK, by Ariel Gabizon and Zachary J. Williamson, in Cryptology ePrint Archive, Report 2021/1167, 2021, [URL] ↩
The multiplicative group of integers modulo $m$ is defined as: \begin{align} \Z_m^* = \{a\ |\ \gcd(a,m) = 1\} \end{align} But why? This is because Euler’s theorem says that: \begin{align} \gcd(a,m) = 1\Rightarrow a^{\phi(m)} = 1 \end{align} This in turn, implies that every element in $\Z_m^*$ has an inverse, since: \begin{align} a\cdot a^{\phi(m) - 1} &= 1 \end{align} Thus, for a prime $p$, all elements in $\Z_p^* = \{1,2,\dots, p-1\}$ have inverses. Specifically, the inverse of $a \in \Z_p^*$ is $a^{p-2}$.
]]>