Slides: here.
Abstract: In this talk, we survey results & challenges around generating, maintaining and using a shared secret amongst the validators of a proof-of-stake (PoS) blockchain.
In the first part, we discuss techniques for generating a secret. We start with secret sharing in the threshold setting and then in the weighted setting that arises in PoS blockchains. We then introduce publicly-verifiable secret sharing (PVSS), explaining why it could be an ideal primitive to build distributed key generation (DKG) protocols from. Lastly, we discuss the new “silent setup” setting^{1}$^,$^{2}$^,$^{3}$^,$^{4} for bootstrapping threshold cryptosystems without a DKG or any explicit secret sharing (previously known as “ad hoc groups” in the literature).
In the second part, we discuss the threat of collusion attacks in the PoS attacks, where validators stand to profit by exposing the shared secret or a function of it (e.g., the plaintext obtained after threshold decryption under the shared secret). We present three different collusion attacks which are all detectable-but-unpunishable. We then give a TEE-based approach that could prevent collusion and call for more research in this direction.
In the third part, we discuss some new techniques used to speed up threshold cryptosystems. We begin by reminding practitioners that Lagrange interpolation in threshold cryptosystems can and should be done via an optimized quasilinear time algorithm, instead of quadratic^{5}. Then, we present new results on threshold cryptosystems that use group elements as secret key^{6}$^,$^{7}$^,$^{8}. Lastly, we present an exciting new direction on batching threshold cryptosystems so that communication during aggregation is independent of the batch size.
Overall, we highlight important research problems in both the theory and the practice of threshold cryptography.
Threshold Signatures in the Multiverse, by L. Baird and S. Garg and A. Jain and P. Mukherjee and R. Sinha and M. Wang and Y. Zhang, in 2023 IEEE Symposium on Security and Privacy (SP), 2023, [URL] ↩
Threshold Signatures from Inner Product Argument: Succinct, Weighted, and Multi-threshold, by Sourav Das and Philippe Camacho and Zhuolun Xiang and Javier Nieto and Benedikt Bunz and Ling Ren, in Cryptology ePrint Archive, Paper 2023/598, 2023, [URL] ↩
hinTS: Threshold Signatures with Silent Setup, by Sanjam Garg and Abhishek Jain and Pratyay Mukherjee and Rohit Sinha and Mingyuan Wang and Yinuo Zhang, in Cryptology ePrint Archive, Paper 2023/567, 2023, [URL] ↩
Threshold Encryption with Silent Setup, by Sanjam Garg and Dimitris Kolonelos and Guru-Vamsi Policharla and Mingyuan Wang, in Cryptology ePrint Archive, Paper 2024/263, 2024, [URL] ↩
Towards Scalable Threshold Cryptosystems, by Alin Tomescu and Robert Chen and Yiming Zheng and Ittai Abraham and Benny Pinkas and Guy Golan Gueta and Srinivas Devadas, in IEEE S\&P’20, 2020 ↩
Ferveo: Threshold Decryption for Mempool Privacy in {BFT} networks, by Joseph Bebel and Dev Ojha, in Cryptology {ePrint} Archive, Paper 2022/898, 2022, [URL] ↩
Distributed Randomness using Weighted VRFs, by Sourav Das and Benny Pinkas and Alin Tomescu and Zhuolun Xiang, in Cryptology ePrint Archive, Paper 2024/198, 2024, [URL] ↩
Aggregatable Distributed Key Generation, by Gurkan, Kobi and Jovanovic, Philipp and Maller, Mary and Meiklejohn, Sarah and Stern, Gilad and Tomescu, Alin, in Advances in Cryptology – EUROCRYPT 2021, 2021 ↩
tl;dr: This is a post-mortem write-up on how I failed to use the Bulletproofs IPA to convince a verifier that a multi-exponentiation $\A^\bb = \prod_i (A_i)^{b_i}$ was done correctly. The problem is that the Bulletproof verifier has to “fold” the $\A$ vector by using individual exponentiations, which would be even slower than the verifier naively doing the $\A^\bb$ multiexp.
The protocol below assumes thes vector $\A$ and $\bb$ are both of size $m = 2^k$ for some integer $k \ge 0$.
$\underline{\prove(\A, \bb)}$ | $\underline{\ver(V, \A, \bb)\rightarrow \{0,1\}}$ |
$\rule[2.5pt]{8em}{0.5pt}\fbox{$\textbf{if }m = 1$}\rule[2.5pt]{8em}{0.5pt}$ | |
$\xrightarrow{\mbox{$\A = [A_1], \bb = [b_1]$}}$ | |
return 1 iff. $V \equals A_1^{b_1}$ | |
$\rule[2.5pt]{6em}{0.5pt}\fbox{$\textbf{else}\text{ (i.e., if }m \ge 2\text{)}$}\rule[2.5pt]{6em}{0.5pt}$ | |
$V_L = (\A_R)^{\bb_L}$ $V_R = (\A_L)^{\bb_R}$ | |
$\xrightarrow{\mbox{$V_L, V_R$}}$ | |
$\xleftarrow{\mbox{$x\randget \F$}}$ | |
$\color{red}\A' = \A_L \circ (\A_R)^x$ $\bb' = \bb_L \circ (\bb_R)^{(x^{-1})}$ | |
Computes ${\color{red}\A'},\bb'$ just like the prover $V' = (V_L)^x \cdot V \cdot (V_R)^{(x^{-1})}$ | |
Recurse on $(\A',\bb')$ | Recurse on $(V', \A',\bb')$ |
One day, I hope to edit this into a full blog post but, until then:
A 20-minute presentation at zkSummit11 can be found below:
An accompanying tweetstorm can be found below:
What is an @aptos keyless account? 🧵
— Alin Tomescu (@alinush407) June 12, 2024
It's a blockchain account derived from (say) your Google account and an application (wallet, dapp, etc).
It's bound not just to you (e.g., you@gmail.com) but also to the application (e.g., @PetraWallet, or @ThalaLabs, or @VibrantXFinance) pic.twitter.com/L3qgRf1WoS
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
The Schnorr signature scheme was originally introduced by Claus-Peter Schnorr, a German mathematician, in a CRYPTO’89 paper^{1}. In the paper, Schnorr first proposes an identification scheme which he then turns into a signature scheme using the well-known Fiat-Shamir transform^{2}. The original paper describes the signature scheme assuming a specific choice of Abelian group: a prime-order $q$ subgroup of $\Zps$, where $p$ is a prime. Later work naturally observed that any prime-order group suffices (e.g., elliptic curve groups)^{3}.
Schnorr patented his scheme in 1990. This was likely the biggest reason why Bitcoin, and the rest of the cryptocurrency space, (unfortunately?) chose ECDSA as its signature scheme, instead of Schnorr, which is simpler, more efficient and easier to thresholdize into a $t$-out-of-$n$ scheme. In 2010, once the patent expired, Schnorr became more popular.
One advantage of ECDSA over Schnorr I can think of is its public key recovery feature, which Bitcoin leverages in P2PKH mode^{4} to keep TXN signatures smaller. In fact, Bitcoin leveraged P2PKH since the beginning it seems^{5}.
Preliminaries:
$\mathsf{Schnorr}$.$\mathsf{KeyGen}(1^\lambda) \rightarrow (\sk, \pk)$:
$\mathsf{Schnorr}$.$\mathsf{Sign}(m, \sk) \rightarrow \sigma$:
Note: It is also possible to use a $+$ instead of a $-$ when computing $s$. The verification equation can be adjusted to account for it (e.g., see EdDSA below).
$\mathsf{Schnorr}$.$\mathsf{Verify}(m, \pk, \sigma) \rightarrow \{0,1\}$:
The scheme is correct if signatures created via $\mathsf{Schnorr.Sign}$ verify correctly via $\mathsf{Schnorr.Verify}$.
Let’s see why this holds:
\begin{align}
R &\equals g^s \cdot \pk^{H(R, m)}\\
g^r &\equals g^{r-H(R, m)\cdot \sk} \cdot (g^\sk)^{H(R, m)}\\
g^r &\equals g^{r-H(R, m)\cdot \sk} \cdot g^{H(R, m) \cdot \sk}\\
g^r &\equals g^r
\end{align}
Schnorr signature verification is significantly faster when done in batch, rather than individually via $\mathsf{Schnorr.Verify}$.
Specifically, given $(\sigma_i, m_i, \pk_i)_{i\in [n]}$, one can ensure all signatures verify (i.e., that $\mathsf{Schnorr.Verify}(m_i, \pk_i, \sigma_i) = 1,\forall i\in [n]$) by taking a random linear combination of the verification equations and combining them into one.
More formally, the batch verification algorithm looks like this:
$\mathsf{Schnorr.BatchVerify}((m_i, \pk_i, \sigma_i)_{i\in[n]}) \rightarrow \{0,1\}$:
The speed-up comes from using multi-exponentiation algorithms such as Bos-Coster or BDL+12^{6} in the last check.
Note: When the public keys are the same (i.e., $\pk_i = \pk, \forall i\in[n]$), then the size of the multi-exponentiation can be reduced, which further speeds up the verification: \begin{align} \left(\prod_{i \in [n]} R_i^{-z_i} g^{s_i \cdot z_i}\right) \pk^{\sum_{i\in[n]} H(R_i, m_i)\cdot z_i} \equals 1 \end{align}
Implementing batch verification so that it returns the same results as individual verification may be trickier than it appears over non-prime order groups like Edwards 25519^{7}$^,$^{8}.
In this formulation, the signature includes the hash $e = H(R, m)$ instead of $R$.
This may have advantages if the hash can be made smaller. The original Schnorr paper^{1} claims $\lambda$-bit hashes (as opposed to $2\lambda$) are sufficient for $\lambda$-bit security, but not sure if that has changed.
On the other hand, a disadvantage is that this formulation does not allow for more efficient batch verification.
$\mathsf{Schnorr}’$.$\mathsf{Sign}(m, \sk) \rightarrow \sigma$:
$\mathsf{Schnorr}’$.$\mathsf{Verify}(m, \pk, \sigma) \rightarrow \{0,1\}$:
EdDSA is a Schnorr-based signature scheme designed for groups $\Gr$ of non-prime order $p = h\cdot q$, where $q\approx 2^{2\lambda}$ and $h=8$ (but can be generalized to $h=2^c$, for any $c$^{9}). EdDSA has a few modifications for security. In particular, (1) the nonce $r$ is generated pseudo-randomly from the SK and the message $m$ and (2) the signing additionally hashes over the public key and (3) EdDSA has been designed so that the SK can be safely reused in Diffie-Hellman (DH) key exchange protocols like X25519.
EdDSA uses multiple hash functions:
These are typically instantiated from a single hash function $H : \{0,1\}^* \rightarrow \{0,1\}^{4\lambda}$ via proper domain separation.
$\mathsf{EdDSA}$.$\mathsf{KeyGen}(1^\lambda) \rightarrow (\sk, \pk)$:
What is up with this weird generation of the secret key $a$? tl;dr is that it allows for “the same [Ed25519] secrets [to] also be used safely with X25519 if you also need to do a key-exchange.”^{10}
$\mathsf{EdDSA}$.$\mathsf{Sign}(m, \sk) \rightarrow \sigma$:
As per [BDL+12]^{6} the inclusion of $\pk$ in $H_3$ is “an inexpensive way to alleviate concerns that several public keys could be attacked simultaneously”. Another yet-to-be explored advantage is that it prevents an adversary who is given a target signature $\sigma$ from finding a message $m$ and a public key $\pk$ for which it verifies. For example, this is possible in plain Schnorr where, given any $\sigma$, the adversary can pick any message $m$ it wants and compute the PK as $\pk = (g^s / R)^{1/H(R, m)}$. (In the future, I hope to expand on why such a security notion may be useful.)
$\mathsf{EdDSA}$.$\mathsf{Verify}(m, \pk, \sigma) \rightarrow \{0,1\}$:
An alternative version of the verification function multiplies by the cofactor $h$ in the exponent: $g^{h\cdot s} \equals R^h \cdot \pk^{h\cdot H(R, \pk, m)}$. The subtleties of this are discussed by Henry de Valence^{7}.
Ed25519 is just EdDSA over the Edwards 25519 curve with $\lambda=128$ and an appropriate choice of hash function. This is stated in the EdDSA paper^{6}:
Our recommended curve for EdDSA is a twisted Edwards curve birationally equivalent to the curve Curve25519 […] We use the name Ed25519 for EdDSA with this particular choice of curve.
Typically, the most common flavor of Ed25519 is Ed25519-SHA-512 which uses SHA2-512 as its hash function.
Surprisingly, implementing Schnorr signatures can be quite tricky. Previous work explores the many subtleties in depth^{7}$^,$^{8}$^,$^{9}. Instead of rehashing their explanations, I will summarize three main pitfalls to watch out for. (Unfortunately, Ed25519 only handles the first one.)
This is the most important pitfall to avoid in Schnorr signatures:
Pitfall: If an implementation produces two signatures that reuse the same $r$, then the secret key can be extracted. Therefore, it is crucial for security that $r$ be sampled randomly.
Recommendation: As we discuss later, picking $r$ pseudorandomly based on the message and the secret key obviates this problem.
We showcase the attack below.
Suppose an implementation produces two signatures $\sigma_1 = (R, s_1)$ and $\sigma_2 = (R, s_2)$ on messages $m_1 \ne m_2$, respectively, that reuse the same $r$.
Specifically:
\begin{align}
R &= g^r\\
s_1 &= r + H(R, m_1) \cdot \sk\\
s_2 &= r + H(R, m_2) \cdot \sk
\end{align}
Then, an attacker can extract $\sk$ as follows:
\begin{align}
\frac{s_1 - s_2}{H(R, m_1) - H(R, m_2)}
&= \frac{H(R, m_1)\sk - H(R, m_2)\sk}{H(R, m_1) - H(R, m_2)}\\
&= \frac{\sk(H(R, m_1) - H(R, m_2))}{H(R, m_1) - H(R, m_2)}\\
&= \sk
\end{align}
For this attack to work, the denominator above must be not zero, which happens with overwhelming probability when $m_1\ne m_2$ and $H$ is collision-resistant. This attack works even when using the alternative $(e, s)$ formulation of Schnorr singatures, described later.
Pitfall: The description above and, in fact, most academic descriptions, do not distinguish between a group element and its serialization into bytes. (Same applies to field elements.) Yet developers who implement Schnorr must understand the caveats of (de)serializing these elements to (1) avoid issues such as signature malleability and (2) maintain compatibility with other libraries.
For example, consider the code that deserializes the $s\in \Zp$ component of the Schnor signature. Typically, naively-written code will not check that the positive integer encoded in the bytes is $< p$. As a result, such code will accept two different byte representations of the same $s$. This could allow for one valid Schnorr signature $\sigma$ on $m$ to be mauled by an attacker into another different-but-still-valid signature $\sigma’$ on $m$.
Such malleability attacks might not seem like a big deal: after all, there was already a valid $\sigma$ on $m$, what do we care if someone can create a new $\sigma’$ that’s also valid? Fair enough, but many applications often (incorrectly) assume that a message only has one, unique, valid signature. In the past, such attacks may have been used to drain money from (poorly-implemented) cryptocurrency exchanges^{11}.
Recommendation: Developers need to ensure that each group (or field) element has a single / unique / canonical serialized representation into bytes and that deserialization only accepts this canonical representation. Ristretto255^{12} is a recently-proposed elliptic curve group that offers canonical (de)serialization.
Pitfall: The description above and, in fact, most academic descriptions, make a crucial assumption: that $\Gr$ is a prime-order group.
Yet, Ed25519, which is the most popular implementation of Schnorr, does not use prime-order groups. Instead, it uses composite order groups where the order is $h\cdot q$ where $q$ is prime and $h = 8$ is the so-called cofactor. This actually creates subtle issues when batch-verifying Schnorr signatures, for example, where signatures that verify individually will not verify as part of a batch^{7}.
Recommendation: If you have the freedom in your application, you should avoid implementing Schnorr over non-prime order groups (i.e., avoid Ed25519) and adopt Schnorr variants like Schnorrkel^{13} which use prime-order groups.
By now, you should be pretty well-versed in Schnorr signatures and a few of their properties: nonce reuse attacks, batch verification, alternative forms, etc. There is so much more to say about them. Perhaps this article will grow over time.
Efficient Identification and Signatures for Smart Cards, by Schnorr, C. P., in Advances in Cryptology — CRYPTO’ 89 Proceedings, 1990 ↩ ↩^{2} ↩^{3}
How To Prove Yourself: Practical Solutions to Identification and Signature Problems, by Fiat, Amos and Shamir, Adi, in Advances in Cryptology — CRYPTO’ 86, 1987 ↩
Not sure what the earliest work is that uses Schnorr signatures over, say, elliptic curves. ↩
How did pay-to-pubkey hash come about? What is its history? ↩
High-speed high-security signatures, by Bernstein, Daniel J. and Duif, Niels and Lange, Tanja and Schwabe, Peter and Yang, Bo-Yin, in Journal of Cryptographic Engineering, 2012, [URL] ↩ ↩^{2} ↩^{3}
It’s 255:19AM. Do you know what your validation criteria are?, Henry de Valence ↩ ↩^{2} ↩^{3} ↩^{4}
Taming the many EdDSAs, by Konstantinos Chalkias and François Garillot and Valeria Nikolaenko, in Cryptology ePrint Archive, Report 2020/1244, 2020, [URL] ↩ ↩^{2}
The Provable Security of Ed25519: Theory and Practice, by Jacqueline Brendel and Cas Cremers and Dennis Jackson and Mang Zhao, in Cryptology ePrint Archive, Report 2020/823, 2020, [URL] ↩ ↩^{2}
An Explainer On Ed25519 Clamping, Jake Craige ↩
Bitcoin Transaction Malleability and MtGox, by Decker, Christian and Wattenhofer, Roger, in Computer Security - ESORICS 2014, 2014 ↩
$$ \def\ak{\mathsf{ak}} $$
Why do we care about threshold verifiable unpredictable functions (VUFs)? Because the most efficient designs for distributed randomness beacons^{1} rely on them! Yet, all practical threshold VUFs require an expensive DKG phase between the beacon nodes (e.g., BLS^{2}). This DKG must be run (1) when the beacon is first deployed and (2) when nodes enter or leave the distributed deployment (e.g., via a DKG-like secret resharing scheme).
Today, several beacon protocols based on threshold VUFs incur this expensive DKG phase (e.g., Aptos Roll^{3}, drand^{4}, Flow^{5}, Sui^{6}). This makes it intriguing (and potentially-useful) to consider the possibility of a silent setup threshold VUF: a scheme that avoids the need for a DKG!
In this blog post, we show such a scheme exists, if efficient multilinear maps exist! In fact, our proposed scheme is slightly stronger: it is a silent-setup multiverse VUF, or a SMURF, which works in a recently-proposed, more general multiverse setting^{7} that implicitly captures the threshold setting too.
We modify the BLS^{2} multisignature scheme to be a threshold VUF as follows.
Each player $i$ has a secret key $\sk_i$ and a verification key $\vk_i = g^{\sk_i}$, generated locally, as usual. Player $i$ produces a signature share on message $m$ as $\sigma_i = H(m)^{\sk_i} \in \Gr$, as usual.
What is different then? To aggregate a $t$-out-of-$n$ threshold VUF, we will plug the $t$ signature shares as the first inputs to an $n$-multilinear map $e$ and the remaining $n-t$ public keys as the last inputs.
For example, a 2-out-of-3 threshold VUF $\sigma$ over a message $m$ would be aggregated from BLS signature shares $(\sigma_1,\sigma_2)$ as:
\begin{align}
\label{eq:example-agg}
\sigma
&= e_3(\sigma_1, \sigma_2, \pk_3)\\
&= e_3(H(m)^{\sk_1}, H(m)^{\sk_2}, g^{\sk_3})\\\
&= e_3(H(m), H(m), g)^{\sk_1 \cdot \sk_2 \cdot \sk_3}
\end{align}
To verify such a threshold VUF, the proof would consist of the actual signature shares $H(m)^{\sk_1}$ and $H(m)^{\sk_2}$. Verification would involve two steps. First, validate each signature share using the multilinear map (or via a DLEQ $\Sigma$-protocol proof): \begin{align} \label{eq:example-ver} e_3(\sigma_i, g,g) \equals e_3(H(m), \pk_i, g),\forall i\in\{1,2\} \end{align} Second, re-aggregate the signature shares and check they yield the same signature $\sigma$ from Equation \ref{eq:example-agg}.
We fully describe this strawman scheme below and improve it with succinctness later via an argument of knowledge (AoK) of signature shares that satisfy Equation \ref{eq:example-ver} and, when aggregated via Equation \ref{eq:example-agg}, yield the threshold VUF.
Note: We assume symmetric multilinear maps. Generalizing this to asymmetric ones would be interesting. (Naively, porting the construction over does not work due to the apparent necessity of defining multiple hash functions $H_i$ into each one of the asymmetric groups $\Gr_i$, which would break the uniqueness of the scheme.)
Recently, there has been increased interest in silent-setup threshold signatures: i.e., threshold signatures that avoid DKGs^{8}$^,$^{7}$^,$^{9}$^,$^{10}. However, silent-setup unique threshold signatures have not received much attention. (Recall that: unique threshold signature = threshold VUFs.)
Part of the reason may be that a silent-setup threshold VUF is actually a strong primitive: it implies $n$-party non-interactive key exchange (NIKE). Specifically, a $1$-out-of-$n$ silent-setup threshold VUF $=$ an $n$-party NIKE^{11}. Furthermore, a $t$-out-of-$n$ silent-setup threshold VUF = an $(n-t+1)$-party NIKE, because $t-1$ of the SKs can be exposed, which yields a $1$-out-of-$(n-t+1)$ silent VUF.
Note: While the strawman silent-setup threshold signature construction from [BGJ+23]^{7} (see a screenshot here) does satisfy uniqueness, it lacks silent setup: it still requires a DKG-like protocol for exposing evaluations on a degree-$(n-1)$ polynomial.
When we write $(a_i)_{i\in T}$ as an input to an algorithm, it is shorthand for $\left(T, (i, a_i)_{i\in T}\right)$.
A VUF^{12} is a unique signature scheme. Note that uniqueness is a stronger property than deterministic signing. It means that, an adversary cannot create two different signatures $\sigma_1 \ne \sigma_2$ such that they are both accepted as valid signatures on a message $m$ under some public key $\pk$. For example, BLS is a unique signature scheme but Ed25519 is not! While Ed25519 supports deterministic signing, it is not a unique scheme because a verifier will glady accept many different signatures for the same message $m$.
Important: As of 2024, practical multilinear map constructions do not exist^{13}$^,$^{14}. In other words, the schemes in this blog post are purely-theoretical and serve only to indicate the feasibility of a SMURF.
An $n$-multilinear map $e : \Gr^n \rightarrow \Gr_T$ has the following properties:
An accumulator scheme is used to compute a short, collision-resistant digest of a set $S$.
$\mathsf{Acc.Commit}(S) \rightarrow d$. Computes the digest $d$ of the set $S$.
An accumulator scheme is collision-resistant if there is no polynomial time adversary that can output two different sets which have the same digest. As an example, simply sorting the set in a canonical order and hashing the result using a collision-resistant hash function yields an accumulator scheme.
An AoK for a relation $\mathcal{R}$ consists of two algorithms:
$\mathsf{AoK.Prove}_\mathcal{R}(x; w) \rightarrow \pi$. Generates a proof $\pi$ that $R(x; w) = 1$, given a public statement $x$ and a witness $w$. The proof $\pi$ is succinct (i.e., much smaller than $w$), but might still leak information about $w$.
$\mathsf{AoK.Verify}_\mathcal{R}(x; \pi)\rightarrow \{0,1\}$. Verifies the proof $\pi$ that the prover knows a witness $w$ such that $R(x; w) = 1$.
To avoid mistakes, we formally define a SMURF as a tuple of algorithms (with correctness and security definitions in the appendix):
$\mathsf{SMURF.KeyGen}(1^\lambda) \rightarrow (\sk_i, \vk_i)$. Locally generates a player’s key pair: their secret key and corresponding verification key.
Ideally, $\vk_i$ should be succinct (i.e., its size should be independent of the maximum number of players $n$).
$\mathsf{SMURF.AggPubkey}(t, (\vk_j)_{j\in [n]}) \rightarrow (\pk, \ak)$. Takes the verification keys of $n$ players (generated locally via $\mathsf{SMURF.KeyGen}$) and a threshold $t$. Aggregates them into:
The necessity of an aggregation key AK $\ak$ in the definition is artificial. An ideal definition would not require this. However, because our succinct SMURF construction requires the VKs of all players during aggregation, we rely on an AK to pass in this information.
$\mathsf{SMURF.ShareSign}(\sk_i, m) \rightarrow \sigma_i$. Computes a signature share $\sigma_i$ over $m$ under $\sk_i$.
$\mathsf{SMURF.ShareVer}(\vk_i, m, \sigma_i) \rightarrow \{0,1\}$. Verifies the signature share $\sigma_i$ on the message $m$ from the player with VK $\vk_i$.
$\mathsf{SMURF.AggSig}(\ak, m, (\sigma_i)_{i\in T}) \rightarrow \sigma$. Aggregates the signature shares $\sigma_i$ from a subset $T$ of players into a signature $\sigma$ that can be verified via $\mathsf{SMURF.Verify}$.
$\mathsf{SMURF.Verify}(\pk, m, \sigma) \rightarrow \{0,1\}$. Verifies that $\sigma$ is a valid signature on $m$ under $\pk$.
$\mathsf{SMURF.Derive}(\pk, m, \sigma) \rightarrow y$. Derives a unique output $y$ from the signature $\sigma$.
$\mathsf{SMURF.Eval}(t, (\vk_j)_{j\in[n]}, m) \rightarrow y$. Returns the unique output $y$ on message $m$ given a threshold $t$ and the VKs of the $n$ players.
Note: We intentionally defined $\mathsf{SMURF.Eval}$ to take the VKs rather than the SKs as input. This allows us to define unpredictability in the silent setup setting, where it will not be possible to extract the SKs under which the adversary’s prediction was made (unlike in the DKG setting), since VKs can be adversarial. Lastly, even though $\mathsf{SMURF.Eval}$ is not a polynomial-time algorithm, this is not a problem since it is only used for the security definition.
Here, we describe the scheme from above in more detail.
Let $g$ be the generator for $\Gr$, which admits an $n$-multilinear map^{15} $e$, as defined above. We construct a non-succinct SMURF as follows:
$\mathsf{SMURF_1.KeyGen}(1^\lambda) \rightarrow (\sk_i, \vk_i)$:
$\mathsf{SMURF_1.AggPubkey}(t, (\vk_j)_{j\in [n]}) \rightarrow (\pk, \ak)$.
$\mathsf{SMURF_1.ShareSign}(\sk_i, m) \rightarrow \sigma_i$:
$\mathsf{SMURF_1.ShareVer}(\vk_i, m, \sigma_i) \rightarrow \{0,1\}$:
$\mathsf{SMURF_1.AggSig}(\ak, m, (\sigma_i)_{i\in T}) \rightarrow \sigma$:
Just a naive aggregation here. Will fix this in the succinct construction below.
$\mathsf{SMURF_1.Verify}(\pk, m, \sigma) \rightarrow \{0,1\}$:
Also a naive verification here, which we fix in the succinct construction below.
$\mathsf{SMURF_1.Derive}(\pk, m, \sigma) \rightarrow y$.
$\mathsf{SMURF_1.Eval}(t, (\vk_j)_{j\in[n]}, m) \rightarrow y$.
Recall that a polynomial time $\mathsf{SMURF.Eval}$ is not necessary, since we only use this algorithm to define security.
From a theoretical standpoint, the $\mathsf{SMURF_1}$ construction above is not succinct:
Fortunately, all of these can be addressed with a succinct argument of knowledge (AoK), often known as a SNARK. (Note that a zkSNARK is not necessary; we do not need zero-knowledge here.)
Specifically, instead of (1) doing all the signature share verification inside $\mathsf{SMURF_1.Verify}$ and (2) doing all the aggregation work inside $\mathsf{SMURF_1.Derive}$, we will give an AoK of having done this work when aggregating in $\mathsf{SMURF_2.AggSig}$.
We denote the resulting scheme as $\mathsf{SMURF_2}$. It has the same $\mathsf{KeyGen}$, $\mathsf{ShareSign}$, $\mathsf{ShareVer}$ and $\mathsf{Eval}$ algorithms as $\mathsf{SMURF_1}$, except for:
$\mathsf{SMURF_2.AggPubkey}(t, (\vk_j)_{j\in [n]}) \rightarrow (\pk, \ak)$.
We make the PK succinct by converting it to an accumulator over the VKs. The aggregation key (AK) still needs to maintain all the individual VKs. It is an interesting open question whether the AK can be made succinct (and thus eliminated).
$\mathsf{SMURF_2.AggSig}(\ak, m, (\sigma_i)_{i\in T}) \rightarrow \sigma$:
The new $\mathsf{AggSig}$ proves knowledge of $t$ valid signature shares, against the VKs accumulated in the PK, such that these shares aggregate into a unique output $y$. More formally, the proof argues knowledge of $\sigma_i$’s and $\vk_i$’s such that $\mathcal{R}(d, m, t, y; T, (\sigma_i)_{i\in T}, (\vk_i)_{i\in [n]}) = 1$. We describe the relation $\mathcal{R}$ in detail below.
$\mathsf{SMURF_2.Verify}(\pk, m, \sigma) \rightarrow \{0,1\}$:
$\mathsf{SMURF_2.Derive}(\pk, m, \sigma) \rightarrow y$.
That’s it! This (theoretical) construction now achieves succinctness.
We assume the $\mathsf{AoK}$ scheme is succinct and lacks a trusted setup. However, this $\mathsf{SMURF_2}$ scheme should still be interesting even if the $\mathsf{AoK}$ scheme requires a trusted setup. After all, the trusted setup would only need to be redone to support a higher $n$ and would be reusable for any number of players $n_0 < n$.
$\mathcal{R}(d, m, t, y; T, (\sigma_i)_{i\in T}, (\vk_i)_{i\in [n]}) = 1$ iff.:
If symmetric multilinear maps exist, then there exist SMURFs! Unfortunately, if efficient SMURFs exist (or even their weaker, threshold variant), then efficient $n$-party non-interactive key exchange (NIKE) exists^{11}. (This explains why our two SMURF constructions are very similar to an $n$-party NIKE based on multilinear maps^{16}.)
Future work:
Acknowledgements: Thanks to Valeria Nikolaenko, Joe Bonneau, Rex Fernando, Benny Pinkas, Dan Boneh and Trisha Datta for reading, providing feedback and brainstorming together!
Big thanks to Guru Vamsi Policharla for pointing out that $t$-out-of-$n$ silent setup VUFs also imply $(n-t+1)$-NIKE.
Correctness: $\forall$ number of players $n$, $\forall$ thresholds $t\le n$, where $(\sk_j, \vk_j) \gets \mathsf{SMURF.KeyGen}(1^\lambda),\forall j\in[n]$ and $(\pk,\ak) \gets \mathsf{SMURF.AggPubkey}(t, (\vk_j)_{j\in[n]})$, for any subset $T\subset[n]$, where $|T| \ge t$, $\forall$ messages $m$, $\sigma_i \gets \mathsf{SMURF.ShareSign}(\sk_i, m),\forall i\in T$, $\sigma \gets \mathsf{SMURF.AggSig}(\ak, m, (\sigma_i)_{i\in T})$ we have:
\begin{align*}
\forall i\in T,\mathsf{SMURF.ShareVer}(\vk_i, m, \sigma_i) &= 1 \wedge {}\\
\mathsf{SMURF.Verify}(\pk, m, \sigma) &= 1 \wedge {}\\
\mathsf{SMURF.Derive}(\pk, m, \sigma) &= \mathsf{SMURF.Eval}(t, (\vk_j)_{j\in [n]}, m)
\end{align*}
Note: This implies that, for a correctly-generated $t$-out-of-$n$ threshold PK from $n$ VKs, if a signature pases verification via $\mathsf{SMURF.Verify}$, then calling $\mathsf{SMURF.Derive}$ on it would yield the same result as calling $\mathsf{SMURF.Eval}$ on the same message, the same $n$ VKs and the same threshold $t$.
Uniqueness:
For all polynomial time adversaries $\Adv$, for any number of players $n=\poly(\lambda)$, for any threshold $t\le n$, we have:
\begin{align*}
\Pr\begin{bmatrix}
(\pk, m, (\sigma_i)_{i\in[2]} \gets \Adv(1^\lambda),\\
(y_i \gets \mathsf{SMURF.Derive}(\pk, m, \sigma_i))_{i\in[2]}
:\\
y_1 \ne y_2 \wedge \forall i\in [2], \mathsf{SMURF.Verify}(\pk, m, \sigma_i) = 1
\end{bmatrix} = \negl(\lambda)
\end{align*}
Uniqueness says that an adversary is not able to produce two signatures for the same message that both verify against a threshold PK yet derive different outputs. This particular definition is rather strong, since it allows the adversary to produce the PK adversarially.
In order to define security of a SMURF, we will need to define an oracle that the adversary can query as he attempts to break the scheme. The oracle will allow the adversary to:
This helps formally model the power of the adversary in the multiverse setting in which the SMURF is supposed to remain secure.
How does the oracle work? First, the oracle $\mathcal{O}$ maintains some state:
Then, the oracle handles the following requests from the adversary $\Adv$:
$\mathcal{O}.\mathsf{KeyGen}() \rightarrow (i, \pk)$:
This generates a new player numbered $i$, adds it to the list $L$ and marks it as honest in $H$.
$\mathcal{O}.\mathsf{CorruptPlayer}(i) \rightarrow \sk$:
This checks if player $i$ actually exists and, if so, corrupts it by revealing their SK. The player is marked as malicious by adding it to $M$ and removing it from $H$.
$\mathcal{O}.\mathsf{ShareSign}(i, m) \rightarrow \sigma_i$:
This returns a signature share $m$ from player $i$, assuming $i$ is honest. The oracle tracks the set of signature queries by adding $i$ to $Q_m$.
Now that we’ve defined the oracle, we can meaningfully define security as follows.
Unforgeability: For all polynomial time adversaries $\Adv$ with oracle access to $\mathcal{O}$:
\begin{align*}
\Pr\begin{bmatrix}
(t, (\vk_{i\in[n]}), m, \sigma) \gets \Adv^\mathcal{O}(1^\lambda),\\
(\pk,\cdot) \gets \mathsf{SMURF.AggPubkey}(t, (\vk_{i\in[n}))
:\\
\mathsf{SMURF.Verify}(\pk, m, \sigma) = 1 \wedge
|Q_m| < t
\end{bmatrix} = \negl(\lambda)
\end{align*}
Unpredictability: For all polynomial time adversaries $\Adv$ with oracle access to $\mathcal{O}$:
\begin{align*}
\Pr\begin{bmatrix}
(t, (\vk_{i\in[n]}), m, y) \gets \Adv^\mathcal{O}(1^\lambda),\\
(\pk,\cdot) \gets \mathsf{SMURF.AggPubkey}(t, (\vk_{i\in[n}))
:\\
\mathsf{SMURF.Eval}(t, (\vk_i)_{i\in[n]}, m) = y \wedge |Q_m| < t
\end{bmatrix} = \negl(\lambda)
\end{align*}
Unpredictability must be defined to prevent trivial instantiations. Otherwise, for example, the signature $\sigma$ could be any non-unique multiverse signature (e.g., BLS multisignatures with proofs-of-possession) while $\mathsf{SMURF.Derive}$ could always uniquely set the VUF output to $y=\bot$. Such a trivial scheme is excluded by our unpredictability definition.
Although verifiable delay functions (VDFs) also give rise to efficent distributed randomness beacons, we do not of VDF-based beacons that are responsive: i.e., they produce beacon values as fast as the network speed. ↩
Short Signatures from the Weil Pairing, by Boneh, Dan and Lynn, Ben and Shacham, Hovav, in Advances in Cryptology — ASIACRYPT 2001, 2001 ↩ ↩^{2}
Roll with Move: Secure, instant randomness on Aptos, by Alin Tomescu and Zhuolun Xiang, 2024, URL ↩
Threshold Signatures in the Multiverse, by L. Baird and S. Garg and A. Jain and P. Mukherjee and R. Sinha and M. Wang and Y. Zhang, in 2023 IEEE Symposium on Security and Privacy (SP), 2023, [URL] ↩ ↩^{2} ↩^{3}
Threshold Signatures from Inner Product Argument: Succinct, Weighted, and Multi-threshold, by Sourav Das and Philippe Camacho and Zhuolun Xiang and Javier Nieto and Benedikt Bunz and Ling Ren, in Cryptology ePrint Archive, Paper 2023/598, 2023, [URL] ↩
hinTS: Threshold Signatures with Silent Setup, by Sanjam Garg and Abhishek Jain and Pratyay Mukherjee and Rohit Sinha and Mingyuan Wang and Yinuo Zhang, in Cryptology ePrint Archive, Paper 2023/567, 2023, [URL] ↩
Decentralized Threshold Signatures for Blockchains with Non-Interactive and Transparent Setup, by Kwangsu Lee, in Cryptology ePrint Archive, Paper 2023/1206, 2023, [URL] ↩
Big thanks to Guru Vamsi Policharla for this observation during the 3rand workshop! ↩ ↩^{2}
Verifiable random functions, by S. Micali and M. Rabin and S. Vadhan, in 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039), 1999 ↩
Indistinguishability Obfuscation from Well-Founded Assumptions, by Aayush Jain and Huijia Lin and Amit Sahai, in Cryptology ePrint Archive, Paper 2020/1003, 2020, [URL] ↩
Multilinear Maps from Obfuscation, by Martin R. Albrecht and Pooya Farshim and Shuai Han and Dennis Hofheinz and Enrique Larraia and Kenneth G. Paterson, in Cryptology ePrint Archive, Paper 2015/780, 2015, [URL] ↩
A multilinear map of size $n’ > n$ inputs would also work by forcing the last $n’ - n$ inputs to be some predetermined values from a common-reference string. ↩
Applications of Multilinear Forms to Cryptography, by Dan Boneh and Alice Silverberg, in Cryptology ePrint Archive, Paper 2002/080, 2002, [URL] ↩
We assume familiarity with:
The Baird et al. strawman^{1} follows a very simple idea.
Each player $i\in[n]$ locally picks their secret key $\sk_i$ and computes their public key as $\pk_i = g^{\sk_i}$. Then, the SKs of a set of $n$ players can be used to define a degree-$(n-1)$ polynomial $f(X)$ as follows: \begin{align} f(i) &= \sk_i,\forall i \in[n] \end{align} To create a $t$-out-of-$n$ threshold signature scheme, the players can collaborate (in an MPC/DKG-like fashion), to publicly-reveal $n-t$ evaluations of this polynomials. This effectively reduces the degree of the polynomial to be $t-1$.
Open question: Can this protocol for publicly-revealing the $n-t$ evaluations can be instantiated any more efficiently than a DKG?
Specifically, the players use (some) DKG-like protocol to reveal: \begin{align} \mathsf{evals} = \left(f(-1), f(-2),\ldots,f(-(n-t))\right) \end{align} The secret key of the resulting $t$-out-of-$n$ threshold signature scheme is defined as: \begin{align} \sk = f(0) \end{align} The associated PK consists of the publicly-revealed evaluations and, of course, $g^{f(0)}$: \begin{align} \pk = (\mathsf{evals}, g^\sk) = \left(\mathsf{evals}, g^{f(0)}\right) = \left(\mathsf{evals}, \prod_{i\in[n]} \pk_i\right) \end{align} To assemble a threshold signature on a message $m$, each player $i$ reveals their signature share $H(m)^{\sk_i}$. Then, any aggregator who has $\pk$ and $t$ signature shares, can interpolate the unique threshold signature $H(m)^{f(0)}$ from (1) the signature shares and (2) the publicly-reveled evaluations in $\pk$.
We give more details below.
Below, we formally give the Baird et al. strawman^{1}.
$\mathsf{Sig}$.$\mathsf{KeyGen}(1^\lambda) \rightarrow (\sk, \pk)$:
$\mathsf{Sig}$.$\mathsf{DistKeyGen}(t, (\sk_i, \pk_i)_{i\in[n]}) \rightarrow (\sk, \pk)$:
The $\mathsf{Sig.DistKeyGen}$ algorithm is run by the players in an MPC fashion such that it outputs the $\pk$ of the threshold signature scheme yet no one learns the $\sk$.
$\mathsf{Sig}$.$\mathsf{ShareSign}(\sk_i, m) \rightarrow \sigma_i$:
$\mathsf{Sig}$.$\mathsf{ShareVer}(\pk_i, m, \sigma_i) \rightarrow \{0,1\}$:
$\mathsf{Sig}$.$\mathsf{Aggregate}(\pk, m, (\sigma_i)_{i\in T}) \rightarrow \sigma$:
$\mathsf{Sig}$.$\mathsf{Verify}(\pk, m, \sigma) \rightarrow \{0,1\}$:
This is a very nice scheme, but it has a few problems:
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
This article assumes familiarity with Shamir secret sharing[^Shamir79], a technique that allows a dealer to “split up” a secret $s$ amongst $n$ players such that any subset of size $\ge t$ can reconstruct $s$ yet no subset of size $<t$ learns anything about the secret.
Recall that a secret ${\color{green}s}\in \Zp$ is $t$-out-of-$n$ secret-shared as follows:
The dealer encodes $s$ as the 0th coefficient in a random degree-$(t-1)$ polynomial $\color{green}{f(X)}$: \begin{align} f(X) &= s + \sum_{k=1}^{t-1} f_k X^k,\ \text{where each}\ f_k\randget \Zp \end{align}
The dealer gives each player $i\in [n]$, their share $s_i$ of $s$ as:
\begin{align}
\color{green}{s_i} &= f(i)\\
&= s + \sum_{k=1}^{t-1} f_k \cdot i^k
\end{align}
The shares $[s_1, s_2, \ldots, s_n]$ define the $t$-out-of-$n$ sharing of $s$.
Recall the definition of a Lagrange polynomial w.r.t. to a set of evaluation points $T$.
\begin{align}
\forall i\in[n],
\color{green}{\lagr_i(X)} &= \prod_{k\in T, k\ne i} \frac{X - k}{i - k}
\end{align}
The relevant properties of $L_i^T(X)$ are that:
\begin{align}
L_i(i) &= 1,\forall i \in T\\
L_i(j) &= 0,\forall i, j\in T, i\ne j\\
\end{align}
Any subset $T\subseteq[n]$ of $t$ or more players can reconstruct $s$ by combining their shares as follows:
\begin{align}
\sum_{i\in T} \lagr_i^T(0) s_i &= \sum_{i\in T}\lagr_i^T(0) f(i) = f(0) = s\\
\end{align}
Suppose the old players, who have a $t$-out-of-$n$ sharing of $s$, want to reshare s with a set of $\color{green}{n’}$ new players such that any $\color{green}{t’}$ players can reconstruct $s$.
In other words, they want to $t’$-out-of-$n’$ reshare $s$.
Importantly, they want to do this without leaking $s$ or any info about the current $t$-out-of-$n$ sharing of $s$. A technique for this, whose origins are (likely?) in the BGW paper^{1}, is described by Cachin et al.^{2} and involves four steps:
Each old player $i$ first “shares their share” with the new $n’$ players: i.e., randomly sample a degree-$(t’-1)$ polynomial $\color{green}{r_i(X)}$ that shares their $s_i$:
\begin{align}
\color{green}{r_i(X)} &= s_i + \sum_{k=1}^{t’-1} {\color{green}r_{i,k}} X^k,\ \text{where each}\ r_{i,k}\randget \Zp
\end{align}
Let ${\color{green}z_{i,j}}$ denote the share of $s_i$ for player $j\in[n’]$. \begin{align} {\color{green}z_{i,j}} = r_i(j) \end{align} Then, each old player $i$ will send $z_{i,j}$ to each new player $j\in [n’]$.
And voilà: SUCH A BEAUTIFUL, SIMPLE PROTOCOL for secret resharing.
It’s easy to see why if we reason about the underlying polynomial defined by the new players’ shares $z_j$.
Specifically, the degree-$(t’-1)$ polynomial $r(X)$ where $r(0) = s$:
\begin{align}
r(x) &= \sum_{i\in H} \lagr_i^H(0) r_i(X)\\
&= \sum_{i\in H} \lagr_i^H(0) \left(s_i + \sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\\
&= \left(\sum_{i\in H} \lagr_i^H(0) f(i)\right) + \left(\sum_{i\in H}\lagr_i^H(0) \left(\sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\right)\\
&= s + \sum_{i\in H}\lagr_i^H(0) \left(\sum_{k=1}^{t’-1} r_{i,k} \cdot X^k\right)\\
&\stackrel{\mathsf{def}}{=} s + \sum_{k=1}^{t’-1} {\color{green}r_k} X^k
\end{align}
In other words, $[s, r_1, r_2,\ldots,r_{t’-1}]$ are the coefficients of the polynomial obtained from the linear combination of the $r_i(X)$’s by the Lagrange coefficients $\lagr_i^H(0)$.
In more detail:
\begin{align}
r(x) &= s + \left(\begin{matrix}
&\lagr_{i_1}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_1,k} \cdot X^k\right) + {}\\
&\lagr_{i_2}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_2,k} \cdot X^k\right) + {}\\
&\ldots\\
&\lagr_{i_{|H|}}^H(0) \left(\sum_{k=1}^{t’-1} r_{i_{|H|},k} \cdot X^k\right)\\
\end{matrix}\right)
\end{align}
Let ${\color{green}c_{i_j, k}} \stackrel{\mathsf{def}}{=} \lagr_{i_j}^H(0) \cdot r_{i_j, k}$.
Then, we can rewrite the above as:
\begin{align}
r(x) &= s + \left(\begin{matrix}
&\sum_{k=1}^{t’-1} c_{i_1,k} \cdot X^k + {}\\
&\sum_{k=1}^{t’-1} c_{i_2,k} \cdot X^k + {}\\
&\ldots\\
&\sum_{k=1}^{t’-1} c_{i_{|H|},k} \cdot X^k\\
\end{matrix}\right)
\end{align}
Let ${\color{green}r_k}\stackrel{\mathsf{def}}{=} \sum_{i_j \in H} c_{i_j, k}$.
Then, we can rewrite the above as:
\begin{align}
r(x) &\stackrel{\mathsf{def}}{=} s + \sum_{k=1}^{t’-1} r_k X^k
\end{align}
And, as we saw in Equation \ref{eq:newshare} above, any new player $j\in[n’]$ can get their share of $r(X)$ via:
\begin{align}
z_j
&= \sum_{i\in H} \lagr_i^H(0) r_i(j)\\
&= r(j)
\end{align}
Big thanks to Benny Pinkas for pointing me to the BGW paper^{1} and for pointing out subtleties in what it means for an old player to correctly share their share.
Completeness Theorems for Non-cryptographic Fault-tolerant Distributed Computation, by Ben-Or, Michael and Goldwasser, Shafi and Wigderson, Avi, in Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing, 1988, [URL] ↩ ↩^{2}
Asynchronous Verifiable Secret Sharing and Proactive Cryptosystems, by Cachin, Christian and Kursawe, Klaus and Lysyanskaya, Anna and Strobl, Reto, in Proceedings of the 9th ACM Conference on Computer and Communications Security, 2002, [URL] ↩
This step is non-trivial and is where most protocols work hard to achieve efficiency. For example, see ^{2}. Publicly-verifiable secret sharing (PVSS) on a public bulletin board such as a blockchain is a simple (albeit naive) way of achieving this: there will be $n$ PVSS transcripts, one for each-reshared $s_i$, and everyone can agree on the set $H$ of valid transcripts. ↩ ↩^{2}
I once ran into a video where Neil deGrasse Tyson, in relation to a debate with folks who didn’t “believe”^{1} in global warming nor in evolution, said the following:
“The good thing about science is that it’s true whether or not you believe in it” – Neil deGrasse Tyson
You can see the short video below:
The way to conceptualize, and popularize^{2}, science is not as being “true” or “false.”
Science is a process that we engage in to discover truths. And it often leads us astray^{3}$^,$^{4}, which is why the idea of “science being true” is at best misleading and at worst dangerous.
Science is not a belief in any sense of the word. Although many religions might be based on belief, science is an entirely different beast.
Science requires not belief, but experimentation, theory postulation and theory falsification (or refinement).
Those are fancy words, but here’s an example everyone can understand:
Question: How might we find out if there is a universal acceleration that falling objects have? Is it $5\ m/s^2$? Is it $10\ m/s^2$?
Poor answer: We could just blindly accept whatever answer Neil DeGrasse Tyson gives us. After all “science is true whether you believe it or not,” no?
Better answer: No. That is a kind of “scientism” akin to religious belief. Instead, we could engage in the scientific process:
Richard Feynman, a physicist you might know^{6} from his work on quantum electrodynamics (QED)^{7}, explains very beautifully:
What is inherent in science is not “truth” but “uncertainty.”
In one of his famous physics lecture, Feynman beautifully elucidates the leap of faith scientists must make when going from a concrete scientific experiment to a general law that predicts beyond what the experiment tested for.
I include an excerpt below, but I find the video to be much more convincing (and entertaining) to watch:
In the video above, Feynman genuinely asks the audience:
Why are we able to extend our laws to regions that we’re not sure?
How is it possible?
Why are we so confident??
He then explains that extending such general laws is the thing we must do if we are to learn anything new beyond the outcome of the concrete experiment. The price we pay, of course, is we (scientists) stand to be proven wrong in the future. In other words, the nature of scientific laws or theories is they are uncertain, subject to be fully-falsified or partially-refined.
It’s not a mistake to say that [the law] is true in a region where you haven’t looked yet.
If you will not say that it’s true in a region that you haven’t looked yet, [then] you don’t know anything!
[In other words,] If the only laws that you find are those which you just finished observing, then… you can’t make any predictions!
[But] the only utility of the science is to go on and to try and make guesses.
[…]
So what we do is always to stick our neck out!
And that of course means that the science is uncertain!
[…]
We always must make statements about the regions that we haven’t seen, or [else] there’s no use in the whole business.
Feynman reiterates on this point to make it stick:
We do not know all the conditions that we need for an experiment!
[…]
So in order to have any utility at all to the science…
In order not simply to describe an experiment that’s just been done, we have to propose laws beyond their range.
And there’s nothing wrong with that. That’s the success; that’s the point!
And, uhh, that makes the science uncertain.
If you thought before that science is certain, well, that’s just an error on your part.
Please don’t be fooled into thinking “science is true” (whatever that’s supposed to mean). Such claims are nonsensical.
First, they are nonsensical, because science is a process. Processes cannot be “true” or “false”; they are just a way of doing things. Second, because scientific theories are (mostly) falsifiable, which is a fancy way of saying there is room to prove them wrong. (i.e., they might actually be false!)
If one cannot rely on science “to be true”, what should one do instead?
The first option is to take a scientific theory and try to falsify it. For example, Einstein picked Newton’s laws of motion, showed they cannot properly describe motion for objects moving close to the speed of light and generalized them. And that’s how we got the general theory of relativity and, apparently, GPS on our phones.
Note: I look at this as falsifying Netwon’s theories, which were simply not going to work accurately in Einstein’s extreme conditions, while others might look at it as refining them (since Newton’s laws of motion still approximate things very well).
The second option is to devise an experiment, make some observations, generalize them into a theory, and see if your theory holds water by checking if it can predict anything useful. Then, you can go back to the first option and try to falsify your theory.
In other words, engage in science as opposed to “believing in science.”
If you must “believe in science [the scientific process]”, realize you are simply deferring to the authority of other scientists and the soundness of the scientific peer review process.
Yet other scientists are people like you and me.
And we make mistakes, are inherently ignorant or have perverse incentives^{8}.
I recognize that engaging in the scientific process is a very high bar to meet. I also recognize it’s not clear how to reach consensus faster on scientific theories of high importance such as anthropogenic climate change. But I do know that preaching “science is true; believe science” is a steadfast way of moving from scientific (falsifiable) territory into religious (unfalsifiable) territory. This would defeat the original goals of science: to ensure we find out when we are wrong and, as a result, get a bit closer to the truth.
The most widely-accepted philosophy of science comes from Karl Popper, who argued that a theory or statement is scientific only if it can, in principle, be empirically disproven, a.k.a. falsified.
In simpler words words, if a theory does not admit any tests that could prove it false, then that theory is not scientific.
But not all philosophers of science agree with Popper. Some argue that scientific theories must be both falsifiable and empirically-confirmed (to some degree).
Others argue that scientific theories can be unfalsifiable “in practice” as long as they either:
Nonetheless, the power of scientific (unfalsifiable) theories and of the scientific method does not mean we should dispense with all unfalsifiable theories! That would be anti-human, as I’ll argue below.
A contrived-but-simple example: your friend tells you “There exist pink cars with yellow stripes!” That is an unfalsifiable theory; it would be impossible to disprove. You’d have to somehow show your friend that all cars in the universe don’t match this description, which would take infinite time. Yet, despite being unfalsifiable (and thus unscientific), the theory can be shown to be true by a trivial observation of such a car. And indeed, I know a person with such a car (so I know the theory to be true)!^{9}
So, should you dismiss every unfalsifiable (non-scientific) theory coming out of people’s mouths? No, because we all make such unfalsifiable statements all the time and we get along just fine: e.g., “I had a delicious almond croissant this morning.”
Another example: religious theories (e.g., “there is a God”) fall into the same category of unfalsifiable but potentially-true theories. In fact, people who claim to have had direct experiences of the divine are firmly-convinced of their veracity. Much like your friend above who saw a pink car with yellow stripes that you did not see.
This is not to say that religious theories should be thrown away (i.e., I recognize the limitations of my own consciousness). But it is to say that one should be very careful how they act on such theories. After all, if the religious theory is false, there is no way to find out; it’s unfalsifiable!
Pro tip: In layman terms, don’t burn or kill people because they don’t hold the same unfalsifiable beliefs as you.
I’ll leave you with a last example. My wise mother once threw this at me when I was being a smartass about how bad unfalsifiable theories can be. She said “Okay, how about the theory that love exists in the world. Are you gonna throw away that theory too because it’s unfalsifiable?”. (If you don’t know this theory to be true, then you have other problems and I wish I could give you a warm hug.)
Science is not simply “true”. It is a process that you could engage in. It is not an axiom that you take for granted. It is not an authority that you defer to.
If you don’t have time to engage in science, you can trust-but-verify. However, there is a risk you’ll be deceived by:
Sentences like these should be critically inspected.
Thumb rule: Mentally-remove the word “science” (and its derivatives) from sentences. See if those sentences still sound convincing. If they do not, something is being left out and you could investigate. More and more, the word “science” is used authoritatively without so much as a citation to the relevant scientific work(s).
I do admire deGrasse Tyson’s efforts to bring the scientific process to the masses. And I’m sure he meant well in (mis)stating that “science is true” (e.g., perhaps he meant to convey, in an entertaining way, his own confidence in the scientific process).
Nonetheless, the conflation of “truth” with “science” and the implication that one should “believe science” worries me. I, for one, find this borderline-dangerous in our increasingly-polarized society, which is more and more filled with separating beliefs. Adding science to this list of beliefs would not serve anyone.
I hope this blog post clearly articulated that science is a (fallible) process and must not be blindly trusted.
I write “believe” in quotes because I find the usage of the term “believe” to be over-simplifying when it comes to how one should engage with complex scientific theories like the theory of anthropogenic climate change. In other words, to simply have to pick between “Do you or do you not believe in global warming?” is an unproductive way of getting any clarity on the causes of global warming. A better way might be to ask someone “What evidence is there for anthropogenic climate change and have you taken a close look at it?”. (PS: This blog post is not about the climate change issue.) ↩
I do want to recognize Neil deGrasse Tyson’s amazing efforts in popularizing science by speaking about it in a particularly entertaining way. In a way, that’s likely the problem behind his “science is true” claim: he did not carefully balance between entertainment and actual education. ↩
For example, in some less enlightened parts of the word, the “scientific” belief used to be that Earth was in the center of our solar system and that the Sun and stars revolved around it. Galileo Galilei was put into house arrest by the Roman Catholic church for the “heresy” of giving evidence that this theory may be wrong. ↩
Another falsified theory was the “Aether Theory.” Aether was believed to be a medium that filled space and enabled the propagation of light. The famous Michelson-Morley experiment in 1887 failed to detect aether, leading to the eventual acceptance of Einstein’s theory of relativity, which does not require the existence of aether. ↩
This principle of air resistance was famously demonstrated, albeit on the Moon, by astronaut David Scott during the Apollo 15 mission.He dropped a feather and a hammer side by side on the Moon. Due to the absence of air resistance on the Moon, they hit the lunar surface at the same time, providing a visual demonstration of this principle. ↩
You might also know Feynman from his hilarious autobiographical books such as “Surely you’re joking, Mr. Feynman! (Adventures of a Curious Character)“. ↩
Feynman, Julian Schwinger and Shin’ichirō Tomonaga were jointly awarded the Nobel Prize in 1965 for their contributions to QED. ↩
Several books can be (and probably have been) written on the perverse incentives in academia, in scientific peer-review, science funding, etc. ↩
I do not. Just trying to make a point. ↩
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
Non-membership proofs in Merkle trees are surprisingly elusive to many people. The problem statement is very simple:
Suppose you have a server who wants to authenticate elements of a set $S$ to a client without ever sending the whole set to this client.
(For simplicity, let’s assume this is a set of numbers.)
Specifically, the server first computes a succinct authentication digest of the set, denoted by $d$, and sends $d$ to the client.
Then, the server is able to prove either membership or non-membership of an element in the set by sending a succinct proof to the client which the client can efficiently verify with respect to the digest $d$.
Design a Merkle tree-based solution for this problem.
The most popular solution to this problem seems to be to build a Merkle tree whose leaves are sorted. This, unfortunately, is a rather sub-optimal solution, both from a security and a complexity point of view.
In this blog post, I hope to dispel the myth of the effectiveness of this sorted-leaves Merkle tree scheme.
Recall that it, if we only need to prove membership, it is very easy to solve the problem by building a Merkle tree over all elements in the set and letting the digest be the Merkle root.
For example, here’s how this would look for a particular choice of set $S$. (Original slides here.)
Then, a membership proof would be a Merkle sibling path to the proved element’s leaf (i.e., the nodes in yellow):
The client can easily verify the proof by computing the hashes along the path from the leaf to the root, and checking that it obtains the same root hash as it has stored locally:
It seems that many people believe sorting the leaves is the right approach to enable non-membership proofs.
This blog post will argue, from three different perspectives, why this is a sub-optimal choice.
Okay! Let say that, instead of the solution from above, the server does indeed first sort the set $S$ as $[2, 3, 6, 7, 9, 11, 13, 14]$ and then computes the Merkle tree:
Clearly, the server can still prove membership as before: just give a Merkle sibling path to the proved element’s leaf.
But, now, it is also possible to prove non-membership of an element.
For example, we can prove non-membership of $8$ by showing that (a) both $7$ and $9$ are in the tree and (b) that they are adjacent. This implies there’s no room where 8 could fit. Therefore, 8 cannot be in the tree:
In other words, the two membership proofs for the adjacent leaves of $7$ and $9$ constitute a non-membership proof for 8, which would have to be placed between them (but cannot be since “there’s no room”).
Can you spot the security issue? It’s a bit subtle and many people miss it…
Here it is: this scheme is secure only if the server correctly computes the Merkle tree over the sorted leaves.
Otherwise, if the server is malicious, it can re-order the leaves and pretend than an element $e$ is both in the set and not in the set.
For example, the malicious server could compute the tree as follows:
Note that the malicious server left 7 adjacent to 9, so that it can still give what appears to be a valid non-membership proof for 8:
At the same time, note that the malicious server inserted a leaf for 8 somewhere else. As a result, the server can still give what appears to be a valid membership proof for 8:
This, of course, is very bad: the server was able to prove two inconsistent statements about the membership of 8 in the digested set. Put differently, it clearly cannot be that 8 is both in $S$ and not in $S$ at the same time. Therefore, the sorted-leaves Merkle tree is insecure when the server cannot be trusted to produce correct digests (and we’ll define security below).
In other words, this type of attack completely ruins security: it makes any proof meaningless to the client (e.g., proof that $8\in S$), since it could easily be followed by a contradicting proof (e.g., a contradicting proof that $8\notin S$).
Not all hope is lost. In some settings, it can be reasonable to assume the digest (i.e., Merkle root) was produced correctly.
For example, in distributed consensus settings (a.k.a., in “blockchains”), there is no single server that dictates what the Merkle root of the data is. Instead, all $n = 3f+1$ servers try to compute the same correct root and vote on it. Servers who deviate from the correct root are ignored and consensus is reached on the correct one by a subset of $2f+1$ honest servers.
Therefore, in this setting, it is okay to rely on the sorted-leaves Merkle tree construction. (I’ll still argue here and here why you shouldn’t, but from different perspectives.)
Other harmless settings include single-client data outsourcing, where a client sorts & Merkle hashes his own data correctly, and transfers everything but the Merkle root to a malicious server.
Since the client has computed the correct root on his own, the client can rely on the server’s (non)membership proofs.
One thing worth emphasizing is that ad-hoc fixes to the problem of a potentially-incorrect digest are not worth it, especially since one can get a construction that needs no fixing from, e.g., a Merkle trie. Specifically, it is not worth it to require the server to prove that it correctly sorted the leaves (e.g., via a SNARK). Also, it is not worth it to rely on fraud proofs when one can have provably-correct behavior all the time. Lastly, it is not worth it to probabilistically audit the data structure to see if you can find two incorrectly-sorted leaves. None of these approaches are worth it because there exist more secure Merkle tree constructions like Merkle tries. Plus, these constructions are easier to update and have smaller proof sizes!
We can formalize the setting in which authenticated set constructions (like the sorted-leaves Merkle tree) are secure.
Specifically, we can define a notion of weak (non)membership soundness that captures the idea that the malicious server must compute the digest correctly:
An authenticated set scheme has weak (non)membership soundness if for all (polynomial-time) adversaries $A$, the probability that $A$ outputs a set $S$, an element $e$, and two proofs $\pi$ & $\pi’$ such that, letting $d$ be the (correct) digest of $S$, $\pi$ verifies as a valid membership proof for $e$ (w.r.t. $d$) while $\pi’$ also verifies as a valid non-membership proof for $e$ (w.r.t. $d$), is negligible in the security parameter of the scheme.
Notice that the adversary outputs a set of elements from which the correct digest $d$ is computed.
In fact, there is a long line of academic literature on 2-party and 3-party authenticated data structures that rely on this type of weaker soundness definitions (see Papamanthou’s PhD thesis^{1} for a survey).
Unfortunately, many applications today inherently rely on untrusted publishers who can compute malicious digests of their data.
For example, in key transparency logs such as Certificate Transparency (CT), log servers can present any digest to new clients joining the system. Therefore, in this setting, authenticated data structures (whether sets or not), must satisfy a stronger notion of security which allows the adversary to construct the digest maliciously.
In fact, such a stronger notion simply requires that the adversary output the digest $d$ directly, which gives the adversary freedom to construct an incorrect one as in our attack above:
An authenticated set scheme has strong (non)membership soundness if for all (polynomial-time) adversaries $A$, the probability that $A$ outputs a digest $d$, an element $e$, and two proofs $\pi$ & $\pi’$ such that $\pi$ verifies as a valid membership proof for $e$ (w.r.t. $d$) while $\pi’$ also verifies as a valid non-membership proof for $e$ (w.r.t. $d$), is negligible in the security parameter of the scheme.
The moral of the story is to pick a Merkle construction that has this stronger notion of security, unless you are sure that your setting allows for the weaker notion and you stand to benefit from relaxing the security (e.g., perhaps because you get a faster construction). A good example of this is the KZG-based authenticated dictionary from Ethereum Research^{2} which has weak soundness (as would be defined for dictionaries), but that’s okay since their consensus setting can accommodate it.
This one is much easier to explain.
Imagine you want to add a new element in your sorted-leaves Merkle tree of size 8.
What if it is smaller than everything else and has to be inserted as the first leaf of the tree?
Then, you would have to completely rehash the entire tree to incorporate this new leaf! This would take $O(n)$ work in a tree of $n$ leaves.
The same problem arises if you’d like to remove the first leaf.
To deal with the slowness of insertions, one can take an amortized approach and maintain a forest of sorted-leaves Merkle trees, where (1) new leaves are appended to the right of the forest as their own size-1 trees and (2) trees of the same size $2^i$ for any $i \ge 0$ get “merged” together by merge-sorting their leaves and rehashing. One can show this approach has $O(\log{n})$ amortized insertion cost. However, such amortized approaches still suffer from $O(n)$ worst-case times and must be de-amortized to bring the worst-case cost down to the amortized cost^{3}.
On the other hand, dealing with deletions can be easier. Specifically, if you do not care about wasted space, then deletions can be done faster by simply marking the leaf as “removed” and trying to garbage-collect as many empty subtrees as you can. Nonetheless, in the worst case, the storage complexity of an $n$-leaf Merkle tree after $O(n)$ deletes remains $O(n)$ (e.g., imagine deleting every even-numbered leaf).
The other problem with the sorted leaves construction is that two Merkle paths must be given as a non-membership proof.
In the best-case, this can be exactly $\log{n}-1$ hashes, but in the worst case this can be as much as $2\log{n}-2$ hashes (e.g., when one proof is in the left subtree and the other proof is in the right subtree).
This is not so great if proof size is a concern. It is also not so great when the Merkle tree is stored on disk since it can double the proof reading I/O cost.
Furthermore, actually achieving the best-case proof size complexity in an implementation can be tricky: the developer must efficiently batch the fetching of the two Merkle proofs from disk or memory, taking care never to fetch the same sibling hash twice (or waste I/O).
…and you want to maintain strong (non)membership soundness, then there is a simple way to fix your construction.
All you have to do is store, inside each internal node of your tree, the minimum and maximum element in that node’s subtree.
Now, a Merkle proof, whether for membership or not, has to additionally reveal the minimum and maximum’s along the proven path. Importantly, when hashing up to verify the Merkle proof, the verifier must ensure the revealed leaf and all min/max pairs revealed are consistent and hashed correctly as part of the verification.
This will of course further increase the proof size of your construction. It will also increase the complexity of implementing the verification procedure, since the min/max ranges have to be incorporated into the hashing and one must check that, for all revealed ranges in the proof, a parent’s range encompasses their child’s range.
Feel free to consider this approach. You could try reproducing the attack from above. You’ll see that while you can present one proof, you’ll have difficulty presenting the other because you will not be able to forge the authenticated min/max ranges. Thus, this construction has strong (non)membership soundness.
This deserves its own post, but here are the key reasons you should probably use a Merkle trie:
There are of course some disadvantages too, but I find them negligible:
In fact, some folks argue that the best trie implementation is via critbit trees^{5}. Unfortunately, I do not know enough about their benefits, especially when Merkleized, but this is probably very much worth exploring.
Kocher^{6} proves non-revocation of certificates via a sorted-leaves-like approach. His approach Merkelizes a list of sorted, non-revoked certificate ID ranges. Specifically, each leaf is a pair $(a, c)$ that says $a$ has been revoked but all certificates $b$ such that $b > a$ and $b < c$ have not been revoked.
Thus, one can prove non-revocation of $b$ by revealing the leaf $(a, c)$ that encompasses the non-revoked ID $b \in (a, c)$. One can also prove revocation of $a$ by revealing the leaf $(a, c)$.
A depiction of the sorted-leaves-like approach from Kocher’s original paper^{7}. The set of elements being authenticated here (i.e., revoked certificates) is $S = \{5, 12, 13, 15, 20, 50, 99\}$.
Of course, Kocher’s approach is vulnerable to the same mis-ordering attack we discussed above. (Furthermore, it also suffers from inefficiency of updates.)
Indeed, Buldas et al.^{7} point out the mis-ordering attack and solve the problem by Merkelizing a binary search tree (BST) instead, which they baptize as an authenticated search tree. However, as far as I could tell, the paper does not describe how to efficiently update such authenticated search trees while keeping them balanced (i.e., solve problem 2).
A depiction of the authenticated (binary) search tree approach from Buldas et al.’s original paper^{7}. The set of elements being authenticated here is $S = \{10, 12, 30, 40, 42, 56, 70, 80\}$.
Fortunately, a few years earlier, Naor and Nissim^{8} had proposed an authenticated 2-3 tree construction which did solve the problem of efficient updates, addressing all problems highlighted in this post. Surprisingly, Naor and Nissim did not point out the mis-ordering attack on Kocher’s work, only the inefficiency of updating it. Also surprisingly, there are no pictures of trees in their paper :(
I still find Merkle tries much easier to implement, but I never tried implementing a 2-3 tree.
Hopefully, this post gave you enough context on the problems of this popular sorted-leaves Merkle tree construction.
This leaves me wondering: are there any advantages to sorted-leaves Merkle trees?
The only advantage I see is that MHTs with sorted leaves are easy to describe: just sort the leaves, Merkleize them and prove non-membership of an element $e$ by revealing the two paths to the adjacent leaves that exclude $e$.
However, just because they are easy to describe does not mean they are easy to understand.
At least, from the questions and answers I see online, and from conversations with researchers and other engineers, their security caveats are not well understood.
First, my own answer on StackExchange makes an unfortunate use of the “sorted Merkle tree” terminology to refer to either a binary search tree^{9}, a trie, or a Sparse Merkle tree (SMT), which actually all have strong (non)membership soundness. Even worse, tries and SMTs are not really sorted, since data is typically hashed before being mapped into the trie.
Another StackExchange answer seems to perpetuate the myth that all you need for non-membership security is to sort the leaves, without paying attention to the weak (non)membership soundness guarantees of such a construction.
The answer quotes this post, where a sorted-leaves Merkle tree solution is described to solve a non-membership problem like the one in the intro. Unfortunately, the answer discards the nuance of the quoted post: there, the original author realized that the leaves could be incorrectly-sorted & resorted to fraud proofs to catch such misbehaviour; i.e., if someone detects a mis-ordered tree, they can easily prove it with two Merkle paths to the out-of-order leaves.
Yet a much easier and cheaper solution would have been to use an authenticated set with strong (non)membership soundness as defined above (e.g., a Merkle trie). This would have simplified the higher-level protocol, since it would have removed the need for fraud proofs, which are clearly less desirable when one can have provably-correct behavior all the time.
Oh well, we live and learn. Don’t sort your Merkle tree’s leaves, okay? Use a Merkle trie.
And, if you somehow find a reason to sort your leaves, please let me know what were the advantages of doing it. Don’t forget to compare to more secure solutions such as Merkle tries, which have strong (non)membership soundness.
Cryptography for Efficiency: New Directions in Authenticated Data Structures, by Charalampos Papamanthou, 2011, [URL] ↩
Multi-layer hashmaps for state storage, by Dankrad Feist, 2020, [URL] ↩
Static-to-dynamic tranformations, by Jeff Erickson, 2015, [URL] ↩
CONIKS: Bringing Key Transparency to End Users, by Marcela S. Melara and Aaron Blankstein and Joseph Bonneau and Edward W. Felten and Michael J. Freedman, in {24th USENIX Security Symposium (USENIX Security 15)}, 2015, [URL] ↩
Shoutout to Alnoki, the cofounder of Econia Labs who brought crit-bit trees to my attention. ↩
On certificate revocation and validation, by Kocher, Paul C., in Financial Cryptography, 1998 ↩
Accountable certificate management using undeniable attestations, by Ahto Buldas and Peeter Laud and Helger Lipmaa, in ACM CCS’00, 2000, [URL] ↩ ↩^{2} ↩^{3}
Certificate Revocation and Certificate Update, by Moni Naor and Kobbi Nissim, in 7th USENIX Security Symposium (USENIX Security 98), 1998, [URL] ↩
Note that a binary-search tree (BST) is a tree where all left descendants of a node are smaller than that node & all right descendants of a node are greater than that node. Importantly, trees with sorted leaves are not conceptualized as binary search trees, since their data is stored in the leaves, not in the internal nodes. ↩
For more details, see this post on Decentralized Thoughts
$$ \def\Adv{\mathcal{A}} \def\Badv{\mathcal{B}} \def\vect#1{\mathbf{#1}} $$
Short Randomizable Signatures, by Pointcheval, David and Sanders, Olivier, in CT-RSA 2016, 2016 ↩