tl;dr: A work-in-progress weighted PVSS for field elements using chunked ElGamal encryption and DeKART range proofs.
Related work
A recent lattice-based PVSS1 scheme reduces the overhead of elliptic curve scalar multiplications during dealing and verification by replacing the encryption scheme with a lattice scheme. Unfortunately, it introduces too much lattice field work, resulting in an overall slower scheme than Chunky. On the other hand, it claims a much smaller 300 KiB transcript for $n=1024$ players2. Furthermore, [GHL21]1 has post-quantum privacy (not sound, because Bulletproofs), which Chunky does not. (Assuming it is not used for bootstrapping DL cryptosystems though.)
Groth’s NI-DKG3, henceforth Groth21, uses chunked ElGamal encryption with Schnorr-style NIZK proofs for correct sharing and correct chunking. In particular, it uses a novel relaxed ZK range proof via rejection sampling.
It has ??x faster dealing and ??x slower verification when benchmarked over BLS12-381. Decryption time is very slow though in the worst-case.
When the goal is merely to PVSS a secret over any field $\F$, our comparison against Groth21 is unfair, since Groth21 can use faster curves: e.g., secp256k1 or Curve25519. But if the goal is to PVSS a secret over a pairing-friendly curve (e.g., the BLS12-381 curve), this comparison is fair and evidences what Groth21 loses in staying away from pairings.
Groth21’s advantage is its versatility: it does not require pairings, which means it can be used in more settings. We note that our Chunky2 variant does not necessarily require pairings either, except for its use of the univariate DeKART range proof. In the future, we believe a sumcheck-based, multivariate variant of DeKART4 could be instantiated to avoid the use of pairings.
Golden PVSS5 is a novel design based on exponent VRFs (eVRFs). It features the smallest transcript sizes by elegantly avoiding the typical pitfalls: chunking, hidden-order groups, lattices. However, its reliance on general-purpose ZKPs and its currently-naive, non-batched, PLONK-based implementation make it very slow in practice. While this should be addressable with a better combination of eVRF and zkSNARK schemes, it is difficult to predict what speedup it would give.
There is also a very exciting line of work on class-group-based PVSS6$^,$7. These schemes can avoid chunking by relying on additively-homomorphic encryption schemes for field elements with efficient decryption8 (unlike ElGamal). Not surprisingly, the cgVSS transcript is 2.2x smaller than Chunky consistently across all $(t, n)$ threshold configurations. Furthermore, dealing time is roughly the same as Chunky. On the other hand, cgVSS transcript verification is 6-8x slower. In practice, the 2x smaller transcript may make up for the slower verification time in certain settings (e.g., DKGs).
The original cgVSS paper6 seems to over-estimate Groth21’s execution times.
Preliminaries
We assume familiarity with:
- PVSS, as an abstract cryptographic primitive.
- In particular, the notion of a PVSS transcript will be used a lot.
- Digital signatures
- i.e., sign a message $m$ as $\sigma \gets \sig.\sign(\sk, m)$ and verify via $\sig.\verify(\pk, \sigma, m)\equals 1$
- ElGamal encryption
- Batched range proofs (e.g., DeKART)
- ZKSoKs (i.e., $\Sigma$-protocols that implicitly sign over a message by feeding it into the Fiat-Shamir transform).
- The SCRAPE low-degree test
All of these will be described in more detail in the subsections below.
Notation
Pairing-friendly groups notation
- Let $\GGen(1^\lambda) \rightarrow \mathcal{G}$ denote a probabilistic polynomial-time algorithm that outputs bilinear groups $\mathcal{G} \bydef (\Gr_1, \Gr_2, \Gr_T)$ of prime order $p\approx 2^{2\lambda}$, denoted additively, such that:
- $\one{1}$ generates $\Gr_1$
- $\two{1}$ generates $\Gr_2$
- $\three{1}$ generates $\Gr_T$
- We use $\one{a}\bydef a\cdot \one{1}$ and $\two{b}\bydef b\cdot \two{1}$ and $\three{c}\bydef c\cdot \three{1}$ to denote multiplying a scalar by the group generator
-
Let $\F$ denote the scalar field of order $p$ associated with the bilinear groups
- We often use capital letters like $G$ or $H$ to denote group elements in $\Gr_1$
- We often use $\widetilde{G}$ or $\widetilde{H}$ letters to denote group elements in $\Gr_2$
Time-complexity notation
- For time complexities, we use:
- $\Fadd{n}$ for $n$ field additions in $\F$
-
$\Fmul{n}$ for $n$ field multiplications in $\F$
- $\Gadd{\Gr}{n}$ for $n$ additions in $\Gr$
- $\Gmul{\Gr}{n}$ for $n$ individual scalar multiplications in $\Gr$
- $\fmsm{\Gr}{n}$ for a size-$n$ MSM in $\Gr$ where the group element bases are known ahead of time (i.e., fixed-base)
- when the scalars are always from a set $S$, then we use $\fmsmSmall{\Gr}{n}{S}$
- $\vmsm{\Gr}{n}$ for a size-$n$ MSM in $\Gr$ where the group element bases are not known ahead of time (i.e., variable-base)
- when the scalars are always from a set $S$, then we use $\vmsmSmall{\Gr}{n}{S}$
- $\pairing$ for one pairing
- $\multipair{n}$ for a size-$n$ multipairing
Other notation
- $[n]\bydef \{1,2,\ldots,n\}$
- $[n) \bydef \{0,1,2,\ldots,n-1\}$
ElGamal encryption
Assuming familiarity with ElGamal encryption:
$E.\mathsf{KeyGen}_H()\rightarrow (\mathsf{dk},\mathsf{ek})$
Generate the key-pair:
\begin{align}
\dk &\randget\F\\
\ek &\gets \dk \cdot H
\end{align}
$E.\mathsf{Enc}_{G,H}\left(\mathsf{ek}, v; r\right) \rightarrow \left(C, R\right)$
Compute:
\begin{align}
C &\gets v \cdot G + r \cdot \ek\\
R &\gets r \cdot H
\end{align}
$E.\mathsf{Dec}_{G}\left(\mathsf{dk}, (C, R)\right) \rightarrow v$
\begin{align}
v &\gets \log_G\left(C - \dk\cdot R\right)\\
&= \log_G\left((v \cdot G + r \cdot \ek) - \dk\cdot (r \cdot H)\right)\\
&= \log_G\left(v \cdot G + r \cdot \ek - (r \cdot \dk) \cdot H\right)\\
&= \log_G\left(v \cdot G + r \cdot \ek - r \cdot \ek\right)\\
&= \log_G\left(v \cdot G\right) = v\\
\end{align}
Decryption will only work for sufficiently “small” values $v$ such that computing discrete logarithms on $v \cdot G$ is feasible (e.g., anywhere from 32 bits to 64 bits).
Univariate DeKART batched ZK range proofs
Assuming familiarity with batched ZK range proofs9. In particular, we will use univariate DeKART as our range proof scheme, formalized below.
$\dekart_b.\setup(N; \mathcal{G})\rightarrow (\mathsf{prk},\mathsf{ck},\mathsf{vk})$
Sets up the ZK range proof to prove batches of size $\le N$, returning a proving key, a commitment key and a verification key. (See implementation here.)
$\dekart_b.\commit(\ck,z_1,\ldots,z_N; \rho)\rightarrow C$
Returns a commitment $C$ to a vector of $N$ values using randomness $\rho$. (See implementation here.)
$\dekart_b.\prove(\prk, C, \ell; z_1,\ldots,z_N, \rho)\rightarrow \pi$
Returns a ZK proof $\pi$ that the $N$ values committed in $C$ are all in $[0, b^\ell)$. (See implementation here.)
$\dekart_b.\verify(\vk, C, \ell; \pi)\rightarrow \{0,1\}$
Verifies that the $N$ values committed in $C$ are all in $[0, b^\ell)$. (See implementation here.)
Zero-knowledge Signatures of Knowledge (ZKSoKs)
Assuming familiarity with ZKSoKs10, which typically consist of two algorithms:
$\sok.\prove(\mathcal{R}, m, \stmt; \witn) \rightarrow \pi$
Returns a ZK proof of knowledge of $\witn$ s.t. $\mathcal{R}(\stmt;\witn) = 1$ and signs the message $m$ in the process.
$\sok.\verify(\mathcal{R}, m, \stmt; \pi) \rightarrow \{0,1\}$
Verifies a ZK proof of knowledge of some $\witn$ s.t. $\mathcal{R}(\stmt;\witn) = 1$ and that the message $m$ was signed.
The $\mathcal{R}_\mathsf{e2k}$ ElGamal-to-KZG NP relation
One of the key ingredients in our PVSS will be a ZK proof of knowledge of share chunks such that they are both ElGamal-encrypted and KZG-committed.
This is captured via the NP relation below:
\begin{align}
\label{rel:e2k}
\term{\Retk}\left(\begin{array}{l}
\stmt = \left(G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C\right),\\
\witn = \left(\{s_{i,j,k}\}_{i,j,k}, \{r_{j,k}\}_{j,k}, \rho\right)
\end{array}\right) = 1\Leftrightarrow\\
\Leftrightarrow\left\{\begin{array}{rl}
(C_{i,j,k}, R_{j,k}) &= E.\enc_{G,H}(\ek_i, s_{i,j,k}; r_{j,k})\\
C& = \dekart_2.\commit(\ck, \{s_{i,j,k}\}_{i,j,k}; \rho)\\
\end{array}\right.
\end{align}
where the $s_{i,j,k}$’s will be “flattened” as a vector (in a specific order) before being input to $\dekart_2.\commit(\cdot)$.
We will explain how this flattening works later in the $\pvssDeal$ algorithm.
$\mathsf{SCRAPE}.\mathsf{LowDegreeTest}(\mathsf{evals}, t, n) \rightarrow \{0,1\}$
Checks that a vector of group-element commitments to polynomial evaluations actually determine a degree-$\le t$ polynomial. It exploits the fact that Shamir shares form a Reed-Solomon codeword, so their inner product with any dual codeword must be zero11.
Input: A list of $(n+1)$ evaluation commitments $\mathsf{evals} = \{(x_i, V_i)\}_{i \in [0, n]}$ where $V_i = a(x_i) \cdot \widetilde{G}$, if honest, and $t$ is the max degree.
Step 1: Sample a random degree-$d$ polynomial $f(X) \in \F[X]$ where $d = n - t$:
\begin{align}
f_0, \ldots, f_d &\randget \F\\
f(X) &:= \sum_{j=0}^{d} f_j X^j
\end{align}
Step 2: Compute Lagrange-like coefficients $\ell_i$ for each evaluation point:
\begin{align}
\ell_0 &:= 1 / \prod_{j \in [n]} (x_0 - x_j)\\
\forall i \in [n],\quad \ell_i &:= 1 / \left((x_i - x_0) \cdot \prod_{j \neq i, j \in [n]} (x_i - x_j)\right)
\end{align}
Step 3: Check that the inner product with the random dual codeword is zero: \begin{align} \textbf{assert}\ 0 \equals \ell_0 \cdot f(x_0) \cdot V_0 + \sum_{i \in [n]} \ell_i \cdot f(x_i) \cdot V_i \end{align}
Typically, the evaluation domain $(x_1, \ldots, x_n)$ is FFT-friendly, so the $f(x_i)$’s are computable in $\Fmul{O(n\log{n})}$. $f(x_0)$ may add another $\Fmul{n}$ operations, but since typically $x_0 = 0$, we have $f(x_0) = f_0$.
Why it works: The vector $(\ell_0 \cdot f(x_0), \ell_1 \cdot f(x_1), \ldots, \ell_n \cdot f(x_n))$ is a random codeword from the dual of the Reed-Solomon code $C = \{(a(x_0), \ldots, a(x_n)) : \deg(a) \le t\}$. If the committed values actually lie on a degree-$\le t$ polynomial, the inner product is zero. If not, it is nonzero with probability $\ge 1 - 1/p$.
Building a DKG from a PVSS
Our goal is to get a weighted DKG12 for field elements amongst the validators of a proof-of-stake blockchain, such that the DKG (final, shared) secret $\term{z}$ is only reconstructable by a fraction $> \term{\threshQ}$ of the stake (e.g., $\threshQ = 0.5$ or 50%).
How? Each validator $i$ will “contribute” to $z$ by picking their own secret $\term{z_i} \in \F$ and dealing it to the other validators via $\term{\pvssDeal}$ in a non-malleable fashion such that only a $> \emph{\threshQ}$ fraction of the stake can reconstruct $z_i$. The DKG secret will be set to $z \bydef \sum_{i\in Q} z_i$, where $\term{Q}$ is the eligible (sub)set of validators who correctly dealt their $z_i$.
Crucially, $Q$ must be large “enough”: i.e., it must have “enough” validators to guarantee that no malicious subset of them can learn (or can bias the choice of) $z$. For example, we could assume only 33% of the stake13 is malicious and require that $Q$ have more stake than that. We denote the stake of the validators in $Q$ by $\term{\norm{Q}}$.
The DKG is parameterized by $\norm{Q}$ and by $\threshQ$. Since, typically in a DKG, the same set of validators will deal a secret amongst themselves, $\norm{Q}$ and $\threshQ$ are typically set to the same value. Otherwise, if $\norm{Q} < \threshQ$, then the validators in $Q$ could reconstruct the secret even though they do not have $> \threshQ$ of the stake, which defeats the point. Alternatively, if $\norm{Q} > \threshQ$, then the protocol would be requiring more validators to contribute than needed for secrecy, since $\threshQ < \norm{Q}$ can reconstruct.
First, to publicly-prove that $\norm{Q}$ is large “enough”, each validator will digitally-sign their dealt PVSS transcript in a domain-separated fashion (part of the domain separator will be the current consensus epoch number). Without such authentication, $Q$ could be filled with transcripts from one malicious validator impersonating the other ones. Therefore, that malicious validator would have full knowledge of the final DKG secret $z$. ($\Rightarrow$ No es bueno.)
Implication: The DKG protocol needs to be carefully crafted to sign the PVSS transcripts. If done right, the validators’ public keys used to sign blockchain consensus messages can be safely reused as signing public keys for the transcript. (If done right.)
Second, we require that PVSS transcripts obtained from $\pvssDeal$ be non-malleable. To see why this is necessary consider the following scenario:
- two validators $i$ and $j$ have enough stake to form an eligible subset $Q = \{i,j\}$ with $\norm{Q} > \threshQ$
- $j$ by itself does not have enough stake
- $i$ deals $z_i \in \F$ and signs the transcript
- $j$ removes $i$’s signature and mauls $i$’s transcript to deal $-z_i + r$ for some $r\randget\F$ it knows
- $j$ signs this mauled transcript
- $\Rightarrow j$ would have full knowledge of the final DKG secret $z = z_i + (- z_i + r) = r$.
Implication: The PVSS transcript will include a zero-knowledge signature of knowledge (ZKSoK) of the dealt secret $z_i$. This way, the dealt secret cannot be mauled without rendering the transcript invalid. Importantly, the ZKSoK signature will include the signing public key of the dealer. This way, validator $j$ cannot bias the final DKG secret $z$ by appropriating validator $i$’s transcript as their own (i.e., by stripping validator $i$’s signature from the transcript, adding their own signature and leaving the dealt secret $z_i$ untouched).
Chunky: A weighted, non-malleable PVSS
Notation:
- Let $\term{n}$ denote the number of players
- Note: In our setting, the PoS validators will act as the players
- Let $\term{\maxTotalWeight}$ denote the maximum total weight $\Leftrightarrow$ maximum # of shares that we will ever want to deal in the PVSS
- Let $\term{\ell}$ denote the chunk bit-size (e.g., $\ell=32$ for 32-bit chunks)
- Let $\term{m} = \ceil{\log_2{\sizeof{\F}} / \ell}$ denote the number of chunks per share
- Let $\term{B}\bydef 2^\ell$ denote the maximum value of a chunk (e.g., $B=2^{32}$ for 32-bit chunks)
The algorithms below describe Chunky, a weighted PVSS where only subsets of players with combined weight $> \threshWeight$ can reconstruct the shared secret.
$\mathsf{PVSS}.\mathsf{Setup}(\ell, W_\mathsf{max}; \mathcal{G}, \widetilde{G}) \rightarrow \mathsf{pp}$
Recall that $\emph{\maxTotalWeight}$ is the max. total weight, $\emph{\ell}$ is the # of bits per chunk and $\emph{m}$ is the number of chunks a share is split into.
$\term{\widetilde{G}}\in\Gr_2$ will be the base used to commit to the shares in $\pvssDeal$.
Step 1: Set up the ElGamal encryption: \begin{align} \term{G},\term{H} &\randget \Gr_1 \end{align}
Step 2: Set up the ZK range proof to batch prove that $\le \maxTotalWeight\cdot m$ share chunks are all $\ell$-bit wide:
\begin{align} (\prk,\ck,\vk) \gets \dekart_2.\setup(\maxTotalWeight\cdot m; \mathcal{G}) \end{align}
Note that DeKART assumes that the field $\F$ admits a $2^\kappa$-th primitive root of unity where $2^\kappa$ is the smallest power of two $\ge \maxTotalWeight\cdot m + 1$. (The ZK range proof needs FFTs of size $\maxTotalWeight\cdot m$.)
Return the public parameters: \begin{align} \pp \gets (\ell, \maxTotalWeight, G, \widetilde{G}, H, \prk,\ck,\vk) \end{align}
$\mathsf{PVSS}.\mathsf{Deal}_\mathsf{pp}\left(a_0, t_W, \{w_i, \mathsf{ek}_i\}_{i\in [n]}, \mathsf{ssid}\right) \rightarrow \mathsf{trs}$
$a_0$ is the dealt secret. $w_i$’s are the weights of each player, including the dealer’s (i.e., $w_\pid$). $\ssid$ is a session identifier, which will be set to the consensus epoch number in which the DKG is taking place and calls this PVSS deal algorithm.
Parse public parameters: \begin{align} (\ell, \maxTotalWeight, G, \widetilde{G}, H, \prk,\ck,\vk)\parse\pp \end{align}
Compute the total weight and assert that the public parameters can accommodate it:
\begin{align}
\label{eq:W}
\term{W} &\gets \sum_{i\in[n]} w_i\\
\textbf{assert}\ W &\le \maxTotalWeight
\end{align}
Find a $2^\kappa$-th root of unity $\term{\omega} \in \F$ such that we can efficiently compute FFTs of size $W$ (i.e., smallest $2^\kappa \ge W$).
Step 1: Pick the degree-$\threshWeight$ random secret sharing polynomial and compute the $j$th share of player $i$:
\begin{align}
\term{a_1,\ldots,a_t} &\randget \F\\
\term{f(X)} &\bydef \emph{a_0} + a_1 X + a_2 X^2 + \ldots + a_t X^\threshWeight\\
\label{eq:eval}
\term{s_{i, j}} &\gets f\left(\term{\chi_{i,j}}\right),\forall i\in[n],\forall j\in[w_i]
\end{align}
Note: Assuming that the set of evaluation points $\emph{\{\chi_{i,j}\}}$ are wisely set to be the first $W$ roots of unity in $\{\omega^{i’}\}_{i’\in [0,W)}$, then the $s_{i,j}$’s would be quickly-computable in $\Fmul{O(W\log{W})}$ via an FFT.
Step 2: Commit to the shares, $\forall i\in[n],\forall j\in[w_i]$:
\begin{align}
\label{eq:share-commitments}
\term{\widetilde{V}_{i,j}} &\gets s_{i,j} \cdot \widetilde{G} \in \Gr_2\\
\label{eq:dealt-pubkey}
\term{\widetilde{V}_0} &\gets a_0 \cdot \widetilde{G}
\end{align}
Step 3: Split each share $s_{i,j}$ into $\emph{m}\bydef \ceil{\log_2{\sizeof{\F}}} / \ell$ chunks $\term{s_{i,j,k}}$, of $\ell$-bits each, such that:
\begin{align}
s_{i,j}
&= \sum_{k\in[m]} (2^\ell)^{k-1} \cdot \emph{s_{i,j,k}}\\
&\bydef \sum_{k\in[m]} \emph{B}^{k-1} \cdot s_{i,j,k}\\
\end{align}
Note: Each $s_{i,j,k} \in [0, B)$, where $B = 2^\ell$.
Step 4: $\forall i \in[n], j\in[w_i], k\in[m]$, encrypt the $k$th chunk of the $j$th share of player $i$:
\begin{align}
\term{r_{j,k}} &\randget \F\ \text{s.t.}\ \sum_{k\in[m]} B^{k-1}\cdot r_{j,k} = 0\\
\label{eq:share-ciphertexts}
\term{(C_{i,j,k}, R_{j,k})} &\gets E.\enc_{G,H}(\ek_i, s_{i,j,k}; r_{j,k})\\
&\bydef \left(\begin{array}{l}
s_{i,j,k} \cdot G + r_{j,k}\cdot \ek_i\\
r_{j,k}\cdot H\end{array}\right)
\end{align}
Observation 1: The randomness has been correlated such that:
\begin{align}
\label{eq:correlated}
\sum_{k\in[m]} B^{k-1} \cdot C_{i,j,k}
&= \sum_{k\in[m]} B^{k-1} \cdot (s_{i,j,k} \cdot G + r_{j,k}\cdot \ek_i)\\
&= \underbrace{\sum_{k\in[m]} (B^{k-1} \cdot s_{i,j,k})}_{s_{i,j}} \cdot G + \underbrace{\sum_{k\in [m]} (B^{k-1} \cdot r_{j,k})}_{0}\cdot \ek_i\\
&= s_{i,j} \cdot G + 0 \cdot \ek_i = s_{i,j} \cdot G
\end{align}
Observation 2: Different players $i$ will safely re-use the same $r_{j,k}$ randomness.
Observation 3: $\sizeof{\{R_{j,k}\}_{j,k}} = m\cdot \max_{i\in[n]}{(w_i)}$
The cumulative weight up to (but excluding) $i$ is $\term{W_i}$ such that $\emph{W_1} = 0$ and $\emph{W_i} = \sum_{i’\in [1, i)} w_{i’}$. (Note that $W \bydef W_{n+1}$.) This notion helps us “flatten” all the share chunks $s_{i,j,k}$ into an array $\{z_{i’}\}_{i’\in [W \cdot m]}$, where $z_{i’} \bydef s_{i,j,k}$ with $i’\gets \left(\emph{W_i} + (j-1)\right)\cdot m + k \bydef \term{\idx(i,j,k)}$ (see appendix for how the indexing was derived).
Step 5: Prove that the share chunks are correctly encrypted and are all $\ell$-bit long.
First, “flatten” all the shares into a vector. $\forall i\in[n], j\in[w_i],\forall k\in[m]$: \begin{align} \term{z_{i’}} \gets s_{i,j, k},\ \text{where}\ i’ &\bydef \emph{\idx}(i,j,k)\in[W\cdot m] \end{align}
Second, KZG commit to the share chunks and prove they are all in range:
\begin{align}
\rho &\randget \F\\
\term{C} &\gets \dekart_2.\commit(\ck, z_1, \ldots, z_{W \cdot m}; \rho)\\
\term{\piRange} &\gets \dekart_2.\prove(\prk, C, \ell, z_1, \ldots, z_{W\cdot m}; \rho)
\end{align}
Step 6: Compute a signature of knowledge of the dealt secret key $a_0$ over the session ID:
\begin{align}
\term{\ctx} &\gets (\threshWeight, \{w_i\}_i, \ssid)\\
\term{\piSok} &\gets \sok.\prove\left(\begin{array}{l}
\Retk, \emph{\ctx},\\
\underbrace{G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C}_{\stmt},\\
\underbrace{\{s_{i,j,k}\}_{i,j,k}, \{r_{j,k}\}_{j,k}, \rho}_{\witn}
\end{array}\right)
\end{align}
Return the transcript:
\begin{align}
\label{eq:proof}
\term{\pi} &\gets \left(C, \piRange, \piSok\right)\\
\label{eq:trs}
\trs &\gets \left(\widetilde{V}_0, \{\widetilde{V}_{i,j}\}_{i,j\in[w_i]}, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k}, \emph{\pi}\right)
\end{align}
$\mathsf{PVSS}.\mathsf{Verify}_\mathsf{pp}\left(\mathsf{trs}, t_W, \{w_i, \mathsf{ek}_i\}_{i\in[n]}, \mathsf{ssid}\right) \rightarrow \{0,1\}$
Parse public parameters: \begin{align} (\ell, \cdot, G, \widetilde{G}, H, \cdot,\cdot,\vk)\parse\pp \end{align}
Parse the transcript: \begin{align} \left(\widetilde{V}_0, \{\widetilde{V}_{i,j}\}_{i,j\in[w_i]}, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k}, \left(C, \piRange, \piSok\right)\right)\parse\trs \end{align}
Let the total weight $W$ be defined as before in Eq. \ref{eq:W}.
Step 1: Verify that the committed shares encode a degree-$\threshWeight$ polynomial via the randomized SCRAPE LDT11: \begin{align} \textbf{assert}\ &\scrape.\lowdegreetest(\{(0, \widetilde{V}_0)\} \cup \{(\chi_{i,j}, \widetilde{V}_{i,j})\}_{i,j}, \threshWeight, W) \equals 1 \end{align}
Note: Recall that the $\emph{\chi_{i,j}}$’s are the roots of unity used to evaluate the secret-sharing polynomial $f(X)$ during dealing (see Eq. \ref{eq:eval}).
May need to feed in the size of the evaluation domain to SCRAPE for the super-efficient algorithm.
Step 2: Check that ciphertexts encrypt the committed shares:
\begin{align}
\term{\beta_{i,j}} &\randget\{0,1\}^\lambda\\
\label{eq:multi-pairing-check}
\textbf{assert}\
&\pair{\sum_{i\in[n],j\in[w_i],k\in[m]} (B^{k-1}\cdot\emph{\beta_{i,j}})\cdot C_{i,j,k}}{\widetilde{G}}
\equals
\pair{G}{\sum_{i\in[n],j\in[w_i]} \emph{\beta_{i,j}}\cdot \widetilde{V}_{i,j}}
\end{align}
Q: But how was this derived? A: Click to expand and understand...
First, recall from Eq. \ref{eq:correlated} that the randomness has been correlated such that $\sum_k C_{i,j,k} = s_{i,j}\cdot G$.
Second, observe that, using a pairing, we can check that the share chunked in the $C_{i,j,k}$’s is the same as the one committed in $\widetilde{V}_{i,j}$:
\begin{align}
\pair{\sum_{k\in[m]} B^{k-1}\cdot C_{i,j,k}}{\widetilde{G}} &\equals \pair{G}{\widetilde{V}_{i,j}}
\end{align}
Third, observe that we can batch all these pairing checks into one by taking linear combination of the verification equations using random $\beta_{i,j}$’s:
\begin{align}
\sum_{i,j}\beta_{i,j}\cdot\pair{\sum_{k\in[m]} B^{k-1}\cdot C_{i,j,k}}{\widetilde{G}} &\equals \sum_{i,j} \beta_{i,j}\cdot \pair{G}{\widetilde{V}_{i,j}}\\\
\end{align}
Moving the sum inside the pairing by leveraging the bilinearity gives exactly Eq. \ref{eq:multi-pairing-check}.
Step 3: Verify the range proof: \begin{align} \textbf{assert}\ \dekart_2.\verify(\vk, C, \ell; \piRange) \equals 1 \end{align}
Step 4: Verify the SoK:
\begin{align}
\term{\ctx} &\gets (\threshWeight, \{w_i\}_i, \ssid)\\
\textbf{assert}\ &\sok.\verify\left(\begin{array}{l}
\Retk, \emph{\ctx},\\
\underbrace{G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C}_{\stmt};\\
\piSok
\end{array}\right) \equals 1
\end{align}
$\mathsf{PVSS}.\mathsf{Decrypt}_\mathsf{pp}\left(\mathsf{trs}, \mathsf{dk}, i, w_i\right) \rightarrow \{s_{i,j}\}_j \in \F$
$i\in[n]$ is the ID of the player who is decrypting their share(s) from the transcript. Recall that $\emph{m}\bydef \ceil{\log_2{\sizeof{\F}} / \ell}$ is the number of chunks per share.
Parse public parameters: \begin{align} (\ell, \cdot, G, \cdot, \cdot, \cdot,\cdot,\cdot)\parse\pp \end{align}
Parse the transcript: \begin{align} \left(\cdot, \cdot, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k},\cdot\right)\parse\trs \end{align}
Step 1: Decrypt all of player $i$’s share chunks $\{s_{i,j,k}\}_{i,j\in[w_i],k\in[m]}$: \begin{align} s_{i,j,k}\gets E.\dec_{G}\left(\dk_i, (C_{i,j,k}, R_{j,k})\right) \end{align}
Step 2: Assemble the chunks back into shares: \begin{align} s_{i,j}\gets \sum_{k\in[m]} (2^\ell)^{k-1} \cdot s_{i,j,k} \end{align}
Weighted DKG protocol
Below, we give a high-level sketch of our $\threshWeight$-out-of-$\{w_i\}_{i\in[n]}$ weighted DKG with contributions from $> \emph{\threshQ}$ fraction of the stake.
But first, we have to slightly augment our notion of a non-malleable PVSS, denoted by $\pvss$, into a signed, subaggregatable and non-malleable PVSS, denoted by $\term{\ssPvss}$. This will make building a DKG protocol much easier.
First, recall from before that validators must sign their PVSS transcripts in the DKG protocol. Thus, the $\term{\ssPvss.\deal}$ and $\term{\ssPvss.\verify}$ algorithms will differ slightly:
- dealing now takes a signing secret key $\term{\sk}$ as input and additionally returns a signature $\term{\sigma}$
- verification now takes a signing pubkey $\term{\pk}$ and the signature $\sigma$ as input
Second, we introduce a useful notion of an aggregatable PVSS subtranscript $\term{\subtrs}$ which excludes the non-aggregatable components of the PVSS transcript $\emph{\trs}$ from Eq. \ref{eq:trs} (i.e., the proof $\pi$ from Eq. \ref{eq:proof}).
Third, we define a new $\term{\ssPvssSubtranscript}$ algorithm which returns such a $\subtrs$. In Chunky’s case, this will consist of only:
- The dealt pubkey $\widetilde{V}_0$ as defined in Eq. \ref{eq:dealt-pubkey}
- The share commitments (i.e., all share commitments $\widetilde{V}_{i,j}$ as defined in Eq. \ref{eq:share-commitments})
- The share chunk ciphertexts (i.e., all share ciphertexts $(C_{i,j,k}, R_{j,k})$ as defined in Eq. \ref{eq:share-ciphertexts})
Fourth, and last, we will also define a $\term{\ssPvssSubaggregate}$ algorithm which takes several subtranscripts $\{\subtrs_i\}_i$ and aggregates them into a single $\subtrs$. This way, two subtranscripts $\subtrs_1$ and $\subtrs_2$ dealing secrets $z_1$ and $z_2$, respectively, can be succinctly combined into a $\subtrs$ dealing $z_1 + z_2$ (such that $\sizeof{\subtrs} = \sizeof{\subtrs_i}, \forall i\in\{1,2\}$).
We detail the new algorithms for this signed, subaggregatable, non-malleable PVSS below. (Note that the $\setup$ and $\decrypt$ algorithms remain the same.)
$\mathsf{ssPVSS}.\mathsf{Deal}_\mathsf{pp}\left(\mathsf{sk}, a_0, t_W, \{w_i, \mathsf{ek}_i\}_{i\in [n]}, \mathsf{ssid}\right) \rightarrow (\mathsf{trs},\sigma)$
Deal a normal PVSS transcript via $\pvssDeal$ but also sign over it and over the session ID:
\begin{align}
\trs &\gets \pvssDeal(a_0, \threshWeight, \{w_i,\ek_i\}_{i\in[n]}, \ssid)\\
(\tilde{V}_0,\cdot,\cdot,\cdot,\cdot)&\parse \trs\\
\sigma &\gets \sig.\sign(\sk, (\tilde{V}_0, \ssid))
\end{align}
$\mathsf{ssPVSS}.\mathsf{Verify}_\mathsf{pp}\left(\pk, \mathsf{trs}, \sigma, t_W, \{w_i, \mathsf{ek}_i\}_{i\in[n]}, \mathsf{ssid}\right) \rightarrow \{0,1\}$
Do a normal PVSS transcript verification via $\pvssVerify$ but also verify the signature over it and the session ID:
\begin{align}
\textbf{assert}\ \pvssVerify(\trs, \threshWeight, \{w_i,\ek_i\}_{i\in[n]}, \ssid) &\equals 1\\
(\tilde{V}_0,\cdot,\cdot,\cdot,\cdot) &\parse \trs\\
\textbf{assert}\ \sig.\verify(\pk, \sigma, (\tilde{V}_0, \ssid)) &\equals 1
\end{align}
$\mathsf{ssPVSS}.\mathsf{Subtranscript}\left(\mathsf{trs}\right) \rightarrow \mathsf{subtrs}$
Parse the transcript as defined in Eq. \ref{eq:trs}: \begin{align} \left(\widetilde{V}_0, \{\widetilde{V}_{i,j}\}_{i,j\in[w_i]}, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k}, \cdot \right)\parse\trs \end{align}
Return the aggregatable subtranscript:
\begin{align}
\label{eq:subtrs}
\subtrs &\gets \left(\widetilde{V}_0, \{\widetilde{V}_{i,j}\}_{i,j\in[w_i]}, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k}\right)\\
\end{align}
$\mathsf{ssPVSS}.\mathsf{Subaggregate}_\mathsf{pp}\left(\{\mathsf{subtrs}_{i’}\}_{i’}\right) \rightarrow \mathsf{subtrs}$
Parse public parameters: \begin{align} (\ell, \cdot, \cdot, \cdot, \cdot, \cdot,\cdot,\cdot)\parse\pp \end{align}
Parse all the aggregatable subtranscripts, for all $i’$:
\begin{align}
\left(\widetilde{V}^{(i’)}_0, \{\widetilde{V}^{(i’)}_{i,j}\}_{i,j\in[w_i]}, \{C^{(i’)}_{i,j,k}\}_{i,j\in[w_i],k}, \{R^{(i’)}_{j,k}\}_{j\in[\max_i{w_i}],k}\right)\parse \subtrs_{i’}\\
\end{align}
Recall that $\emph{n}$ denotes the number of players that a transcript deals to and recall that $\emph{m} = \ceil{\log_2{\sizeof{\F}} / \ell}$ denotes the number of chunks per share.
Aggregate:
\begin{align}
\term{\widetilde{V}_0} &\gets \sum_{i’} \widetilde{V}^{(i’)}_0\\\
\forall i\in[n],j\in[w_i], \term{\widetilde{V}_{i,j}} &\gets \sum_{i’} \widetilde{V}^{(i’)}_{i,j}\\
\forall i\in[n],j\in[w_i],k\in[m], \term{\widetilde{C}_{i,j,k}} &\gets \sum_{i’} C^{(i’)}_{i,j,k}\\
\forall j\in[w_i],k\in[m], \term{\widetilde{R}_{j,k}} &\gets \sum_{i’} R^{(i’)}_{j,k}\\
\end{align}
Return the aggregated subtranscript:
\begin{align}
\subtrs &\gets \left(\widetilde{V}_0, \{\widetilde{V}_{i,j}\}_{i,j\in[w_i]}, \{C_{i,j,k}\}_{i,j\in[w_i],k}, \{R_{j,k}\}_{j\in[\max_i{w_i}],k}\right)\\
\end{align}
DKG overview
A DKG will occur within the context of a consensus epoch $\term{\epoch}$. All validators know each other’s public keys. Specifically, every validator $i$ has signing pubkey $\term{\pk_{i’}}$ (with signing secret key $\term{\sk_{i’}}$) and encryption key $\ek_i$14.
Dealing phase: Each validator $i’\in[n]$ picks a random secret $\term{z_{i’}}\in\F$ and computes a PVSS transcript that deals it:
\begin{align}
\emph{z_{i’}} &\randget \F\\
\term{\ssid_{i’}} &\gets (i’, \emph{\pk_{i’}}, \emph{\epoch})\\
\term{\trs_{i’}, \sigma_{i’}} &\gets \ssPvssDeal(\emph{\sk_{i’}}, z_{i’}, \threshWeight, \{w_i,\ek_i\}_{i\in[n]}, \emph{\ssid_{i’}})\\
\end{align}
Our current $\ssPvssDeal$ Rust implementation in aptos-dkg returns a chunky::Transcript struct that will contain both the actual transcript $\trs_{i’}$ and its signature $\sigma_{i’}$.
Then, each validator $i’$ (best-effort) disseminates $(\trs_{i’}, \sigma_{i’})$ to all other validators. Eventually, each validators $i’$ will have its own view of a set $\term{Q_{i’}}$ of validators who correctly-dealt a (single) transcript, as well as the actual signed transcripts themselves.
Agreement phase: In this phase, validators will agree on an aggregated subtranscript $\term{\subtrs}$ obtained from a “large-enough” eligible set $\emph{Q}$ of honest validators.
More formally, the agreed-upon $(Q,\subtrs)$ will have the following three properties:
\begin{align}
&\norm{Q} > \threshQ\\
\label{eq:trs-verifies}
&\forall j’ \in Q, \exists (\term{\trs_{j’},\sigma_{j’}}),\ \text{s.t.}\ \ssPvssVerify(\pk_{j’}, \emph{\trs_{j’}, \sigma_{j’}}, \threshWeight, \{w_i,\ek_i\}_{i\in[n]}, (\underbrace{j’, \pk_{j’}, \epoch}_{\emph{\ssid_{j’}}})) \goddamnequals 1\\
\label{eq:subtrs-aggr}
&\emph{\subtrs} \goddamnequals \ssPvssSubaggregate(\{\ssPvssSubtranscript(\trs_{j’})\}_{j’ \in Q})
\end{align}
Agreement on $Q$ could be reached inefficiently by running a Byzantine agreement phase for each transcript: i.e., validator $i’$ proposes its $(\trs_{i’}, \sigma_{i’})$ and if it collects “enough” attestations (e.g., signatures from a fraction $> \term{\threshS}$ of the stake, say, 33%15) on it, then $i’$ is accumulated in the set $Q$ so far. The downside of this approach is high latency: it requires one Byzantine agreement per contributing validator. For Aptos, specifically, it would also require sending too many validator TXNs.
Proposal sub-phase: To reach agreement on $(Q,\subtrs)$ efficiently, one of the validators (e.g., the consensus leader) sends a final DKG subtranscript proposal $(Q, h)$, where $h \gets H(\subtrs)$ and $H(\cdot)$ is a collision-resistant hash function.
Every validator $i’$ will attest to (i.e., sign) this proposal if they can verify that the hashed subtranscript in $h$ was actually aggregated from some set $\{\trs_{j’}\}_{j’\in Q}$ of transcripts that all passed verification as per Eq. \ref{eq:trs-verifies}.
More formally, validator $i’$ will attest to the $(Q, h)$ proposal via a signature $\term{\alpha_{i’}}\bydef \sig.\sign(\sk_{i’}, (Q, h))$, if and only if:
- $\norm{Q} > \threshQ$
- $\forall j’\in Q$, validator $i’$ eventually16 receives a single17 $(\trs_{j’},\sigma_{j’})$ s.t. $\ssPvssVerify(\pk_{j’}, \trs_{j’}, \sigma_{j’}, \threshWeight, \{w_i,\ek_i\}_{i\in[n]}, (j’, \pk_{j’}, \epoch)) \goddamnequals 1$
- $h \equals H(\ssPvssSubaggregate(\{\ssPvssSubtranscript(\trs_{j’})\}_{j’\in Q}))$
Commit sub-phase: If the $(Q, h)$ proposal gathers “enough” attestations (i.e., $> \threshS$), the proposing validator sends a(n Aptos validator) TXN with $(Q, \subtrs, \{\alpha_{j’}\}_{j’\in \term{S}})$ to the chain, where $\emph{S}$ is the set of validators who attested with $\norm{S} > \threshS$.
(Note that this TXN includes the $\subtrs$ corresponding to the hash in the proposal $(Q,h)$.)
This TXN will be succinct as it only contains:
- The aggregated subtranscript $\subtrs$
- Note: Assuming elliptic curves over 256-bit base fields (e.g., BN254), $\sizeof{\subtrs} \bydef \underbrace{64}_{\widetilde{V}_0} + \underbrace{64 \cdot W}_{\widetilde{V}_{i,j}\text{'s}} + \underbrace{32 \cdot W\cdot m}_{C_{i,j,k}\text{'s}} + 32\cdot \underbrace{\max_i{w_i}\cdot m}_{R_{j,k}\text{'s}}$ as per Eq. \ref{eq:subtrs}
- e.g., for total weight $W = 254$, $m=8$ chunks and $\max_i{w_i} = 5$, the size will be $64 + 64 \cdot 254 + 32 \cdot 254 \cdot 8 + 32 \cdot 5 \cdot 8 =$ 82,624 bytes $=$ 80.6875 KiB
- If we increase $\max_i{w_i}$ to 7, we get $64 + 64 \cdot 254 + 32 \cdot 254 \cdot 8 + 32 \cdot \emph{7} \cdot 8 =$ 83,136 bytes $=$ 81.1875 KiB
- Attestations $\alpha_{j’}$’s from at most all $n$ validators.
Once this TXN gets included on-chain it is sent to execution, where all (honest) validators will:
- check that the attestations in $(Q,\subtrs, \{\alpha_{j’}\}_{j’\in S})$ are valid; i.e.,:
- $h \gets H(\subtrs)$
- $\textbf{assert}\ \norm{S} > \threshS$
- $\forall j’\in S, \textbf{assert}\ \sig.\verify(\pk_{j’}, \alpha_{j’}, (Q, h)) \equals 1$
- this implies that $\norm{Q} > \threshQ$…
- …and that Eqs. \ref{eq:trs-verifies} and \ref{eq:subtrs-aggr} hold
- install the subtranscript on-chain, declaring the DKG complete
Now:
- The final public key whose corresponding secret key is secret-shared is $\widetilde{V}_0$ from $\subtrs$
- The share commitments $\widetilde{V}_{i,j}$’s in $\subtrs$ can be made public
- e.g., if the DKG is for bootstrapping a weighted threshold BLS signature scheme, then $\widetilde{V}_{i,j}\bydef s_{i,j}\cdot G$ will act as the verification key for the BLS signature share $H(m)^{s_{i,j}}$
- Each player can use $\pvssDecrypt$ to obtain their shares from $\subtrs$ 20
Benchmarks
Aptos mainnet
Single-threaded numbers from my Apple Macbook Pro M4 Max for the setup we expect to use on Aptos mainnet:
| Scheme | $\ell$ | Setup | Transcript size | Deal (ms) | Serialize (ms) | Sub-aggregate (ms) | Verify (ms) | Decrypt-share (ms) |
|---|---|---|---|---|---|---|---|---|
| Chunky | 32 | 129-out-of-219 / 136 players | 259.24 KiB | 373.30 | 0.24 | 1.29 | 63.05 | 10.73 |
| Chunky2 | 32 | 129-out-of-219 / 136 players | 279.78 KiB | 401.96 (0.93x) | 0.27 (0.89x) | 1.35 (0.96x) | 72.45 (0.87x) | 11.09 (0.97x) |
These numbers can be reproduce by cloning aptos-core and doing:
git clone https://github.com/aptos-labs/aptos-core
cd aptos-core/crates/aptos-crypto/benches/
./run-pvss-benches.sh
Full benchmarks
Parallelization: Chunky** and *Groth21 run single-threaded.
But Golden sets gnark and gnark-crypto to default to NbTasks = runtime.NumCPU()*2 for MSMs and FFTs.
So, each PLONK proof uses all logical cores.
The $n-1$ proofs in a dealing are still generated serially in a for-loop; it is the MSM/FFT work inside each proof that is parallelized.
Our comparison is overly-generous to Golden, as a result.
| Scheme | Curve | Library | $\ell$ | $t$ | $n$ | Transcript size | Deal (ms) | Verify (ms) | Decrypt share (ms) |
|---|---|---|---|---|---|---|---|---|---|
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 6 | 8 | 13.37 KiB | 29.83 | 7.87 | 2.96 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 6 | 8 | 20.15 KiB (1.51x) | 27.8 (1.07x) | 12.4 (1.57x) | ≈ 12,424 (≈ 4,200x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 6 | 8 | 5.29 KiB (2.53x) | 8,060 (270x) | 10.25 (1.30x) | 0.31 (9.55x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 6 | 8 | 178.13 KiB (13.3x) | 5,553 (186x) | 466 (59.2x) | 0.48 (6.21x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 6 | 8 | 5.16 KiB (2.59x) | 50.25 (1.68x) | 60.18 (7.65x) | 10.48 (3.54x) |
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 11 | 16 | 22.56 KiB | 49.19 | 11.24 | 3.03 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 11 | 16 | 34.27 KiB (1.52x) | 49.9 (~1.00x) | 20.7 (1.84x) | ≈ 17,686 (≈ 5,800x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 11 | 16 | 10.50 KiB (2.15x) | 17,594 (358x) | 21.31 (1.90x) | 0.30 (10.1x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 11 | 16 | 180.94 KiB (8.02x) | 5,591 (114x) | 551 (49.0x) | 0.49 (6.20x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 11 | 16 | 9.51 KiB (2.37x) | 64.54 (1.31x) | 72.40 (6.44x) | 10.24 (3.38x) |
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 22 | 32 | 40.94 KiB | 84.70 | 16.79 | 3.05 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 22 | 32 | 62.52 KiB (1.53x) | 92.1 (1.09x) | 34.9 (2.08x) | ≈ 24,722 (≈ 8,100x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 22 | 32 | 20.97 KiB (1.95x) | 37,458 (442x) | 50.56 (3.01x) | 0.30 (10.2x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 22 | 32 | 183.13 KiB (4.47x) | 5,836 (68.9x) | 481 (28.6x) | 0.50 (6.06x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 22 | 32 | 18.26 KiB (2.24x) | 85.74 (~1.01x) | 98.76 (5.88x) | 10.32 (3.38x) |
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 43 | 64 | 77.69 KiB | 150.63 | 26.17 | 3.03 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 43 | 64 | 119.02 KiB (1.53x) | 181.4 (1.20x) | 68.0 (2.60x) | ≈ 26,463 (≈ 8,700x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 43 | 64 | 41.88 KiB (1.85x) | 73,967 (491x) | 145.74 (5.57x) | 0.29 (10.4x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 43 | 64 | 187.50 KiB (2.41x) | 6,039 (40.1x) | 483 (18.5x) | 0.52 (5.88x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 43 | 64 | 35.66 KiB (2.18x) | 137.22 (1.10x) | 151.54 (5.79x) | 10.33 (3.41x) |
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 85 | 128 | 151.19 KiB | 281.35 | 44.24 | 3.03 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 85 | 128 | 232.02 KiB (1.53x) | 355.7 (1.26x) | 121.0 (2.74x) | ≈ 37,445 (≈ 12,400x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 85 | 128 | 83.69 KiB (1.81x) | 151,163 (537x) | 518.78 (11.7x) | 0.30 (10.1x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 85 | 128 | 192.63 KiB (1.27x) | 6,093 (21.7x) | 490 (11.1x) | 0.54 (5.66x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 85 | 128 | 70.52 KiB (2.14x) | 273.84 (1.03x) | 273.01 (6.17x) | 10.46 (3.45x) |
| Chunky | BLS12-381 | arkworks v0.5.0 |
32 | 169 | 256 | 298.19 KiB | 521.81 | 76.97 | 3.02 |
| Groth21 | BLS12-381 | blstrs v0.7.1 |
8 | 169 | 256 | 458.02 KiB (1.54x) | 688.4 (1.32x) | 232.4 (3.02x) | ≈ 54,549 (≈ 18,100x) |
| Golden | BN254 + BJJ | gnark v0.14.0 |
– | 169 | 256 | 167.32 KiB (1.78x) | 308,051 (590x) | 1,854.27 (24.1x) | 0.30 (10.1x) |
| [GHL21e]1 | Curve25519 | cpp-lwevss |
– | 169 | 256 | 201.81 KiB (1.48x) | 6,899 (13.2x) | 509 (6.61x) | 0.53 (5.73x) |
| cgVSS6 | BLS12-381 + CL | cgdkg_artifact |
– | 169 | 256 | 140.23 KiB (2.13x) | 664.30 (1.27x) | 580.35 (7.54x) | 10.75 (3.56x) |
To reproduce the Chunky numbers, clone aptos-core and run:
cd aptos-core/crates/aptos-crypto/benches/
# TODO: modify the code to test the desired thresholds
./run-pvss-benches.sh
Golden notes
To reproduce the Golden numbers, clone alinush/fy and run:
cd fy/
# Transcript size only (one row per (t, n)):
go test ./golden/ -run TestPrintTranscriptSize -v -timeout 2h
# Full benchmark (transcript size + deal/verify/serialize/decrypt-share):
go test ./golden/ -run TestPrintBenchmarks -v -timeout 2h
# Custom (t, n) pairs, comma-separated as "t:n":
GOLDEN_SIZES=6:8,11:16 go test ./golden/ -run TestPrintBenchmarks -v
Why is Golden dealing so slow?
Golden is the only scheme in the table that relies on a SNARK: for every recipient, the dealer produces a gnark PLONK proof attesting that an eVRF-derived pad was computed correctly. Each PLONK proof costs ~1.2 s on our machine, and a dealing contains $n-1$ of them, which is why Deal scales as $\approx 1{,}200(n-1)$ ms. Verification is much cheaper ($\approx 7$ ms/proof) since PLONK verify is fast, and per-recipient share decryption is just one Diffie–Hellman operation plus a scalar subtraction.
Groth21 notes
To reproduce the Groth21 numbers, clone our e2e-vss fork and run:
cd groth21-rs/
./benches/run-pvss-benches.sh --features chunks-8bit
The --features chunks-8bit flag switches the code to 8-bit chunks (m=32, B=2^8). Drop it to use the default 16-bit chunks (m=16, B=2^16).
The “Decrypt share” benchmark builds the batched BSGS table before the benchmarks run.
How we picked the baseline
Groth21 has two known implementations:
- DFINITY’s production Rust implementation, which includes forward-secure binary tree encryption (BTE), epoch-based key updates, and custom-optimized BLS12-381 primitives (windowed Pippenger MSMs, precomputed
mul2tables). - Sourav Das’s
e2e-vssimplementation, which implements the same Groth21 PVSS protocol but with plain (non-forward-secure) ElGamal encryption and uses the much fasterblstrscrate for BLS12-381 arithmetic.
Both implementations use identical NIZK proofs (correct sharing and correct chunking, from Sections 6.4 and 6.5 of the paper) and the same chunking parameters:
NUM_CHUNKS=16(i.e., $\emph{m}$)CHUNK_SIZE=65536(i.e., $\emph{B}$)NUM_ZK_REPETITIONS=32(i.e., $\ell$ in their paper, but baptized as $\term{\tau}$ here)
The two implementations differ only in the encryption layer (forward-secure BTE vs. plain ElGamal) and the underlying curve library.
We chose to benchmark Sourav’s implementation, after verifying it is the fastest of the two. To confirm this, we modified DFINITY’s codebase to:
- remove the forward-secure BTE layer from encryption, replacing it with vanilla ElGamal, as in
e2e-vss - remove the pairing-based ciphertext integrity checks that are specific to BTE
- switch the sharing proof from $\mathbb{G}_2$ polynomial coefficient commitments to $\mathbb{G}_1$ commitments, matching
e2e-vss’s approach, for an apples-to-apples comparison
We then benchmarked both implementations on the same machine, calling the raw cryptographic functions directly (bypassing serialization overhead) with the same parameters ($n \in \{64, 128, 256, 512, 1024\}$).
The result: Sourav’s implementation was consistently 1.4-1.5x faster for both dealing and verification.
I suspect this is because blstrs includes hand-tuned C and assembly for BLS12-381 that is difficult to beat.
We further improved Sourav’s implementation by upgrading blstrs from v0.6.1 to v0.7.1 and blst from v0.3.11 to v0.3.16, which yielded an additional ~20% faster dealing and ~2x faster verification.
Note that this overly-generous towards Groth21, since blstrs is faster than arkworks, which we use in Chunky.
Why is Groth21’s worst-case decrypt-share so expensive?
Groth21’s chunking proof is approximate: it does not guarantee that each encrypted chunk $s_{i,j}$ lies in $[0, B)$ where $B = 2^\ell = 2^8 = 256$. Instead, for each $i,j$ it guarantees that there exists a small multiplicative factor $\term{\Delta_{i,j}} \in [1, \term{E})$, where:
- $\emph{E} = 2^{\lceil \lambda / \term{\tau} \rceil} = 2^8 = 256$
- $\lambda = 256$ is the security level
- $\emph{\tau} = 32$ is the number of parallel ZK repetitions in the chunking proof
- $\emph{\Delta_{i,j}} \cdot s_{i,j}$ lies in the signed range $[1-\term{Z},\,\emph{Z}-1]$, where: \[\emph{Z} = 2 \tau n m (B-1)(E-1)\]
So, decrypting 1 chunk may need up to $E-1 = 255$ baby-step giant-step (BSGS) invocations, each for a range of size $2Z-1$. Since a full share has $m = 32$ chunks, this yields $\term{k} = m \cdot (E-1) = 8{,}160$ BSGS invocations to decrypt 1 share. If done naively, this would take $O(k\sqrt{2Z-1})$ time. But, if we use a batched BSGS variant with larger tables, we can reduce this to $O(\sqrt{k (2Z-1)})$ time, a $\sqrt{k} \approx 90\times$ improvement.
Plugging in the actual $(n, m, B, E, \tau)$ from the benchmark:
| $n$ | $Z$ | $\log_2 (2Z-1)$ | $\sqrt{k}$ | $\ceil{\sqrt{2Z-1}}$ | batched BSGS table # entries $\approx\sqrt{k \cdot (2Z-1)}$) |
|---|---|---|---|---|---|
| 8 | $2 \cdot 32 \cdot 8 \cdot 32 \cdot 255 \cdot 255 \approx 1.07 \cdot 10^9$ | 31.0 | $\sqrt{8{,}160}$ | 46,161 | $\approx 4.17 \cdot 10^6$ |
| 256 | $2 \cdot 32 \cdot 256 \cdot 32 \cdot 255 \cdot 255 \approx 3.41 \cdot 10^{10}$ | 36.0 | same | 261,121 | $\approx 2.36 \cdot 10^7$ |
The worst-case numbers above only ever materialize for a fully adversarial dealer who evades detection by the chunking proof at every chunk and every receiver simultaneously.
How much work does the dealer need to do to trigger a worst case?
[GHL21e] notes
To reproduce, clone our alinush/cpp-lwevss fork, which rewrites the reference main.cpp to benchmark a fresh PVSS deal instead of the re-sharing scenario.
Specifically, it drops the n−1 dummy “previous dealings”, the t corresponding decryptions, and the three re-share/one-time-setup proofs (commit(sk), proveDecryption, proveKeyGen).
Only proveEncryption, proveSmallness, and proveReShare (the Shamir parity-check) remain, which is what a fresh PVSS dealer actually needs.
# 1. Install dependencies
brew install gmp ntl libsodium cmake # macOS
# Debian/Ubuntu: apt install libgmp-dev libntl-dev libsodium-dev cmake
# 2. Clone the fork and build
git clone https://github.com/alinush/cpp-lwevss
cd cpp-lwevss
mkdir -p build && cd build && cmake .. && make -j lwe-pvss-main && cd ..
# 3. Run sequentially (NOT in parallel: memory-bandwidth contention causes noise)
for tn in "6 8" "11 16" "22 32" "43 64" "85 128" "169 256"; do
set -- $tn
./build/lwe-pvss-main $2 $1
done
We also include the original numbers from the [GHL21e] paper1 and compare them to Chunky, making sure to be generous and use a higher threshold $t$ for Chunky.
| Scheme | $\ell$ | Setup | Transcript size | Deal (ms) | Verify (ms) | Decrypt-share (ms) |
|---|---|---|---|---|---|---|
| Chunky | 32 | 683-out-of-1024 / 1024 players | 1.15 MiB | 1,972.80 | 252.06 | 3.12 |
| [GHL21e]1 | N/A | 512-out-of-1024 / 1024 players | $\approx$300 KiB (3.93x) | 34,000 (17.24x) | 20,000 (79.35x) | 1.4 (2.23x) |
The C++ implementation uses a faster elliptic curve (Curve25519) compared to us (pairing-friendly BLS12-381).
However, the actual implementation is naive: (likely inefficiently) re-implements Bulletproofs from scratch using libsodium and it does not leverage MSMs.
Nonetheless, the paper stresses that “only about 25-30% of the prover time and about 15% of the verifier time was spent performing scalar-point multiplications on the curve”, suggesting that MSM/Bulletproof speed-ups would not make a huge difference.
I attempted optimizing their implementation, but it was fairly non-trivial: improvements there would warrant a separate publication.
cgVSS notes
The share-correctness proof uses polynomial commitments in $\Gr_1$ of BLS12-381; only the encryption layer and the associated ZK proof live in the class group.
To reproduce, clone the alinush/cgdkg_artifact repo which is a small modification of mine that:
- adds the PVSS abstraction in
classgroup/src/pvss.rs - adds PVSS benchmarks in
benches/benchmarks_pvss.rs - TODO: swaps MIRACL for blstrs, which is faster
…and run:
# macOS/arm64 may need: -L /opt/homebrew/lib -L /opt/homebrew/opt/openssl@3/lib
cargo bench --bench benchmarks_pvss
Future work
- It would be intriguing to optimize a Groth16-based approach for PVSS, like the one by Nicholas Gailly here.
Acknowledgements
The weighted PVSS in this blog post has been co-designed with Rex Fernando and Wicher Malten at Aptos Labs. The weighted DKG built on top of the PVSS has been co-designed with Daniel Xiang and Balaji Arun. Thanks to Ittai Abraham for helping me think through the DKG protocol from the lens of validated Byzantine agreement. Thanks to Wicher Malten for the initial write-up of Chunky 2, which I later modified.
Appendix: The $i’\gets \mathsf{idx}(i,j,k)$ indexing
It may be easiest to understand the $\idx(i,j,k) = (W_i + (j-1))\cdot m + k$ formula by considering an example.
Say the number of chunks per share is $m = 3$ and that we have $n=4$ players with weights $[ w_1, w_2, w_3, w_4 ] = [2, 1, 3, 2]$
Then, the cumulative weights will be $[ W_1, W_2, W_3, W_4 ] = [ 0, 2, 3, 6 ]$
“Flattening out” the shares, we’d get:
Player 1:
s_{1,1,1}, s_{1,1,2}, s_{1,1,3},
1 2 3
s_{1,2,1}, s_{1,2,2}, s_{1,2,3},
4 5 6
Player 2:
s_{2,1,1}, s_{2,1,2}, s_{2,1,3},
7 8 9
Player 3:
s_{3,1,1}, s_{3,1,2}, s_{3,1,3},
10 11 12
s_{3,2,1}, s_{3,2,2}, s_{3,2,3},
13 14 15
s_{3,3,1}, s_{3,3,2}, s_{3,3,3},
16 17 18
Player 4:
s_{4,1,1}, s_{4,1,2}, s_{4,1,3},
19 20 21
s_{4,2,1}, s_{4,2,2}, s_{4,2,3},
22 23 24
Observations:
- Player $i$’s share chunks start at index $W_i\cdot m + 1$.
- To get to the chunks of the $j$th share (of player $i$) add $(j-1)\cdot m$ to that.
- To get to the $k$th chunk (of the $j$th share of player i) add $k-1$ to that.
So:
\begin{align}
\idx(i,j,k)
&= ((W_i \cdot m + 1) + ((j-1) \cdot m) + (k-1)\\
&= ((W_i \cdot m) + ((j-1) \cdot m) + k\\
&= (W_i + (j-1)) \cdot m + k\\
\end{align}
For example, when $i = 3, j = 3, k = 2$, we get:
\begin{align}
(W_i + (j-1)) \cdot m + k
&= (W_3 + (3-1)) \cdot 3 + 2\\
&= (3 + 2) \cdot 3 + 2\\
&= 5 \cdot 3 + 2 = 17
\end{align}
as expected for $s_{3,3,2}$.
Appendix: Chunky 2
We present a modified version of Chunky with a 13% faster verifier, henceforth called Chunky 2.
To avoid redundancy, we describe only the modifications we made, rather than restating the entire algorithm from scratch.
The key idea is to modify the ElGamal-to-KZG relation $\emph{\Retk}$ from Eq. $\ref{rel:e2k}$ to also prove that the $\widetilde{V}_{i,j}$ share commitments from Eq. $\ref{eq:share-commitments}$ are computed correctly. This will speed up Step 2 in $\pvss.\verify$, which checks that what’s encrypted in the $C_{i,j,k}$’s from Eq. \ref{eq:share-ciphertexts} is what’s comitted in the $\widetilde{V}_{i,j}$’s.
The modified $\Retknew$ relation follows below, with changes $\bluedashedbox{\text{highlighted in blue}}$:
\begin{align}
\term{\Retknew}\left(\begin{array}{l}
\stmt = \left(G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C, \bluedashedbox{\{\widetilde{V}_{i,j}\}_{i,j}} \right),\\
\witn = \left(\{s_{i,j,k}\}_{i,j,k}, \{r_{j,k}\}_{j,k}, \rho\right)
\end{array}\right) = 1\Leftrightarrow\\
\Leftrightarrow\left\{\begin{array}{rl}
(C_{i,j,k}, R_{j,k}) &= E.\enc_{G,H}(\ek_i, s_{i,j,k}; r_{j,k})\\
C& = \dekart_2.\commit(\ck, \{s_{i,j,k}\}_{i,j,k}; \rho)\\
\bluedashedbox{\widetilde{V}_{i,j}} & = \bluedashedbox{\left( \sum_{k \in [m]} B^{k-1} s_{i,j,k} \right) \cdot \widetilde{G}}
\end{array}\right.
\end{align}
This modification moves almost all verification work in Step 2 of $\pvss.\verify$ into the dealing algorithm. Furthermore, it reduces total computation across the dealing and verification algorithms. We summarize below:
| Scheme | Proving work | Verification work | Transcript size change |
|---|---|---|---|
| Chunky | 0 | $\vmsmOne{W\cdot m} + \vmsmTwo{W} + \multipair{2}$ | 0 |
| Chunky 2 | $\GmulTwo{W}$ | $\vmsmTwo{2W+1}$ | ${} + W |\Gr_2|$ |
The $\Sigma$-protocol verifier extra work will be of the form $\psi(\mathbf{\sigma}) \equals \mathbf{A} + e\cdot [\widetilde{V}_{i,j}]_{i,j}$ and can be done in a size-$(2W+1)$ MSM because the group elements in $\psi(\mathbf{\sigma})$ will all have the same base $\widetilde{G}$.
Then, we modify Step 6 of the $\pvss.\deal$ algorithm to prove this new relation:
\begin{align}
\bluedashedbox{\piSoknew} &\gets \sok.\prove\left(\begin{array}{l}
\bluedashedbox{\Retknew}, \ctx,\\
G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C, \bluedashedbox{\{\widetilde{V}_{i,j}\}_{i,j}},\\
\{s_{i,j,k}\}_{i,j,k}, \{r_{j,k}\}_{j,k}, \rho
\end{array}\right)
\end{align}
Then, we modify Step 4 of the $\pvss.\verify$ algorithm to verify the proof from above:
\begin{align}
\textbf{assert}\ &\sok.\verify\left(\begin{array}{l}
\bluedashedbox{\Retknew}, \ctx,\\
G, H, \ck, \{\ek_i\}_i,\{C_{i,j,k}\}_{i,j,k}, \{R_{j,k}\}_{j,k}, C, \bluedashedbox{\{\widetilde{V}_{i,j}\}_{i,j}};\\
\bluedashedbox{\piSoknew}
\end{array}\right) \equals 1
\end{align}
Lastly, we remove Step 2 of the $\pvss.\verify$ algorithm, since the check is now performed above.
References
For cited works, see below 👇👇
-
Practical Non-interactive Publicly Verifiable Secret Sharing with Thousands of Parties, by Craig Gentry and Shai Halevi and Vadim Lyubashevsky, in Cryptology ePrint Archive, Report 2021/1397, 2021, [URL] ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9 ↩10
-
The 300 KiB proof size just mentioned in passing in the introduction. It is unclear whether they actually measured it correctly: is this the size of the publicly-verifiable transcript that includes all encryptions and proofs for all users? ↩
-
Non-interactive distributed key generation and key resharing, by Jens Groth, in Cryptology ePrint Archive, Report 2021/339, 2021, [URL] ↩
-
DekartProof: Efficient Vector Range Proofs and Their Applications, by Dan Boneh and Trisha Datta and Rex Fernando and Kamilla Nazirkhanova and Alin Tomescu, in Cryptology {ePrint} Archive, Paper 2025/1159, 2025, [URL] ↩
-
Golden: Lightweight Non-Interactive Distributed Key Generation, by Benedikt Bünz and Kevin Choi and Chelsea Komlo, in Cryptology {ePrint} Archive, Paper 2025/1924, 2025, [URL] ↩
-
Non-interactive VSS using Class Groups and Application to DKG, by Aniket Kate and Easwar Vivek Mangipudi and Pratyay Mukherjee and Hamza Saleem and Sri Aravinda Krishnan Thyagarajan, in Cryptology ePrint Archive, Paper 2023/451, 2023, [URL] ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8
-
Publicly Verifiable Secret Sharing over Class Groups and Applications to DKG and YOSO, by Ignacio Cascudo and Bernardo David, in Cryptology ePrint Archive, Paper 2023/1651, 2023, [URL] ↩
-
Linearly Homomorphic Encryption from DDH, by Guilhem Castagnos and Fabien Laguillaumie, in Cryptology ePrint Archive, Report 2015/047, 2015, [URL] ↩
-
Bulletproofs: Short Proofs for Confidential Transactions and More, by B. Bünz and J. Bootle and D. Boneh and A. Poelstra and P. Wuille and G. Maxwell, in 2018 IEEE Symposium on Security and Privacy (SP), 2018 ↩
-
On Signatures of Knowledge, by Melissa Chase and Anna Lysyanskaya, in Cryptology ePrint Archive, Report 2006/184, 2006, [URL] ↩
-
SCRAPE: Scalable Randomness Attested by Public Entities, by Cascudo, Ignacio and David, Bernardo, in Applied Cryptography and Network Security, 2017 ↩ ↩2
-
Distributed Randomness using Weighted VRFs, by Sourav Das and Benny Pinkas and Alin Tomescu and Zhuolun Xiang, in Cryptology ePrint Archive, Paper 2024/198, 2024, [URL] ↩
-
In an abundance of caution, in Aptos, we require that $Q$ contains $>$ 66% of the stake. ↩
-
Recall that in Aptos, we will safely reuse the validator signing keys as encryption keys. ↩
-
This can be viewed through the lens of collecting $f+1$ attestations in validated Byzantine agreement (VABA). ↩
-
This may require that each validator $i’$ poll other validators for the transcripts in the proposed set $Q$ that $i’$ is missing. ↩
-
If $i’$ receives two transcripts signed by the same validator $j’$, then that constitute equivocation and would be provable misbehavior. So $i’$ should (or may?) not attest to $Q$ since it includes a malicious player $j’$. ↩
-
Short Signatures from the Weil Pairing, by Boneh, Dan and Lynn, Ben and Shacham, Hovav, in Advances in Cryptology — ASIACRYPT 2001, 2001 ↩
-
Constructing Elliptic Curves with Prescribed Embedding Degrees, by Paulo S. L. M. Barreto and Ben Lynn and Michael Scott, in Cryptology ePrint Archive, Paper 2002/088, 2002, [URL] ↩
-
Technically, they have to add a dummy proof to the subtranscript, obtaining a proper transcript, which they can now feed in to $\pvssDecrypt$ in a type-safe way. ↩