tl;dr: Pairings, or bilinear maps, are a very powerful mathematical tool for cryptography. Pairings gave us our most succinct zeroknowledge proofs^{1}$^,$^{2}$^,$^{3}, our most efficient threshold signatures^{4}, our first usable identitybased encryption (IBE)^{5} scheme, and many other efficient cryptosystems^{6}. In this post, I’ll teach you a little about the properties of pairings, their cryptographic applications and their fascinating history. In fact, by the end of this post, some of you might want to spend a year or two in jail.
Twitter correction: The original tweet announcing this blog post stated that “SNARKs would not be possible without [pairings]”, with the highlighted S meant to emphasize the “succinctness” of such SNARKs. However, thanks to several folks on Twitter, I realized this is not exactly true and depends on what one means by “succinct.” Specifically, “succinct” SNARKs, in the polylogarithmic proof size sense defined by Gentry and Wichs^{7}, exist from a plethora of assumptions, including discrete log^{8} or random oracles^{9}. Furthermore, “succinct” SNARKs, in the sense of $O(1)$ group elements proof size, exist from RSA assumptions too^{10}. What pairings do give us, currently, are SNARKs with the smallest, concrete proof sizes (i.e., in # of bytes).
$$ \def\idt{\mathbb{1}_{\Gr_T}} \def\msk{\mathsf{msk}} \def\dsk{\mathsf{dsk}} \def\mpk{\mathsf{mpk}} $$
Preliminaries
 You are familiar with cyclic groups of prime order (e.g., elliptic curves)
 Let \(\idt\) denote the identity element of the group $\Gr_T$
 Let $x \randget S$ denote randomly sampling an element $x$ from a set $S$
 Recall that $\langle g \rangle = \Gr$ denotes $g$ being a generator of the group $\Gr$
Definition of a pairing
A pairing, also known as a bilinear map, is a function $e : \Gr_1 \times \Gr_2 \rightarrow \Gr_T$ between three groups $\Gr_1, \Gr_2$ and $\Gr_T$ of prime order $p$, with generators $g_1 = \langle \Gr_1 \rangle, g_2 = \langle \Gr_2 \rangle$ and $g_T = \langle \Gr_T \rangle$, respectively.
When $\Gr_1 = \Gr_2$, the pairing is called symmetric. Otherwise, it is asymmetric.
Most importantly, a pairing has two useful properties for cryptography: bilinearity and nondegeneracy.
Bilinearity
Bilinearity requires that, for all $u\in\Gr_1$, $v\in\Gr_2$, and $a,b\in\Zp$:
\[e(u^a, v^b) = e(u, v)^{ab}\]For cryptography purposes, this is the coolest property. For example, this is what enables useful applications like tripartite DiffieHellman.
Nondegeneracy
Nondegeneracy requires that:
\[e(g_1, g_2) \ne \idt\]Why this property? We want nondegeneracy because, without it, it is simple (and useless) to define a (degenerate) bilinear map that, for every input, returns $\idt$. Such a map would satisfy bilinearity, but would be completely useless.
Efficiency
Efficiency requires that there exists a polynomialtime algorithm in the size of a group element (i.e.; in $\lambda = \log_2{p}$) that can be used to evaluate the pairing $e$ on any inputs.
Why this requirement? It precludes trivialbutcomputationallyintractable pairings. (Click to expand.)
For example, let $r$ be a random element in $\Gr_T$.
First, the pairing is defined so that $e(g_1, g_2) = r$.
This way, the pairing satisfies nondegeneracy.
Second, given $(u,v)\in \Gr_1 \times \Gr_2$, an algorithm could spend exponential time $O(2^\lambda)$ to compute the discrete logs $x = \log_{g_1}{(u)}$ and $y = \log_{g_2}{(v)}$ and return $e(u, v) = e(g_1^x, g_2^y) = r^{xy}$.
This way, the pairing satisfies bilinearity because:
\begin{align}
e(u^a, v^b)
&= e\left((g_1^x)^a, (g_2^y)^b\right)\\\
&= e\left(g_1^{(ax)}, g_2^{(by)}\right)\\\
&= r^{(ax)\cdot (by)}\\\
&= \left(r^{xy}\right)^{ab}\\\
&= e(u, v)^{ab}
\end{align}
History
This is my limited historical understanding of pairings, mostly informed from Dan Boneh’s account in this video and from my own research into the relevant literature. Please email me if you are aware of more history and I can try to incorporate it.
A mathematician in prison
The history of (cryptographic) pairings begins with a mathematician named André Weil^{11} who, during World War II, is sent to jail for refusing to serve in the French army. There, Weil, “managed to convince the liberalminded prison director to put [him] in an individual cell where [he] was allowed to keep [..] a pen, ink, and paper.”
Weil used his newlyacquired tools to define a pairing across two elliptic curve groups. However, what was very odd at that time was that Weil put in a lot of effort to make sure his definition of a pairing is computable. And this extra effort is what made today’s pairingbased cryptography possible^{12}.
Go to prison, not to university?
Funnily, Weil’s time in jail was so productive that he began wondering if he should spend a few months there every year. Even better, Weil contemplated if he should recommend to the relevant authorities that every mathematician spend some amount of time in jail. Weil writes:
I’m beginning to think that nothing is more conducive to the abstract sciences than prison.
[…]
My mathematics work is proceeding beyond my wildest hopes, and I am even a bit worried  if it’s only in prison that I work so well, will I have to arrange to spend two or three months locked up every year?
In the meantime, I am contemplating writing a report to the proper authorities, as follows: “To the Director of Scientific Research: Having recently been in a position to discover through personal experience the considerable advantages afforded to pure and disinterested research by a stay in the establishments of the Penitentiary System, I take the liberty of, etc. etc.”
You can read all of this and more in his fascinating autobiography, written from his perspective as a mathematician^{13}.
From breaking cryptography to building cryptography
Weil’s work was the foundation. But three more developments were needed for pairingbased cryptography to rise.
First development: Miller’s algorithm
In 1985, Victor Miller writes up a manuscript showing that Weil’s pairing, which actually involves evaluating a polynomial of exponential degree, can in fact be computed efficiently in polynomial time^{14}.
In December 1984, Miller gave a talk at IBM about elliptic curve cryptography where he claimed that elliptic curve discrete logs were more difficult to compute than ordinary discrete logs over finite fields^{15}. Miller was challenged by Manuel Blum, who was in the audience, to back up this claim by giving a reduction: i.e., showing that an algorithm $B$ for solving discrete log on elliptic curves can be efficiently turned into another algorithm $A$ for solving discrete logs in finite fields. Such a reduction would imply the problem solved by $B$ (i.e., computing elliptic curve discrete logs) is at least as hard, if not harder, than $A$’s problem (i.e., computing finite field discrete logs).
Miller set out to find a reduction by thinking about the only thing that related an elliptic curve group and a finite field: the Weil pairing. Funnily, what he quickly realized was that, although the Weil pairing gives a reduction, it’s in the opposite direction: i.e., it turned out an algorithm $A$ for discrete log in finite fields could be efficiently turned into an algorithm $B$ for discrete logs in elliptic curves with the help of the Weil pairing. This “unwanted” reduction is easy to see. Since $e(g^a, g) = e(g,g)^a$, solving discrete log on the elliptic curve element $g_a\in \Gr$ is just a matter of solving it on $e(g,g)^a\in \Gr_T$, which is actually a multiplicative subgroup of a finite field (see the inner details of pairings).
This almost showed the opposite of what Miller sought to prove, potentially destroying elliptic curve cryptography, but fortunately the degree of the extension field that the Weil pairing mapped into was too large, making this “unwanted” reduction inefficient and thus not a reduction after all.
This whole affair got Miller interested in seeing if the Weil pairing could be computed efficiently, which led to the discovery of his algorithm. Interestingly, he submitted this manuscript to FOCS, a top theoretical computer science conference, but the paper got rejected and would not be published until much later in the Journal of Cryptology (according to Miller)^{16}.
Second development: MOV attack
In 1991, Menezes, Vanstone and Okamoto^{17} leverage Miller’s efficient algorithm for evaluating the Weil pairing to break the discrete logarithm assumption on certain elliptic curves in subexponential time. This was quite amazing since, at that time, no subexponential time algorithms were known for elliptic curves.
Their attack, called the MOV attack, mapped an elliptic curve discrete logarithm challenge $g^a\in \Gr$ into a target group as $e(g^a, g)=e(g,g)^a \in \Gr_T$ using the pairing. Since the target group was a subgroup of a finite field $\mathbb{F}_q^{k}$, this allowed the use of faster, subexponential time algorithms for computing the discrete log on $e(g,g)^a$.
Third development: Joux’s tripartite DiffieHellman
So far, pairings only seemed useful for cryptanalysis. No one knew how to use them for building (instead of breaking) cryptography.
This changed in 2000, when Joux^{18} used pairings to implement a 1round keyexchange protocol between three parties, or tripartite DiffieHellman. Previously, such 1round protocols were only known between two parties while three parties required 2 rounds.
From there, an abundance of new, efficient cryptography started pouring over:
 BLS (short) signatures^{4}
 identitybased encryption^{5}
 additivelyhomomorphic encryption with support for one multiplication^{19}
 succinct zeroknowledge proofs^{1}
An interesting pattern to notice here is how pairings evolved from a cryptanalytic tool used to break cryptosystems, to a constructive tool used to build cryptosystems. Interestingly, the same pattern also arose in the development of latticebased cryptography.
Arithmetic tricks with pairings
There are a few tricks cryptographers often use when dealing with pairings in their proofs of correctness or security of a cryptosystem.
The most obvious trick, “multiplying in the exponent”, comes from the bilinearity property.
\begin{align} e(u^a, v^b) = e(u, v)^{ab} \end{align}
Bilinearity also implies the following trick: \begin{align} e(u, v^b) = e(u, v)^b \end{align} Or, alternatively: \begin{align} e(u^a, v) = e(u, v)^a \end{align}
Another trick, which is just an analogous way of defining bilinearity, is: \begin{align} e(u, v\cdot w) = e(u, v)\cdot e(u, w) \end{align}
Why does this work? Let $y,z$ denote the discrete logs (w.r.t. $g_2$) of $v$ and $w$, respectively.
Then, we have:
\begin{align}
e(u, v\cdot w)
&= e(u, g_2^y \cdot g_2^z)\\
&= e(u, g_2^{y + z})\\
&= e(u, g_2)^{y + z}\\
&= e(u, g_2)^y \cdot e(u, g_2)^z\\
&= e(u, g_2^y) \cdot e(u, g_2^z)\\
&= e(u, v)\cdot e(u, w)
\end{align}
Or, alternatively: \begin{align} e(u, v / w) = \frac{e(u, v)}{e(u, w)} \end{align}
Applications of pairings
Tripartite DiffieHellman
This protocol was introduced by Joux in 2000^{18} and assumes a symmetric pairing: i.e., where \(\Gr_1 = \Gr_2 = \langle g\rangle \stackrel{\mathsf{def}}{=} \Gr\).
We have three parties Alice, Bob and Charles with secret keys $a, b$, and $c$ (respectively). They send each other their public keys $g^a, g^b, g^c$ and agree on a shared secret key $k = e(g, g)^{abc}$.^{20}
How?
Consider Alice’s point of view. She gets $g^b$ and $g^c$ from Bob and Charles. First, she can use her secret $a$ to compute $g^{ab}$. Second, she can use the pairing to compute $e(g^{ab}, g^c) = e(g, g)^{abc} = k$.
By symmetry, all other players can do the same and agree on the same $k$.
The protocol can be generalized to asymmetric pairings too, where $\Gr_1 \neq \Gr_2$.
BLS signatures
Boneh, Lynn and Shacham give a very short signature scheme from pairings^{4}, which works as follows:
 Assume $\Gr_2 = \langle g_2 \rangle$ and that there exists a hash function $H : \{0,1\}^* \rightarrow \Gr_1$ modeled as a random oracle.
 The secret key is $s \in \Zp$ while the public key is $\pk = g_2^s \in \Gr_2$.
 To sign a message $m$, the signer computes $\sigma = H(m)^s\in \Gr_1$.
 To verify a signature $\sigma$ on $m$ under public key $\pk$, one checks if $e(\sigma, g_2) \stackrel{?}{=} e(H(m), \pk)$
Notice that correctlycomputed signatures will always verify since:
\begin{align}
e(\sigma, g_2) \stackrel{?}{=} e(H(m), \pk) \Leftrightarrow\\
e(H(m)^s, g_2) \stackrel{?}{=} e(H(m), g_2^s) \Leftrightarrow\\
e(H(m), g_2)^s \stackrel{?}{=} e(H(m), g_2)^s \Leftrightarrow\\
e(H(m), g_2) = e(H(m), g_2)
\end{align}
See the BLS paper^{4} for how to prove that no attacker can forge BLS signatures given access to $\pk$ and a signing oracle.
Cool properties of BLS signatures
BLS signatures are quite amazing:
 They are one of the simplest schemes to implement, given access to an ellipticcurve library.
 One can aggregate many signatures from different public keys on the same message $m$ into a single multisignature that continues to verify using just 2 pairings.
 One can even aggregate such signatures across different messages into a single aggregate signature.
 However, such aggregate signatures take $n+1$ pairings to verify.
 One can easily and efficiently^{21} build a threshold BLS signature scheme, where any subset of $\ge t$ out of $n$ signers can collaborate to sign a message $m$ but no less than $t$ can ever produce a valid signature.
 Even better, BLS threshold signatures are deterministic and give rise to threshold verifiable random functions (VRFs) which are useful for generating randomness on chain.
 One can define veryefficient blindvariants of BLS signatures, where the signer can sign a message $m$ without learning the message $m$.
 BLS signatures are very efficient in practice.
 As far as I remember, they are the most efficient scheme for (1) multisignatures, (2) aggregate signatures and (3) threshold signatures
 For singlesigner BLS, they are slower than Schnorr signatures over nonpairingfriendly curves
If you find yourself confused between the various notions of multisignatures, aggregate signatures and threshold signatures, see my slides.
Identitybased encryption (IBE)
In an IBE scheme, one can encrypt directly to a userfriendly email address (or a phone number), instead of a cumbersome public key which is difficult to remember or typein correctly.
Boneh and Franklin give a very efficient IBE scheme from pairings^{5}.
For IBE to work, a trusted thirdparty (TTP) called a private key generate (PKG) must be introduced, who will issue secret keys to users based on their email addresses. This PKG has a master secret key (MSK) $\msk \in \Zp$ with an associated master public key (MPK) $\mpk = g_2^s$, where $\langle g_2 \rangle = \Gr_2$.
The $\mpk$ is made public and can be used to encrypt a message to any user given their email address. Crucially, the PKG must keep the $\msk$ secret. Otherwise, an attacker who steals it can derive any user’s secret key and decrypt everyone’s messages.
As you can tell the PKG is a central point of failure: theft of the $\msk$ compromises everyone’s secrecy. To mitigate against this, the PKG can be decentralized into multiple authorities such that a threshold number of authorities must be compromised in order to steal the $\msk$.
Let $H_1 : \{0,1\}^* \rightarrow \Gr_1^*$ and $H_T : \Gr_T \rightarrow \{0,1\}^n$ be two hash functions modeled as random oracles.
To encrypt an $n$bit message $m$ to a user with email address $id$, one computes:
\begin{align}
g_{id} &= e(H_1(id), \mpk) \in \Gr_T\\
r &\randget \Zp\\
\label{eq:ibectxt}
c &= \left(g_2^r, m \xor H_T\left(\left(g_{id}\right)^r\right)\right) \in \Gr_2\times \{0,1\}^n
\end{align}
To decrypt, the user with email address $id$ must first obtain their decryption secret key $\dsk_{id}$ from the PKG. For this, we assume the PKG has a way of authenticating the user, before handing them their secret key. For example this could be done via email.
The PKG computes the user’s decryption secret key as: \begin{align} \dsk_{id} = H_1(id)^s \in \Gr_1 \end{align}
Now that the user has their decryption secret key, they can decrypt the ciphertext $c = (u, v)$ from Equation $\ref{eq:ibectxt}$ as: \begin{align} m &= v \xor H_T(e(\dsk_{id}, u)) \end{align}
You can see why correctlyencrypted ciphertexts will decrypt successfully, since:
\begin{align}
v \xor H_T(e(\dsk_{id}, u))
&= \left(m \xor H_T\left((g_{id})^r \right)\right) \xor H_T\left(e(H_1(id)^s, g_2^r) \right)\\
&= \left(m \xor H_T\left(e(H_1(id), \mpk )^r \right)\right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\\
&= m \xor \left(H_T\left(e(H_1(id), g_2^s)^r \right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\right)\\
&= m \xor \left(H_T\left(e(H_1(id), g_2 )^{rs}\right) \xor H_T\left(e(H_1(id), g_2 )^{rs}\right)\right)\\
&= m
\end{align}
To see why this scheme is secure under chosenplaintext attacks, refer to the original paper^{5}.
How do pairings actually work?
Mostly, I have no idea. How come? Well, I never really needed to know. And that’s the beauty of pairings: one can use them in a blackbox fashion, with zero awareness of their internals.
Still, let’s take a small peek inside this black box. Let us consider the popular pairingfriendly BLS12381 curve^{22}, from the family of BLS curves characterized by Barreto, Lynn and Scott^{23}.
Public service announcement: Some of you might’ve heard about BonehLynnShacham (BLS) signatures. Please know that this is a different BLS than the BLS in BarrettoLynnScott curves. Confusingly, both acronyms do share one author, Ben Lynn. (In case this was not confusing enough, wait until you have to work with BLS signatures over BLS12381 curves.)
For BLS12381, the three groups $\Gr_1, \Gr_2, \Gr_T$ involved are:
 The group $\Gr_1$ is a subgroup of an elliptic curve $E(\F_q) = \left\{(x, y) \in (\F_q)^2\ \vert\ y^2 = x^3 + 4 \right\}$ where $\vert\Gr_1\vert = p$
 The group $\Gr_2$ is a subgroup of a different elliptic curve $E’(\F_{q^2}) = \left\{(x, y) \in (\F_{q^2})^2\ \vert\ y^2 = x^3 + 4(1+i) \right\}$ where $i$ is the square root of $1$ and $\vert\Gr_2\vert = p$.
 The group $\Gr_T$ are all the $p$th roots of unity in $\F_{q^{k}}$, where $k=12$ is called the embedding degree
How does the pairing map across these three groups work? Well, the pairing $e(\cdot,\cdot)$ expands to something like: \begin{align} \label{eq:pairingdef} e(u, v) = f_{p, u}(v)^{(q^k  1)/p} \end{align} It’s useful to know that computing a pairing consists of two steps:
 Evaluating the base $f_{p, u}(v)$, also known as a Miller loop, in honor of Victor Miller’s work
 Raising this base to the constant exponent $(q^k  1)/p$, also known as a final exponentiation.
 This step is a few times more expensive than the first
For more on the internals, see other resources^{24}$^,$^{25}$^,$^{26}.
Implementing pairingbased crypto
This section discusses various implementationlevel details that practitioners can leverage to speed up their implementations.
Use asymmetric pairings!
The pairing over BLS12381 is asymmetric: i.e., $\Gr_1 \ne \Gr_2$ are two different groups (of the same order $p$). However, there also exist symmetric pairings where $\Gr_1 = \Gr_2$ are the same group.
Unfortunately, “such symmetric pairings only exist on supersingular curves, which places a heavy restriction on either or both of the underlying efficiency and security of the protocol”^{27}. In other words, such supersingular curves are not as efficient at the same security level as the curves used in asymmetric pairings.
Therefore, practitioners today, as far as I am aware, exclusively rely on asymmetric pairings due to their higher efficiency when the security level is kept the same.
BLS12381 performance
I will give a few key performance numbers for the BLS12381 curve implemented in Filecoin’s (blstrs Rust wrapper around the popular blst library.
These microbenchmarks were run on a 10core 2021 Apple M1 Max using cargo bench
.
Pairing computation times
As explained in Eq. \ref{eq:pairingdef}, a pairing involves two steps:
 a Miller loop computation
 210 microseconds
 a final exponentiation
 276 microseconds
Therefore, a pairing takes around 486 microseconds (i.e., the sum of the two).
Exponentiation times
The $\Gr_T$ microbenchmarks were done by slightlymodifying the blstrs
benchmarking code here.
(See the HTML comments of this page for those modifications.)
 $\Gr_1$ exponentiations are the fastest
 72 microseconds
 $\Gr_2$ exponentiations are around 2$\times$ slower
 136 microseconds
 $\Gr_T$ exponentiations are around 3.5$\times$ slower than in $\Gr_2$
 500 microseconds
Note: These benchmarks pick the exponentiated base randomly and do not perform any precomputation on it, which would speed up these times by $2\times$$4\times$.
Multiexponentiations
This is a wellknown optimization that I’m including for completeness.
Specifically, many libraries allow you to compute a product $\prod_{0 < i < k} \left(g_i\right)^{x_i}$ of $k$ exponentiations much faster than individually computing the $k$ exponentiations and aggregating their product.
For example, blstrs seems to be incredibly fast in this regard:
 a size256 multiexponentiation in $\Gr_1$
 takes 760 microseconds in total, or 3 microseconds per exponentiation!
 done naively, it would take 18.5 milliseconds in total, which is $24\times$ longer
 a size256 multiexponentiation in $\Gr_2$
 takes 1.88 milliseconds in total, or 7.33 microseconds per exponentiation!
 done naively, it would take 35.3 milliseconds, which is $18.8\times$ longer
Group element sizes
 $\Gr_1$ group elements are the smallest
 e.g., 48 bytes for BLS12381 or 32 bytes for BN254 curves^{28}
 $\Gr_2$ group elements are 2$\times$ larger
 e.g., 96 bytes on BLS12381
 $\Gr_T$ elements are 12$\times$ larger
 In general, for a pairingfriendly curve with embedding degree $k$, they are $k$ times larger
Other operations
 $\Gr_1$ multiplications (recall we are using multiplicative notation for groups, not additive notation)
 Normal: 565 nanoseconds (when both points are in projective $(X, Y)$ coordinates)
 Mixed: 438 nanoseconds (when first point is in projective coordinates, second is in affine $(X, Y, Z)$ coordinates)
 Faster, because saves one projectivetoaffine conversion
 $\Gr_2$ multiplications
 Normal: 1,484 nanoseconds
 Mixed: 1,095 nanoseconds
 Hashing to $\Gr_1$ takes around 50 microseconds (not accounting for the extra time required to hash down larger messages using SHA2256)
Switching between $\Gr_1$ and $\Gr_2$
When designing a pairingbased cryptographic protocol, you will want to carefully pick what to use $\Gr_1$ and what to use $\Gr_2$ for.
For example, in BLS signatures, if you want small signatures, then you would compute the signature $\sigma = H(m)^s \in \Gr_1$ and settle for a slightlylarger public key be in $\Gr_2$. On the other hand, if you wanted to minimize public key size, then you would let it be in $\Gr_1$ while taking slightly longer to compute the signature in $\Gr_2$.
Other things will also influence how you use $\Gr_1$ and $\Gr_2$, such as the existence of an isomorphism $\phi : \Gr_2 \rightarrow \Gr_1$ or the ability to hash uniformly into these groups. In fact, the existence of such an isomorphism separates between two types of asymmetric pairings: Type 2 and Type 3 (see Galbraith et al.^{25} for more information on the different types of pairings)
Comparison to nonpairingfriendly elliptic curves
When compared to an elliptic curve that does not admit pairings, pairingfriendly elliptic curves are around two times slower.
For example, the popular primeorder elliptic curve group Ristretto255 offers:
 $\approx 2\times$ faster exponentiations of 40 microseconds
 which can be sped up to 10 microseconds, using precomputation when the exponentiation base is fixed
 32 byte group element sizes
Multipairings
If you recall how a pairing actually works (see Eq. $\ref{eq:pairingdef}$), you’ll notice the following optimization:
Whenever, we have to compute the product of $n$ pairings, we can first compute the $n$ Miller loops and do a single final exponentiation instead of $n$.
This drastically reduces the pairing computation time in many applications.
\begin{align}
\prod_i e(u_i, v_i)
&= \prod_i \left(f_{p, u_i}(v_i)^{(q^k  1)/p}\right)\\
&= \left(\prod_i f_{p, u_i}(v_i)\right)^{(q^k  1)/p}
\end{align}
Conclusion
This blog post was supposed to be just a short summary of the three properties of pairings: bilinearity, nondegeneracy and efficiency.
Unfortunately, I felt compelled to discuss their fascinating history. And I couldn’t let you walk away without seeing a few powerful cryptographic applications of pairings.
After that, I realized practitioners who implement pairingbased cryptosystems might benefit from knowing a little about their internal workings, since some of these details can be leveraged to speed up implementations.
Acknowledgements
I would like to thank Dan Boneh for helping me clarify and contextualize the history around Weil, as well as for his 2015 Simons talk, which inspired me to do a little more research and write this historical account.
Big thanks to:
 Lúcás Meier, Pratyush Mishra, Ariel Gabizon and Dario Fiore for their enlightening points on what “succinct” (S) stands for in SNARKs^{7} and for reminding me that SNARKs with $O(1)$ group elements proof size exist from RSA assumptions^{10}.
 Sergey Vasilyev for pointing out typos in the BLS12381 elliptic curve definitions.
 @BlakeMScurr for pointing out an incorrect reference to Joux’s work^{18}.
 Conrado Guovea for pointing me to Victor Miller’s account of how he developed his algorithm for evaluating the Weil pairing (discussed here).
 Chris Peikert for pointing out that there are plentyfast IBE schemes out there that do not rely on pairings^{29}.
PS: Twitter threads are a pain to search through, so if I missed acknowledging your contribution, please kindly let me know.

Quadratic Span Programs and Succinct NIZKs without PCPs, by Rosario Gennaro and Craig Gentry and Bryan Parno and Mariana Raykova, in Cryptology ePrint Archive, Paper 2012/215, 2012, [URL] ↩ ↩^{2}

Pinocchio: Nearly Practical Verifiable Computation, by Bryan Parno and Craig Gentry and Jon Howell and Mariana Raykova, in Cryptology ePrint Archive, Paper 2013/279, 2013, [URL] ↩

On the Size of PairingBased Noninteractive Arguments, by Groth, Jens, in Advances in Cryptology – EUROCRYPT 2016, 2016 ↩

Short Signatures from the Weil Pairing, by Boneh, Dan and Lynn, Ben and Shacham, Hovav, in Advances in Cryptology — ASIACRYPT 2001, 2001 ↩ ↩^{2} ↩^{3} ↩^{4}

IdentityBased Encryption from the Weil Pairing, by Boneh, Dan and Franklin, Matt, in Advances in Cryptology — CRYPTO 2001, 2001 ↩ ↩^{2} ↩^{3} ↩^{4}

ConstantSize Commitments to Polynomials and Their Applications, by Kate, Aniket and Zaverucha, Gregory M. and Goldberg, Ian, in ASIACRYPT’10, 2010 ↩

Separating Succinct NonInteractive Arguments From All Falsifiable Assumptions, by Craig Gentry and Daniel Wichs, in Cryptology ePrint Archive, Report 2010/610, 2010, [URL] ↩ ↩^{2}

Efficient ZeroKnowledge Arguments for Arithmetic Circuits in the Discrete Log Setting, by Jonathan Bootle and Andrea Cerulli and Pyrros Chaidos and Jens Groth and Christophe Petit, in Cryptology ePrint Archive, Report 2016/263, 2016, [URL] ↩

ComputationallySound Proofs, by Silvio Micali, in Logic Colloquium ‘95: Proceedings of the Annual European Summer Meeting of the Association of Symbolic Logic, 1998, [URL] ↩

Subvector Commitments with Application to Succinct Arguments, by Russell W.F. Lai and Giulio Malavolta, in Cryptology ePrint Archive, Report 2018/705, 2018, [URL] ↩ ↩^{2}

André Weil — Wikipedia, The Free Encyclopedia, by Wikipedia contributors, 2022, [URL] ↩

Thanks to Dan Boneh, who contrasted Weil’s definition with a different one by Shimura from his classic book on modular forms. While Shimura’s definition makes it much easier to prove all the properties of the pairing, it defines a pairing of order $n$ as a sum of $n$ points of order $n^2$. This makes it hopelessly noncomputable. Weil’s definition, on the other hand, involves an evaluation of a very concrete function – there are no exponentialsized sums – but requires much more work to prove all its pairing properties. ↩

The Apprenticeship of a Mathematician, by Weil, Andre, 1992, [URL] ↩

Short Programs for functions on Curves, by Victor S. Miller, 1986, [URL] ↩

Miller tells this story himself in a talk he gave at Microsoft Research on October 10th, 2010. ↩

I am unable to find any trace of Miller’s published work on this beyond the manuscript Boneh published in^{14}. Any pointers would be appreciated. ↩

Reducing Elliptic Curve Logarithms to Logarithms in a Finite Field, by Menezes, Alfred and Vanstone, Scott and Okamoto, Tatsuaki, in ACM STOC, 1991, [URL] ↩

A One Round Protocol for Tripartite Diffie–Hellman, by Joux, Antoine, in Algorithmic Number Theory, 2000 ↩ ↩^{2} ↩^{3}

Evaluating 2DNF Formulas on Ciphertexts, by Boneh, Dan and Goh, EuJin and Nissim, Kobbi, in Theory of Cryptography, 2005 ↩

Typically, there will be some keyderivation function $\mathsf{KDF}$ used to derive the key as $k = \mathsf{KDF}(e(g,g)^{abc})$. ↩

Towards Scalable Threshold Cryptosystems, by Alin Tomescu and Robert Chen and Yiming Zheng and Ittai Abraham and Benny Pinkas and Guy Golan Gueta and Srinivas Devadas, in IEEE S\&P’20, 2020 ↩

BLS12381 For The Rest Of Us, by Ben Edgington, 2022, [URL] ↩

Constructing Elliptic Curves with Prescribed Embedding Degrees, by Paulo S. L. M. Barreto and Ben Lynn and Michael Scott, in Cryptology ePrint Archive, Paper 2002/088, 2002, [URL] ↩

Pairings for cryptographers, by Steven D. Galbraith and Kenneth G. Paterson and Nigel P. Smart, in Discrete Applied Mathematics, 2008, [URL] ↩ ↩^{2}

An Introduction to PairingBased Cryptography, by Alfred Menezes, 2005, [URL] ↩

Subgroup security in pairingbased cryptography, by Paulo S. L. M. Barreto and Craig Costello and Rafael Misoczki and Michael Naehrig and Geovandro C. C. F. Pereira and Gustavo Zanon, in Cryptology ePrint Archive, Paper 2015/247, 2015, [URL] ↩

PairingFriendly Elliptic Curves of Prime Order, by Barreto, Paulo S. L. M. and Naehrig, Michael, in Selected Areas in Cryptography, 2006 ↩

Efficient IdentityBased Encryption over NTRU Lattices, by Léo Ducas and Vadim Lyubashevsky and Thomas Prest, in Cryptology ePrint Archive, Paper 2014/794, 2014, [URL] ↩