tl;dr: Everything I wanted to know but was afraid to ask about circom
.
Notes
As put by Palladino1:
The main challenges [in] Circom is understanding when you are writing constraint-generation code, and when you are writing witness-generation code, and tracking what values are known at compile-time versus only known at runtime. […] Keeping in mind the difference between constraint and witness generation times can guard you against underconstrained computation bugs.
Terminology
- witness generation: computation of hints a.k.a. “witness data” that aids in satisfying the constraints encoded in a circom template
- can be sped up with parallel keyword
- variables
- circom docs say: “variables are identifiers that hold non-signal data and are mutable.”
- but, a more accurate mental model, AFAICT, is that they are compile-time symbolic expressions (e.g., often they are used to symbolically-store a linear combination of signals at compile time; see the Num2Bits example)
- functions: reusable pieces of code used only during witness generation
- signals
- input signals; i.e.,:
- part of the NP relation’s statement or witness
- you cannot do
s <== expression
forsignal input s
; only forsignal s
.
- output signals
- intermediate signals (computed via witness generation)
- input signals; i.e.,:
- template: a family of circuits, parameterized by one or more values known at compile-time (think C++ templates)
- template parameters
- components
- assignment
- constraints
var
versus signal
Give some examples here. Num2Bits is one.
Witness generation versus constraints
Find a more natural circuit that showcases this difference. Not IsZero()
.
As explained by Palladino1:
When writing a circuit you’re writing two different programs, that belong to two different programming paradigms, in a single one.
The most important thing to know: when (not) to use <--
.
Tags
Very useful for safety. See nice article by Hess2.
Buses
Very useful for readable code. See here.
Assert
The assert
keyword is particularly “interesting.”
I think I’m gonna stay away from it.
Quoting from here:
// if all values are known at compile-time the assertion is checked then
// otherwise, the assertion creates a constraint.
assert(a <= b);
assert(a * a == b);
Non-trivial examples
IsZero()
template
This is a template that outputs 1 when the input signal is 0 and outputs 0 otherwise.
It should be the most trivial thing to implement, no?
Yet it turns out that, while checking whether something is zero is trivial (i.e., something === 0
), assigning a signal with the truth value of something === 0
is not as trivial…
(This is because the only operations in circom
are equality checks, addition and multiplication.)
This example will also illustrate the difference between witness generation versus constraining.
1
2
3
4
5
6
7
8
9
10
template IsZero() {
signal input in;
signal output out;
signal inv;
inv <-- in != 0 ? 1/in : 0;
out <== -in * inv + 1;
in * out === 0;
}
How does it work? Here’s a simple analysis:
- When
in
is 0, then:out <== -in * inv + 1
(line 8) impliesout <== 0 * inv + 1
.- Which implies
out <== 1
, ensuring the desired behaviour. - Q: But then
inv
does not even matter for this case? Which is why it is assigned 0? But why 0? It could be assigned anything it seems.
- When
in
is not 0, then:in * out === 0
(line 9) impliesout === 0
.- As a result,
out <== -in * inv + 1
implies0 <== -in * inv + 1
- Which implies
in * inv === 1
- Which can be satisfied, since the witness generation can compute an inverse via the
inv <-- in != 0 ? 1/in : 0
line.
Observation: I find the explanation in circom
’s docs a bit confusing, perhaps even in a dangerous way.
It says: “If in
is 0, then […]. Otherwise, in * inv
is always 1, then out
is 0.”
This could be interpreted to suggest that circom
enforces that inv
be correctly computed as the inverse of in
. Which is not the case.
It’s not that circom guarantees in * inv
will be always 1; it’s that when in
is not zero, the only way to satisfy line 8 and 9’s constraints is to correctly compute the inverse inv
via the witness generation on line 6.
In a way, this rather-elegant IsZero()
template perfectly illustrates how non-intuitive circom
can be when it comes to efficiently-programming ZK relations.
Num2Bits()
template
Let’s also consider the Num2Bits()
example from circom’s docs, which returns the binary representation of its input, as n
bits (although edited with more sane variable names and annotated with comments).
This example illustrates:
- template parameters (e.g.,
n
) - variables versus signals
- witness generation versus constraining
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
template Num2Bits(n) {
signal input in;
signal output out[n];
// will be incrementally updated to eventually store the symbolic expression:
// out[0] * 1 + out[1] * 2 + out[2] * 2^2 + ... + out[n-1] * (2^(n-1))
var acc = 0;
var pow2 = 1; // stores increasing powers of two: 2^0, 2^1, 2^2, ...
for (var i = 0; i < n; i++) {
out[i] <-- (in >> i) & 1; // extracts in's ith bit
out[i] * (out[i] - 1) === 0; // constrains out[i] to be a bit
// appends a "+ out[i] * 2^i" term to the symbolic expression in acc
acc += out[i] * pow2;
// sets this to 2^{i+1}
pow2 = pow2 + pow2;
}
// since acc stores the symbolic expression from above, this constraint
// ensures that the bits in out[i] are indeed the binary representation of in
acc === in;
}
It was actually ChatGPT who helped me understand that acc
merely stores a symbolic expression that is derivable at compile-time (see here), because the circom docs, as far as I could tell, do not do a good job at clarifying the distinction between circom signals and circom variables.
So: “thx chat!”
Conclusion
circom
is an amazing tools for implementing ZK relations that (obviously) makes the developer’s job 10,000x easier than writing R1CS constraints directly.
I am still trying to understand the many pitfalls developers can fall into when writing circom
code, most importantly, under-constraining bugs.
Looking forward to looking at NoirLang to see if it makes it easier to distinguish between constraints and witness generation?
References and resources
For cited works and other resources3$^,$4$^,$5$^,$6$^,$7, see below 👇👇
-
A beginner’s intro to coding zero-knowledge proofs, by Santiago Palladino, May 4th 2023 ↩ ↩2
-
Tag, you’re it: Signal tagging in Circom, by Tjaden Hess, January 2nd, 2024 ↩
-
Breaking down TornadoCash, a live explanation of TorandoCash’s
circom
implementation ↩ -
Ecne, “verify that R1CS equations uniquely determine outputs given inputs (i.e. that the constraints are sound)” ↩
-
CIRCOM: A Robust and Scalable Language for Building Complex Zero-Knowledge Circuits, by Jose L. Muñoz-Tapia and Marta Belles and Miguel Isabel and Albert Rubio and Jordi Baylina, 2022, [URL] ↩