Caulk: Lookup Arguments in Sublinear Time Arantxa Zapico∗1, Vitalik Buterin2, Dmitry Khovratovich2, Mary Maller2, Anca Nitulescu3, and Mark Simkin2 1 Universitat Pompeu Fabra† 2 Ethereum Foundation‡ 3 Protocol Labs§ Abstract We present position-hiding linkability for vector commitment schemes: one can prove in zero knowledge that one or m values that comprise commitment cm all belong to the vector of size N committed to in C. Our construction Caulk can be used for membership proofs and lookup arguments and outperforms all existing alternatives in prover time by orders of magnitude. For both single- and multi-membership proofs Caulk beats SNARKed Merkle proofs by the factor of 100 even if the latter instantiated with Poseidon hash. Asymptotically our prover needs O(m2 + m log N ) time to prove a batch of m openings, whereas proof size is O(1) and verifier time is O(log(log N )). As a lookup argument, Caulk is the first scheme with prover time sublinear in the table size, assuming O(N log N ) preprocessing time and O(N ) storage. It can be used as a subprimitive in verifiable computation schemes in order to drastically decrease the lookup overhead. Our scheme comes with a reference implementation and benchmarks. 1 Introduction A vector commitment is a basic cryptographic scheme, which lies at the foundation of numerous constructions and protocols. In a nutshell, a vector commitment is a compact data structure that “contains” a (potentially very big) number of elements and allows proving that a specific element has been committed to it. A natural requirement is that a proof is succinct and unforgeable. A Merkle tree is a well known example of vector commitment. The rise of privacy-preserving applications makes it vital to make proofs zero-knowledge i.e. hiding the element that is asserted to be in the commitment while still establishing a certain relationship, or link, to that element. Thus a vector commitment to v = (v1, . . . , vN ) is linkable if it permits proving that you know a secret si mathematically linked to vi. The simplest example is the proof of authorization where a party proves the knowledge of a secret key beyond one of public key in the set. A more elaborate example is a proof of coin ownership in private cryptocurrencies: with coins stored as hashes of secret k and value v in a list or a tree, prove that you can spend v by showing its k in zero knowledge. A third example is a lookup argument in verifiable computation: prove that intermediate values a1, a2, . . . , ak are all contained in a certain table (e.g., a table of all 16-bit numbers for the purpose of overflow checks in financial or mathematical computations). Applications also include membership proofs, ring signatures, anonymous credentials and other schemes. So far all these examples have been handled by working but not so efficient mechanisms, which limit scalability and adoption. The first version of the Zcash cryptocurrency [ 30] used a SHA-2-based Merkle tree to store the coins and the Groth16 [ 20] SNARK to prove the coin ownership. Relatively heavy machinery of Groth16 and the unfit of SHA-2 to prime-field circuits made the resulting prover time of 40 ∗This work was done while Arantxa Zapico was an intern at the Ethereum Foundation. †arantxa.zapico@upf.edu ‡{v buterin, mary.maller, mark.simkin}@ethereum.org, khovratovich@gmail.com §anca@protocol.ai 1 seconds barely usable. Even the most recent developments of algebraic hashes [1 , 19 ] reduce prover time by the order of magnitude only. Another application of concern, lookup tables, so far has required the generic construction of Plookup [ 17 ], and, the last but not least, makes the prover be at least as big as the table itself no matter how many values they look up. 1.1 Our Contributions In this paper we present a novel construction to add position-hiding linkability for vector commitments, named Caulk, which solves both problems with unprecedented efficiency. We construct a proof of membership with concrete efficiency of the factor of 100x over Poseidon Merkle trees, with the same asymptotic of O(log N ) for N -sized commitments. Our construction naturally extends to proof of subset memberships, thus leading the way to more efficient lookup arguments. We have removed the bottleneck of big tables by achieving the yet impossible O(m log N + m2) cost for m-subvector lookups. The Verifier is succinct as it requires only O(log (log N )) scalar operations as well as constant number of pairings to verify a constant-size proof. We envision the widespread deployment of our construction both in generic lookup-equipped proof systems [17, 27] and specific applications with membership proofs. The prover benefits are even more extreme when compared with Merkle-SNARKs that use SHA-2. Caulk also has constant proof sizes and O(log (log N )) verifier time. It achieves statistical zero-knowledge and soundness in the algebraic group model, and requires a universal setup and O(N ) storage. 1.2 Paper Structure We start with a technical overview of Caulk in Section 2. We present two instances: one for single-element proof of commitment (the m = 1 case), and the other for proving an m-subset: all values committed to cm are elements of the vector committed in C. Related work is discussed in Section 3. Section 4 provides a self-complete description of the tools we use, particularly the KZG commitment and precomputation techniques, and can be skipped by a knowledgeable reader. In Section 5 we identify our constructions as special cases of a more general family of protocols that add what we define as position-hiding linkability to vector commitment schemes. This primitive asserts that all (hidden) entries committed to cm are also (publicly) committed in C. Position-hiding refers to the fact that no information about which elements were taken to construct cm should be leaked. We formalize its definition as well as the security notions to be held. In Section 6 we formally describe Caulk for the case m = 1 and prove it is statistically zero-knowledge and sound in the algebraic group model. As an important building block we also introduce a construction that demonstrates that a pedersen commitment contains a root of unity. In Section 7 we extend Caulk even further to m-subset proofs, with some values possibly repeating. In this scenario Caulk can be seen as a lookup table, and is thus a prover efficient alternative to schemes such as Plookup [17]. We discuss various optimizations in Section 8. Caulk comes with an open source reference implementation in Rust using arkworks library 1. In Section 9 we compare its efficiency with some rival schemes. 2 Caulk in a nutshell We present two constructions, one for the case m = 1, and another for m > 1. The starting point for both of them is a public vector ~c encoded as a polynomial C(X) = ∑N i=1 ciλi(X), where {λi(X)}N i=1 are the Lagrange interpolation polynomials corresponding to some set of roots of unity H = {1, ω, . . . , ωN −1}, with ωN = 1. Both prover and verifier have access to a KZG commitment to C(X), i.e., to a G1-element C = ∑N i=1 ci[λi(x)]1 where x is secret. 1https://github.com/caulk-crypto/caulk 2 2.1 Construction for m = 1 Our idea is for the prover to demonstrate, in zero-knowledge, that a pedersen commitment cm opens to ci and that they know a verifying KZG proof [Qi]1 such that e(C − [ci]1, [1]2) = e([Qi]1, [a(x − ωi)]2), for some blinding factor a. Our prover time is unaffected by the computation of the non-hiding KZG proof [Qi]1 because we can compute all [Qi]1s in advance using N log N group operations as in [ 28 , 14 ]. As a result our prover does require linear storage. First the prover demonstrates knowledge of ci and r such that cm = [ci + hr]1 for unknown h given to them as [h]1 in the setup. We use standard arguments of knowledge for pedersen commitments in order to prove well formation of cm. The challenge is then to prove well formation of a commitment to [a(x − ωi)]2 and prove that ωi is a root of unity i.e. that ωN i = 1. In order to avoid working with unnecessarily big polynomials, we introduce a new subgroup of roots of unity Vn = {1, . . . , σn−1} with n = log (N ) + 6 and σn = 1. We create a polynomial f (X) of degree n such that, using its first 4 coefficients, the prover can convince the verifier that for z(X) of degree 1, z(X) = aX + b = a(X + b a ), f (σ4) = a b . The other coefficients of f (X) are constructed so f (σ4+i) = ( a b )2i , and the last one used to show that ( a b )N = ( b a )N = 1. By iteratively demonstrating that f (σ4+i+1) = f (σ4+i) × f (σ4+i) we can compute the powers of a b up to 2log N = N while performing only O(log(N )) computations. 2.2 Construction for m > 1 We extend our results to m > 1 so that we prove that cm is a commitment to ~a = (ci1 , ci2 , . . . , cim ), where cij is an element in ~c for all j = 1, . . . , m. We commit to ~a as a polynomial φ(X) = ∑ j cij μj (X) where {μj (X)}m j=1 are Lagrange interpolation polynomials over Vm = {ν, . . . , νm−1, νm = 1}. Let I = {i1, . . . , im} ⊂ [N ], where each i ∈ I is included only once; that is, the set of all index i such that ci is an element of ~a, without repetitions. Tomescu et al. observed that multiple KZG evaluation proofs can be naturally aggregated into a single proof [ 28 ]. Denoting τi(X) as the lagrange polynomials that evaluates to 1 at ωi and 0 at all others ωk they show that C(X) − ∑ i∈I ciτi(X) = ∏ i∈I (X − ωi)H(X) if and only if C(ωi) = ci. (1) and that [H(x)]1 can be efficiently computed assuming that the non-hiding KZG proof [Qi]1 have been precomputed (as in the case m = 1). Given access to individual proofs Qi(X), the prover can perform linear combinations with them and compute H(X) in O(|I|) = O(m) time. We discuss this aggregation in more detail in Section 4.6.1. Now the prover generates a hiding commitments to ~cI = (ci)i∈I and zI (X) = ∏ i∈I (X − ωi), and outputs them together with a commitment to the polynomial H(X) such that (1) holds. Concretely we have polynomial equation C(X) − CI (X) = zI (X)H(X) (2) asserted against commitments C, [CI ], [zI ], [H]. It remains to prove that (i) zI (X) has the right form, (ii) [CI ] is a commitment to the same values as cm = ∑m j cij μj (X) but in a different basis: τi vs μj . For the first statement we introduce an auxiliary polynomial u(X) = ∑m j=1 ωij μj (X). We prove that u(X)’s coefficients are N th roots of unity by providing a proof that uj (X) = uj−1(X)uj−1(X) for j = 1, . . . , m, when evaluated at elements in Vm, and showing that u0(X) = u(X) and un(X) = 1. Then it remains to show that zI (X) vanishes at coefficients of u(X) i.e. zI (u(X)) vanishes at Vm. This is done by providing H2(X) such that zI (u(X)) = zVm (X)H2(X). Note that the argument holds also when u(X) has repeated coefficients. The second statement is proven by asserting the polynomial equation CI (u(X)) − φ(X) = m∏ j=1 (X − νj )H(X), 3 Scheme Trusted Params |srs| Proof size Prover work Verifier work Merkle trees + zkSNARKs Updatable m log(N ) 13G1, 8F ̃O(m log(N )) 2P RSA accumulators Yes O(1) 2 G O(log(m)) m exp Caulk single opening (Sec. 6) Updatable O(N ) 6G1, 2G2, 4F ̃O(log(N )) 4P Caulk lookup (Sec. 7) Updatable O(N ) 14G1, 1G2, 4F ̃O(m2 + m log(N )) 4P Table 1: Cost comparison of our scheme with alternative proofs for membership and lookups. N is the size of the table and m the size of the set to be opened. We consider that Merkle trees + zk-SNARKs are implemented using Marlin [13 ] and note that these numbers are different with other SNARKs. Note that the asymptotic prover work for the Merkle trees + zkSNARKs hides the large constants involved in arithmetising hash functions. The RSA accumulator asymptotics hides large constants: for example G denotes a hidden order group that has larger size than G1, G2. thus linking an input φ(X) in the known basis {μj (X)}m j=1 to CI (X) in the unknown basis {τi(X)}i∈I . In our protocol, the procedure above is performed by also including blinders to hide the positions and the values taken to construct zI (X), ~cI and ~a. 3 Related Work Merkle-SNARK Zcash protocol [30 ] proposed a SNARK over a circuit described a Merkle tree opening for the anonymous proof of coin ownership. It remains a very popular approach for various set membership proof protocols [ 29, 31 ]. The prover costs are logarithmic in the number of tree leafs, but the concrete efficiency varies depending on the hash function that comprises the tree [ 1, 19]. Regular hash functions such as SHA-2 are known to be very slow, whereas algebraic alternatives are rather novel and some applications are reluctant to use them. Pairing Based Camenisch et al.[ 10 ] describe a vector commitment that only requires constant prover and verifier costs. However the commitments themselves are computed by a trusted third party and have linear size because the prover requires access to g 1 x−ci for all ci in the vector and x secret. Discrete-Log Based In the discrete-logarithm setting a series of works have looked into achieving logarithmic sized zero-knowledge membership proof [ 3, 21 , 8 , 9]. These have the advantage that there is no trusted setup or pairings. The prover and verifier costs are asymptotically dominated by a linear number of field operations. For modest sized vectors this can be practical because the number of more computationally intensive group operations is logarithmic. RSA Accumulators Camenisch and Lysyanskaya [ 11 ] design a proof of knowledge protocol for linking a commitment over a prime ordered from to an RSA accumulator. There are no a-priori bounds on the size of the vector. This approach is used by Zerocoin [ 25 ] which is a privacy preserving payments system (the predecessor to Zerocash [5]). Benarroch et al. [ 6] improve on this result by allowing the use of prime ordered groups of of “standard” size, e.g., 256 bits, whereas [11 ] needs a much larger group. RSA based schemes have constant sized public parameters. Benarroch et al. [6] and later Campanelli et al. [ 12 ] enjoy a prover time that is over twice as fast as zk-SNARKs over Merkle trees with Pedersen hashes and their implementation has not yet been heavily optimised. The verifier time is also asymptotically and concretely efficient. However they require either a trusted RSA modulus or class groups and the proof size is relatively large (about 2 to 5 KBytes). 4 Preliminaries A bilinear group gk is a tuple gk = (q, G1, G2, GT , e, P1, P2) where G1, G2 and GT are groups of prime order q, the elements P1, P2 are generators of G1, G2 respectively. We also consider ˆP1 another generator of G1 and will denote it as [h]1, where h is unknown and hP1 = ˆP1. e : G1 × G2 → GT is an efficiently computable, non-degenerate bilinear map, and there is an efficiently computable isomorphism between G1 and G2. Elements in Gγ , are denoted implicitly as [a]γ = aPγ , where γ ∈ {1, 2, T } and PT = e(P1, P2). With this notation, e([a]1, [b]2) = [ab]T . When informally introducing techniques, we may use [a] to refer to element a given in either G1 or G2. 4 4.1 Lagrange Polynomials and Roots of Unity We use ω to denote a root of unity such that ωN = 1, and define H = {1, ω, . . . , ωN −1}. Also, we let λi(X) denote the ith lagrange polynomial, i.e., λi(X) = ∏ s6=i−1 X−ωs ωi−1−ωs and zH (X) = ∏N −1 i=0 (X − ωi) = XN − 1 the vanishing polynomial of H. We will additionally consider smaller groups of roots of unity in Sections 6, 7 and 7.1, that will be introduced accordingly. 4.2 Security parameters and algorithms Let λ ∈ N denote the security parameter and 1λ its unary representation. A function negl : N → R is called negligible if for all c > 0, there exists k0 such that negl(k) < 1 kc for all k > k0. For a non-empty set S, let x $ ←− S denote sampling an element of S uniformly at random and assigning it to x. Let PPT denote probabilistic polynomial-time. Algorithms are randomized unless explicitly noted otherwise. Let y ← A(x; r) denote running algorithm A on input x and randomness r and assigning its output to y. Let y $ ←− A(x) denote y ← A(x; r) for a uniformly random r. 4.3 Games Code-based games are used in security definitions [4]. A game Gamesec A (λ), played with respect to a security notion sec and adversary A, has a main procedure whose output is the output of the game. 4.4 Cryptographic Assumptions The security of our protocols holds in the Algebraic Group Model (AGM) of Fuchsbauer et al.( [ 15 ]), using the dlog, qDHE and qSDH assumptions [ 18 , 7 ]. In the AGM adversaries are restricted to be algebraic, namely, when an adversary A gets some group elements as input and outputs another group element, it can provide some algebraic representation of the latter in terms of the former. Definition 4.1 (Algebraic Adversary). Let G be a cyclic group of order p. We say that a PPT adversary A is algebraic if there exists an efficient extractor EA that, given the inputs ([x1], . . . , [xm]) of A, outputs a representation z = (z1, . . . , zm)> ∈ Fm, where F is the finite field of p elements, for every group element [y] in the output of A such that: Advalg G,A(λ) = [ [y] ← A([x1], . . . , [xm]), z ← EA([y], [x1], . . . , [xm]), and [y] 6 = ∑m j=1 zj [xj ] ] = negl(λ). 4.5 The KZG Polynomial Commitment Scheme Our constructions heavily rely on the KZG polynomial commitment scheme (Def. A.3) that we describe below, as well as its adaptation for vector commitment that we explain in the next section. For efficiency, we slightly modify the polynomial commitment in order to add degree checks to the original protocol, without incurring in extra proof elements or pairings. The polynomial commitment introduced by Kate, Zaverucha and Goldberg in [22] is a tuple of algorithms (KZG.Setup, KZG.Commit, KZG.Open, KZG.Verify) such that: • srsKZG ← KZG.Setup(parKZG, d): On input the system parameters and a degree bound d, it outputs a structured reference string srsKZG = ({[xi]1,2}d i=1 ). • C ← KZG.Commit(srsKZG, p(X)): It outputs C = [p(x)]1. • (s, πKZG) ← KZG.Open(srsKZG, p(X), α): Prover computes q(X) = p(X) − p(α) X − α , sets s = p(α), [Q]1 = [q(x)xd−deg +2]1, and outputs (s, πKZG = [Q]1). • 1/0 ← KZG.Verify(srsKZG, C, deg, α, s, πKZG ): Verifier accepts if and only if e(C − s, [xd−deg +2]2) = e([Q]1, [x − α]2). 5 Security. It has been proven in [22 , 13 , 16 ] that the original KZG protocol, i.e., where [Q]1 = [q(x)]1 and the pairing equation is e(C − s, [1]2) = e([Q]1, [x − α]2), is a polynomial commitment scheme that satisfies completeness, evaluation blinding and extractability as in Def. A.3 in the AGM, under the dlog assumption. What is more, Marlin presents an alternative version of KZG with degree checks that does not require additional powers in G2. For our construction, we claim that adding xd−deg +2 to the pairing and element [Q]1 does not affect completeness or extractability. We also argue that under the AGM, no PPT adversary A can break soundness by providing a commitment to a polynomial p(X) such that deg (p) > deg . Indeed, if that is the case, deg (Q) = d + 1 for Q(X) the algebraic representation of [Q]1, which will imply an attack to the d-DHE assumption, as the srs only contains powers [xi]1 up to d. 4.6 KZG as Vector Commitment Scheme There is a natural isomorphism between vectors of size m and polynomials of degree m − 1; where we can represent ~v = (v1, . . . , vm) ∈ Fm as V (X) = ∑m j=1 vj Bj (X), where B = {Bj (X)}m j=1 is a basis of the space of polynomials of degree up to m − 1, and vice versa. This fact implies as well a natural relation between polynomial and vector commitments (Def. A.2), where in particular, the former implies the latter. What is more, when the basis B chosen to encode the vector consists of Lagrange polynomials we have vector commitments with easy individual position openings: evaluating V (X) in the i − 1th interpolation point returns vi. In this work we will use the protocol by Kate et al. for both cases, polynomial and vector commitments. For the latter, we will not only consider individual openings but also subset openings. In particular, let H = {1, ω, . . . , ωN −1} be a set of roots of unity and {λi(X)}N i=1 its corresponding Lagrange interpolation set, with vanishing polynomial zH (X). That is, λi(ωi−1) = 1 and λi(ωj ) = 0 for all j 6 = i − 1. We have that for some polynomial H(X), V (X) − s = (X − ωi)H(X) if and only if V (ωi) = vi+1 = s. For a polynomial VI (X) = ∑ i∈I siτi(X) where si+1 are claimed values for vi+1 and {τi(X)}i∈I the Lagrange interpolation polynomials of the set {ωi}i∈I , V (X) − VI (X) = ∏ i∈I (X − ωi)H(X) iff V (ωi) = vi+1 = si+1 for all i ∈ I. 4.6.1 Multiple Openings A KZG proof of opening can naturally be extended to open one polynomial in many points. Indeed, let p(X) be a polynomial, ~α ∈ Fm a vector of opening points and ~s such that si = p(αi) for all i = 1, . . . , m. Define C~α(X) as the unique polynomial of degree m − 1 such that C~α(αi) = si for all i ∈ [m]. We have that p(αi) = si for all i = 1, . . . , m if and only if there exists q(X) such that p(X) − C~α(X) = m∏ i=1 (X − αi)q(X) We can thus redefine the KZG prover and verifier the following way: • (s, πKZG) ← KZG.Open(srsKZG, p(X), deg, ~α): Prover computes {τi(X)}m i=1 the interpolation La- grange polynomials for the set {αi}m i=1, zα(X) = ∏m i=1(X−αi) and define C~α(X) = ∑m i=1 p(αi)τi(X). Then, it computes q(X) = p(X) − C~α(X) zα(X) , sets si = p(αi), [Q]1 = [q(x)]1, and outputs (~s, πKZG = [Q]1). • 1/0 ← KZG.Verify(srsKZG, C, deg, ~α, ~s, πKZG ): The verifier computes {τi(X)}m i=1, C~α = [C~α(x)]1, [zα(x)]2 and verifies p(X) − C~α(X) = q(X)zα(X) by making the pairing check e(C − C~α, [1]2) = e([Q]1, [zα(x)]2), and outputs 1 if and only if the equation is satisfied and deg(p) ≤ d. 6 4.6.2 KZG for Bivariate Polynomials For the protocol in Section 7.1 we will use bivariate polynomials, or polynomials of higher degree. What this mean is that, if we have a bivariate polynomial P (X, Y ) with degree up to d1 − 1 in X and d2 − 1 in Y then we require a universal setup with d1d2 powers. We work with a version of KZG that uses a univariate setup because these are already available for multiple different curves (i.e. we do not need a specialist setup just for our protocol and can work with prior KZG setups). We observe that, by using the KZG open algorithm, we can commit to P (X, Y ) as [P (xd2 , x)]1. We must open P (X, Y ) in two steps. First we partially open P (X, Y ) at some point X = α to a commitment [P (α, x)]1. The partial proof is given by a commitment [wα(xd2 , x)] to a partial witness wα(X, Y ) = P (X, Y ) − P (α, Y ) X − α We then fully evaluate P (α, X) at Y = β via a standard KZG proof with a degree bound of d2 − 1 on [P (α, x)]1. 4.6.3 Subset openings By applying the ideas above to vector commitments, we have that the KZG scheme can be applied to prove openings to subsets rather than positions. Consider the subset of H {ωi}i∈I = {ωij }m j=1 with I ⊂ [N ] such that |I| = m and its vanishing polynomial zI (X) = ∏ i∈I (X − ωi). We can prove that a commitment C = [C(x)]1 to vector ~v is such that C(ωij ) = sj , i.e., vij +1 = sj for j = 1, . . . , m. The proof is given by Q = [q(x)]1 for q(X) = C(X) − ∑m j=1 sj τj (X) zI (X) . where τj (X) = m∏ k=1,k6 =j X − ωik ωij − ωik is the langrange polynomial evaluating to 0 at ωik for k 6 = j and 1 at ωij . The verifier checks the equation C(X) − m∑ j=1 sj τj (X) = q(X)zI (X) By making the pairing check e(C − [ m∑ j=1 sj τj (x)]1, [1]2) = e(Q, [zI (x)]2) We will additionally use a result by Tomescu et al. [ 28] that allows the prover to compute Q in time O(m log2(m)) given it already has stored proofs Q1, . . . , Qm that c(ωij ) = sj . Indeed the prover sets q(X) = m∑ j=1 qj (X) ∏m k=1,k6 =j (ωij − ωik ) and Q = m∑ j=1 Qj ∏m k=1,k6 =j (ωij − ωik ) This is a correct computation of Q because C(X) − m∑ j=1 sj τj (X) = C(X)(1 − m∑ j=1 τj (X)) + m∑ j=1 (c(X) − sj )τj (X)
j=1 (c(X) − sj )τj (X) 7 where we are using that (1 − ∑m j=1 τj (X)) = 0 for any set of lagrange polynomials. Further m∑ j=1 (C(X) − sj )τj (X) = m∑ j=1 (C(X) − sj ) m∏ k=1,k6 =j (X − ωik ) (ωij − ωik )
m∑ j=1 (C(X) − sj ) (X − ωij ) ∏m k=1,k6 =j (ωij − ωik ) m∏ k=1 (X − ωik )
m∑ j=1 qj (X) ∏m k=1,k6 =j (ωij − ωik ) m∏ k=1 (X − ωik ) The provers asymptotic time is thus dominated by the computation of ∏m k=1,k6 =j 1 (ωij −ωik ) . Remark 1. We remark that precomputing all the proofs Q1, . . . , Qn that c(ωij ) = sj can be achieved in time O(n log n) using techniques by Feist and Khovratovich [ 14 ]. The overview of this technique by Tomescu et al. ([28], Section 3.4.4, “Computing All ui’s Fast”) is explained well. 4.7 Proof of Opening of a Pedersen Commitment Pedersen commitment schemes are a particular case of vector commitments. We will consider them for committing to single values in a zero knowledge way. Thus, the srs will additionally output [h]1 for some secret h and the commitment to some element s is computed as v[1]1 + r[h]1 = [v + hr], for some randomly sampled h ∈ F . We suggest a standard Fiat-Shamired Sigma protocol [ 24] to demonstrate knowledge of v, r such that cm = [v + hr]1 for some v, r: Rped = {(cm; (v, r)) : cm = [v + hr]1} The proof consists of R = [s1 + hs2]1, t1 = s1 + vc and t2 = s2 + rc, where c = H(cm, R) and s1, s2 are elements chosen by the verifier. At the end, the verifier checks that R + c · cm = [t1 + ht2]1. 5 Position-Hiding Linkable Vector Commitments We introduce the concept of position-hiding linkable vector commitment schemes. Informally, two vector commitment schemes VC1 and VC2 are position-hiding linkable if a prover is able to convince a verifier that for a given commitments C corresponding to VC1 and cm corresponding to VC2, it is true that all the elements in the vector committed in cm are also elements of the vector committed in C. Basicallly, position-hiding linkability allows the prover to extract or isolate in zero-knowledge elements from some public set or table, and later prove further attributes on them. This new primitive should satisfy three security notions: completeness, as usual; linkability, that captures the fact that if the proof verifies then there is no element committed in cm that is not also committed in C; and position-hiding, which holds only if no information about the set of elements in C that have been used to construct cm is leaked. Definition 5.1 (Position-Hiding Linkability for Vector Commitments). Two vector commitment schemes VC1 and VC2 are position-hiding linkable if there exist algorithms (Setuplink, Provelink, Verifylink, Simulatelink ) that behave as follows, • Setuplink(1λ, d1, d2) : takes as input the security parameter, bounds on the length of vectors in VC1 and VC2, and outputs common parameters srs that include srs1 = VC1.srs and srs2 = VC2.srs as well as trapdoor x, including the corresponding trapdoors x1 and x2. • Provelink(srs, r, r′, ~v, ~vI ) : on input the srs, commitment randomness r to vector ~v ∈ FN and commit- ment randomness r′ to ~a ∈ Fm, outputs a proof π that there exists some I ⊂ [N ] such that for all j = 1, . . . , m, aj = vi for some i ∈ I. • Verifylink(srs, C, C′, π) : On input the srs, commitments C and C′, and proof π, accepts or rejects. 8 • Simulatelink(x1, x2, C, C′) : On input the trapdoors x1, x2 and commitments C and C′, outputs a simulated proof πsim, and satisfy the following properties: Completeness: For all N, t with t ≤ N , all ~v ∈ FN , and all ~a ∈ Fm it holds that: Pr Verifylink(srs, C, C′, π) = 1 (srs, x) ← Setuplink(1λ, N, t); C ← Commit(srs1, ~v, r); C′ ← Commit(srs2, ~a, r′); π ← Provelink(srs, r, r′, ~v, ~a) = 1. Linkability For all N, m with m ≤ N , and all PPT adversaries, there exists an extractor XA such that: Pr Verifylink(srs, C, C′, π) = 1 ∧ |~v| = N ∧(∃ j ∈ [m] s.t. aj 6 = ci ∀i ∈ [N ] ∨ VC2.Commit(srs2, ~a, r′) 6 = C′) (srs, x) ← Setuplink(1λ, N, m); ~v ← A(srs); C ← VC1.Commit(srs1, ~v); (π, C′) ← A(srs1, srs2, C); (~a, r′) ← XA(C′, π) = negl(λ). Position-Hiding For all N, m with m ≤ N , for all ~v and ~a, all PPT adversaries A, there exists a PPT algorithm Simulatelink such that: A(srs, C, C′, π) = 1 (srs, x) ← Setuplink(1λ, N, m) C ← Commit(srs1, ~v, r) C′ ← Commit(srs2, ~a, r′) π ← Provelink(srs, r, r′, ~v, ~a) ≈c A(srs, C, C′, πsim) = 1 (srs, x) ← Setuplink(1λ, N, m) C ← Commit(srs1, ~v, r) C′ ← Commit(srs2, ~a, r′) πsim ← Simulatelink(x, C, C′) In the next sections, we introduce position-hiding linkability for KZG commitments of arbitrary size and pedersen commitments for single elements (Section 6), as well as for two KZG commitments (Section 7). 6 Linking Vectors with Elements In this section we present a method to link a commitment C to a vector ~c ∈ FN computed as C = [C(x)]1 with C(X) = ∑N i=1 ciλi(X), to a pedersen commitment cm. By this we mean a method for a prover to convince a verifier that there exists an i such that C opens to v at some N th rooth of unity ωi and cm = [v + hr]1. We will consider two groups of roots of unity: • H = {1, ω, . . . , ωN −1} of size N with ωN = 1, Lagrange interpolation polynomials {λi(X)}N i=1 where λi(ωi−1) = 1 and λi(ωj ) = 0 if j 6 = i − 1, and vanishing polynomial zH (X). • Vn = {1, ν, . . . , νn−1} of size n = log (N ) + 6 with νn = 1, Lagrange interpolation polynomials {ρs(X)}n s=1 and vanishing polynomial zVn (X). Our construction can be divided into three main components. The first one is a proof of knowledge for the element v committed in cm, that is a proof for relation Rped as defined in Section 4.7. The second is a modified protocol for computing blinded versions of KZG openings for statements C(ωi) = v that does not reveal the coordinate i + 1 or the evaluation v, which we describe below. The high-level idea here is to re-randomize a regular KZG opening with an additional blinding factor. Our third component then proves that the re-randomized vanishing polynomial used for the KZG opening is well-formed, i.e., a NIZK argument (as in Def. A.1) for the relation Runity = {(srs, [z]2; (a, i)) : [z]2 = [a(x − ωi)]2 ∧ (ωi)N = 1} . 9 6.1 Our Blinded Evaluation Construction Our prover takes (r′ = ⊥, ~c) and (r, v) as input, where the first tuple represents the vector inside the (deterministic) KZG commitment and the second tuple represents the randomness and value for the pedersen commitment. Let C(X) = ∑N i=1 ciλi(X) be the polynomial encoding vector ~c. In a regular KZG opening for position i + 1, the prover would compute q(X) = C(X)−v x−ωi and reveal Q = [q(x)]1. Instead, our prover computes a special kind of obfuscated commitment to ωi by selecting a random a and committing to z(X) = aX − b = a(X − ωi) where ωi = b a , i.e. the commitment contains [z]2 = [z(x)]2. Note that this is simply the KZG cofactor X − ωi multiplied by a blinding factor a. The blinding factor is necessary, because the set {ωi}m−1 i=0 is polynomial sized, so revealing [x − ωi]1 would allow the verifier to do a brute force search to find the index. The prover then computes [T ]1 = [T (x)]1 and [S]2 = [S(x)]2, where T (X) = q(X) a + hs and S(X) = −r − sz(X) and s is a uniformly random value chosen by the prover. T (X) is the KZG quotient polynomial q(X) divided by a (the blinding factor above) to compensate for z(X) having that blinding factor. The additional term [hs]1 mixed in to fully blind the evaluation [ q(X) a ]1 and preserve zero-knowledge. [S]2 is a term that compensates for the h terms in both [T ]1 and cm. In the pairing equation that checks these points, [S]2 will be paired with h to ensure that it can only cancel out terms containing h and cannot make incorrect quotient polynomials appear correct. We also provide two proofs of knowledge πped and πunity as described in Section 4.7 and Section 6.2 respectively. The proof πped is for v, r such that cm = [v + hr]1. The proof πunity is for a, b such that [z]2 = [ax − b]2 and aN = bN . The verifier checks the pairing equation e(C · cm−1, [1]2) = e([T ]1, [z]2) + e([h]1, [S]2). This equation checks that, for the polynomials C(X), T (X), z(X), S(X) encoded in C, [T ]1, [z]2, and [S]2 respectively, it holds that C(X) − v − hr = T (X)z(X) + hS(X). Now with T (X) = q(X) a + sh, z(X) = a(X − ωi), and S(X) = −r − sz(X), this is C(X) − v − hr = ( q(X) a + sh ) z(X) − hr − hsz(X) ⇐⇒ C(X) − v = ( q(X) a ) z(X). The full description of our protocol is given in Figure 1. Theorem 1. Let Rped and Runity be relations for which zero-knowledge argument of knowledge systems are given. The construction in Figure 1 implies position-hiding linkability for the commitment schemes corresponding to C and cm in the algebraic group model under the q-SDH and dlog assumptions. The proof is in Appendix B. 6.2 Correct computation of z(X) The purpose of this section is provide a zero-knowledge proof of knowledge for relation Runity, i.e. that the prover knows a, b such that [z]2 = [ax − b]2 and aN = bN . This proof is used as a subprotocol in Fig. 1’s construction of a linkable vector commitment in Section 6. In order to prove that a b is inside the evaluation domain (i.e. is a root of unity) in zero-knowledge we add another polynomial f (X) of degree n = log (N ) + 6. The polynomial f (X) essentially recovers a b from [z]2 and then includes its powers 2i until i = log (N ). It will be enough then to prove that (i) f (X) is correctly formed with respect to [z]2, (ii) it does indeed contain all 2-powers of a b , and (iii) the coefficient corresponding to ( a b )2log(N ) = ( a b )N equals 1. The core of our construction is the following lemma, that we prove in Appendix C: Lemma 1. Let z(X) be a polynomial of degree 1, n = log (N ) + 6 and σ such that σn = 1. If there exists a polynomial f (X) ∈ F[X] such that 10
