I hope this article helps you understand how Tornado Cash works on a technical level. I’ll assume you are already familiar with Smart contracts, merkle trees, and have a basic understanding of Zero Knowledge Proofs (ZKP) or Circom circuits.
A Tornado Cash system lets an address A deposit some constant amount of ETH (say 1 ETH), and lets another address B withdraw it, breaking the link between these two addresses. On a high level, this is how it works:
Before a deposit,
Agenerates some secret data, hashes it and sends it to Tornado Cash’s contractCalong with 1 ETH.Cadds this hash of secret data into its data structure.Ashares the secret data withB.Buses ZKP to prove toCthat it knows the secret data, and sends the proof toC. The secret data is kept private fromCand any observer using ZKP.Cverifies the proof, and if correct, transfers 1 ETH toB, Otherwise it rejects the proof as invalid and no ETH is transferred.
The rest of the focus of this article is to understand the setup that makes this entire thing possible, i.e.:
Bcan withdraw fromConly if it proves that it knows a secret corresponding to some deposit.Once a secret is used to withdraw, it cannot be used again. Otherwise, it would be possible to drain all the ETH deposited in
C. You may recognize this as the double spending problem.
So, let’s dive right into it…
We’ll first forgo the double spending issue and focus on how someone knowing a valid secret data can withdraw from C without revealing the secret. Tornado Cash system has 2 components. The first is a smart contract deployed on-chain. It’s responsible for handling deposits and withdraws. The second is a ZK-SNARK circuit written in Circom. When this circuit is compiled, it also outputs a smart contract verifier that can verify the validity of the proof.
Before depositing 1 ETH to C, A generates some random bytes called secret. It remains private to A. Now, to make a deposit, A performs these steps:
Computes
h = Hash(secret).Hash()is a hashing function, andsecretis called a pre-image ofhin the context ofHash(). Tornado cash usesMiMCSponge()hash function provided by Circom library.Executes a transaction calling
C.deposit(h)and also sends 1 ETH along with it.
Now that C.deposit(h) has been called, we’ll see what happens in that function. Internally, C maintains a merkle tree where each leaf corresponds to a successful deposit, or is otherwise empty which will be replaced in future with a successful deposit. The tree has a fixed size already determined at the time of contract deployment. So the number of deposits the contract can have is already pre-determined. We can index each leaf if you imagine them to be in an array leaf. If the number of total leaves is n, leaf[0] is the first leaf, and leaf[n-1] is the last leaf.
Censures that 1 ETH has been sent to it, otherwise it reverts.Let’s say i deposits have already happened, then
leaf[i]is assigned the value ofhand the merkle root is re-computed. It’s the same as insertinghinto the merkle tree at i-th leaf.A
Depositevent is emitted logging the indexi,hand the newroot.
Note that, we have only revealed h = Hash(secret). Since we have used a hash function, it’s not possible to compute the pre-image secret from h, if sufficient randomness was used to generate secret.
The rest of the article depends on some knowledge of ZK-SNARK or Circom circuits. If you have written a Circom circuit before, you can skip this section.
A circuit represents a computation on given public and private inputs, and the constraints among these inputs, the intermediate values and the final output. One party supplies the public and private inputs to the circuit and generates a proof(a sequence of bytes). Hence, this party is called the prover.
Another party when supplied with a proof and all the public inputs, verifies the validity of this proof. Thus, the private inputs are not revealed and the verifier can be certain the computation was performed correctly. The proof will be only determined as valid if the performed computation satisfies the constraints defined in the circuit.
Let’s continue with Tornado Cash now…
Now B has to withdraw 1 ETH. For this, A reveals secret only to B. Now B performs the following actions:
Since
Bnow knowssecret, it computesh = Hash(secret).Bobserves all theDepositevents emitted so far, and reconstructs the merkle pathMfromhto its currentroot.
Tornado Cash provides a Circom circuit takes in the following as input:
secret: private inputroot: the public inputmerkle path from
htoroot: private inputrecipientaddress : public input.Bpasses its addresses for this input. It’ll be clarified later on why it’s required.
The circuit takes the hash of secret, verifies that it’s equal to h, then verifies that the merkle path is indeed a path from h to root. As a result, the circuit generates a proof, which is a sequence of bytes.
Basically, B is proving that: B knows a secret such that its hash is part of this merkle tree. This proof is a blob of bytes, and reveals nothing beyond the validity of the statement above, thanks to ZK-SNARKs.
Bpasses the proof and all the public inputs to the circuit by executing a transactionC.withdraw(proof, root, recipient).
C.withdraw(proof, root, recipient) has been called by B. These actions follow:
Cchecks ifrootis one of the recent roots of the merkle tree. If not, the transaction reverts. The reason it’s not restricted to the current root is because the merkle root might have changed between the time whenBpicked the root and whenCexecutes this transaction, due to new deposits.Now,
Cinvokes the verifier to verify the validity of the proof usingproof, root, recipient. Ifproofis indeed valid,Csends 1 ETH torecipient, which isBin this case.
Now, we can understand why the circuit needs recipient as an input. If recipient was not required by a circuit for proof generation, anyone could frontrun this transaction by passing the same proof and root, and withdraw 1 ETH to their address instead.
At this stage, let’s introspect on what we have achieved so far and what’s remaining.
With our current system, we have broken the link between the depositor A, and withdrawer B. We haven’t posted any information on-chain which can link them. There are only two on-chain actions. A calls C.deposit(h), and B calls C.withdraw(proof, root, recipient). root is publicly known, and recipient is just the address which receives the withdrawn funds. ZK-SNARK ensures that there’s nothing revealed by proof.
However, we still need to fix the double spending problem mentioned earlier. B can call C.withdraw() with the same arguments, and it will be able to withdraw 1 more ETH. This can be repeated until it drains all the funds in C. To prevent this, we need to ensure that secret cannot be used more than once…
Somehow we have to indicate that secret has been used after the first withdraw. We know that computing a pre-image of a hash is an impossible task. We just leverage this fact and introduce a new parameter called nullifier. For a long time, I was intimidated by this name, however there’s nothing fancy here. It’s just some minor modifications to the circuit. Here’s how it works:
In addition to
secret, we now have another private inputnullifier. Where before, the depositorAgeneratedsecretand insertedHash(secret)into the merkle tree, it now generates(secret, nullifier)pair and insertsHash(secret, nullifier)(hash of the concatenation).The ZK-circuit now takes two more inputs:
nullifier: private inputnullifierHash: public input. It is meant to be equal toHash(nullifier).
The circuit now verifies that
nullifierHash == Hash(nullifier)andh == Hash(secret, nullifier), and then verifies the merkle path fromhtoroot. If everything checks out, it generates aproof.Since a new public input
nullifierHashis added to the circuit, the withdrawerB, now has to pass that in toC.withdraw(). So the call now becomesC.withdraw(proof, root, nullifierHash, recipient).Cnow maintains an additional mappingused.withdraw()now first ensures thatused[nullifierHash]is false, otherwise it reverts. It means the corresponding deposit has been withdrawn. We’ll explain it later on why that’s the case.Now the usual proof verification happens, and if its valid, 1 ETH is transferred to
recipient, andused[nullifierHash]is set to true.
If you observe the on-chain activities using the same approach we used before, you can convince yourself that there is still no way to link A and B. However, this system now prevents the double spending problem. But how?
The withdrawal will be successful only for someone who knows secret and nullifier. Since, nullifierHash is now marked in the contract as used after the first withdrawal, it becomes impossible to withdraw using the same nullifier again. We managed to prevent the double spending problem without revealing secret!
I have presented a simplified version of Tornado Cash. If you look at its current code at Github, you’ll notice one more feature related to relayer. This enables an address to withdraw even if it doesn’t have any ETH to pay for gas. Hence, it’s a useful feature if you want to move to a new address anonymously.
https://github.com/tornadocash/tornado-core/blob/master/circuits/withdraw.circom#L33-L35
That’s the end 🌪. Hope it was useful, and you enjoyed reading it! 🫂
