🌪Tornado Cash 101🌪

I hope this article helps you understand how Tornado Cash works on a technical level. I’ll assume you are already familiar with Smart contracts, merkle trees, and have a basic understanding of Zero Knowledge Proofs (ZKP) or Circom circuits.

A Tornado Cash system lets an address A deposit some constant amount of ETH (say 1 ETH), and lets another address B withdraw it, breaking the link between these two addresses. On a high level, this is how it works:

Before a deposit, A generates some secret data, hashes it and sends it to Tornado Cash’s contract C along with 1 ETH.
C adds this hash of secret data into its data structure.
A shares the secret data with B.
B uses ZKP to prove to C that it knows the secret data, and sends the proof to C. The secret data is kept private from C and any observer using ZKP.
C verifies the proof, and if correct, transfers 1 ETH to B, Otherwise it rejects the proof as invalid and no ETH is transferred.

The rest of the focus of this article is to understand the setup that makes this entire thing possible, i.e.:

B can withdraw from C only if it proves that it knows a secret corresponding to some deposit.
Once a secret is used to withdraw, it cannot be used again. Otherwise, it would be possible to drain all the ETH deposited in C. You may recognize this as the double spending problem.

So, let’s dive right into it…

Secret, Hashes, Merkle tree

We’ll first forgo the double spending issue and focus on how someone knowing a valid secret data can withdraw from C without revealing the secret. Tornado Cash system has 2 components. The first is a smart contract deployed on-chain. It’s responsible for handling deposits and withdraws. The second is a ZK-SNARK circuit written in Circom. When this circuit is compiled, it also outputs a smart contract verifier that can verify the validity of the proof.

Depositor’s actions

Before depositing 1 ETH to C, A generates some random bytes called secret. It remains private to A. Now, to make a deposit, A performs these steps:

Computes h = Hash(secret). Hash() is a hashing function, and secret is called a pre-image of h in the context of Hash(). Tornado cash uses MiMCSponge() hash function provided by Circom library.
Executes a transaction calling C.deposit(h) and also sends 1 ETH along with it.

On-chain actions on deposit

Now that C.deposit(h) has been called, we’ll see what happens in that function. Internally, C maintains a merkle tree where each leaf corresponds to a successful deposit, or is otherwise empty which will be replaced in future with a successful deposit. The tree has a fixed size already determined at the time of contract deployment. So the number of deposits the contract can have is already pre-determined. We can index each leaf if you imagine them to be in an array leaf. If the number of total leaves is n, leaf[0] is the first leaf, and leaf[n-1] is the last leaf.

C ensures that 1 ETH has been sent to it, otherwise it reverts.
Let’s say i deposits have already happened, then leaf[i] is assigned the value of h and the merkle root is re-computed. It’s the same as inserting h into the merkle tree at i-th leaf.
A Deposit event is emitted logging the index i, h and the new root.

Note that, we have only revealed h = Hash(secret). Since we have used a hash function, it’s not possible to compute the pre-image secret from h, if sufficient randomness was used to generate secret.

ZK-SNARK circuits - a short detour

The rest of the article depends on some knowledge of ZK-SNARK or Circom circuits. If you have written a Circom circuit before, you can skip this section.

A circuit represents a computation on given public and private inputs, and the constraints among these inputs, the intermediate values and the final output. One party supplies the public and private inputs to the circuit and generates a proof(a sequence of bytes). Hence, this party is called the prover.

Another party when supplied with a proof and all the public inputs, verifies the validity of this proof. Thus, the private inputs are not revealed and the verifier can be certain the computation was performed correctly. The proof will be only determined as valid if the performed computation satisfies the constraints defined in the circuit.

Let’s continue with Tornado Cash now…

Withdrawer’s actions

Now B has to withdraw 1 ETH. For this, A reveals secret only to B. Now B performs the following actions:

Since B now knows secret, it computes h = Hash(secret). B observes all the Deposit events emitted so far, and reconstructs the merkle path M from h to its current root.

Tornado Cash provides a Circom circuit takes in the following as input:

secret : private input
root : the public input
merkle path from h to root : private input
recipient address : public input. B passes its addresses for this input. It’ll be clarified later on why it’s required.

The circuit takes the hash of secret, verifies that it’s equal to h, then verifies that the merkle path is indeed a path from h to root. As a result, the circuit generates a proof, which is a sequence of bytes.

Basically, B is proving that: B knows a secret such that its hash is part of this merkle tree. This proof is a blob of bytes, and reveals nothing beyond the validity of the statement above, thanks to ZK-SNARKs.

B passes the proof and all the public inputs to the circuit by executing a transaction C.withdraw(proof, root, recipient).

On-chain actions on withdraw

C.withdraw(proof, root, recipient) has been called by B. These actions follow:

C checks if root is one of the recent roots of the merkle tree. If not, the transaction reverts. The reason it’s not restricted to the current root is because the merkle root might have changed between the time when B picked the root and when C executes this transaction, due to new deposits.
Now, C invokes the verifier to verify the validity of the proof using proof, root, recipient. If proof is indeed valid, C sends 1 ETH to recipient, which is B in this case.

Now, we can understand why the circuit needs recipient as an input. If recipient was not required by a circuit for proof generation, anyone could frontrun this transaction by passing the same proof and root, and withdraw 1 ETH to their address instead.

At this stage, let’s introspect on what we have achieved so far and what’s remaining.

With our current system, we have broken the link between the depositor A, and withdrawer B. We haven’t posted any information on-chain which can link them. There are only two on-chain actions. A calls C.deposit(h), and B calls C.withdraw(proof, root, recipient). root is publicly known, and recipient is just the address which receives the withdrawn funds. ZK-SNARK ensures that there’s nothing revealed by proof.

However, we still need to fix the double spending problem mentioned earlier. B can call C.withdraw() with the same arguments, and it will be able to withdraw 1 more ETH. This can be repeated until it drains all the funds in C. To prevent this, we need to ensure that secret cannot be used more than once…

Nullifier (just a fancy word, I promise)

Somehow we have to indicate that secret has been used after the first withdraw. We know that computing a pre-image of a hash is an impossible task. We just leverage this fact and introduce a new parameter called nullifier. For a long time, I was intimidated by this name, however there’s nothing fancy here. It’s just some minor modifications to the circuit. Here’s how it works:

In addition to secret, we now have another private input nullifier. Where before, the depositor A generated secret and inserted Hash(secret) into the merkle tree, it now generates (secret, nullifier) pair and inserts Hash(secret, nullifier) (hash of the concatenation).
The ZK-circuit now takes two more inputs:
- nullifier : private input
- nullifierHash : public input. It is meant to be equal to Hash(nullifier).
The circuit now verifies that nullifierHash == Hash(nullifier) and h == Hash(secret, nullifier), and then verifies the merkle path from h to root. If everything checks out, it generates a proof.
Since a new public input nullifierHash is added to the circuit, the withdrawer B, now has to pass that in to C.withdraw(). So the call now becomes C.withdraw(proof, root, nullifierHash, recipient).
C now maintains an additional mapping used. withdraw() now first ensures that used[nullifierHash] is false, otherwise it reverts. It means the corresponding deposit has been withdrawn. We’ll explain it later on why that’s the case.
Now the usual proof verification happens, and if its valid, 1 ETH is transferred to recipient, and used[nullifierHash] is set to true.

If you observe the on-chain activities using the same approach we used before, you can convince yourself that there is still no way to link A and B. However, this system now prevents the double spending problem. But how?

The withdrawal will be successful only for someone who knows secret and nullifier. Since, nullifierHash is now marked in the contract as used after the first withdrawal, it becomes impossible to withdraw using the same nullifier again. We managed to prevent the double spending problem without revealing secret!

I have presented a simplified version of Tornado Cash. If you look at its current code at Github, you’ll notice one more feature related to relayer. This enables an address to withdraw even if it doesn’t have any ETH to pay for gas. Hence, it’s a useful feature if you want to move to a new address anonymously.

https://github.com/tornadocash/tornado-core/blob/master/circuits/withdraw.circom#L33-L35

That’s the end 🌪. Hope it was useful, and you enjoyed reading it! 🫂

blockdev