Semaphore is a zero-knowledge protocol enabling anonymous group membership proof and signaling that we can use for many things. Anonymous voting, anonymizing financial transactions, and OpenAI CEO Sam Altman's WorldCoin project usesSemaphore to ensure the anonymity of its users.
The basic logic of Semaphore is very similar to the basic logic of Tornado Cash, but it is much more general and complex. Therefore, before we delve into how Semaphore works, it is worth familiarizing ourselves with Tornado Cash. I have written a full article on the topic, so in this article, I will only briefly discuss the system. For those interested in the details, please read my previous article.
Tornado Cash is a coin mixer designed to make tracking money transfers impossible. To use the system, the user generates a key pair. Using one of the keys, they can deposit a certain amount into the Tornado Cash smart contract, which can then be withdrawn using the other key. The connection between the two keys is proven using a zero-knowledge proof, so no one besides the user can determine which deposit key corresponds to which withdrawal key, making the money transfers untraceable.
The following figure illustrates how the keys are generated:
The user generates a nullifier and a secret. These will be used to generate the two keys. To make a deposit, the user uses the commitment hash, created by hashing the secret and the nullifier. The key used for withdrawal is the nullifier hash. When the user places the amount into a deposit, the smart contract stores the commitment hash in a Merkle tree (I wrote about Merkle trees in more detail in my previous article).
When withdrawing the amount, the user provides the nullifier hash and a zero-knowledge proof, which proves that the commitment hash can be derived from the nullifier using the secret, as well as a Merkle proof that the commitment hash is included in the Merkle tree. The smart contract stores the nullifier hash, ensuring that it can only be used once and, therefore, an amount placed in deposit cannot be withdrawn multiple times. This is a brief summary of how the system works. For a more detailed description, please read my previous article.
The logic of Semaphore is very similar but much more general and complex, as can be seen from the following diagram. This is the schematic diagram of the circuit associated with the system:
Here, the user generates two private variables: the Identity Trapdoor and the Identity Nullifier. These two private variables can be seen as a private key. By hashing these two private variables, the Identity Commitment is created, which can also be seen as a form of a public key. This public key can be used for registration, for example, in the case of anonymous voting.
Registration follows the same process as in Tornado Cash, by adding the Identity Commitment to a Merkle tree, so when we want to prove membership in the group, we can do so with a Merkle proof. In the circuit, the Merkle proof is defined by the Sibiligs and Path Indices parameters.
The figure shows that the nullifier hash is not simply generated by hashing the nullifier as in the case of Tornado Cash, but rather as a combination of an Identity Nullifier and an External Nullifier hash. This is beneficial because, unlike Tornado Cash, the nullifier can be reused multiple times. For example, in the case of an anonymous voting system, a user only needs to register once and can participate in multiple votes. For each vote, a unique external nullifier is generated. This allows for generating unique public nullifiers for each vote, which are specific to the particular vote but cannot be linked to each other.
The last parameter is the Signal Hash. The signal is freely chosen data that can be linked to the zero-knowledge proof. In the case of a vote, for example, it could be the vote itself. If we consider the Identity Tapdoor and Identity Nullifier as private keys, and the Identity Commitment as a public key, then the zero-knowledge proof can be seen as a form of digital signature, as it can only be generated with knowledge of the private data. In this case, the signal is the digitally signed content.
After the theory, let's take a look at what this looks like in practice (the full code is available in this GitHub repository).
The following code snippet showcases the use of the semaphore library in off-chain:
const groupId = 1
const externalNullifier = 1212
const signal = 1
let identity = new Identity()
const group = new Group(groupId, merkleTreeDepth)
group.addMember(identity.commitment)
const fullProof = await generateProof(identity, group, externalNullifier, signal, {
zkeyFilePath: zkeyFilePath,
wasmFilePath: wasmFilePath
})
assert(await verifyProof(fullProof, merkleTreeDepth))
First, we create an Identity object to generate the random identity nullifier and trapdoor. As mentioned before, these variables are similar to a private key, so we must securely store them, which is not a trivial task. That is why there is the option to initialize the Identity with a fixed value instead of random variables. A typical solution is to sign a constant string with a private key (for example, using MetaMask) and use this as the source for the Identity, while the nullifier and trapdoor are not stored, but generated in real-time. By doing this, our Semaphore identity provides the same level of security as a private key, as the trapdoor and nullifier can only be derived with knowledge of the private key. For example, if our private key is stored on a hardware wallet, then the nullifier and trapdoor can only be generated with the help of the hardware wallet.
After creating the Identity, we create theGroup. The Group class manages the Merkle tree. The constructor has two parameters: a unique groupId and the depth of the Merkle tree. The depth of the Merkle tree determines the number of possible elements in the group. In the example, we have given 20, which means that the group will be able to store 2^20 elements. We can add the identity commitments to the group using the addMember method.
The zero-knowledge proof is generated with the generateProof function. The first four parameters of the function are identity, which contains the private variables; group, which is necessary for generating Merkle proof; externalNullifier; and signal. The 5th parameter passes the proving key and a wasm file containing the validator code. These can be downloaded from here. Separate files are assigned for each Merkle tree depth, from 16 to 32. The proving key and wasm file can also be generated from the Semaphore circuit. For more information on this topic, please read my article.
The zero-knowledge proof can be easily validated with theverifyProof function, which has two parameters: the proof and the depth of the Merkle tree. The return value is a boolean variable that indicates whether the validation of the proof was successful or not.
Things may be a little more complicated if the management of the Merkle tree and the validation of the proof are done on the blockchain (on-chain). This is demonstrated by the following code snippet:
const { semaphore }: { semaphore: Semaphore } = await run(
"deploy:semaphore", { logs: false }
)
const groupId = 1
const merkleTreeDepth = 20
const externalNullifier = 1212
const signal = 1
let identity = new Identity()
await semaphore["createGroup(uint256,uint256,address)"](
groupId, merkleTreeDepth, ADMIN.address
)
await semaphore.addMember(groupId, identity.commitment)
// generate proof from events
const group = new Group(groupId, merkleTreeDepth)
const events = await semaphore.queryFilter(
semaphore.filters["MemberAdded(uint256,uint256,uint256,uint256)"](groupId)
)
for (let event of events) {
group.addMember(event.args.identityCommitment.toBigInt())
}
assert.equal((await semaphore.getMerkleTreeRoot(groupId)).toBigInt(), group.root)
const fullProof = await generateProof(identity, group, externalNullifier, signal, {
zkeyFilePath: zkeyFilePath,
wasmFilePath: wasmFilePath
})
// ---
const transaction = await semaphore.verifyProof(
groupId,
fullProof.merkleTreeRoot,
fullProof.signal,
fullProof.nullifierHash,
fullProof.externalNullifier,
fullProof.proof
)
await expect(transaction)
.to.emit(semaphore, "ProofVerified")
.withArgs(
groupId,
fullProof.merkleTreeRoot,
fullProof.nullifierHash,
fullProof.externalNullifier,
fullProof.signal
)
We deploy the Semaphore smart contract in the first row. This is a general contract on which we can create multiple different groups, so it is sufficient to deploy only one smart contract that can be used for all applications without the need for a separate contract for each application.
The Group is created using thecreateGroup function of the smart contract. Its three parameters are the unique groupId, the depth of the Merkle tree, and the Ethereum address of the group administrator.
We can add a new element to the Merkle tree using the addMember method of the contract, just like in the off-chain solution.
Generating a Merkle proof is more complex than in the off-chain case, as here we first need to query the blockchain to find out what members are in the group. This can be done by querying the MemberAdded events. This way, we can build the Merkle tree locally using the Group class, and then generate the zero-knowledge proof similar to the off-chain example.
The proof can be verified by calling theverifyProof method in the smart contract. If the proof is incorrect, the method will throw an InvalidProof error, and if it is valid, it will generate a ProofVerified event. If the verifyProof method is used within a smart contract to validate the proof, it is sufficient to call it without checking the return value, as the execution will stop if the proof is invalid.
In a nutshell, this is what I wanted to write about the Semaphore protocol. I hope this article will assist in understanding its function, and that many of you will be able to utilize it in your future projects.