paint-brush
Zero Knowledge Proof based Gradient Aggregation for Federated Learning: Methodologyby@escholar
208 reads

Zero Knowledge Proof based Gradient Aggregation for Federated Learning: Methodology

tldt arrow

Too Long; Didn't Read

Traditional FL solutions rely on the trust assumption of the centralized aggregator, which forms cohorts of clients in a fair and honest manner. However, a malicious aggregator, in reality, could abandon and replace the client’s training models, or launch Sybil attacks to insert fake clients. Such malicious behaviors give the aggregator more power to control clients in the FL setting and determine the final training results. In this work, we introduce zkFL, which leverages zero-knowledge proofs (ZKPs) to tackle the issue of a malicious aggregator during the training model aggregation process.
featured image - Zero Knowledge Proof based Gradient Aggregation for Federated Learning: Methodology
EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) Zhipeng Wang, Department of Computing, Imperial College London;

(2) Nanqing Dong Department of Computer Science, University of Oxford;

(3) Jiahao Sun, Data Science Institute, Imperial College London;

(4) William Knottenbelt, Department of Computing, Imperial College London.

TABLE OF LINKS

Abstract & Introduction

Related Work & Preliminaries

Methodology

Theoretical and Empirical Analysis

Results

Conclusion & References

Methodology

zkFL

As shown in Fig. 1, our zkFL system works as follows:


  1. Setup: N clients and 1 aggregator generate their private/public key pairs and set up communication channels. Each client knows the public keys of the other n − 1 clients, and this setup can be achieved by using a public key infrastructure (PKI).

  2. Local Training, Encrypting, and Signing: During each round, the n clients train their models locally to compute the local model updates w1, w2, . . . , wn. Each client encrypts their update Enc(wi) = g wi · h si using Pedersen commitment, where g and h are public parameters and si is a random number generated by the client. The client signs the encrypted updates with their private key to generate a signature sigi

  3. The client then sends the tuple of local model update, the randomly generated number, encrypted local model update, and signature (wi , si , Enc(wi), sigi) to the aggregator.

  4. Global Aggregation and ZKP Generation: The aggregator aggregates the received local model updates w1, w2, . . . , wn to generate the aggregated global model update w = Pni= 1 wi. The aggregator also computes the aggregated value of the encrypted global model update Enc(w) = Qn i=1 Enc(wi) and signs it with its private key to generate the signature sig. The aggregator then leverages zk-SNARK to issue a proof π for the following statement and witness:


    where the corresponding circuit C(statement, witness) outputs 0 if and only if:


    Global Model Transmission and Proof Broadcast: The aggregator transfers the aggregated global model update w, its encryption Enc(w) and the proof π to the n clients.

  5. Verification: Upon receiving the proof π and the encrypted global model update Enc(w) from the aggregator, the clients verify if π is valid. When the verification is passed, the clients start their local training based on the aggregated global model update w.

Blockchain-based zkFL


To decrease the computation burden on clients, we incorporate blockchain technology into our zkFL system. In this approach, the verification of proofs generated by the aggregator is entrusted to blockchain miners. Illustrated in Figure 2, the blockchain-based zkFL operates as follows:

  1. Setup: N clients and 1 aggregator generate their private/public key pairs, which correspond to their on-chain addresses.

  2. Selection: For each round, n clients are selected from the N clients via Verifiable Random Functions [24,3]. The n selected clients’ public keys are broadcasted to the underlying P2P network of the blockchain, which will be received and verified by the miners.

  3. Local Training, Encrypting, and Signing: The n selected clients train their models locally to compute the local model updates w1, w2, . . . , wn. Each client encrypts their update Enc(wi) = g wi ·hsi using Pedersen commitment, where g and h are public parameters and si is a random number generated by the client. The client signs the encrypted updates with their private key to generate a signature sigi. The client then sends the tuple of local model update, the randomly generated number, encrypted local model update, and signature (wi , si , Enc(wi), sigi) to the aggregator.

  4. Global Aggregation and ZKP Generation: The aggregator aggregates the received local model updates w1, w2, . . . , wn to generate the aggregated global model update w = Pn i=1 wi. The aggregator also computes the aggregated value of the encrypted global model update Enc(w) = Qn i=1 Enc(wi) and signs it with its private key to generate the signature sig. The aggregator then leverages zk-SNARK to issue a proof π for the following statement and witness:


where the corresponding circuit C(statement, witness) outputs 0 if and only if:

  1. Global Model Transmission and Proof Broadcast: The aggregator transfers the aggregated global model update w and its encryption Enc(w) to the n clients, and broadcasts the proof π, and the encrypted global model update Enc(w) to the miners over the P2P network.
  2. On-Chain Verification: Upon receiving the proof π and the encrypted global model update Enc(w) from the aggregator, the miners verify π and append Enc(w) to the blockchain if π is valid.
  3. On-Chain Reading: When the next round starts, the newly selected n clients read the blockchain to check if Enc(w) is appended on-chain. When the check is valid, the clients start their local training based on the aggregated global model update w.