This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Tariq Bontekoe, University of Groningen;
(2) Dimka Karastoyanova, University of Groningen;
(3) Fatih Turkmen, University of Groningen.
Zero-knowledge proofs and verifiable computing
Privacy-preserving computations
Requirements: an application’s perspective
Verifiable, privacy-preserving computing
Open challenges and future directions
An increasing number of papers discusses the combination of PETs with verifiable computing in settings where both privacy and correctness should be guaranteed. In this section we gather, classify and compare existing works that add (public) verifiability to (decentralized) privacypreserving computations. Moreover, we answer the following question RQ3: How do existing solutions compare with regards to the requirements/properties of Section 5?
To obtain a comprehensive list of current works, we first determined a small set of recent, relevant articles on (publicly) verifiable/auditable privacy-preserving computations. Specifically, we determined at least one very recent ‘seed’ article for different underlying (privacy-preserving) computation techniques, by querying the Google Scholar and Scopus databases. The most recent, at the time of writing, matching verifiable MPC paper [72] was found with the query: (public verifiability OR public auditability) AND (MPC OR SMPC OR multi-party computation OR secure multi-party computation). Two relevant, recent verifiable HE papers [73], [74] were found using: (verifiable OR public verifiability OR public auditability) AND (homomorphic encryption OR fully homomorphic encryption). The final seed paper [54] on DLT based privacy-preserving computations was, up front, known to be recent and relevant, hence no specific query was used.
Subsequently, we checked all papers in the reference list of these ‘seed’ articles to the second degree. Next to this, all articles that referenced our ‘seed’ articles were also checked. After filtering the articles on their relevance, based on title, abstract, and a quick scan, we found 32 relevant works, presenting a novel or improved solution for verifiable privacy-preserving computations.
Table 1 shows an overview of all papers and the constructions used to combine privacy and verifiability, classified according to the underlying computation mechanism that was employed. In the text below, we discuss the solutions in more detail per verifiability mechanism used: ZKPs, MACs, and TEE, and other approaches.
Zero-knowledge proofs. ZKPs are used in MPC, HE, and DLT approaches. Most MPC based solutions rely on some form of LSSS, e.g., Shamir’s secret sharing (SSS) [31] or SPDZ(-like) schemes [32]. Approaches using PHE (e.g., CDN [97]) or GCs were observed less often.
MPC. The majority of verifiable MPC solutions uses commitments to secret shares of input/output values and intermediate computations in combination with NIZKs to guarantee correctness. We distinguish three types of NIZKs that are used with varying properties. The first set of works [40], [72], [82], [84], uses Σ-protocols and the FS heuristic to obtain NIZKs from standard assumptions. Two of the SPDZ(-like) protocols ([40], [82]) additionally rely on MACs similar to those used in the original SPDZ protocol. A downside of Σ-protocols is the large proof size and significant effort required on the verifier’s side for larger computations. Hence, recent works [76], [83] often apply the more efficient compressed Σ-protocol theory [27], or Σ-bullets [26], which provides similar security guarantees.
As an even more efficient approach, other works [6], [77], [78], [79] use distributed, often adaptive, zk-SNARKs to assure correctness. [81] presents an MPC based solution relying on crowd verifiable ZKPs. Meaning that a prover tries convince a group of verifiers (‘crowd’), where each verifier only provides a bounded amount of randomness, thus requiring a mostly honest crowd.
HE. Only in recent years, the first solutions using ZKPs to create verifiable HE have appeared. [88] presents the first zk-SNARK suitable for the BV scheme [98]. It does however require specific setup parameters for the FHE scheme that fit the group orders of the zk-SNARK, leading to decreased efficiency of FHE operations.
[95] improves upon this work by not restricting the HE parameter space. However, both solutions are limited to ‘simpler’ FHE schemes, i.e., without complex ciphertext maintenance operations, since these are not supported by the homomorphic hashing scheme used. [86] proposes a method that does not rely on homomorphic hashes, but rather uses Rinocchio: a zk-SNARK that efficiently operates on rings.
[73, Sec. 5.1] compares Rinocchio against three popular efficient ZKPs and show a decrease in prover time by a factor ∼1.5–25000. An open question for Rinocchio is how to perform ciphertext maintenance operations. This generally necessitates operations not supported by rings or switching between different rings, which is not supported by Rinocchio.
DLT. Some schemes do not rely on a specific PET for privacy-preserving computations. Instead, they combine DLT with ZKP to achieve a similar result. This especially concerns the use of zk-SNARKs, due to their small, often constant, proof size and efficient verification. While most applications focus purely on financial transactions, e.g., Zerocash [7], we also identify works that focus on smart contracts. [26] proposes Σ-bullets to create a privacypreserving payment system in smart contract platforms. The provided functionality does, however, not directly generalize to non-payment related computations. [53] proposes the Hawk framework for verifying generic smart contract functions whilst preserving privacy of input and output using zk-SNARKs. DPC, as proposed in [54], can be seen as an evolution of Hawk that also provides function privacy using composable zk-SNARKs. We note that all of these works only verify local computations executed by a single party. However, due to publishing results and proofs on a blockchain, the entire system could be used for verifiable, privacy-preserving computations with multiple parties.
MACs. Message authentication codes are predominantly used in HE approaches, but also appear for MPC.
MPC. While MACs are often used to achieve active security in MPC, e.g., SPDZ [32], it is not directly clear how to obtain public verifiability, due to often having to rely on a secret key for verification. [5] uses MACs to design a scheme where n clients verifiably outsource a computation to m workers. However, only clients are able to verify the computation, and at least one worker needs to act honestly to guarantee correctness and data privacy.
HE. We see three different approaches for combining HE ciphertexts and MACs, agreeing with [73]: (1) Encryptthen-MAC; (2) Encrypt-and-MAC; (3) MAC-then-Encrypt.
The Encrypt-then-MAC approach is used in [87]. The authors propose to first homomorphically hash BGV [99] ciphertexts, and then add a homomorphic MAC computed from that hash. However, the presented solution is rather limited, and only supports quadratic circuits. Due to the lack of other works using this approach, it is unclear whether it can be generalized to general arithmetic circuits.
Encrypt-and-MAC approaches have a MAC and ciphertext that are independent of one another, both are computed directly from the plaintext. The MAC therefore needs to be both hiding and support the ciphertext operations required for the FHE scheme. [94] introduces a privacypreserving homomorphic MAC that supports multiplication and addition operations. This primitive is subsequently used to construct an efficient Encrypt-and-MAC construction for HE. However, it is unclear how the MAC supports other types of operations that are required for most FHE schemes.
We found the most occurring approach of the three to be MAC-then-Encrypt, where the MAC is computed from the plaintext and concatenated to the plaintext before encryption. [91] is the first to introduce a fully homomorphic MAC scheme that can be used to verify HE computations over boolean circuits. While the MACs of this scheme are succinct, using this approach to verify HE computations is no more efficient than running the computation itself. An improvement upon this scheme, with more efficient verification of HE computations on arithmetic circuits, is proposed in [92]. The approach is however limited to arithmetic circuits of bounded depth. [93], a more recent work using the same approach, does not have any of these limitations and can thus be used to verify HE evaluations of any arithmetic circuit.
None of the above schemes are publicly verifiable, due to requiring access to the full secret key to verify the MAC.
Trusted execution environment. A TEE is a collective name for (mostly) hardware-based solutions that protect data privacy while enabling the use of that data in computations. TEEs require specific hardware, e.g., Intel SGX or AMD SEV, that compute on encrypted data, by only decrypting it inside a secure part of the CPU. Some TEEs also provide attestation mechanisms to verify the code running inside.
[96] proposes Ekiden, a blockchain solution where smart contracts over private data are executed off-chain. Specialized compute nodes with TEEs execute the smart contract and publish a remote attestation to correct execution. The consensus nodes on their turn should have the ability to verify these attestations to update the smart contract results on-chain.
[90] presents a construction for FHE in TEE that makes use of TEE attestation as well. Clients can send their encrypted data to a TEE in the cloud, and only need to attest that it runs exactly the right computations. TEEs naturally support more computations than other approaches, however the supported operations are more limited and significantly slower than when they were executed by regular CPUs. Optimizing FHE computations for TEEs can lead to notable performance improvements [73].
For both approaches it is unclear how to achieve public verifiability, i.e., how attestation can be (efficiently) performed by external parties. Moreover, while TEEs are often relatively efficient compared to other methods, they do require trust in the specialized hardware that is used.
Other. Some works did not fit any of the above categories. Nonetheless, the approaches are worth mentioning.
[75] outsources all computations to a trusted third party (TTP). This TTP produces NIZK proofs to guarantee correctness and is trusted to preserve input privacy. This solution is relatively efficient, however requiring a TTP is unrealistic in many real-world use cases.
[80] introduces a pairing-based instantiation of functiondependent commitments, leading to efficient verification of correctness. However, the proposed construction only supports linear functions and it is unclear whether this can be extended to non-linear functions.
[85] presents an efficient, constant round, publicly verifiable MPC protocol that only makes black-box use of cryptographic primitives. Their solution is based on a BMRtype garbled circuit [37] and requires black-box access to a statistically secure OT scheme and a circular 2-correlation robust hash function. Notably, correctness and public verifiability only hold when there is at least one honest party.
[8] constructs an MPC protocol using an SWHE scheme supporting a single multiplication. Verifiable key switching, using NIZKs, is adopted to support computations with an arbitrary number of multiplications.
[74] presents an approach for verifiable FHE, they call ‘blind hashing’, where the hash of the plaintext is appended to the plaintext before encryption, i.e., this is highly similar to MAC-then-Encrypt. The hash is subsequently verified by also computing the hash of the decrypted result and ensuring it equals the hash that was included in the ciphertext. We note that this preliminary publication lacks security proof or reasoning, and only explains how to perform a single matrix multiplication. It is unclear how this approach extends to other operations, such as needed for ciphertext maintenance and more complex circuits.
Finally, [89] proposes to use Yao’s GC [35] for one-time verifiable computations between a client and server. Subsequently, they propose to add FHE on top to transform this one-time approach into a reusable one, allowing multiple evaluations of the same function on different input data. It should be noted that a different computation needs a new setup, and computing the setup is rather inefficient.
An overview of the asymptotic complexities of each scheme is provided in Table 2. In case the authors did not mention the complexity of their solution in the article, and we could not find it in related works, we marked it as ‘N/A’.
Specifically, we describe the computation ( 1 ) and communication ( 2 ) complexity of the verifiable computation. Whenever possible, we split the costs between those used to create the function result and those used to create the proof or tag needed for verification. If a solution makes use of a generic building block, rather than specifying the exact scheme used, we describe the type of building block used, e.g., LSSS, HE, or zk-SNARK. Since verification ( 3 ) is executed apart from the computation in the case of public verifiability, we list its complexity in a separate column.
Finally, we list the asymptotic number of commitments used by each solution, if they are used and if their complexity is not yet included in the other columns. Commitments lead to additional computation and communication costs, depending on the specific commitment scheme used.
In short, we observe that MPC approaches have a higher communication and a lower computation cost than HE based approaches. Moreover, we see that both MAC and zk-SNARK based approaches have efficient verification, especially when compared to other approaches. MACs do often require more communication than zk-SNARKs, but are more efficient computationally.
Based on our discussion in Section 5 we noted several important aspects that determine the suitability of a scheme for several domains. Table 3 provides an overview of the relevant aspects regarding security and usability. Next to this, we list whether the original work has evaluated their solution in a practical scenario, and whether they provide a publicly available implementation. Below, we compare the different solution approaches by discussing the most striking observations and challenges for future work.
Data privacy 4 . There is an observable difference between MPC- and HE-based solutions regarding the assumptions for which they provide data privacy. This is largely due to the fact that MPC distributes data over multiple parties, whereas HE-based solutions generally send all their data to a central computation server.
In MPC based solutions, privacy depends on the scheme used. While some schemes guarantee privacy when only one of the parties is honest, others require an honest majority.
For HE, data privacy of each ciphertext is guaranteed, and only parties holding the secret key can view the private data. However, in many use cases with distributed data or decentralized computations, inputs must be encrypted under a secret-shared secret key. In that case, the same observations as for MPC-based solutions hold.
Correctness and public verifiability 5 & 6 . While almost all schemes provide unconditional correctness, the same cannot be said for public verifiability. Most HE solutions do not offer public verifiability, in contrast to MPCbased solutions. The cause for this is most likely due to the fact that most HE solutions rely on MACs.
Next to this, the combination of ZKPs and HE schemes is currently too inefficient and constrained.
Security assumptions 7 & 8 . We determine three main groups of security assumptions. The first group relies on TEE for both privacy and correctness. While these schemes do not require any non-standard cryptographic assumptions and are also very efficient, they do require access to and trust in very specific hardware. Due to recent attacks on CPU hardware, e.g., Spectre and Meltdown, parties may be reluctant to use this.
The second group uses zk-SNARKs to achieve public verifiability. These schemes are known to be highly efficient, but have to rely on non-standard cryptographic assumptions, such as knowledge-of-exponent type assumptions, in return. Moreover, many of these schemes rely on a trusted setup to guarantee security, which may be undesirable in many cases.
The last group consists of constructions relying mostly on Σ-protocols and MACs. Such constructions are often less efficient, but they compensate this by relying only on standard cryptographic assumptions, without requiring specific hardware.
Regarding post-quantum security ( 8 ) we note that most HE schemes themselves are by default post-quantum secure and thus many of the MAC-based constructions discussed above are likely to derive this property, even though this is not directly discussed. However, most MPC based constructions are not post-quantum secure. Only [72] makes use of mostly post-quantum secure primitives. Even though it is not fully post-quantum secure, with some small adaptions it likely will be.