Decentralized artificial intelligence(AI) is one of the most promising trends in the AI space. The hype around decentralized AI has increased lately with the raise on popularity of blockchain technologies. While the value proposition of decentralized AI systems is very clear from a conceptual standpoint, their implementation is full of challenges. Arguably, the biggest challenges of implementing decentralized AI architectures are in the area of security and privacy.
The foundation of decentralized AI systems is an environment in which different parties such as data providers, data scientists and consumers collaborate to create, train and execute AI models without the need of a centralized authority. That type of infrastructure requires to not only establish unbiased trust between the parties but also solve a few security challenges. Let’s take a very simple scenario of a company that wants to create a series of AI models to detect patterns in their sales data. In a decentralized model, the company will publish a series of datasets to a group of data scientists that will collaborate to create different machine learning models. During that process, the data scientists will interact with other parties that will train and regularize the models. Enforcing the privacy of the data as well as the security of the communications between the different parties is essential to enable the creation of AI models in a decentralized manner.
Traditional cryptography techniques such as symmetric or asymmetric encryption can be useful in decentralized AI scenarios but they fall short to enable many of key requirements of these type of systems. For starters, those techniques require implicit trust between the different parties to exchange keys used to secure the communication which is not a factor that you can rely on in decentralized AI architectures. Additionally, there is the issue that any party can decrypt the data and have access to sensitive information. To mitigate those challenges, decentralized AI systems have started to embraced some of the most advanced cryptography techniques coming out of academic research that, when applied correctly, feel a bit like magic 😉 Specifically, there are three security methods that are becoming omnipresent in decentralized AI architectures: homomorphic encryption, GAN cryptography and secure multi-party computations.
Homomorphic encryption can be considered one of the greatest breakthroughs in the cryptography space in the last decade. Let’s illustrate homomorphic encryption using a simple example. Imagine that you own a jewelry design shop in which you use precious metals to create new jewelry. In that environment, you are concerned that some of your designers might steal some of the precious materials they use for their work. To avoid relying on subjective trust, you create a little locked box in which the designers can manipulate the materials and create the new jewelry but they can’t take it out.
Mathematically, homomorphism is defined as “a mapping of a mathematical set (such as a group, ring, or vector space) into or onto another set or itself in such a way that the result obtained by applying the operations to elements of the first set is mapped onto the result obtained by applying the corresponding operations to their respective images in the second set”. Homomorphic encryption allows specific types of computations to be carried out on ciphertext which produces an encrypted result which is also in ciphertext. Specifically, there are two types of homomorphic encryption algorithms:
· Partial Homomorphic Encryption(PHE): Given the encryption of two data primitives a and b: E(a) and E(b), PHE can compute E(a+b) OR E(ab) without knowing a, b or the private key. PHE is the most common homomorphic encryption technique as its considerably less expensive than full homomorphic encryption models. There are different types of PHE algorithms including Unpadded RSA, ElGamal and Paillier
· Fully Homomorphic Encryption(FHE): Given the encryption of two data primitives a and b: E(a) and E(b), PHE can compute both E(a+b) AND E(ab). FHE was considered impossible until IBM researcher Craig Gentry published his doctoral thesis in 2009 in which he shown a method to construct FHE systems using lattice-based cryptography.
In a decentralized AI environment, homomorphic encryption allows parties to perform computations on encrypted dataset without having to decrypt the data. Unfortunately, most homomorphic encryption implementations are too expensive to be adopted in mainstream solutions. By expensive I don’t mean hours; I am referring to using an AWS EC2 instance for a year to run some basic computations of homomorphically encrypted data.
Adversarial neural cryptography or GAN cryptography is an emerging AI method that uses generative adversarial neural networks(GAN) to secure communication between different parties. GAN cryptography was pioneered by Google in a 2016 research paper under the title “Learning to Protect Communications with Adversarial Neural Cryptography” . The paper proposes a method in which neural networks can dynamically discover new forms of encryption and decryption to protect a communication channels from adversaries trying to break the security schemes.
The setup for the GAN cryptography scenario involved three parties: Alice, Bob, and Eve. Typically, Alice and Bob wish to communicate securely, and Eve wishes to eavesdrop on their communications. Thus, the desired security property is secrecy (not integrity), and the adversary is a “passive attacker” that can intercept communications but that is otherwise quite limited.
In the scenario depicted above, Alice wishes to send a single confidential message P to Bob. The message P is an input to Alice. When Alice processes this input, it produces an output C. (“P” stands for “plaintext” and “C” stands for “ciphertext”.) Both Bob and Eve receive C, process it, and attempt to recover P. Let’s represent those computations by PBob and PEve, respectively. Alice and Bob have an advantage over Eve: they share a secret key K. That secret Key[K] is used as an additional input to Alice and Bob.
Informally, the objectives of the participants are as follows. Eve’s goal is simple: to reconstruct P accurately (in other words, to minimize the error between P and PEve). Alice and Bob want to communicate clearly (to minimize the error between P and PBob), but also to hide their communication from Eve.
Using generative adversarial network techniques, Alice and Bob were trained jointly to communicate successfully while learning to defeat Eve. Here is the kicker, Alice and Bob have no predefined notion of the cryptography algorithms they are going to use to accomplish their goal neither the techniques Even will use. Following GAN principles, Alice and Bob are trained to defeat the best version of Eve rather than a fixed Eve.
In decentralized AI systems, GAN cryptography can allow different nodes to secure datasets and models dynamically that can be resilient to the most sophisticated attacks.
Secured Multi-Party Computations
Secure multi-party computations(sMPC) is another security technique that enables to make assertions about datasets without revealing the dataset itself. sMPC is the foundation of new blockchain protocols such as Enigma. sMPC was first introduced in 1986 by computer scientist Andrew Yao as a solution to the famous millionaire’s problem.
Consider that we have three parties Alice, Bob and Charlie, with respective inputs x, y and z denoting their salaries. They want to find out the highest of the three salaries, without revealing to each other how much each of them makes. Mathematically, this translates to them computing: F(x,y,z) = max(x,y,z)
If there were some trusted outside party (say, they had a mutual friend Tony who they knew could keep a secret), they could each tell their salary to Tony, he could compute the maximum, and tell that number to all of them. The goal of MPC is to design a protocol, where, by exchanging messages only with each other, Alice, Bob, and Charlie can still learn F(x, y, z) without revealing who makes what and without having to rely on Tony. They should learn no more by engaging in their protocol than they would learn by interacting with an incorruptible, perfectly trustworthy Tony.
sMPC is already making inroads in decentralized architectures with technologies such as the Enigma blockchain. In the context of decentralized AI architectures, sMPC will allow different parties to make assertions about datasets that can be used on AI models without revealing the datasets to third parties.
Enabling decentralized AI systems requires a strong emphasis on security and privacy. Homomorphic encryption, GAN cryptography and secure multi-party computations are some of the techniques are starting to become relevant to enable the first wave of decentralized AI platforms.