Software eats the world, and Machine Learning pushes it at an exponential speed. The rise of drives the tech industry, and our work at software 2.0 frst Federated Learning, a step closer towards confidential AI Federated Learning enables data scientists to create AI without compromising users confidentiality. This method is set to disrupt the centralized AI paradigm, in which better algorithm always comes at the cost of collecting more and more personal data. Thus, Federated Learning allows powerful network effects in industries where data cannot be transferred to third parties due to confidentiality reasons (health, bank, etc.) Simplified to the extreme, creating an AI comes to solving a mathematical function f(x)=y by observing a high number of (x) examples, labelled (y). We say an algorithm is being when we are building f and we speak of when we use f to predict a result y for a given x. trained inference Training an AI requires collecting a large volume of data. Due to the high amount of computing power needed, data is most often processed in the cloud thanks to developed by AWS, Microsoft or Google. dedicated Machine Learning solutions data is collected from users’ devices and centralized in the cloud where the training and inference of algorithms takes place. This is why AI was built on a centralized architecture: 1. What are the main issues of centralized AI? Centralized AI is by far the most common architecture. However, by separating AI algorithms from users’ devices, we are constantly moving colossal volumes of data. These repetitive transfers create serious constraints: : the obligation to transfer our data and to have it stored on remote servers creates opportunities for hackers to intercept data and use it inappropriately a. Diminished privacy : for confidentiality reasons several industries are not able to share their data and store it in the cloud (health sector, insurance, bank, military etc). These sectors cannot benefit from Centralized Artificial Intelligence b. Incompatibility with many sectors that slow inference ; centralized AI is inappropriate for many use-cases where AI needs to interact in real time with the real world (i.e autonomous cars) c. Latency problems d. due to the exploding amount of data that needs to be handled (an autonomous car generates 4000Go of data to infer every day) High transfer costs NB: edge computing, which consists in placing the processing power into the devices, at the edge of the network (i.e. a GPU in an autonomous car), removes issues c and d, but still requires to regularly collect data from users to train and improve the model. In other words, training in edge computing is cloud based ; it doesn’t solve privacy issues and prevents confidential industries to benefit from AI (problems a & b). 2. The emergence of Federated Learning A new training method called , developed by and used in its Gboard app, could become the basis of a distributed and confidential AI. Federated Learning Google How it works: Let’s take as an example a fleet of phones using a Federated AI that recommends new music to its users: The algorithm is downloaded from the cloud on every phone. This is the , common to all users central algorithm This algorithm is continuously trained on the song tracks of each user. and is personalized to the musical preferences of every user The central model becomes local The obtained from the algorithm on the device of each user, called updates, are sent to the cloud through an encrypted channel. This means new learnings only new discoveries are sent to the cloud, personal data does not leave devices Updates are aggregated with the central algorithm. The latter integrates the new learnings as if it were directly trained on the data (just like in a centralized architecture) , obtained through Federated , performs as well as the centralized model. It is then distributed again on every phone where . The AI available on each phone This new central model Learning it completes the local model accumulates learnings made from all users all the while staying personalized to each user This phenomena repeats itself How is Federated Learning an improvement to AI? This data is encrypted making it impossible for anyone to intercept the data and retro engineer it Personal data never leaves the user’s device, only updates made to the model are transferred. The updates are lighter than the original users’ data. Consequently the overall workload needed is lower in Federated Learning than in cloud based architectures or in edge computing, which makes it cheaper and more convenient The model is located in the user’s device, allowing for real time inferences with no latency problems 3. Enabling network effects on confidential data AI creates a winner-takes-all dynamic, responsible for the success of some of the biggest tech companies ( , Facebook, Amazon, etc.). This mechanism was brilliantly summarized by Matt Turck in on data network effects: Google his article Data network effects occur when your product, generally powered by machine learning becomes smarter as it gets more data from your users By using aggregated updates to train algorithms instead of raw data, (health sector, banks, insurance companies etc.) Federated Learning empowers sectors where data cannot be transferred to third parties for confidentiality reasons with data network effects. A startup that uses Federated Learning on a confidential sector will have higher network effects than its competitors because centralized AI does not allow a company to combine their clients data, which limits the performance of algorithms. It’s worth diving deeper into this last point: A startup in the insurance sector that uses classic learning methods will have to train their algorithms on each one of their clients individually. In other words, each client will have access to an AI trained on their data . Yet, we know that algorithms perform better if there are trained on higher volumes of data exclusively Through the collection of updates, Federated learning allows a startup to , and thus to create more robust algorithms gather data from all of their clients Higher performing algorithms means products of higher quality. A product of higher quality increases demand from new users, which in turn allows a startup to collect more updates, which increases even more the performance of algorithms. It goes on , a startup we recently invested in at frst, uses Federated Learning to allow various cancer treatment centers to collaborate together. A patient’s data never leaves hospitals and remains confidential. Instead, data in each center is aggregated and used to make new discoveries. Owkin Federated learning offers powerful new tools to treatment centers to make new discoveries without comprising the confidentiality of their patients. Which sector could benefit from Federated Learning? How could a network of factories benefit from this technology? How could two insurance companies learn from each other using Federated Learning? At , we believe that behind these questions lie great opportunities. If you are working on federated learning or developing this new technology, feel free to reach out to me so we can discuss more -> rst.vc frst gabriel@f Thanks to for his precious advice and for passing on to me his passion for Federated Learning ;) Gilles Wainrib

Amazon

Facebook

Federated Learning, a step closer towards confidential AI

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

5 Ways to Protect Your Cloud Storage

Application Security: A Primer

Crypto Regulations in U.S and Europe: Guide for a Novice Trader

Does The Best Secure Email Really Exist?

How to Implement QA Testing in a Confidential and Secure Manner

Issues With Private Data and Confidentiality in Hyperledger Fabric [Deep Dive]

5 Ways to Protect Your Cloud Storage

Application Security: A Primer

Crypto Regulations in U.S and Europe: Guide for a Novice Trader

Does The Best Secure Email Really Exist?

How to Implement QA Testing in a Confidential and Secure Manner

Issues With Private Data and Confidentiality in Hyperledger Fabric [Deep Dive]

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps