We are excited to announce the initial release of the , a collection of open-source tools for computing and collaborating on confidential data. Developed at , MC² (Multi-Party Collaboration and Coopetition) enables rich analytics and machine learning on data, ensuring that data remains concealed even when it’s being processed. The data in use remains hidden from the server running the job, allowing confidential workloads to be offloaded to third parties or cloud providers. This not only protects confidential data from intrusions but also enables — multiple data owners can jointly run analytics or ML on their collective data, without explicitly revealing their individual data to anyone else: not even a trusted third party. MC² Project UC Berkeley’s RISELab encrypted untrusted secure collaboration Need for confidential computing and secure data sharing Personal data is becoming more pervasive and privacy concerns continue to grow. This is causing global data protection laws to become stricter; organizations now face increasingly higher noncompliance risks. At the same time, these organizations are realizing the enormous benefits of being able to share their data with each other — banks can collaborate to detect financial crime, health institutions can collaborate on medical studies, etc. Driven by these developments, Gartner that, by 2025, “50% of large organizations will adopt privacy-enhancing computation for processing data in untrusted environments and multiparty data analytics use cases.” predicts The goal of the MC² Project is to realize this vision and solve this tension between expanding cloud adoption, the need for data sharing, and the increasing concern over data privacy. Use Cases MC² has already seen industry adoption and interest in applications surrounding finance and telecommunications: for efforts towards anti-money laundering, fraud detection, or credit risk modeling; for predicting hardware faults and performance problems across different mobile network operators. Ant Financial and Scotiabank Ericsson More generally, industries that have data locked down due to privacy concerns can benefit from MC². Our platform keeps any confidential data, such as SSNs or data, completely hidden during computation with the use of such as Intel SGX. PHI secure enclaves Key technology: Secure enclaves What are secure enclaves? Secure enclaves are a recent technology that enables the creation of a within an otherwise untrusted machine. Each enclave has access to a restricted portion of the memory; any data or software placed within the enclave is encrypted and isolated from the rest of the system. No other process on the same processor — not even privileged software such as the OS or the hypervisor — can access the encrypted enclave memory. This creates a layer of protection against any intrusion from the operating system itself; when used properly, anyone with root access to a machine running the workload can learn little to no information about what is happening inside the enclave. Enclaves provide isolated execution: (TEE) trusted execution environment Another key feature of secure enclaves is . This is a feature that enables users to cryptographically verify that an enclave is running trusted, unmodified code. The MC² Project provides a remote attestation platform for users to attest any non-local compute service from a trusted local client running on their own machine. Enclaves support remote attestation: remote attestation Unfortunately, loading existing software into enclaves could expose the data to certain , where an attacker can learn additional information about the encrypted data by observing auxiliary information such as data access patterns during the software’s execution. Preventing such leakage is left to the software developer; MC² tackles this problem by the enclave code and ensuring it is resilient to side-channel leakage via memory access patterns. Enclaves and side-channels: side-channel attacks fortifying Secure enclaves vs. other approaches Secure enclaves are not the only privacy-enhancing approach out there for computing confidential data. Here, we compare it to other popular alternatives: MC² provides a software stack that powers secure enclaves In particular, MC² provides a platform that can seamlessly run popular analytics and machine learning frameworks (Apache Spark, XGBoost, etc.) within enclaves securely and efficiently, abstracting away the complexities of writing enclave code from the end-user. One approach to using enclaves is to simply load the entire application (e.g., Apache Spark) into the enclave. However, doing so adversely affects both the security and efficiency of the enclave application. For instance, if the program is memory-intensive, the performance will be greatly impacted by excessive encryption/decryption and paging. Instead: MC² the application so that only the components that need to compute directly on the sensitive data are loaded into the enclave. Other components, such as network communication and task scheduling, are executed outside the enclave. This also benefits security by reducing the trusted computing base, i.e., the amount of code that runs within the enclave and therefore needs to be vetted beforehand. MC² partitions the enclave code for security and efficiency: partitions MC² the enclave components using cryptographic techniques to provide stronger security guarantees. This is done in two ways. First, MC² builds in measures to verify the of jobs that have distributed execution. Second, since enclaves are known to be vulnerable to side-channel leakage, MC² makes use of techniques in enclave code to make sure that no side-channel information is leaked via memory access patterns. Data obliviousness ensures that the memory access patterns do not reveal any information about the sensitive data being accessed. MC² fortifies enclave execution: fortifies integrity data-oblivious End-to-End Workflow The entry point to all compute jobs supported by MC² is the . This tool runs in a environment, typically the user’s local machine. Through a command line or Python interface, the client software is responsible for handling remote attestation and submitting jobs to the compute cluster. The client also contains additional features to generate keys needed for the compute service and to start/stop a cluster of machines on . (Visit the for concrete details on how all of this can be achieved, or the for a hands-on demonstration of the workflow.) The MC² Client: MC² Client trusted untrusted Microsoft Azure documentation quickstart MC² offers several compute services: these include , , and . All are intended to run in a primary untrusted environment, such as a cluster of machines hosted on a public cloud, that has support for trusted execution environments (hardware enclaves). Data is encrypted in transit using a client key and only ever decrypted inside hardware enclaves, providing the previously mentioned security guarantees for data in use. For all compute services, MC² leverages the , a project intended to provide a consistent API for a variety of different enclave architectures. The MC² Compute Services: Spark SQL distributed XGBoost secure aggregation for federated learning Open Enclave SDK Research Prototypes MC² also includes the following exploratory research prototypes (not integrated with the MC² Client) enabling secure computation with novel cryptographic techniques. These works were published at USENIX Security, a top security conference. : A general-purpose Python DSL for learning with secure multiparty computation. Cerebro : Secure inference for deep neural networks. Delphi Conclusion MC² is a platform for running secure analytics on data that stays encrypted even when in use. By doing so, the project also enables secure collaboration among multiple organizations, where individual data owners can use our platform to jointly analyze their collective data without revealing it to one another. The development of the MC² Project is actively maintained by Opaque Systems . To learn more about how Opaque can help you take advantage of confidential computing, visit our website at opaque.co . We would love your contributions! Visit our GitHub page to see all the projects under the MC² umbrella. Also published at https://towardsdatascience.com/secure-collaborative-analytics-and-ml-using-mc%C2%B2-4be376cfaba0

Apache

Secure Enclaves and ML using MC²

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

#1 Song Despacito Proves The Power Of Collaboration

3 Key Challenges to Effective Team Collaboration and How to Face Them

3 Things to Forget if You Want to Level Up Your Team: The Magic of Collaboration

4 Reasons Why Email Is Obsolete, and You Should Move On

5 Benefits of Project Planning and Management

5 Team Communication Tools That Will Skyrocket Your Productivity

#1 Song Despacito Proves The Power Of Collaboration

3 Key Challenges to Effective Team Collaboration and How to Face Them

3 Things to Forget if You Want to Level Up Your Team: The Magic of Collaboration

4 Reasons Why Email Is Obsolete, and You Should Move On

5 Benefits of Project Planning and Management

5 Team Communication Tools That Will Skyrocket Your Productivity

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps