Adventures of an Enclave (SGX / TEEs)

Written by lsquaredleland | Published 2018/12/13
Tech Story Tags: blockchain | tees | intel-sgx | cryptography | enclave-adventure

TLDRvia the TL;DR App

So what can SGX and TEEs be used for beyond the obvious cases?

Imagine a magical piece of hardware that no one can see inside, even if they break it open, that is the promise of Intel SGX and TEEs (Trusted Execution Environments).

Cryptographers for decades have been pushing the limits of secure computation. This is when the computing party is oblivious to the result and underlying data. Example: did Alice or Bob supply a larger number without revealing the underlying numbers to training machine learning models on encrypted data. Some of these problems have been solved, but often times the solutions are neither generalised nor efficient.

Let’s discuss how trusted hardware works, its standard use cases and some more unique ones, before discussing various blockchain projects that use this technology and the future of TEEs.

Note SGX is a subset of TEEs that is currently the most widely used and I use the terms fairly interchangeably throughout the post.

Why Private Compute?

Privacy is often valued greatly in particular instances and brushed away on others, let’s give some examples cases where privacy does matter.

  • Secure lotteries where no one can cheat or rig the numbers, where the code is public and individuals can attest that the lottery is running that code.
  • How to share images for classification by an algorithm if the images are locked under by HIPPA, GDPR and other data privacy controls.

Privacy Implementation

Some of the current cryptographic primitives tools available for secure computation include techniques such as fully homomorphic encryption (FHE), secure multiparty computation (sMPC), and Zero Knowledge Proofs (ZKPs). However these techniques are either too specialised (not generalised), too slow, or too computationally expensive to make them practical in a production environment. Systems like SGX provide similar security guarantees but is much faster, cheaper and practical today.

Figure: High level on how SGX works, which is relatively comparable to TEEs.

Technicalities

This is how SGX works on a high level, specific implementation details are passed over here.

  1. Code runs in a hardware protected enclave / area [1] separate from OS, which has a private key associated that is secret
  2. Enclave can communicate through special channels to the application
  3. Use remote attestation to prove that a specific piece of code ran on a suitable enclave producing a specific result (“quote”), whose integrity is verified[2].

With this, a developer can send open sourced code to an enclave, a user can verify that the code running inside the enclave is equivalent to the open sourced code via remote attestations. Users then can inspect the code for any backdoors or unexpected functionality.

Common Use Cases

Let’s start with the common and obvious use cases of this technology.

Most Common Use Cases

These use cases that already have fairly common cryptographic constructions for, but can also be done in TEEs. Determining whether Alice or Bob is wealthier without revealing the actual value(Yao’s Millionaire Problem). Counting votes without revealing the linkability between vote and caster. Generating a random number. Blind auctions where the individuals running the auction are unable to see the bids and also forced to reveal all bids at the end.

Provable erasure

Regret that moment you shared some photos to a former lover? How can you prove that they deleted the photo, the same applies to GDPR, how can one prove that the data is no longer in the database or in someone’s hands who can access it? If the data was stored inside of an enclave, a user can attest that the data has either been deleted within the enclave or that the associated private key of the data has been deleted and the data can no longer be decrypted. Phew if they didn’t take a photo of the photo….

Key Generation

Shamir’s secret sharing is a cryptographic technique that has similar functionality to a multi sig wallet. Generate m shards and n number are necessary to initiate a transaction. However as a precursor it’s necessary to have a public and private key pair, but how can one prove that the key was destroyed and not copied? Here comes the provable erasure attribute of TEEs, generate the public and private key pair inside the enclave, create the shards, then provable delete and not leak the private key.

Private Searches and Encrypted Databases

Let’s say there exists a database of restaurants that I want to search, and I do not want to reveal my searches to the owner. Mostly I’m afraid that the owner would sell the data to the paparazzi. Currently there is no easy way of enabling private searching of data with standard cryptographic techniques. However with SGX and TEEs it is possible to search within a dataset that is encrypted by the enclave, without revealing to the operator what one searched for. A similar technique can be used for web searches[here][here]

Provable Search

Initially we wanted private searches, then why all of a sudden would we want to have provable searches? Don’t traditional databases already provide that in their logs? Yes, but let’s say we gave the NSA some viewing keys to a privacy blockchain. (Viewing keys are used to deanonymize the contents of a private transactions in protocols such as Zcash and Monero). Can we build a system that logs what subset of the keys that were used, with TEEs it is possible to have tamper proof logs, which no other cryptographic system can provide.

Uncommon Use Cases

This is where the fun begins, how can we use TEEs in ways that are non intuitive? Some of these ideas are from academic papers, others were conjured up and verified by researchers in the field.

Heart Beats

Heart beats are a technique to signify that someone or something is still around (the software equivalent of a dead man’s switch). Let’s say if I didn’t send a particular message to my mother every month, she would assume that I have gone missing. But let’s make this actionable. In Paralysis Proofs there are three friends who want to have a 3 of 3 multisig, however they don’t want to lose access to the funds if one friend disappears. Using some code run inside of an SGX enclave, it requires all three to submit individual messages to authorise the transfer of assets. In the event that a friend goes missing, the other two can send in a special message that will require the missing friend to submit a heartbeat within a time frame t else the wallet becomes a 2–2 multisig. (Maybe that friend is in hiding, who knows).

Centralised Non Custodial Exchanges

Sounds like an oxymoron to an extent, but it’s possible to have a non custodial exchange that is centralised which can never steal users funds. Maybe non custodial is a misnomer because the funds are technically associated with the private key of the enclave. Users can attest that the code running in the enclave is equal to code that the developers have open sourced and that there is no secret back door. Even more exciting, when new code is pushed to the enclave, the developer can implement a functionality that allows users to use the older code, until the new code has been fully audited, etc.

Private Machine Learning

Data is the new oil (hash power probably is these days though) and companies hold their data very close, rarely sharing it unless they’re forced to. Sometimes this is a market failure, ex: if hospitals shared their broken bone x-rays to each other, they can train a superior model than if they used their subset of data [another paper]. This would improve patient outcomes, but cannot be done due to HIPPA and GDPR[3]. However secure compute to the rescue, here each hospital will send encrypted images to the enclave, and the enclave will decrypt it locally and train a model. Even more interesting the model potentially could only reside in the enclave and can never be exported. One limitation, most enclaves have limited computational power, thus the training of models will take some time, but there are some recent advances that might get around this. Along with the GPUs with TEEs.

Although medical reasons are most cited for private machine learning, there’s an implication that’s more far reaching. For self driving cars to become safer and more common it is essential to collect more data to improve their algorithms and models. But cars collect so much data that it is unfeasible to send it to a centralised data center, however if the data is verifiably analysed locally and the output is sent to a central server the output is order of magnitudes smaller (verifiable federated learning). Because car manufactures cannot have malicious users sending altered output and they need to tame fears of mass surveillance.

Secret Blockchains and Smart Contracts

Imagine a blockchain who’s client / node code is only known by the developer and cannot be deciphered by anyone else. The developer encrypts the byte code and sends it to each enclave, once inside the byte code is decrypted and run against a suite of formal verification tests to attest that the code functionality is what people assume it to be.

Blockchain Consensus

This generally runs counter to the “decentralisation” ethos of the cryptocurrency community, but what if SGX was used for part of the consensus mechanism? In Proof of Luck, SGX’s random number generator is used to elect a consensus leader who will create the next block. The authors cite “low-latency transaction validation, deterministic confirmation time, negligible energy consumption, and equitably distributed mining” as benefits in the system. Proof of Time and Proof of Ownership along with Proof of Useful Work and Proof of Elapsed Time are additional consensus mechanisms that have been designed using SGX and are only possible given properties of the enclave.

Cheap BFT Compute

Byzantine Fault Tolerant compute, refers to any compute that is unstoppable, like any BFT consensus mechanism. Ethereum can be seen as one giant BFT computer, however it is extremely expensive to run, but anything smart contract on the network will be executed given the incentives. Censorship of the network is difficult. How can we have the same censorship resistant compute, but have it be dramatically cheaper? If we have a publicly verifiable VM inside of an enclave we know how code should execute. And because the enclave itself does not know what code is running on top of the VM, then we have cheaper private BFT compute. Image computer programs that are anonymous and that are unstoppable….no one has really understands the ramifications of that yet. Think anonymous (not pseudonymous) assassination markets.

And More

This just touches the cusp of the powers of TEEs, here are some additional use cases that you can read on your own time: Vote Buying with Dark DAOs, Bitcoin Mixing, Securing Tor, Stealing Bitcoins from other Enclaves and Secure Public Clouds.

Use Case Summary

Here is a list of the use cases explored in this piece, only some of them are exclusive to TEEs but might be less performant using other techniques. Like anything cryptography, there are multiple ways to achieve the same goal, but there are inherent trade offs. For the sake of conciseness we will not go heavily into the trade offs space between different cryptographic constructions.

Figure: Table of use cases, examples from the table linked here: ZKPMPC

Blockchain Projects Using TEE / SGX

Given these unique properties of TEEs it’s no wonder why some blockchain protocols have adopted this technology as either part of their core technology or implemented it for an ancillary feature. Most of them use it for private compute, but some have a unique spin. All of the projects and companies below are currently being built out and / or have been used in production in the past, this list excludes many academic projects that are more research focused.

Academic Projects Background

Blockchain Companies using SGX

These are some companies that came up when asking around, I have not verified how production ready they are and the list is ordered alphabetically to have no bias.

What will you create with SGX and TEEs?

After conducting this survey of uncommon TEE use cases, I’ve realised that these are the use cases that will become the most common. The value of these tools are for the hard problems that cryptography has yet to solve, comparing two numbers are easy, but verifiable federated learning to entirely anonymous smart contract platforms is the future that we are looking at. And we are just starting to truly understand the use cases and the ramifications.

Further Readings and Research

Arxiv Papers

Attacks and Vulnerabilities

Unfortunately SGX and TEEs are not infallible, like all cryptographic tools there are trade offs. These issues are prevalent in all forms of existing TEEs, however we will focus on the particular faults of SGX as they are better documented.

  • Side channel attacks (this is a class of attacks), cache-timing attacks, cache attacks, speculative execution attacks (this one is quite infamous), different speculative execution attack (some of these have been partial fixed)
  • Licensing issues (what if Intel stop licensing enclaves to specific projects)
  • Centralised remote attestation (the servers are run by Intel, Proof of Intel)
  • Not anonymous attestations (Intel knows what particular device it is attesting for)
  • Closed Source (hardware design and some of the software components)
  • Proof of Intel (for attestations and manufacturing of the devices, don’t want to have a Supermicro event all over again, which turned out to be false but there are similar things like this and this)

Bright Future

However the future is not that bleak, despite some of the existing pitfalls of Intel’s SGX, there are a new generation of TEEs that are coming around. This includes the Keystone project which is working on an open source RISC-V implementation. Along with Gradient which is sidechannel immune and SGX 2.0.

Non Blockchain Companies Using SGX / TEEs

Footnotes

[1] Enclaves are suppose to be a hardware isolated regions separate from the operating system. However in SGX that is not the case. Enclaves in the SGX framework are not hardware isolated from the OS. They are basically a memory primitive, in which the memory address space of the code being executed by the enclave, as well as the SP, PC, and some microarchitectural state are obfuscated from the OS. To be actually robust to sidechannel attacks, the page tables and faults need to be hidden from the OS, and SGX doesn’t do this. SGX’s page table faults are observable by the OS, and the OS can cause faults, therefore a malicious OS, can directly learn a secure container’s memory accesses at page granularity, and any piece of software can perform cache timing attacks, which use the cache tag state as exfiltration path. These vulnerabilities were demonstrated in 2018 by the Foreshadow attack

[2] Noted by Christian from Gradient, it is possible to fix enclave sidechannel attacks, but the remote attestations are a weak point. SGX and Apple’s SEP use strongly isolated device keys to do attested boot to honest state. In their cases the attestation is to Intel or Apple servers, not anyone else. This attestation is not anonymous in the prover stage. With Gradient however, they are working on an anonymous attestation scheme, so one can check the integrity of a processor without reliance on a central third party, and without ever exposing identity.

[3] This is more of a social issue, medical groups can get consent from their patients to share data with others however choose not to do so. Thus even if the tools are created it is unlikely that many would use it.

Special thanks to the reviewers: Lorenz Breidenbach (IC3 / ETH Zurich), Christian Wentz (Gradient), Martina Long (Primitive Ventures), Dawn Song (UC Berkeley / Oasis Labs), Kevin Britz (Totient Labs)


Published by HackerNoon on 2018/12/13