3,420 reads

How to Write Upgradable (Versioned) Smart Contracts in Solidity?

by Vaibhav SainiDecember 14th, 2018

Too Long; Didn't Read

Immutability is a feature that makes the Blockchain great.

People Mentioned

Companies Mentioned

Coins Mentioned

featured image - How to Write Upgradable (Versioned) Smart Contracts in Solidity?

A Complete Guide on Understanding and Implementing Upgradable Smart Contracts in Solidity using Libraries

Versioning(Pseudo-Versioning) Smart Contracts

Immutability is a feature that makes the Blockchain great.

But like everything in this world, it also does have some cons.

This article focuses on Reusability and Upgradeability of Smart Contracts in today’s Blockchain platforms. We will mainly talk about Ethereum in this article, which is the most widely used smart contract platform today.

But first Go make a cup of coffee☕ first, it’s going to be a long one.

We will start by seeing what are the reasons behind these restrictions/inabilities of current smart contract platforms. Then we will explore workarounds to model the Upgradeability(versioning) behavior(I call it Pseudo-Versioning) which we enjoy today in almost all of the centralized platforms.

P.S. If you are really new to EVM(or want to learn about EVM in depth), then consider going through the below article.

Getting Deep Into EVM: How Ethereum Works Backstage_An Ultimate, In-depth Explanation of What EVM is and How EVM Works._hackernoon.com

Overview

Solidity(broadly speaking, EVM) has still a long way to go in terms of programmer productivity and language expressiveness. If you have worked with Ethereum, then by now you have probably realized that

Solidity is a limited language.

Especially, when you come from the lands of Swift and Javascript, developing in Solidity is definitely a step back in terms of what the language allows the programmer to do and the expressiveness of the language.

This can sometimes, piss you off.

Even the Panda got pissed off

But Why it is Limited?

Solidity, and in general languages that compile to bytecode intended to be executed in the EVM(which is a sandboxed), are limited because:

When executed, your code will run on every node of the network. Once a node receives a new block, it will verify its integrity. In Ethereum this also means verifying that all the computations that happened on that block were performed correctly and the new state of contracts is correct.
This causes that, even though the EVM is Turing-complete, heavy computations are expensive (or directly not allowed by the current gas limit) because every node will need to perform it, therefore slowing the network.
A standard library hasn’t really been developed yet. Arrays and strings are especially painful, I have personally had to implement my own string manipulation library in order to do basic stuff, that we take for granted otherwise.
You cannot get data from the outside world (out of the EVM) unless it gets in via a transaction (Oracle) and once a contract is deployed it is not upgradable (you can plan for migrations or pure storage contracts, though).

Some of this limitations are needed for the existence of the Ethereum computing platform (you will never be able to store a backup of your Google Photos and perform image recognition purely on-chain, and that is just fine). Other limitations are here just because it is a really young technology (though evolving blazingly fast) and it will keep improving over time.

Ok. But how the F**k do I solve this problem?

While working on a project, which needed changes in the contracts in future, I came across “library”. This is a feature of Solidity which helps us to solve(indirectly) this problem. Before going into the Upgradable contract implementation, let’s see what is it and it’s limitations.

What are libraries and Why do we Need them?

In Solidity, a library is a different type of contract, that doesn’t have any storage and cannot hold ether. Sometimes it is helpful to think of a library as a singleton in the EVM, a piece of code that can be called from any contract without the need to deploy it again. This solves some big problems like:

Deployment gas costs: This has the obvious benefit of saving substantial amounts of gas, because the same code doesn’t have to be deployed over and over, and different contracts can just rely on the same already deployed library.
Code repetition in the blockchain: This is obvious from the above point.
Code Updates: Earlier bug fixes and updates need to be deployed independently on each project (or, even worse, Ethereum has to hard fork to fix a contract’s problems). Now, it’s solved.

Libraries do sound awesome, right? Unfortunately, they also have some limitations. Below are some important things to know about libraries:

Libraries don’t have storage capabilities.
Libraries can manipulate the storage of other contracts.
Libraries cannot have payable functions.
Libraries cannot have a fallback function.
Libraries don’t have an event log.
Libraries can be used to fire event logs for the contract which uses it.
Libraries aren’t allowed to inherit.
Even though libraries cannot directly inherit, they can be linked with other libraries and use them in the same way a contract would, but with the natural limitations of libraries.

These points can sound confusing at first. Don’t Panic. Here is a great resource to get your head around libraries.

Library Driven Development in Solidity_A comprehensive review on how to develop more modular, reusable and elegant smart contract systems on top of the…_medium.com

But for now, we will only cover the parts which we need to understand in order to understand/implement upgradable contracts.

How does a Library work?

A library is a type of contract that doesn’t allow payable functions and cannot have a fallback function (these limitations are enforced at compile time, therefore making it impossible for a library to hold funds). A library is defined with the keyword library (library L{}) in the same way a contract is defined (contract C{}).

Calling a function of a library will use a special instruction (DELEGATECALL), that will cause the calling context to be passed to the library as if it was code running in the contract itself. I really like this angle from the Solidity documentation,

“Libraries can be seen as implicit base contracts of the contracts that use them”

In the above snippet, when function a() of contract C is called, the address of the contract will be returned and not the library's. This appears to be the same for all msg properties: msg.sender, msg.value, msg.sig, msg.data and msg.gas. (Solidity documentation related to this indicates otherwise, but after doing some testing it looks like msg context is maintained).

One thing we can notice here is that it is not clear how class C and library L are linked. So, let’s see that.

How are libraries linked?

Different from explicit base contract inheritance, (contract C is B {}) in a contract that depends on a library, it is not that clear how a contract gets linked with a library. In the above case, contract C uses library L in its function a(), but there is no mention of what address of the library to use, and L won't get compiled inside C's bytecode.

Library linking happens at the bytecode level. When contract C is compiled, it leaves a placeholder for the library address in this way 0073__L_____________________________________630dbe671f(0dbe671f is the function signature for a()). If we were to deploy contract C untouched, the deployment would fail as the bytecode is invalid.

In simple words, Library linking is as simple as replacing all occurrences of the library placeholder in the contract bytecode with the address of the deployed library in the blockchain. Once the contract is linked to the library, it can be deployed.

Now as we have covered the basics of the library, let’s see how we can use them to create upgradeable contracts.

Library, themselves are not Upgradeable

They are not, in the same way, contracts aren’t either. As stated in the previous section, the reference to the library is made at the bytecode level rather than at the storage level. Changing the bytecode of a contract is not allowed once deployed, therefore the reference to the library will live as long as the contract does.

You must be asking that, then how does one introduce the “upgradable” feature that we have been talking about this whole time?

Finally, How it Actually Works?

Here is where a little trick comes in. Let’s see this in detail:

Model of Updatable Contract

Instead of linking the main user-facing contract directly with the address of the deployed library, it is linked to a ‘Dispatcher’ contract. At compile and deploy time this is just fine because the Dispatcher doesn’t implement any of the methods of the library. This means that as the Dispatcher contract doesn’t use any library code in the contract itself, it’s(Dispatcher contract) bytecode(like the bytecode for contract C which we saw above) doesn’t have to include the library’s address in its bytecode. So, as we are not hardcoding any address on the bytecode level, we can swap the library any time with a different one.

But if we are not using any library code in the Dispatcher contract, then how do we execute the library functions?

When a transaction comes in, the main contract(Token contract) thinks it is making a delegatecall to the library(TokenLib1) it is linked with. But this delegatecall will instead be made to the dispatcher(Dispatcher contract).

Here is where things get interesting. Once the dispatcher catches the delegatecall in its fallback function it figures out what the correct version of the library code is, and redirects the call once again with a delegatecall. Once the library returns, the return will go all the way back to the main contract.

This solution works great, but it has some minor limitations.

Limitations

The dispatcher needs to know what the memory size for the return of that library call is. Right now it is solved by having a mapping for function signatures to their return type size. This was intentionally kept out of the drawing for the sake of simplicity.
Given the way delegatecall’s work on the EVM level, you can only use it from one contract to another that has the same storage footprint. As libraries have no storage, we kept Dispatcher with no storage. That’s why there is a separate DispatcherStorage to keep all the data it needs. Also, the address of the DispatcherStorage needs to be hardcoded in the Dispatcher contract’s bytecode.

Note that for the user-facing contract(Token contract) nothing special is needed, only that instead of being linked with the concrete version of the library, it has to be linked with the dispatcher.

Here is the implementation of the solution:

maraoz/solidity-proxy_Solidity implementation of a delegate proxy. Contribute to maraoz/solidity-proxy development by creating an account on…_github.com

Happy Pseudo-Versioning!

Sources:

Jorge Izquierdo’s article on Library Driven Development

Simon de la Rouviere’s article on ThrowProxy

Thanks for reading ;)