Photo by on Roman Mager Unsplash Vulnerabilities in smart contracts are threatening projects, developers and investors for a long time. A growing number of teams are putting efforts in this field with various approaches to secure contracts. SECBIT Labs proposes combining the formal proofs with the traditional test and security audit. In this article, we take the ERC20 formal proof in our GitHub repository as an example to show the use of formal proofs on smart contracts. We hope the formal proofs can help to eliminate buggy contracts, and secure all aspects of smart contracts including design logic, implementation, economic system, . blockchain security tokenlibs-with-proofs etc What is Formal Proof Formal proof is a concept under formal verification, which is defined on as Wikipedia In the context of hardware and software systems , formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics . The verification of these systems is done by providing a formal proof on an abstract mathematical model of the system, the correspondence between the mathematical model and the nature of the system being otherwise known by construction. Examples of mathematical objects often used to model systems are: finite state machines , labelled transition systems , Petri nets , vector addition systems , timed automata , hybrid automata , process algebra , formal semantics of programming languages such as operational semantics , denotational semantics , axiomatic semantics and Hoare logic . Essentially, formal verification is to prove properties of a mathematical model. In formal proofs of smart contracts, the model is the smart contracts and the way they are running, and properties are those desired behaviors of smart contracts, such as no integer overflow and underflow can ever happen. good In detail, we formally model the smart contracts by the , prove they agree with the given , and prove the specification can further guarantee certain . All proofs are provided as explicit whose correctness can be checked independently of the proving process. program logic contract specification high-level properties proof objects When talking about any formal proofs, we must keep following prerequisites in mind: All proofs are under certain assumptions; The correctness of proofs relies on the formal logic theory; The final proof result and theorem indeed define the correctness and security of smart contracts. In formal proofs of ERC20 contracts, the first point requires the correctness of fundamental definitions; the second demands a valid proof, which can be constructed and checked by a popular proving assistant Coq; the last one relies on that we state common ERC20 properties correctly. Note: the above three points are prerequisites and the (TCB) of our formal proofs on smart contracts. Every security system depends on certain TCB. The smaller and the stronger the TCB is, the safer the system is. trusted computing base Why Formal Proofs for Smart Contracts Ethereum improved and enhanced smart contracts by introducing a Turing-complete smart contract language, which brings significant flexibility and expressiveness as much as the traditional programming languages. Inevitably, bugs and security issues are introduced as well, causing tremendous financial losses. SECBIT Labs has performed several rounds of scans and analyses on all deployed token contracts on Ethereum, and recorded a total of 4,172 contracts with vulnerabilities and security risks. 101 of them are also recorded by CoinMarketCap, and a full list can be found in repo by SECBIT. awesome-buggy-erc20-tokens After auditing numerous smart contracts, SECBIT Labs has realized that the security issues in smart contracts are much more complex and critical than the traditional software: The trust of smart contracts comes from their immutability, , once deployed no one can modify the code easily. However, it also implies that, if not prepared in advance, it’s difficult to fix bugs easily and quickly, especially when being attacked. Such fragility definitely harms the trust and value of token economy and blockchain projects. i.e. Implementations of many deployed smart contracts are open sourced. Though it can improve the transparency and trust of contracts, it also reduces the cost for attackers to find bugs. The design of smart contract language still has lots of limitations and flaws, making it easier to introduce bugs and security risks. A lack of experienced developers further aggravated the issue. At present, there are two main approaches to address smart contract security issues: test and audit. Though they are necessary for avoiding common vulnerabilities, their capabilities remain limited. Test The security team can generate test cases manually or by automatic test utilities and use them to examine the input combinations and the execution paths of contracts as many as possible. However, it’s difficult to guarantee 100% coverage, so no issue discovered in tests does not imply bug-free contracts. Audit Professional security auditors inspect the implementation and the logic of contracts, and hunt for bugs and risks. The audit mostly relies on the subjective experience of auditors, so the scope and the effectiveness of audit are difficult to measure and evaluate. Formal proof describes the programs by mathematical models, and proves their correctness and security by mathematical proofs. It is based on the strict and rigorous mathematics and logic, and therefore can cover all program behavior and strictly guarantee the correctness and security which are precisely defined. Formal proof has been applied successfully in critical areas including astronautics, high-speed railway, nuclear power and aeronautics. Because of the fragility of smart contracts and the large amount of assets managed by them, smart contracts demand extreme security. In addition, some high-level and complex properties about economics and game theories cannot be validated easily by test and audit. Thus, formal proof is no doubt an efficient way to secure those small yet complex smart contracts. How to Formally Prove Smart Contracts Proving Structure The formal proof of a smart contract involves four parts: source code, specifications, properties, and proof. Structure of Formal Proof Source Code: is the proving object. Specifications: defines the expected behavior of every contract function. Properties: defines the desired or properties when the contract works as a whole, such as the immutability of the total amount of tokens. good Proof: reasons the contract does implement the specification and guarantee the properties. Properties Proved Our proof on ERC20 contract primarily proves the following properties of the contract: : no harmful overflow in transfer() and transferFrom() nat_nooverflow_dsl_nooverflow Lemma nat_nooverflow_dsl_nooverflow: forall (m: state -> a2v) st env msg, (_from = _to / (_from <> _to /\ (m st _to <= MAX_UINT256 - _value)))%nat -> ((from == to) || ((fun st env msg => m st (to st env msg)) <= max_uint256 - value))%dsl st env msg = otrue. : totalSupply equals the sum of all balances after executing any steps in contracts Property_totalSupply_equal_to_sum_balances (* Prop #1: total supply is equal to sum of balances *) Theorem Property_totalSupply_equal_to_sum_balances : forall env0 env msg ml C E C' E', create env0 msg C E -> env_step env0 env -> run env C ml C' E' -> Sum (st_balances (w_st C')) (st_totalSupply (w_st C')). : Token sum remains unchanged after executing transfer() Property_totalSupply_fixed_transfer (* Prop #2: total supply is fixed with transfer *) Theorem Property_totalSupply_fixed_transfer: forall env C C' E' msg to v spec preP evP postP, spec = funcspec_transfer to v (w_a C) env msg -> preP = spec_require spec -> evP = spec_events spec -> postP = spec_trans spec -> preP (w_st C) /\ evP (w_st C) E' /\ postP (w_st C) (w_st C') -> (st_totalSupply (w_st C)) = (st_totalSupply (w_st C')). : Token sum remains unchanged after initialization Property_totalSupply_fixed_after_initialization (* Prop #3: total supply is fixed after initialization *) Theorem Property_totalSupply_fixed_after_initialization: forall env0 env msg C E C' E', create env0 msg C E -> step env C msg C' E' -> (st_totalSupply (w_st C)) = (st_totalSupply (w_st C')). : Token sum remains unchanged after executing transferFrom() Property_totalSupply_fixed_delegate_transfer (* Prop #4: total supply is fixed with delegate transfer *) Theorem Property_totalSupply_fixed_delegate_transfer1: forall env C C' E' from msg to v spec, spec = funcspec_transferFrom_1 from to v (w_a C) env msg -> (spec_require spec) (w_st C) /\ (spec_events spec) (w_st C) E' /\ (spec_trans spec) (w_st C) (w_st C') -> (st_totalSupply (w_st C)) = (st_totalSupply (w_st C')). Theorem Property_totalSupply_fixed_delegate_transfer2: forall env C C' E' from msg to v spec, spec = funcspec_transferFrom_2 from to v (w_a C) env msg -> (spec_require spec) (w_st C) /\ (spec_events spec) (w_st C) E' /\ (spec_trans spec) (w_st C) (w_st C') -> (st_totalSupply (w_st C)) = (st_totalSupply (w_st C')). : Transferring only affects balances of accounts involved, other accounts remain unchanged Property_from_to_balances_change (* Prop #5: only balances of from and to changed by transfer*) Theorem Property_from_to_balances_change_only: forall env C C' E' to addr msg v spec, spec = funcspec_transfer to v (w_a C) env msg -> (spec_require spec) (w_st C) /\ (spec_events spec) (w_st C) E' /\ (spec_trans spec) (w_st C) (w_st C') -> m_sender msg <> to -> m_sender msg <> addr -> to <> addr -> (st_balances (w_st C') to = (st_balances (w_st C) to) + v) /\ (st_balances (w_st C') (m_sender msg) = (st_balances (w_st C) (m_sender msg)) - v) /\ st_balances (w_st C') addr = st_balances (w_st C) addr. Proving Process The proved contract is a standard ERC20 contract, including six public functions: , , , , and . A brief proving process is shown as below. constructor transfer() transferFrom() balanceOf() approve() allowance() 1. Define Specification A specification language is a formal language in computer science used during systems analysis , requirements analysis and systems design to describe a system at a much higher level than a programming language , which is used to produce the executable code for a system. defines the specification of the six public functions in the contract. Spec.v The execution of every function may involve multiple cases and behave differently in each case. We define one specification rule for one case, so a specification usually is composed of one or multiple rules. Take the specification of as an example: it consists of two rules- and , corresponding to two cases that the approved quota is smaller than or equal to . transferFrom() funspec_transferFrom_1 funcspec_transferFrom_2 2**256-1 2**256-1 Using as an example, every specification rule is composed of following parts. funcspec_transferFrom_1 specifies prerequisites that must hold before executing in the first case. spec_require transferFrom Definition funcspec_transferFrom_1 (from: address) (to: address) (value: value) := fun (this: address) (env: env) (msg: message) => (mk_spec (fun S : state => (* require(balances[_from] >= _value); require(_from == _to || balances[_to] <= MAX_UINT256 - _value); require(allowance >= _value); allowance < MAX_UINT256 *) st_allowed S (from, m_sender msg) < MAX_UINT256 ) ... ) st_balances S from >= value /\ ( ) ((from = to) / (from <> to /\ st_balances S to <= MAX_UINT256 - value)) /\ ( ) st_allowed S (from, m_sender msg) >= value /\ ( specifies events that must have occurred after successfully executing the function: spec_events Definition funcspec_transferFrom_1 (from: address) (to: address) (value: value) := fun (this: address) (env: env) (msg: message) => (mk_spec (* require omitted *) (* emit Transfer(_from, _to, _value); *) (* return True; *) (fun S E => E = (ev_Transfer (m_sender msg) from to value) :: (ev_return _ True) :: nil) where, represents the event ; ev_Transfer Transfer is a pseudo event in specification to represent the function return and the return value. ev_return specifies the state transition in a successful function execution, e.g. changes in storage variables: spec_trans Definition funcspec_transferFrom_1 (from: address) (to: address) (value: value) := fun (this: address) (env: env) (msg: message) => (mk_spec (* require omitted events omitted *) ) ( (* State transition: *) (fun S S' : state => (* Unchanged. *) st_totalSupply S' = st_totalSupply S /\ st_name S' = st_name S /\ st_decimals S' = st_decimals S /\ st_symbol S' = st_symbol S /\ (* balances[_from] -= _value; *) st_balances S' = (st_balances S) $+{ from <- -= value } (* balances[_to] += _value; *) $+{ to <- += value } /\ (* allowed[_from][msg.sender] -= _value; *) st_allowed S' = (st_allowed S) $+{ from, m_sender msg <- -= value } ) which states that, corresponding entries in and must be changed as expected, while the other storage variables are not changed. balances allowed Please refer to for the complete definitions of all specification rules. Spec.v 2. Prove Against Specifications Before proving high-level properties upon specifications defined above, we need to ensure that Solidity source code does implement the specifications. First we represent the Solidity contract code in Coq. A domain specification language (DSL) is defined in Coq for this purpose. For example, the ERC20 function can be represented as below in : transfer() DSL.v (* DSL representation of transfer(), generated from solidity *) Definition transfer_dsl : Stmt := (@require(balances[msg.sender] >= value) ; @require((msg.sender == to) || (balances[to] <= max_uint256 - value)) ; @balances[msg.sender] -= value ; @balances[to] += value ; (@emit Transfer(msg.sender, to, value)) ; (@return true) ). Then we prove that every function in DSL agrees with specifications, e.g. we prove that agrees with by the following lemma in : transfer() funcspec_transfer DSL.v Lemma transfer_dsl_sat_spec: forall st env msg this, spec_require (funcspec_transfer _to _value this env msg) st -> forall st0 result, dsl_exec transfer_dsl st0 st env msg this nil = result -> spec_trans (funcspec_transfer _to _value this env msg) st (ret_st result) /\ spec_events (funcspec_transfer _to _value this env msg) (ret_st result) (ret_evts result). Please refer to for the complete DSL definitions, DSL representations, and the specification proofs. DSL.v 3. Define and Prove High-Level Properties Specification above define the expected behavior of each execution of one function, while high-level properties define the expected behavior presented by the contract as a whole when arbitrary message calls are received. For example, the following property defined in states the property that no token will lose regardless of which message calls are handled by the contract: DSL.v (* Prop #1: total supply is equal to sum of balances *) Theorem Property_totalSupply_equal_to_sum_balances : forall env0 env msg ml C E C' E', create env0 msg C E -> env_step env0 env -> run env C ml C' E' -> Sum (st_balances (w_st C')) (st_totalSupply (w_st C')). Please refer to for the complete definitions and proofs of all high-level properties. Spec.v Future Plans We created the project to collect and publish our formal proofs of various token contracts. tokenlibs-with-proofs We are going to prove more common properties to help the community to avoid vulnerabilities and risks. We are going to prove more token contracts with more features, e.g. freezing, upgrading, authority manipulating, adding ownership. We are researching on proving high-level properties with respect to game theories, e.g. fairness, optimistic strategy, Nash equilibrium. We hope this project can be a helpful resource and reference for demonstrating the use of formal proofs in smart contracts. We also hope more smart contract developers and security experts can join to the research and the application of improving smart contract security by formal proofs. If you have any questions, suggestions and ideas, welcome to discuss in our . Gitter channel Special thanks to Yi Tang from ConsenSys China who discussed the topic with us and provided feedback and comment. Related Work SECBIT Labs: awesome-buggy-erc20-tokens. https://github.com/sec-bit/awesome-buggy-erc20-tokens ConsenSys: Tokens. https://github.com/ConsenSys/Tokens ConsenSys: Smart Contract Security Best Practices. https://github.com/ConsenSys/smart-contract-best-practices Michael Burge: DSLs for Ethereum Contracts. https://www.michaelburge.us/2018/05/15/ethereum-chess-engine.html C. A. R. Hoare. An axiomatic basis for computer programming. , 26(1):53–56, Jan. 1983. Communications of the ACM L. Lamport. Proving the correctness of multiprocess programs. SE-3, 2 (March 1977), 125–143. IEEE Transactions on Software Engineering G. Necula. Proof-carrying code. In . pages 106–119, New York, Jan. 1997. Proc.24th ACM symposium on Principles of programming languages (POPL’97) Inria: The Coq Proof Assistant. https://coq.inria.fr/ Nielson, F., Nielson, H.R., Hankin, C.: Principles of program analysis. Springer (2015) Acta Scientiarum Naturalium Universitatis Pekinensis: Logic Based Formal Verification Methods: Progress and Applications http://xbna.pku.edu.cn/html/2016-2-363.htm SECBIT was founded by a group of cryptocurrency-enthusiasts. We are doing research on smart contract security, smart contract formal verification, crypto-protocols, compilation, contract analysis, game theory and crypto-economics.