Authors:
(1) ZHIYUAN WEI, Beijing Institute of Technology, China;
(2) JING SUN, University of Auckland, New Zealand);
(3) ZIJIAN ZHANG, XIANHAO ZHANG, XIAOXUAN YANG, and LIEHUANG ZHU, Beijing Institute of Technology, China;
(4) XIANHAO ZHANG, Beijing Institute of Technology, China;
(5) XIAOXUAN YANG, Beijing Institute of Technology, China;
(6) LIEHUANG ZHU, Beijing Institute of Technology, China.
Overview of Smart Contracts and Survey Methodology
Vulnerability in Smart Contracts
Conclusions, Acknowledgement and References
This section presents a robust methodology for effectively identifying and categorizing vulnerabilities in blockchain smart contracts. To ensure comprehensive coverage of various vulnerability types, our investigation is conducted from multiple perspectives. For Ethereum smart contracts, we primarily rely on two key projects: the Decentralized Application Security Project (DASP) [1] and the Smart Contract Weakness Classification (SWC) Registry[2] . DASP provides a ranking of the top 10 smart contract vulnerabilities. However, it should be noted that this list has not been updated since its initial release in 2018. On the other hand, the SWC Registry provides an implementation of weakness classification and currently lists 37 vulnerabilities (as of the time of writing). Although it lacks a concrete classification and ranking, it offers valuable insights into specific vulnerability types.
In addition to these projects, we also draw insights from several related research papers, including [21, 64, 65, 83, 99, 106]. By diligently collecting and examining all identified smart contract vulnerabilities and their respective causes, our aim is to establish a comprehensive methodology for categorizing the root causes of vulnerabilities in blockchain smart contracts. This methodology is built upon the well-established Common Weakness Enumeration (CWE) rules and effectively identifies four primary root causes of vulnerabilities: coding standards, data authenticity, access control,
and control flow management. To illustrate our classification framework, Figure 2 provides a visual representation of smart contract vulnerabilities based on their root causes and corresponding secondary causes. Our classification framework encompasses 14 distinct secondary causes, which are associated with 40 specific vulnerabilities found in Ethereum, Hyperledger Fabric (HF), and EOSIO. Furthermore, the figure also indicates the status of each vulnerability, indicating whether it has been eliminated, can be mitigated by specific methods, or remains unsolved. Recognizing the lack of consistency in the naming and definition of vulnerabilities across various studies, we have taken measures to ensure clarity and uniformity in our analysis. As part of our research, we have standardized the names of these vulnerabilities. Furthermore, we offer detailed explanations for each vulnerability to promote better comprehension and facilitate future research and analysis.
This type of weakness happens when a smart contract is not developed in accordance with established coding rules and best practices. This issue often arises due to the relative novelty of programming languages used for smart contracts, leading to a shortage of experienced developers in the domain. Furthermore, some developers may lack a comprehensive understanding of the specific coding standards for these programming languages, resulting in mistakes and vulnerabilities within the smart contract code. Improper adherence to coding standards can manifest in various ways:
3.1.1 Syntax Errors (VE1, VE2). These errors occur when the code violates the syntax rules of the programming language, such as spelling and punctuation flaws. Two specific examples of syntax errors are typographical error (VE1) and Right-To-Left-Override control character (VE2). VE1 refers to a typographical error in the code where an incorrect operator is used, and VE2 involves the misuse of the U+202E unicode character. Both of these vulnerabilities can be mitigated by following best practices and employing preventive measures.
3.1.2 Version Issues (VE3, VE4, VE5, VE6, VE7). Version issues in smart contracts can arise due to the rapid progress and updates in smart contract technology, including changes in compiler versions. When developers write code using outdated or deprecated functions, operators, or coding standards in a new compiler version, it can result in unexpected behaviors and potentially exploitable states. This category of vulnerabilities includes five specific flaws: outdated compiler version (VE3), floating pragma (VE4), use of deprecated Solidity functions (VE5), incorrect constructor name (VE6), and uninitialized storage pointer (VE7). To avoid these vulnerabilities, it is essential to stay updated on the latest version of the compiler and adhere to the recommended coding standards and best practices.
3.1.3 Irrelevant Code (VE8, VE9, VE10). Irrelevant code in smart contracts refers to code that is not essential for the execution or functionality of the contract. While this code may not directly impact the correctness of the contract, it can introduce security vulnerabilities or make them harder to detect. It is not uncommon for programming code to contain unused or shadowing parts. This category of vulnerabilities includes three specific flaws: presence of unused variables (VE8), code with no effects (VE9), and shadowing state variables (VE10). To mitigate these vulnerabilities, it is important for contract writers to thoroughly test the functionality and behavior of the code before deployment.
3.1.4 Visibility (VE11, VE12). Solidity provides visibility labels for functions and variables, namely public, external, private, or internal. Each visibility label determines who can access or call specific functions or variables. The default visibility setting in Solidity is public, which means that if the contract writer does not explicitly specify the visibility, functions and variables will be treated as public by default. Forgetting to set the appropriate visibility for a function or variable can lead to two vulnerabilities: function default visibility (VE11) and state variable default visibility (VE12). To mitigate these risks, contract writers should carefully consider the suitable visibility for each function and variable.
This type of weakness occurs when systems fail to properly verify the origin or authenticity of data, which can allow attackers to manipulate or access sensitive information. This can lead to a wide range of security issues, including cryptographic signatures and cryptographic data.
3.2.1 Cryptographic Signatures (VE13, VE14, VE15). Cryptographic signatures play a crucial role in validating the authenticity and integrity of data within blockchain systems. In Ethereum (and Bitcoin), the Elliptic Curve Digital Signature Algorithm (ECDSA) is commonly used for cryptographic signature generation and verification. However, there are certain vulnerabilities related to cryptographic signatures that can be exploited by attackers: signature malleability (VE13), missing protection against signature replay attacks (VE14), and lack of proper signature verification (VE15). To mitigate these vulnerabilities, it is of utmost importance to implement robust signature verification mechanisms.
3.2.2 Cryptographic Data (VE16, VS1, VS2). This type of vulnerability in smart contracts refers to situations where sensitive data, despite being marked as private, can still be accessed by unauthorized parties. This vulnerability arises due to the inherent transparency of blockchain transactions, which allows the content of transactions to be readable by anyone. Attackers can easily access and acquire the data stored in the contract, leading to significant financial losses for the contract creator and participants. The vulnerabilities associated with cryptographic data include unencrypted private data on-chain (VE16), fake EOS (VS1), and forged notification, fake receipt (VS2). To address these vulnerabilities, it is crucial to prioritize the proper encryption of sensitive data before storing it on-chain.
This type of weakness arises when unauthorized users gain access to a contract and can perform actions that they should not be allowed to. Such vulnerabilities can have significant repercussions for the smart contract ecosystem, including financial losses and other adverse outcomes. Improper access control vulnerabilities manifest in two primary forms: unprotected low-level function and coding issues. Addressing these vulnerabilities is crucial to maintaining the security and integrity of smart contracts.
3.3.1 Unprotected Low-level Function (VE17, VE18, VE19). Users can utilize Solidity’s low-level functions SELFDESTRUCT, tx.origin, and DELEGATECALL to control contracts. These low-level functions provide powerful capabilities but can be easily abused by malicious users if not used with caution. The following are the vulnerabilities associated with unprotected low-level functions:
• Unsafe suicide (VE17) The SELFDESTRUCT function allows a contract to be removed from the blockchain, returning any remaining Ether to a designated target address. While this can be useful in certain scenarios, it carries risks. If Ether is sent to a contract that has self-destructed, the funds will be permanently lost and cannot be recovered. Consequently, it is imperative to exercise caution when utilizing the SELFDESTRUCT function. Developers should carefully consider the variables and conditions involved before employing this function.
• Authorization through tx.origin (VE18) The tx.origin variable represents the address that initiated a transaction, while msg.sender represents the immediate invoker of a function. This vulnerability happens when one contract calls another contract, tx.origin does not represent the calling address but rather the original initiator of the transaction. This can result in funds being transferred to the wrong address. To mitigate this vulnerability, it is recommended to use msg.sender instead of tx.origin for authorization checks.
• Unsafe delegatecall (VE19) This vulnerability arises from the DELEGATECALL instruction, which allows third-party code to be executed within the context of a current contract. This vulnerability, also known as delegatecall to untrusted callee, can be exploited by attackers to take control of another contract, especially in proxy contracts where code can be dynamically loaded from different addresses at runtime. If an attacker can manipulate the address used in DELEGATECALL, they may modify storage or execute malicious code, leading to unauthorized actions such as fund theft or contract destruction.
3.3.2 Coding Issues (VE20, VE21). Due to unintentionally exposing some functions, malicious parties can withdraw some or all Ether from the contract account. This type of flaw leads to unprotected Ether withdrawal (VE20) and write to arbitrary storage location (VE21). This type of flaw can be avoided by carefully designing the code or structure.
This type of weakness occurs when attackers can exploit the openness of the public blockchain to gain control over the program’s execution in unexpected ways. This weakness can manifest in various forms as follows:
3.4.1 Improper Input (VE22, VE23, VE24). Due to error handling in EVM, improper input can cause assert violation (VE22), requirement violation (VE23) and wrong address (VE24). The Solidity assert(), require(), as guard functions are introduced to improve the readability of contract code. However, assert() and require() require strong logical conditions, and improper input will cause errors. Moreover, the length of a contract address should be 40 hexadecimal characters. If the address length is incorrect, the contract can still be deployed without any warning from the compiler. Ethereum automatically registers a new address that is owned by nobody, and any Ether sent to this address becomes inaccessible. To mitigate these vulnerabilities, it is important to carefully handle input validation.
3.4.2 Incorrect Calculation (VE25, VH1, VE26, VS3). This type of weakness happens when contracts perform a calculation that generates incorrect results and that may lead to a larger security issue such as arbitrary code execution. Arithmetic overflow/underflow (VE25, VH1) is the most common error in software, and it both happens in Ethereum and HF. Call-stack overflow (VE26) occurs due to EVM imposing a limit on the depth of the call-stack, allowing a maximum of 1024 nested function calls. If an attacker successfully reaches this limit by repeatedly invoking functions, it can result in a call-stack overflow vulnerability. Once the call-stack reaches its maximum depth, subsequent instructions, such as the send instruction, will fail. Asset overflow (VS3) specifically pertains to the EOSIO blockchain. It occurs when there is an overflow in the asset type, which represents token balances and other asset values on the EOSIO platform.
• Arithmetic overflow/underflow (VE25) This vulnerability, commonly known in software programming, is not specific to smart contracts. It occurs when an arithmetic operation produces a value that exceeds the maximum or minimum range of integer representation. In Ethereum contracts, this vulnerability arises due to the behavior of the EVM’s integer arithmetic and the lack of automatic checks for arithmetical correctness. For instance, if the result of an addition operation surpasses the maximum value representable by a specific integer type, it wraps around to a lower-than-expected value without raising an error or warning. Research by Torres et al. [129] has identified over 42,000 contracts, particularly ERC-20 Token contracts, vulnerable to arithmetic overflow/underflow. To mitigate this issue, it is advisable to employ libraries such as SafeMath [5].
3.4.3 Denial of Service (VE27, VE28, VE29). Denial of Service vulnerabilities can affect smart contracts and result in exceptions that may lead to undesirable consequences such as contract lock-ups or freezing of funds. There are multiple ways in which DoS attacks can be carried out, with two primary incentives: failed calls and gas consumption. One type of DoS vulnerability is DoS with failed call (VE27), where an external call, whether accidental or deliberate, fails. This vulnerability is particularly relevant in payment scenarios where multiple calls are executed within a single transaction. The failure of an external call can disrupt the intended flow of the contract, potentially leading to undesired consequences. Another DoS problem can be concluded as gas-related vulnerabilities. Although gas is a mechanism designed to prevent resource abuse, attackers can exploit this mechanism to trigger the other two vulnerabilities: insufficient gas griefing (VE28) and DoS with block gas limit (VE29). Mitigating these DoS vulnerabilities requires careful design and consideration of gas usage.
• DoS with block gas limit (VE29) To safeguard the network, each block has a predetermined maximum amount of gas that can be consumed, known as the Block Gas Limit. The gas consumption of a transaction must be less than or equal to the Block Gas Limit; otherwise, the transaction will fail to execute and any changes made during its execution will be rolled back. This ensures that no single transaction monopolizes excessive resources within a block, promoting fair usage and preventing DoS attacks that could overwhelm the network. It is important to note that different EVM instructions have varying gas costs. Some operations, such as ADD, AND, and POP, have relatively low gas costs, while others, like SSTORE, incur higher gas costs. This differentiation encourages efficient and responsible use of gas resources.
3.4.4 Use of Low-level Function (VE30, VE31, VE32, VE33). Solidity’s low-level functions, such as call, transfer, send, mstore and abi.encodePacked, provide users with control and flexibility when interacting with smart contracts. However, improper use of these low-level functions can introduce unexpected behavior and vulnerabilities into the contract’s program logic. These vulnerabilities include unchecked send (VE30), arbitrary jump with function type variable (VE31), hash collisions (VE32), and message call with hardcoded gas amount (VE33). To mitigate these vulnerabilities, it is crucial to carefully review and validate the usage of low-level functions, handle exceptions appropriately, account for potential changes in gas costs, and implement robust input validation and verification mechanisms.
• Unchecked send (VE30) This vulnerability is also described as unchecked low-level call unhandled exceptions, or exception disorder. This vulnerability happens when the call fails accidentally or an attacker forces the call to fail. In some cases, developers may include code to check the success of the call, but they neglect to handle the exceptions properly. As a result, funds intended for transfer may not reach the intended recipient. This vulnerability stems from the inconsistent exception-handling behavior in Solidity, which can lead to unexpected outcomes if not handled correctly.
3.4.5 Improper Behavioral Workflow (VE34, VE35, VE36, VE37). It refers to vulnerabilities that arise when the expected order or sequence of operations within a smart contract is manipulated by malicious users, leading to unexpected states or undesired behavior. These vulnerabilities include reentrancy (VE34), unexpected Ether balance (VE35), incorrect inheritance order (VE36), and infinite loop (VE37). Unexpected Ether balance (VE35) occurs when malicious users intentionally send funds to a contract in a specific manner to disrupt its intended behavior or cause a denial-of-service condition. By manipulating the contract’s ether balance, attackers can affect the contract’s functionality and, in extreme cases, render it unusable. Incorrect inheritance order (VE36) is a vulnerability that arises from the improper ordering of contract inheritance. Malicious users can manipulate the inheritance order to achieve unexpected outcomes and potentially exploit vulnerabilities. Infinite loop (VE37) refers to a vulnerability where a contract falls into an infinite loop, leading to non-termination of contract execution.
• Reentrancy (VE34) This vulnerability occurs when a contract invokes a function from an external contract, and the called contract has sufficient gas to invoke a callback into the calling contract. This creates a loop where the called contract re-enters the calling contract before the initial invocation is completed. Malicious attackers can exploit this vulnerability to manipulate the execution flow and potentially exploit vulnerabilities present in the contract. It is crucial to carefully review and secure contract interactions to prevent re-entrance attacks and ensure the integrity and security of the smart contract system.
3.4.6 Consensus Issues (VE38, VE39, VE40, VS4). In the blockchain, the synchronization of subsequent blocks with the majority of the network relies on following a consensus protocol, such as Proof of Work (PoW) or Proof of Stake (PoS). This consensus protocol allows network participants sufficient time to reach an agreement on which transactions should be included in the blocks. However, the synchronization process itself introduces vulnerabilities that can be exploited by attackers. These vulnerabilities include transaction order dependency (VE38) (TOD), time manipulation (VE39), and bad randomness (VE40, VS4).
• TOD (VE38) This vulnerability, also known as Frontrunning, occurs due to the prioritization mechanism for transactions in blockchain blocks. Miners have the ability to choose which transactions to include in a block and the order in which they are arranged. Since transactions are often prioritized based on gas price, a malicious miner who can see and react to transactions before they are mined may manipulate the transaction order to their advantage. By varying the order of transactions and manipulating the output of the contract, they can manage undesirable outcomes or financial losses for users.
• Time manipulation (VE39) This vulnerability arises when smart contracts rely on the timestamp information from blocks to perform certain functions. In Solidity, the current timestamp can be obtained using block.stamp or now. However, this timestamp value can be manipulated by miners. If a contract’s functionality is dependent on the timestamp, miners can profit by choosing a suitable timestamp to manipulate the contract’s behavior. This vulnerability is also referred to as block values as a proxy for time.
• Bad randomness (VE40) This vulnerability refers to vulnerabilities in the generation of random numbers within smart contracts. Random numbers are often used to make decisions or determine outcomes. If the random number generation process is flawed, malicious actors may be able to predict the outcome of the contract and exploit it. One example of bad randomness is the use of a predictable seed value for the random number generator. If an attacker can guess or determine the seed value, they can predict the generated random numbers and manipulate the contract accordingly to their advantage.
Since the last update of the DASP (Decentralized Application Security Project) in 2018, we have created a new list of the top 10 smart contract vulnerabilities that pose significant risks to the security and functionality of contracts. This list is based on the frequency of occurrence among various analysis tools available at https://sites.google.com/view/scanalysis-toollist. Figure 3 provides statistics on the occurrence of 22 vulnerabilities. The top 10 vulnerabilities are as follows: reentrancy (VE34), arithmetic overflow/underflow (VE25), DoS with block gas limit (VE29), unsafe suicidal (VE17), unsafe delegatecall (VE19), unchecked send (VE30), TOD (VE38), time manipulation (VE39), authorization through tx.origin (VE18), and various other vulnerabilities. These vulnerabilities have been identified as the most common and high-risk issues that developers should prioritize when assessing the security of their smart contracts.
Reentrancy attracts significant attention from researchers due to its difficulty in detection and mitigation. The complex and decentralized nature of smart contracts makes it challenging to ensure the atomic execution of functions and proper handling of reentrant calls. Arithmetic overflow/underflow is a common issue in software programs, particularly those written in low-level languages. Gas usage plays a crucial role in the EVM and the smart contract ecosystem, and accurately estimating the required gas for an operation can be difficult. Given that smart contract applications often handle sensitive data and substantial amounts of value, DoS with block gas limit can lead to denial-of-service attacks, data corruption, and financial losses for users. Unsafe suicidal, unsafe delegatecall, and authorization through tx.origin vulnerabilities occur when the access control of a smart contract is flawed. Access control determines which entities can interact with the contract and what actions they can perform. These vulnerabilities can result in security risks and potential financial losses when access control is compromised. Unchecked send occurs when a contract fails to handle exceptions from failed calls appropriately. This vulnerability causes the smart contract to behave unexpectedly and compromises its secure operation. TOD, time manipulation, and bad randomness vulnerabilities are related to consensus issues influenced by the blockchain network. Smart contracts are executed on the blockchain and must follow the same transaction order as the underlying blockchain. This means that even if a smart contract is designed to be resistant to TOD and time manipulation, it can still be vulnerable if the blockchain is not resistant to these issues.
In this section, we performed a comprehensive analysis of the root causes of vulnerabilities in the smart contract domain and introduced a novel classification system to effectively categorize them. Furthermore, we conducted a statistical ranking of the most frequently encountered vulnerabilities based on existing research. These findings offer conclusive and precise answers that effectively address our research question, RQ1, as outlined in Section 2.2.
By gaining a good understanding of these vulnerabilities in smart contracts and their ranking, developers can effectively allocate their time and resources, prioritizing the resolution of the most critical security concerns. Additionally, to fully comprehend the potential damage caused by these vulnerabilities, it is crucial to explore the common types of attacks that can exploit them. By carefully examining the relationship between vulnerabilities and attacks, developers can identify potential attack vectors and proactively implement robust measures to mitigate these risks.
This paper is available on arxiv under CC 4.0 license.
[1] https://dasp.co/
[2] https://swcregistry.io/