Why Code Reviews Still Miss Critical Software Security Weaknesses

Table of links

Abstract

1 Introduction

2 Background and Related Work

Software Security
Coding Weaknesses
Security Shift-Left
Modern Code Review
Code Review for Software Security
Security Concern Handling Process in Code Review

3 Motivating Examples

4 Case Study Design

Research Questions
Studied Projects
Data Collection
Coding Weakness Taxonomy
Study Overview
Security Concern Identification Approach (RQ1)
Alignment Analysis of Known Vulnerabilities (RQ2)
Handling Process Identification (RQ3)

5 Preliminary Analysis

PA1: Prevalence of Coding Weakness Comments
PA2: Preliminary Evaluation of our Security Concern Identification Approach

6 Case Study Results

7 Discussion

8 Threats to Validity

Internal Validity
Construct Validity
External Validity

In this section, we provide background on software security and the modern code review process. We then discuss the current practice and remaining challenges in code review for identifying security issues.

2.1 Software Security

Software security represents the concepts that help software products survive and operate normally under harmful attacks (McGraw, 2004). Software security issues represent a variety of problems related to software security. Software vulnerability, a type of security issue, is a ”.. security flaw, glitch, or weakness found in software code that could be exploited by an attacker ..” (Dempsey et al., 2020). In particular, Munaiah et al. (2017) explained that the flaws or faults in software can result in ”the violation of the system’s security policies”. The same study also suggested that these kinds of faults should be prioritized and eliminated as early as possible, as they could constitute considerable impacts in the later stages.

2.2 Coding Weaknesses

Software security issues can be caused by the chain of common software development faults (Hoole et al., 2016), so-called coding weaknesses, such as improper data validation, miscalculation, or incorrect memory allocation. Bojanova et al. (2016) introduced the Bugs Framework that defines the relationship between the causes (coding weakness) and the consequences of various software security issues. Each security issue can originate from several causes, influenced by different attributes, and result in various consequences that could be exploited by an attacker. Bojanova and Galhardo (2023) elaborated the Bugs Framework by showing several real-world examples of how a sequence of coding weaknesses can cause a security issue. To give an example, CVE-2021-218342 presents a vulnerability in a MPEG-4 library. An attacker can exploit this vulnerability by developing a special input that triggers the buffer overflow and causes a denial of service error. Although the buffer overflow is part of the vulnerable consequence, the actual cause of the vulnerability is a coding weakness i.e., improper input validation. Similarly, Tsipenyuk et al. (2005) also identified seven groups of faults in source code that can compromise software security. For example, Path Manipulation error and Illegal Pointer Value error belong to the group Input Validation and Representation which can lead to security issues such as cross-site scripting (XSS) or SQL Injection.

2.3 Security Shift-Left

The Shift-Left concept was introduced by Smith (2001) to motivate practitioners to test the developed software products as early, and from as many aspects, as possible in order to diminish the potential defects that may introduce unexpected consequences in later stages. Carter (2017) suggested security shift-left where security awareness should be incorporated into the software development lifecycle at the early stage, allowing the incubation of synergies between security experts and software developers. The early detection of security issues also means smaller cost and effort required to fix them (McConnell, 2004), as well as reducing the impacts on the end-users. In practice, practitioners can use various approaches to perform security shiftleft. From the organizational perspective, Weir et al. (2022) provided initial guidelines for companies that want to shift security to the left. At the management level, the companies can establish security governance, create a centralized software security group which consists of security specialists, and allocate a security satellite team to support security activities across the organization. At the project level, the project team should lead the technical aspect of security matters and request security support from the satellite team when needs arise.

2.4 Modern Code Review

Lightweight code review has been increasingly adopted in the software industry as a replacement for traditional code inspection activity (Rigby et al., 2012). A developer who composes the code changes (so-called, Author) submits the changes to the code review system. Then other developers in the project (so-called, Reviewers) can freely review the changes and provide feedback. Although code review is used for multiple purposes such as knowledge transfer or sharing of code ownership, the main objective of the activity remains to manage software defects (Bacchelli and Bird, 2013; Rigby and Bird, 2013; Thongtanunam et al., 2015). Balachandran (2013) presented various factors that affect code reviews such as the experience and available effort of reviewers. Similarly, Bacchelli and Bird (2013) reported that different levels of source code understanding among code developers affect the quality of code review feedback. In addition, the outcomes of the code review were also discussed. M¨antyl¨a and Lassenius (2009) identified two defect classes (1) functionality and 2) evolvability) that can be discovered during code reviews of nine industrial software projects. The study of Beller et al. (2014) confirmed a similar notion based on the ConQAT and GROMACS open-source projects. In addition to these two main types of defects, Thongtanunam et al. (2015) found that code developers may also raise concerns related to traceability (i.e., the bookkeeping of code changes). Nevertheless, few studies have delved into the non-functional, yet important defects like security issues.

2.5 Code Review for Software Security

During the code review process, reviewers may follow the security code review practice (Howard, 2006) to detect various potential security issues in the code. Prior studies have investigated the well-known security issues such as overflow, cross-site scripting, SQL injection, and cross-site request forgery that can be identified during the code review process (Paul et al., 2021b; Bosu et al., 2014; Bosu and Carver, 2013; Edmundson et al., 2013; Alfadel et al., 2023; Di Biase et al., 2016). Edmundson et al. (2013) conducted a controlled study by asking developers to identify the injected vulnerable code in a web application called Anchor CMS. Paul et al. (2021b) and Di Biase et al. (2016) examined code review comments in Chromium projects.

A recent study by Alfadel et al. (2023) examined code review comments in NodeJS packages. In brief, they reported similar results that developers can identify certain security issues such as cross-site scripting (XSS), but the effectiveness of security issue identification in code reviews is relatively low. It can be seen that the types of security issues studied in prior works (Paul et al., 2021b; Bosu et al., 2014; Bosu and Carver, 2013; Edmundson et al., 2013; Alfadel et al., 2023; Di Biase et al., 2016) are still limited to the aspects of consequences from the external threats, i.e., how a threat agent (or an attacker) can exploit the system (e.g., cross-site scripting and SQL injection). To be able to identify such external threats during code reviews, reviewers are required to possess a deep understanding of security knowledge (Braz and Bacchelli, 2022) and endure high cognitive load (Gon¸calves et al., 2022).

Rather than focusing on well-known security issues that are limited to the aspects of consequences from external threats or the visible vulnerabilities, little work has investigated coding weaknesses that potentially introduce security issues and vulnerabilities. Since code reviews focus on identifying coding issues in source code, identifying coding weaknesses that can lead to various exploitable vulnerabilities would be more aligned with code review practices. Therefore, investigating the types of coding weaknesses will help us better understand the benefits and capabilities of code reviews for software security, as well as highlight any deficiencies in current discussions on coding weaknesses within code reviews. Table 1 shows the key differences (i.e., studied taxonomy and methodology) between this work and the related secure code review studies.

2.6 Security Concern Handling Process in Code Review

To better understand the effectiveness of code reviews, prior studies have investigated how raised concerns were addressed during code reviews. Assal and Chiasson (2018) interviewed developers to understand their perception of security throughout the software development lifecycle. While the results show that security can be seen as a formal checkpoint in code reviews, the results did not elaborate on how the developers would respond to the security concerns raised by reviewers. A study by Kononenko et al. (2016) reported that security concerns are one of the aspects that developers consider before making changes. In particular, Han et al. (2021) investigated code-smell-related comments (e.g., violation of coding conventions) in code reviews and reported that 6% of comments were ignored by

developers. Similarly, Beller et al. (2014) found that 7%-35% of the code review comments were neglected by the developers in general. Although several studies have investigated the aspects of the code review process, little work has studied how developers respond to concerns about coding weakness, given that they are related to potential security issues, during code reviews. Referring to Section 2.2, it becomes apparent that coding weaknesses can overshadow both functionality and evolvability defects that reviewers can identify during code reviews (M¨antyl¨a and Lassenius, 2009). An extended understanding of how developers respond to such coding weaknesses could shed light on current secure code review practices and their challenges. Such an understanding would help practitioners develop secure code review policies that address current challenges, allowing the team to execute better secure code reviews and prevent security issues in software products.

Authors:

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.