This story draft by @escholar has not been reviewed by an editor, YET.

Effectiveness in Generating Specific Vulnerabilities for C Codes

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Table of Links

Abstract and I. Introduction

II. Related Work

III. Technical Background

IV. Systematic Security Vulnerability Discovery of Code Generation Models

V. Experiments

VI. Discussion

VII. Conclusion, Acknowledgments, and References


Appendix

A. Details of Code Language Models

B. Finding Security Vulnerabilities in GitHub Copilot

C. Other Baselines Using ChatGPT

D. Effect of Different Number of Few-shot Examples

E. Effectiveness in Generating Specific Vulnerabilities for C Codes

F. Security Vulnerability Results after Fuzzy Code Deduplication

G. Detailed Results of Transferability of the Generated Nonsecure Prompts

H. Details of Generating non-secure prompts Dataset

I. Detailed Results of Evaluating CodeLMs using Non-secure Dataset

J. Effect of Sampling Temperature

K. Effectiveness of the Model Inversion Scheme in Reconstructing the Vulnerable Codes

L. Qualitative Examples Generated by CodeGen and ChatGPT

M. Qualitative Examples Generated by GitHub Copilot

E. Effectiveness in Generating Specific Vulnerabilities for C Codes

Figure 6 provides the percentage of vulnerable C codes that are generated by CodeGen (Figure 6a, Figure 6b, and Figure 6c) and ChatGPT (Figure 6d, Figure 6e, and Figure 6f) using our three few-shot prompting approaches. We removed duplicates and codes with syntax errors. The x-axis refers to the CWEs that have been detected in the sampled codes, and the y-axis refers to the CWEs that have been used to generate non-secure prompts. These non-secure prompts are used to generate the code. Other refers to detected CWEs that are not listed in Table I and are not considered in our evaluation. Overall, we observe high percentage numbers on the diagonals, this shows the effectiveness of the proposed approaches in finding C codes with targeted vulnerability. The results also show that CWE-787 (out-of-bound write) happens in many scenarios, which is the most dangerous CWE among the top-25 of the MITRE’s list of 2022 [29]. Furthermore, the results in Figure 6 indicate the effectiveness of our approximation of the inverse of the model in finding the targeted type of security vulnerabilities in C codes.


Authors:

(1) Hossein Hajipour, CISPA Helmholtz Center for Information Security ([email protected]);

(2) Keno Hassler, CISPA Helmholtz Center for Information Security ([email protected]);

(3) Thorsten Holz, CISPA Helmholtz Center for Information Security ([email protected]);

(4) Lea Schonherr, CISPA Helmholtz Center for Information Security ([email protected]);

(5) Mario Fritz, CISPA Helmholtz Center for Information Security ([email protected]).


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks