Reading code manually is a time-consuming process. It is also error-prone, as it is easy to miss important details. As developers and penetration testers, we need to find a way to automate this process. SAST is a technique that can help us with this task.
SAST is not a silver bullet. It is usable only for use cases with access to the source code for open-source projects or white-box penetration testing. However, it can help you find some low-hanging fruits and save time.
Static application security testing (SAST) is a subset of static code analysis used to increase the security and reliability of the code. SAST detects old dependencies, secret detection, logical errors that lead to vulnerabilities, and more. SAST includes testing that affects cybersecurity secondarily, such as visual code complexity, code ambiguity, and non-intuitive practices that can lead to vulnerabilities.
SAST tools are usually regex pattern matchers on steroids that look for known vulnerabilities in the code.
For example, a SAST tool might look for the use of eval
, exec
, or pickle
in Python code, these functions can be used to execute arbitrary code.
I would divide my approach to SAST into three categories:
Vulnerability detection: I use tools like Semgrep, Bandit, Nodejsscan to find attack vectors in the code. You can usually find low-hanging fruits such as unsanitized inputs, bad cryptography, or vulnerable libraries. By the way, the Semgrep PRO version has more rules; it is free if there are not more than 10 developers on the project.
Secret detection: Gitleaks, Trufflehog, or Grep can help you find secrets in the code. This is important because secrets can be used to escalate privileges or access sensitive data. Usually, database connection strings, API keys, or passwords are stored in the code.
Misconfigurations: Tools like Checkov or Trivy can help you find misconfigurations. Misconfigurations are usually in "infrastructure as code" (IaC) files but can also be in the code itself. An example would be a misconfigured docker-compose file that exposes a database to the internet. For scanning Dockerfiles, I recommend a combination of hadolint and grype
SAST tools are not perfect. They will give you false positives and false negatives. **However, it saves much time compared to randomly exploiting some applications. I used SAST tools in my penetration testing engagements and competitions. It helped me get started with the codebase and gave me some hints about where to look for vulnerabilities.
I recommend that you start with Hack The Box. This platform offers challenges, such as getting source code and finding vulnerabilities. Then, exploit them on a real machine. This is a great way to learn how to use SAST tools in practice.
Or, if you are a developer, you can use SAST tools in your CI/CD pipeline. This way, you can get used to the tools and their output. Simultaneously, you will improve the security of your application.
I have prepared a list of guides to help you start with SAST and secret detection. These guides are written as part of my thesis and are a great starting point for anyone interested in SAST tools.
When detecting secrets, remember: it's not a secret if hackers know it. Even seasoned developers can accidentally push passwords or connection strings into remote source control. Using various tools, this guide offers quick and easy methods to mitigate this risk.
This guide focuses on understanding secret detection using pre-commit hooks and CI/CD pipelines. While the code primarily focuses on GitLab, the last section will also cover GitHub.
This guide efficiently outlines the steps for initiating Static Application Security Testing (SAST) in GitLab. SAST is a process that uses static code analysis to identify potential vulnerabilities.
This cheatsheet introduces helpful tools for scanning Infrastructure as Code (IaC) artifacts and provides examples on how to integrate them into your CI/CD pipeline.
I have created a custom tool in Golang that helps me quickly match regexes against huge codebases.
There is an infinite tradeoff between precision and variance.
Suppose you need more variance and don't mind more manual reviewing. In that case, you can try RegFinder, which is like grep
but more suited for secret detection (faster in bigger repos, clear output, ignoring some file extensions). Or you can use grep directly. Most valuable are the regexes in the repo, not the tool you will use.
grep -n -r your_app/ -Ef regex_dir/general.txt
Or
./regfinder.elf -d your_app/ -f regex_dir/general.txt
It is straightforward to extend existing regex patterns. This tool is not feasible for automated pipelines. However, it comes in handy if you need to find a non-standard secret or in other assessments, such as security reviews, where more manual work is expected.
Comment if you have great experience with other tools. Thank you for reading.