259 reads

Using Git Hooks for Automated Secrets Detection

by Jean Dubrulle | GitGuardianApril 22nd, 2020

Too Long; Didn't Read

Git hooks are scripts that are triggered by certain actions in the software development process, like committing or pushing. By automatically pointing out issues in code, they allow reviewers not to waste time on mistakes that can be easily diagnosed by a machine. There are client-side hooks, that execute locally on the developers’ workstation, and server hooks that execute on the centralized version control system. The most useful git hooks on GitHub are pre-commit, pre-push and pre-receive hooks.

Company Mentioned

featured image - Using Git Hooks for Automated Secrets Detection

Git hooks are extremely useful in the journey to replace as much of the human factor in the process of secure development as possible.

What are git hooks?

There are client-side hooks, that execute locally on the developers’ workstation, and server-side hooks, that execute on the centralized version control system.

If you are interested to explore further git hooks, here is a curation of the most useful git hooks on GitHub.

Why implement secret detection in your SDLC?

As a general security principle, where feasible, data should remain safe even if it leaves the devices, systems, infrastructure or networks that are under organization control, or if these are compromised. Assuming a breach helps prevent lateral movement after a hacker gains initial access.

In their everyday life, developers handle a trove of sensitive information that hackers could leverage. They rely on hundreds of secrets like API keys, database connection strings, private keys, or certificates to interconnect payment systems, databases, CRMs, messaging and notification systems, internal services… Too often, these secrets are hardcoded in source code or shared over Slack or emails. All these systems are not designed to store and share secrets, nor are internal wikis a good place to expose usernames and passwords.

Indeed, because of the very nature of software development, source code is made to be cloned on different workstations, deployed on multiple servers, distributed to customers, etc. In practice, you never know where your code is going to end up. If it contains secrets, it takes just one of these places to be compromised for all the secrets to be compromised as well. Same reasoning holds for all developers having access to source code: it takes one compromised developer account to compromise all the secrets they have access to.

On top of that, API keys and other secrets that are used to programmatically authenticate or authorize ourselves are unlike traditional usernames and passwords: because they are made to be programmatically used, they aren’t further secured by MFA (most of the time).

Pre-commit, pre-push, pre-receive, post-receive: where to implement secret detection?

Here are some general principles about fitting security in the software development process:

The earlier a security vulnerability is uncovered, the less costly it is to correct. Hardcoded secrets are no exceptions. If the secret is uncovered after the secret reaches centralized version control server-side, it must be considered compromised, which requires to rotate (revoke and redistribute) the exposed credential. This operation can be complex and can involve multiple stakeholders.

People bend the rules, often in an effort to collaborate better together and do their job. Security must not be a blocker. It should allow flexibility and foster information flows, yet enable visibility and control. Security measures will be bypassed, sometimes for the worst. But it is also good sometimes that the developer can take the responsibility to bypass them. Talking about secrets detection: algorithms achieve a tradeoff between not raising false alerts (high precision) and not missing keys (high recall). Secrets detection being probabilistic, even the best algorithms can fail and need human judgement.

These principles advocate for the following:

Client-side secrets detection early in the software development process is a nice to have: implement pre-commit or pre-push hooks when possible. The good thing with pre-commits is that the secret is never added to the local repository, which comes in handy since removing a secret from your git history can be very tricky. Whereas the good thing with pre-push is that you’ve got an Internet connection there, allowing you to make API calls for example. This is not necessarily the case when committing.

Server-side secrets detection is a must have: take into account that depending on the size of your organization, enforcing client-side secrets detection might not be an easy task, as this requires access to your developers’ workstations. We’ve heard many times from Application Security professionals that this is not something they felt confident to do. In any case, keep in mind that client-side hooks can (and must, secret detection being probabilistic) be easy to bypass, hence the absolute necessity for server-side checks where the ultimate threat lies.

Secret detection has one extremely important peculiarity though: unlike cryptography weaknesses or SQL injection vulnerabilities that only express themselves the moment the code is deployed, any secret reaching version control system must be considered compromised thus requiring immediate attention, even if the code is not ready to be deployed yet.

This implies that implementing secrets detection is not only about scanning the most actual version of your master branch before deployment. It is also about scanning through every single commit of your git history, covering every branch, even development ones.

Implementing GitGuardian

GitGuardian comes in the form of a dashboard centralizing policy breaks across your organization’s repositories.

It is natively integrated with your Version Control System in a post-receive fashion. When integrating a repository into your monitored perimeter, secret detection is enforced on every branch, without making any distinction between development and master branches.

At every push, GitGuardian not only scans actual source code as would be the case if we were looking at other security vulnerabilities such as vulnerable dependencies, but on top of that, we also go through every incremental change that was made since the last push.

We also encourage you to run security checks early and often by using our API. Our API allows you to use our secrets detection as a service in pre-commits, pre-push, or in your CI (although CI builds aren’t typically enforced on all branches to scan every incremental change that was made). This would complement native integration with your Version Control System.

At GitGuardian, we are always keen to share the technical details of what we do, and all the subtleties we found in our journey to automate secrets detection. We are committed to doing so, even if it is not directly related to the objectives that we are trying to reach as a company.

We do this in the spirit of Open Source, knowing that sharing technical details allows us to get more feedback from developers and Application Security professionals around the world, and ultimately create more value.