Gitleaks For Enterprises

Gitleaks For Enterprises

Aims to solve the predicament of Gitleaks integration. New design & workflow that changes the course of integration & usability across enterprises.

The default configuration of Gitleaks isn't feasible to use across multiple projects for teams/organizations. In this article, we will understand the need for having a secret scanning tool in your environment, a quick intro on challenges with default Gitleaks configuration when we try to use it in enterprises/across projects & how we can fix it.

Github: gitleaks-for-enterprise

Introduction

Most of the recent security breaches occur due to a simple misconfiguration or leaked secrets/API keys/etc. Detecting these kinds of misconfigurations at an early stage in the build process will be pretty helpful. Identifying secrets & sensitive information plays a key role in the shift-left security DevOps approach.

About Gitleaks

Gitleaks is a SAST tool for detecting and preventing hardcoded secrets like passwords, API keys, and tokens in git repos. Gitleaks is an easy-to-use, all-in-one solution for detecting secrets, past or present, in your code.

Usage

gitleaks detect -c ./gitleaks.toml --source /path/to/repo
$ cat gitleaks.toml
...
[[rules]]
    description = "Rule 1: AWS Access Key"
    regex = '''(A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}'''
    tags = ["key", "AWS"]
...

The best part with Gitleaks is it allows us to add allowlist based on rule & commitID/fileName/data/ etc. This will be helpful while dealing with false positives.

$ cat gitleaks-with-allowlist.toml
...
[[rules]]
    description = "Rule 1: AWS Access Key"
    regex = '''(A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}'''
    tags = ["key", "AWS"]
    [rules.allowlist]
    description = "Ignore revoked AWS Key"
    commits = [ "commit-A" ]
    paths = [ '''config.env''' ]
...

There are multiple ways to integrate Gitleaks into your environment like pre-commit hooks, CI pipelines, etc. That's an amazing feature.

Everything looks great. Where is the problem?

Existing architecture & drawbacks

If you want to run Gitleaks on 2-3 projects, it's straightforward. But things get quite challenging & interesting when we want to integrate it into larger organizations with tens/hundreds of projects. Why?

  1. The default structure of gitleaks.toml makes it impossible to use it across multiple projects
  2. The whitelisting of false positives across multiple projects can be a real challenge
  3. Having a separate gitleaks.toml for each project isn't a feasible solution either. Why? For instance, you have 100 repositories with individual gitleaks.toml file. You can add the whitelisting in each file specific to the project but when you want to add a new detection rule to gitleaks.toml file, it's gonna be a nightmare to add/delete the values across multiple repositories.

Gitleaks-Default-Design.drawio.png

The existing structure of gitleaks.toml doesn't give us the flexibility to achieve it. What now?

Upgraded architectural design

We should build a design that's flexible & open to extension with ease. What do we need?

  1. Centralized repository for detection rules & exceptions
  2. All secret detection rules must be in a single file
  3. The exceptions are different for every project. The constraints of a rule being an exception varies for every project. Hence, the allowlist rules for each project should be stored in separate files.
  4. A connector that will combine the detection rules & exception list in a way that gitleaks can understand.
  5. Run gitleaks on the specific repository along with the data gathered above.

The design looks something like this.

Gitleaks-For-Enterprises-Design.drawio.png

How to configure & use it

  1. Check the gitleaks-for-enterprise repository. The directory structure is as follows - allowlist/$USERNAME/$REPONAME/allowlist.toml Show-Allowlist-Directory-Structure.png
  2. Next step is to clone gitleaks-for-enterprise & generate gitleaks.toml. We have a base.toml file with all the detection rules. The allowlist folder contains exceptions for all projects.
  3. If this is your first time generating gitleaks.toml, this file would be equivalent to base.toml because there's no allowlist.toml for your target project yet.
     git clone https://github.com/rewanthtammana/gitleaks-for-enterprise
     cd gitleaks-for-enterprise
     python3 run.py -a allowlist/rewanthtammana/gitleaks-demo-repo/allowlist.toml > gitleaks.toml
    
    gitleaks-generation.png
  4. For this example, let's run it on a demo repository, gitleaks-demo-repo. Clone this repo locally & run gitleaks on it. There are 6 leaks identified.
     git clone https://github.com/rewanthtammana/gitleaks-demo-repo /tmp/gitleaks-demo-repo
     gitleaks detect -c ./gitleaks.toml --source /tmp/gitleaks-demo-repo
    
    Gitleaks-First-Run.png
  5. Append -v option to the above gitleaks command to view gitleaks information. I have leaked dummy values for demo purposes.
     gitleaks detect -c ./gitleaks.toml --source /tmp/gitleaks-demo-repo -v
    
    Gitleaks-first-output-analysis.png
  6. Let's consider we revoked the above-identified Github key, ghp_WtfdNeDljtnHfLaVePtZll6NQBqU6c0jiuSX.
  7. After a revocation, you can visit your gitleaks-for-enterprise setup & add this revocation as an exception.
    1. There are multiple ways to add exceptions based on commit id, value, file name, etc.
  8. In this case, let's take the data as an exception. Here it will be ghp_WtfdNeDljtnHfLaVePtZll6NQBqU6c0jiuSX
  9. The allowlists should be in the below format for ease of organizing & access control, allowlist/$USERNAME/$REPONAME/allowlist.toml
  10. In this case, we have to create a file allowlist/rewanthtammana/gitleaks-demo-repo/allowlist.toml with the following data as an exception.

    # Rule specific white listing
    [[rules]]
        id = "8"
        [rules.allowlist]
            regexes = ['''ghp_WtfdNeDljtnHfLaVePtZll6NQBqU6c0jiuSX''']
    
  11. Now generate a new gitleaks.toml file. This will be different from the base file because now we have an allowlist.toml file that will change the course.
    python3 run.py -a allowlist/rewanthtammana/gitleaks-demo-repo/allowlist.toml > gitleaks.toml
    
  12. As we can see, the number of leaks reduce to 4 from 6. Also, you can see that the Github key we revoked, ghp_WtfdNeDljtnHfLaVePtZll6NQBqU6c0jiuSX isn't returned as a finding any further. Github-key-in-allowlist.png
  13. Similarly you can add more exceptions specific to your repo, in this case, the repo is rewanthtammana/gitleaks-demo-repo, so we created allowlist/rewanthtammana/gitleaks-demo-repo/allowlist.yaml.

Further Scope & Conclusion

This can be easily integrated with CI pipelines to identify the sensitive information & helps to take a step towards shift-left security.

The model is designed to be extensible & efficient. This layout can be easily expanded to hundreds/thousands of projects & still provide you the flexibility to have a centralized repository to maintain all the rules & exceptions.

As we have only base.toml with all the detection rules, it's quite affordable for the teams to update the rules frequently & use them across multiple projects.

Conclusion

By leveraging this kind of directory structure & framework, any team can have a centralized repository for all their detection rules & exception lists. This makes it easy for developers, DevOps, security, including business teams to use & allows to integrate with CI smoothly. Hope this restructuring helps you with gitleaks integration in your enterprise or across multiple projects.