Gitrob: Putting the Open Source in OSINT

Posted January 12, 2015

Developers generally like to share their code, and many of them do so by open sourcing it on GitHub, a social code hosting and collaboration service. Many companies also use GitHub as a convenient place to host both private and public code repositories by creating GitHub organizations where employees can be joined.

Sometimes employees might publish things that should not be publicly available. Things that contain sensitive information or things that could even lead to direct compromise of a system. This can happen by accident or because the employee does not know the sensitivity of the information.

Gitrob is a command line tool that can help organizations and security professionals find such sensitive information. The tool will iterate over all public organization and member repositories and match filenames against a range of patterns for files that typically contain sensitive or dangerous information.

How it works

Looking for sensitive information in GitHub repositories is not a new thing, it has been known for a while that things such as private keys and credentials can be found with GitHub's search functionality, however Gitrob makes it easier to focus the effort on a specific organization.

The first thing the tool does is to collect all public repositories of the organization itself. It then goes on to collect all the organization members and their public repositories, in order to compile a list of repositories that might be related or have relevance to the organization.

Gitrob collecting repositories from organization members.

When the list of repositories has been compiled, it proceeds to gather all the filenames in each repository and runs them through a series of observers that will flag the files, if they match any patterns of known sensitive files. This step might take a while if the organization is big or if the members have a lot of public repositories.

Gitrob sifting through collected repositories and flagging interesting files.

All of the members, repositories and files will be saved to a PostgreSQL database. When everything has been sifted through, it will start a Sinatra web server locally on the machine, which will serve a simple web application to present the collected data for analysis.

Interesting files across all repositories are shown in one list for easy analysis. The quick filter in the top right corner can be used to look for specific files.

Clicking on a file will show its contents with syntax highlighting. It will also show why the file was flagged.

Members of the organization can be viewed in a grid layout. Members with interesting files are easy to spot.

Clicking on a member will show their basic information and public repositories. Repositories with findings are highlighted with an orange background.

All collected repositories can be viewed in a table with their descriptions and website URLs. Repositories with findings are highlighted with an orange background.

All files in a specific repository can be viewed. The quick filter in the top right corner can be used to look for specific files.

Some findings

While developing Gitrob, I tested it against many different organizations belonging to various companies, big and small, both to expose the tool to a lot of real-life data and to notify the companies of any findings before release.

The tool found several interesting things ranging from low-level, to bad and all the way to company-destroying kind of information disclosure. Here's some examples...

Note: I have redacted sensitive and identifying information in the screenshots; I am not interested in embarrassing or exposing anyone. And again, all these findings have been reported.

Found in a .bash_profile file, the employee was thoughtful enough to mask the passwords, but still mapped out a big chunk of infrastructure with his command aliases. It also tells attackers that spear-phishing this employee will likely give them root access to a lot of databases.

Found in a .bash_profile file, the command aliases revealed the existence of a secret black site domain used for the company's tools for everyday operations such as analytics, metrics and continuous integration. A big increase in attack surface.

Command history files can contain a lot of sensitive information, such as passwords, API keys and hostnames.

A developer had open sourced a Wordpress website, including a complete database dump with password hash for his user account. Maybe the same password is used somewhere else?

An .env file for a chat bot contained several credentials. Apart from an attacker being able to spy on their Campfire chat and steal stuff from the data stores, they would also be able to control the temperature somewhere with the Nest credentials.

A company had open sourced their documentation website, a simple Ruby On Rails application. They forgot to remove the application secret token, which can be exploited to achieve remote code execution.

A developer had checked in his KeePass password database containing 174 entries. The data is heavily encrypted, but the master password can be brute-forced. In this case the company was certainly interesting enough for someone to throw a lot of computing power at that task.

Amazon EC2 credentials found in a .zshrc file. depending on the level of privilege, it can potentially give complete control of the company's infrastructure.

An employee had checked in an Amazon EC2 private key which can potentially give complete control of the company's infrastructure.

The same employee from the last screenshot also checked in his private SSH key, which could potentially grant access to the company's SSH servers. It could potentially also be used to clone private organization repositories.

Installing and setting up Gitrob

Gitrob is written in Ruby and requires at least version 1.9.3 or above. If you are on an older version, it is very easy to install newer versions with RVM. If you are installing Gitrob on Kali, you are almost good to go, you just need to update Bundler with gem install bundler. It might also be necessary to install a PostgreSQL dependency with apt-get install postgresql-server-dev-9.1 in a terminal.

Gitrob is a Ruby gem, so installation is a simple gem install gitrob in a terminal. This will automatically install all the code dependencies as well.

A PostgreSQL database is also needed for Gitrob to store its data. Installing PostgreSQL is pretty straight forward; here is an installation guide for Mac OS X and one for Ubuntu/Debian based Linux. If you're installing Gitrob on Kali, you already have PostgreSQL installed, however you need to start the server with service postgresql start in a terminal.

When PostgreSQL is installed, it's time to create a user and a database for Gitrob. To do so, type the following commands in a terminal:

sudo su postgres # Not necessary on Mac OS X
createuser -s gitrob --pwprompt
createdb -O gitrob gitrob

The last thing we need is a GitHub access token in order to be able to talk to their API. The easiest way is to create a personal access token. If you plan on using Gitrob extensively or on a very big organization, it might be necessary turn down the amount of threads used and maybe configure Gitrob to use access tokens from you and your colleagues, to avoid running into rate limiting.

When everything is ready, simply run gitrob --configure and you will be presented with a configuration wizard that asks you for database connection details and GitHub access tokens. All of this configuration can be changed by running the same command again. The configuration will be saved in ~/.gitrobrc - and yes, Gitrob is looking for this file too so watch out.

Setting up Gitrob with the configuration wizard.

When everything is set up, you can start analyzing organizations by running gitrob -o <orgname> in a terminal. To see options, use gitrob --help.

Why I created Gitrob

I work in the security team at SoundCloud (We're hiring, btw) and one of my recent tasks has been to create a system that continuously watches our GitHub organization for various things that might be a security risk, including looking for potential sensitive files in repositories. During development, I thought it would be interesting to take parts of this system and open sourcing it as a tool that can be used both defensively and offensively.

If you are responsible for security at a company that uses GitHub for hosting code, Gitrob can be used to periodically check your organization for any sensitive files that might be lingering in repositories.

If you are on the offensive side, like a professional penetration tester, Gitrob can be used in the initial information gathering stage to look for anything that might give you a foothold or increase the target's attack surface. Gitrob can also give you usernames, names, email addresses and names of internal systems that are useful in phishing campaigns and social engineering attacks. If you are lucky, Gitrob can even give you complete pwnage without ever sending a single malicious packet to the target's systems.

Helping out

Gitrob should be considered Beta and there is probably a good amount of bugs. Bug reports and suggestions for improvements are welcome!

Another way to help out is to contribute new patterns for sensitive files. If you know of any sensitive files that are not already identified, please submit them in a pull request on GitHub. I am especially interested in sensitive web framework files and configuration files. Have a look at the signatures.json file to see what is already looked for.

Have fun and be responsible!