Developers generally like to share their code, and many of them do so by open sourcing it on GitHub, a social code hosting and collaboration service. Many companies also use GitHub as a convenient place to host both private and public code repositories by creating GitHub organizations where employees can be joined.
Sometimes employees might publish things that should not be publicly available. Things that contain sensitive information or things that could even lead to direct compromise of a system. This can happen by accident or because the employee does not know the sensitivity of the information.
Gitrob is a command line tool that can help organizations and security professionals find such sensitive information. The tool will iterate over all public organization and member repositories and match filenames against a range of patterns for files that typically contain sensitive or dangerous information.
Looking for sensitive information in GitHub repositories is not a new thing, it has been known for a while that things such as private keys and credentials can be found with GitHub's search functionality, however Gitrob makes it easier to focus the effort on a specific organization.
The first thing the tool does is to collect all public repositories of the organization itself. It then goes on to collect all the organization members and their public repositories, in order to compile a list of repositories that might be related or have relevance to the organization.
When the list of repositories has been compiled, it proceeds to gather all the filenames in each repository and runs them through a series of observers that will flag the files, if they match any patterns of known sensitive files. This step might take a while if the organization is big or if the members have a lot of public repositories.
All of the members, repositories and files will be saved to a PostgreSQL database. When everything has been sifted through, it will start a Sinatra web server locally on the machine, which will serve a simple web application to present the collected data for analysis.
While developing Gitrob, I tested it against many different organizations belonging to various companies, big and small, both to expose the tool to a lot of real-life data and to notify the companies of any findings before release.
The tool found several interesting things ranging from low-level, to bad and all the way to company-destroying kind of information disclosure. Here's some examples...
Note: I have redacted sensitive and identifying information in the screenshots; I am not interested in embarrassing or exposing anyone. And again, all these findings have been reported.
Gitrob is written in Ruby and requires at least version 1.9.3 or above. If you
are on an older version, it is very easy to install newer versions with RVM.
If you are installing Gitrob on Kali, you
are almost good to go, you just need to update Bundler with
gem install bundler.
It might also be necessary to install a PostgreSQL dependency with
install postgresql-server-dev-9.1 in a terminal.
Gitrob is a Ruby gem, so installation is a simple
gem install gitrob
in a terminal. This will automatically install all the code dependencies as well.
A PostgreSQL database is also needed
for Gitrob to store its data. Installing PostgreSQL is pretty straight forward;
here is an installation guide for Mac OS X
and one for Ubuntu/Debian based Linux.
If you're installing Gitrob on Kali, you already have PostgreSQL installed, however
you need to start the server with
service postgresql start in a terminal.
When PostgreSQL is installed, it's time to create a user and a database for Gitrob. To do so, type the following commands in a terminal:
sudo su postgres # Not necessary on Mac OS X createuser -s gitrob --pwprompt createdb -O gitrob gitrob
The last thing we need is a GitHub access token in order to be able to talk to their API. The easiest way is to create a personal access token. If you plan on using Gitrob extensively or on a very big organization, it might be necessary turn down the amount of threads used and maybe configure Gitrob to use access tokens from you and your colleagues, to avoid running into rate limiting.
When everything is ready, simply run
gitrob --configure and you
will be presented with a configuration wizard that asks you for database connection
details and GitHub access tokens. All of this configuration can be changed by
running the same command again. The configuration will be saved in
- and yes, Gitrob is looking for this file too so watch out.
When everything is set up, you can start analyzing organizations by running
gitrob -o <orgname> in a terminal. To see options, use
I work in the security team at SoundCloud (We're hiring, btw) and one of my recent tasks has been to create a system that continuously watches our GitHub organization for various things that might be a security risk, including looking for potential sensitive files in repositories. During development, I thought it would be interesting to take parts of this system and open sourcing it as a tool that can be used both defensively and offensively.
If you are responsible for security at a company that uses GitHub for hosting code, Gitrob can be used to periodically check your organization for any sensitive files that might be lingering in repositories.
If you are on the offensive side, like a professional penetration tester, Gitrob can be used in the initial information gathering stage to look for anything that might give you a foothold or increase the target's attack surface. Gitrob can also give you usernames, names, email addresses and names of internal systems that are useful in phishing campaigns and social engineering attacks. If you are lucky, Gitrob can even give you complete pwnage without ever sending a single malicious packet to the target's systems.
Gitrob should be considered Beta and there is probably a good amount of bugs. Bug reports and suggestions for improvements are welcome!
Another way to help out is to contribute new patterns for sensitive files. If you know of any sensitive files that are not already identified, please submit them in a pull request on GitHub. I am especially interested in sensitive web framework files and configuration files. Have a look at the signatures.json file to see what is already looked for.
Have fun and be responsible!