Disclaimer: this information is for research purposes only. It is intended to help website owners and software engineers better secure their systems.
At the time of our research (late 2017) our crawler discovered 6242/1,000,000 vulnerable git repositories.
That is 0.62 % of the most popular websites in the world*.
*We only tested root directories (eg
example.com/.git). Testing all link and folders would more than likely yield many more results.
A git repository is a revision control system (or version control system). It is the most popular tool for version control used by software developers.
According to the
man git pages: "Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals."
A 'vulnerable' git repository is a git repo which has been left unintentionally public on the internet. There are plenty of cases where git repositories should be left on the internet - an obvious example being open source software. However, many git repositories should not be public and are. The most common mistake developers make is disabling directory access and assuming this means their repository is safe.
What does this mean? Let's explain with an example.
git pushcommand (or
pull, or similar) to push your code to your live environment. Perhaps you setup an apache or nginx server, with the root web directory in
/home/user/www/.git/, and its contents.
example.com/.gitshows a 404, or 403 denied error. All good. This is where most developers leave things.
example.com/.git/config? If the config file downloads, your repository is still public - and most of it could be downloaded by a bad actor for malicious purposes.
In short, developers assume that because their repository cannot be accessed at
example.com/.git/ they are safe. However, often the raw files - objects, configuration, commits - are still public.
It doesn't take a great deal of effort to find the first commit by viewing the
refs/heads/master file. From there, you can usually work backwards to download the entire repo. This process can be automated, and open source tools to automate this process exist on github.
As a developer, when using git repositories in your website root, please check
example.com/.git/config, not just
A simple golang program was written to request each domain for
config files, testing both http and https versions, for example:
If a domain returned a 200 response, the config file was parsed to find out if it were a real git config file.
As described above, this is due to a simple mistake which can be made by anyone. In alternative research, we found vulnerable repositories in multiple conglomerate sized companies (including a major search engine) with public bug bounty programs. The mistake originates from thinking you are safe by disabling directory listing.
Often developers commit all kinds of data to a git repository - including database passwords, secret keys, and even SSH keys. If your website is vulnerable an attacker could potentially access all data stored in your git history.
From there, they could use the data to launch further attacks or gain deeper access into your system.
By far the best way to solve this is to avoid putting your git repository in your website root. Often content management systems and web application frameworks include all code within the web root. This is terrible practice. Your application can nearly always run with only one file (for example
index.php in the website root, by changing the references within.
If this is not possible for you, then you should use a
post-receive hook to push your code from a bare repository to the website root. Alternatively, leave it in your website root and ensure you disable not only directory listing, but direct file access as well.
- published 14 March 2018 Article written by Ollie Cox.