You can actually measure the number of the security-related bugs of a project by using the corresponding static analysis tools (depending on the programming language that you used) but you cannot check how they evolve over time.
To involve the aspect of time we can take advantage of the characteristics of version control systems (VCS). As we all know, to manage large software projects, developers employ such systems like Subversion and Github. Such systems can provide project contributors with major advantages like: automatic backups, sharing on multiple computers, maintaining different versions and others. For every new contribution, which is known as a commit, a VCS system goes to a new state, which is called a revision. Every revision stored in a repository represents the state of every single file at a specific point of time.
To measure the occurrence of security-related bugs through time, together with two colleagues, we combined FindBugs, an effective static analysis tool that runs over bytecode and has already been used in research, and Alitheia Core, an extensible platform designed for performing large-scale software quality evaluation studies. I will not spoil your time by describing the integration of the two (you can check it in the proceedings of this conference). I will go one step further and describe how the framework works: First, Alitheia Core checks out a project from its SVN repository. Then, for every revision of this project, the framework creates a build. Then it invokes FindBugs to examine this build and create an analysis report. A user can select whether to examine the project alone or the project together with its dependencies. Finally, from this report, it retrieves the security-related bugs.
We have examined four open source projects that are based on the Maven build system namely: xlite, sxc, javancss, and grepo. We particularly chose Maven-based projects to exploit the advantages of the system such as: resolving and downloading dependencies on the fly and automatically build every revision. In addition, all projects deal with untrusted input, thus they could become targets for exploits. Our experiment included two measurements. First, for every revision, we applied FindBugs only to the bytecode of a specified project. Then for our second one, we also included the dependencies of this project. The figure below depicts the results for the javancss project. The results are representative for all four projects. The red line represents the first measurement and the green line the second one.

Bug frequency for javancss project
The most interesting observation that we can make is that the security bugs are increasing as projects evolve. This is quite important and shows that bugs should be fixed in time to decrease the effort and cost of the security audits after the end of the development process. Another observation is the existence of the domino effect. The usage of external libraries introduces new bugs. Thus, we can observe how third-party software can affect the evolution of a project while its developers are unaware of that. Another interesting issue regards the bugs themselves. As we observed, even after hundreds of revisions, in the release versions there are bugs that could be severe for the application. For instance, the last revision of javancss includes a code fragment that creates an SQL statement from a non-constant string. If this string is not checked properly, an SQL injection attack is prominent.
At first glance you could say that as a project evolves, the lines of code increase and as a result, the security-related bugs should increase too. But you cannot support this with confidence since there are techniques like refactoring that could decrease the project’s code as time passes. Hence, it could be interesting to check if there is a correlation between the lines of code and the security bugs of a project.
The main limitation of our approach is that it has to build automatically numerous revisions and run FindBugs on the created jars. This is not convenient since most Maven-based projects exist on GitHub. To validate the statistical significance of our results we should run our framework on more projects. This can be done by managing data from GitHub and we are currently working towards this direction.
The code of our framework can be found here.