Fatal Injection: The Client’s Side

In a previous blog post I discussed about a critical class of web attacks known as code injection attacks. In particular, I presented a subset of such attacks where target entities exist on the server. Here we will talk about the emerging subset of dynamic code injection attacks, which, except for server-side entities, threaten network-oriented applications hosted in a client machine, such as the browser and messaging applications.

Continue reading

Data Security in the Cloud Environment

It’s true. You don’t have to worry about physical equipment on the cloud. What about your data? Lately, there is a number of security concerns associated with cloud computing and one of them involves the protection of a client’s virtual machines (VM), data and running applications. For instance, a recent research showed that it is possible for software hosted by a cloud-computing provider to acquire data like encryption keys from software hosted on the same cloud.

To examine data security in a cloud environment together with two colleagues, Periklis Gkolias and Prof. Diomidis Spinellis, we performed a series of penetration tests on a number of virtual machines running different operating systems. All VMs were hosted on the Amazon Elastic Compute Cloud (EC2), which is a part of the Amazon Web Services (AWS) platform. To perform the penetration tests, we used the Tenable Nessus vulnerability scanner. Our methodology included the following steps; first, we retrieved a list of available Amazon machine images. Then we picked a random image, launched it on the cloud and retrieved its IP address. After that, we invoked the Nessus scanner and passed the IP address as a parameter to it. When the test was over, we terminated the image.

In total, we examined 70 VMs. The operating systems (OS) running on these images can be distinguished in four basic categories, namely: Windows Server (14 images), Ubuntu (26 images), CentOS (9 images) and other Linux OSs (21 images, including Slackware, Arch Linux). Keep in mind that Amazon does not use vanilla distributions of these operating systems but modified distributions that match the requirements of virtual machines. Our first observation was that 22 VMs (10 Ubuntu images, 8 other Linux images, 3 Windows Server images and 1 CentOS) were vulnerable through HTTP methods. These VMs had minimum three vulnerabilities that exploit the HTTP protocol. In addition, the virtual machines of the Windows family present many serious problems with the MS RDP (Remote Desktop Protocol) protocol. Specifically, all images running Windows, except for one, were vulnerable to attacks targeting this protocol. These images had minimum four defects coming from this protocol. Another observation, involved virtual machines with obsolete versions of the Apache Server. Regardless of the operation system, such images were vulnerable to numerous attacks like man-in-the-middle, cross-site scripting and SQL injection. This indicates that installing the latest version of the Apache Server software could solve the above problems. In general, as you can see in the figure below, the VMs were vulnerable to different types of critical attacks. Most defects found on the VMs, could lead mostly to man-in-the-middle and denial-of-service attacks. Such attacks could be avoided by configuring SSL (Secure Sockets Layer) protocol settings properly. For example, in many cases there were mistakes in the computers name and some certificates had expired. From the 70 images only 26 turned out to be secure, namely: 8 CentOS VMs, 8 Ubuntu, 8 VMs with other Linux OSs and 2 VMs with the Windows Server OS.

Similar research from Malduzzi et al. also indicates that if you are thinking of using a 3rd party Amazon Web Service image or publishing your own you better think again. If you are interested in reading more about cloud security issues you can have a look on “a survey on security issues in service delivery models of cloud computing” by Subashini et al. and “who can you trust in the cloud?” by C. Roberts II et al. Finally, you can check out this interesting presentation concerning the various security issues of the Dropbox cloud back-up service.

Drones and the Digital Panopticon

There has been a lot of alarming speculation in the media since February about the potential consequences of the FAA Modernization and Reform Act, which requires that the FAA prepare the national airspace for the introduction, in 2015, of privately owned and operated unmanned aerial vehicles (UAVs). The airspace already hosts UAVs flown by federal, state, and local governments; the Act makes it easier for such agencies to acquire them and permits private entities to get licenses to fly them too. It is designed to get as many drones as possible in the air as quickly as possible.

Much ink has been spilled speculating about the potential effects of a widespread drone presence in this country, mostly focusing on either their ramifications for privacy or on the potential for physical injury they represent. These observations fail to address what makes a sky full of drones so radically unsettling.

Drones are going to be used to gather data, and the data will be integrated into the marketing scan. All drones gather at least the data they need in order to function remotely, and some of them will be able to photograph in staggeringly high resolution, or track up to 65 separate people at once. They won’t all be doing this, obviously, but the FAA’s licensing process doesn’t require drone operators to go into detail about what their vehicles will carry or collect.

We also know that data about people’s movements and behavior is hugely valuable to marketers. It is already collected unobtrusively from us as we move around in the virtual space of the Internet. In an important sense, that space is already patrolled by drones with data collection capabilities, similar to the ones that will soon be operated in the national airspace by private entities. Behavioral information is lucrative. There is every reason to think that the data collected by airborne drones will be just as interesting to the purchasers of bot-collected online behavior data.

Of course, much of our physical-space movement is monitored already, and it is possible to aggregate this information to create an eerily complete picture of a person’s movements, social circle, and preferences. Credit cards, license plate scanners, CCTV cameras, transit passes, and smartphones are all sources of this information.

Over this web of information, drones can add a layer of photographic evidence. The marketing scan of the online drone will merge into the marketing scan of the physical-space drone, and the result will be that we are even more easily identified, tracked, tagged, and followed. Privacy advocates are justly concerned about the erosion of basic notions of privacy by ubiquitous monitoring.

This is a danger separate from safety hazards, because it undermines one of the most basic presumptions of freedom – the absence of arbitrary power. Conceptually, the danger potentially posed by the coming drone squadrons can be separated from privacy concerns, too. The concept of the panopticon (likely familiar to many of you) illustrates the loss of freedom that accompanies arbitrary power, and shows how distinct it is from the lack of contextual integrity that marks an absence of privacy.

The panopticon exemplifies the reality of arbitrary power. The English philosopher Jeremy Bentham invented the Panopticon: a prison in which guards can watch prisoners without prisoners knowing whether they are being watched. The architectural design features a central guard tower, from which a single guard could see every cell in the prison. Bentham reasoned that this architecture would force prisoners to behave at a minimal cost, since fewer resources would need to be invested in guarding them.

100 years later, French sociologist Michel Foucault observed that the “panoptic mechanism” exists in the abstract, as a form of social control. A panoptic arrangement exists wherever there is ongoing subjection to a “field of visibility.” Drones do this, literally: they could be watching at any time, but it will be impossible for us to know at any given time whether we are being observed. The constant subjection to this field, coupled with the capacity for this data to be used by the government to punish or by the marketing scan to determine what information we receive, means that our rational self-interest will lead us to self-censor. We are already seeing this play out socially; people have developed strategies like maintaining separate social network identities for personal and business use, or paying cash for transit passes to avoid being traceable via credit card.

Domestic drones taking photographs or video won’t significantly change this dynamic. They will push it further toward an extreme, in which it becomes harder and harder to extricate ourselves from the marketing scan, and in which the marketing scan and the eye of the State merge (because law enforcement will have ready access to privately owned and aggregated data).

My point in writing this is not to challenge anyone to come up with a “solution,” but rather to point out that the negative effects of drone presence are not exemplified by their security vulnerabilities or their tendency to drop out of the sky. Abstract as it might seem, the increased power and intensity of this “field of visibility” is what will affect our lives the most. It will determine the distribution of information through the marketing scan; we will eventually be aware of it for this reason. And as the reality of our observed status sinks in, we will rationally self-monitor in case we’re being recorded. This state of being poses a radical threat to the way we think about freedom.

 


Measuring the Occurrence of Security-Related Bugs through Software Evolution

You can actually measure the number of the security-related bugs of a project by using the corresponding static analysis tools (depending on the programming language that you used) but you cannot check how they evolve over time.

To involve the aspect of time we can take advantage of the characteristics of version control systems (VCS). As we all know, to manage large software projects, developers employ such systems like Subversion and Github. Such systems can provide project contributors with major advantages like: automatic backups, sharing on multiple computers, maintaining different versions and others. For every new contribution, which is known as a commit, a VCS system goes to a new state, which is called a revision. Every revision stored in a repository represents the state of every single file at a specific point of time.

To measure the occurrence of security-related bugs through time, together with two colleagues, we combined FindBugs, an effective static analysis tool that runs over bytecode and has already been used in research, and Alitheia Core, an extensible platform designed for performing large-scale software quality evaluation studies. I will not spoil your time by describing the integration of the two (you can check it in the proceedings of this conference). I will go one step further and describe how the framework works: First, Alitheia Core checks out a project from its SVN repository. Then, for every revision of this project, the framework creates a build. Then it invokes FindBugs to examine this build and create an analysis report. A user can select whether to examine the project alone or the project together with its dependencies. Finally, from this report, it retrieves the security-related bugs.

We have examined four open source projects that are based on the Maven build system namely: xlitesxcjavancss, and grepo. We particularly chose Maven-based projects to exploit the advantages of the system such as: resolving and downloading dependencies on the fly and automatically build every revision. In addition, all projects deal with untrusted input, thus they could become targets for exploits. Our experiment included two measurements. First, for every revision, we applied FindBugs only to the bytecode of a specified project. Then for our second one, we also included the dependencies of this project. The figure below depicts the results for the javancss project. The results are representative for all four projects. The red line represents the first measurement and the green line the second one.

Bug frequency for javancss project

The most interesting observation that we can make is that the security bugs are increasing as projects evolve. This is quite important and shows that bugs should be fixed in time to decrease the effort and cost of the security audits after the end of the development process. Another observation is the existence of the domino effect. The usage of external libraries introduces new bugs. Thus, we can observe how third-party software can affect the evolution of a project while its developers are unaware of that. Another interesting issue regards the bugs themselves. As we observed, even after hundreds of revisions, in the release versions there are bugs that could be severe for the application. For instance, the last revision of javancss includes a code fragment that creates an SQL statement from a non-constant string. If this string is not checked properly, an SQL injection attack is prominent.

At first glance you could say that as a project evolves, the lines of code increase and as a result, the security-related bugs should increase too. But you cannot support this with confidence since there are techniques like refactoring that could decrease the project’s code as time passes. Hence, it could be interesting to check if there is a correlation between the lines of code and the security bugs of a project.

The main limitation of our approach is that it has to build automatically numerous revisions and run FindBugs on the created jars. This is not convenient since most Maven-based projects exist on GitHub. To validate the statistical significance of our results we should run our framework on more projects. This can be done by managing data from GitHub and we are currently working towards this direction.

The code of our framework can be found here.

Fatal Injection: The Server’s Side.

In my first blog post  I discussed about software security and mentioned some common traps into which programmers regularly fall. One of them involved input validation. For instance, a developer may assume that the user will enter only numeric characters as input, or that the input will never exceed a certain length. Such assumptions can lead to the processing of invalid data that an attacker can introduce into a program and cause it to execute malicious code. This class of exploits is known as code injection attacks and it is currently topping the lists of the various security bulletin providers. Code injection attacks are one of the most damaging class of attacks since they can occur in different layers, like databases, native code, applications and others; and they span a wide range of security and privacy issues, like viewing sensitive information, modification of sensitive data, or even stopping the execution of the entire application.

The set of code injection attacks that involves the insertion of binary code in a target application to alter its execution flow and execute inserted compiled code can be described as binary code injection attacks. This category includes the infamous buffer-overflow attacks. Such attacks are possible when the bounds of memory areas are not checked, and access beyond these bounds is possible by the program. Consider the following code segment:

#include <string.h>

void vulnerable_method (char *foo)
{
  char tmp[12];
  strcpy(tmp, foo);
}

int main (int argc, char **argv)
{
  vulnerable_method(argv[1]);
}

This code takes an argument and copies it to the tmp variable. But what happens when an argument is larger that 11 characters? By taking advantage of this, malicious users can inject additional data overwriting the existing data of adjacent memory. From there they can take control over a program or even take control of the entire host machine. C and C++ are vulnerable to this kind of attacks since typical implementations lack a protection scheme against overwriting data in any part of the memory. In comparison, Java guards against such attacks by preventing access beyond array bounds, throwing a runtime exception. To combat a buffer-overflow attack in the above case you could use strncpy instead (strlcpy is even better if you are running on BSD or Solaris) since it requires putting a length as a parameter.

Code injection also includes the use of source code, either of Domain-Specific Languages (DSLs) or Dynamic Languages. DSL-driven injection attacks constitute an important subset of code injection, as DSL languages like SQL and XML play a significant role in the development of applications. For instance, many applications have interfaces where a user enters input to interact with the underlying relational database management system of the application. This input can become part of an SQL statement and executed on the target RDBMS. A code injection attack that exploits the vulnerabilities of these interfaces is called an SQL injection attack. One of the most common forms of such an exploit involves taking advantage of incorrectly filtered quotation  characters. For instance, in a login page, users wanting to access a vulnerable application have to fill-in a simple username and password form. The following Java code illustrates the defect by accepting user input without performing any input validation:

Connection conn = DriverManager.getConnection(a_url, "a_username", "a_password");
String sql = "select * from user where username='" + uname +"' and password='" + pass + "'";
stmt = conn.createStatement();
rs = stmt.executeQuery(sql);
if (rs.next()) {
  loggedIn = true;
  out.println("Logged in");
} else {
  out.println("Credentials not recognized");
}

By using the string anything’ OR ’x’=’x as a username, a malicious user could log in to the site without supplying a password, since the ‘OR’ expression is always true. To avoid a situation like the above, you could use Java’s prepared statements. In a prepared statement, the query is precompiled on the driver that is used to connect the application with the database. From that point on, the parameters are sent to the driver as literal values and not as executable portions of SQL; hence no SQL can be injected using a parameter. ΧPath and LDAP injection attacks are DSL-driven attacks that have a very similar style to SQL injection. Also, notice that contrary to binary code injection, a DSL-driven injection attack is independent of the language that was used to create the application. So the problem above could similarly occur in PHP, C++ etc.

As you have probably realized, the aforementioned attacks target vulnerable, server-side applications. In one of my upcoming posts, we will have the opportunity to discuss about injection attacks that involve dynamic languages and target entities that exist on the client-side (i.e. the user’s browser).