About XRDS Staff

Ultra-Efficient via Sublinearity

For a long time in the area of design and analysis of algorithms, when we have said that an algorithm is efficient we meant that it runs in time polynomial in the input size n and finding a linear time algorithm have been considered as the most efficient way to solve a problem. It’s been because of this assumption that we need at least to consider and read all the input to solve the problem. This way it seems that we cannot do much better! But nowadays the data sets are growing fast in various areas and applications in a way that it hardly fits in storage and in this case even linear time is prohibitive. To work with this massive amount of data, the traditional notions of an efficient algorithm is not sufficient anymore and we need to design more efficient algorithms and data structures. This encourages researchers to ask whether it is possible to solve the problems using just sublinear amount of resources? what does that mean exactly when we say ‘sublinear resources’?
We can think of sublinear algorithms in the area of big data in three different categories:

Continue reading

More with Less: Deduplication

Deduplication is a critical technology for modern production and research systems. In many domains, such as cloud computing, it is often taken for granted [0]. Deduplication magnifies the amount of data you can store in memory [1], on disk [2], and in transmission across the network [3]. It comes at the cost of more CPU cycles, and potentially more IO operations at origin and destination storage backends. Microsoft [4], IBM [5], EMC [6], Riverbed [7], Oracle [8], NetApp [9], and other companies tout deduplication as a major feature and differentiator across the computing industry. So what exactly is deduplication?

Continue reading

The Scary Reality of Identity Theft

One of the most basic philosophical questions stems from attempting to identify oneself, with the first step of proving you actually exist. René Descartes provides a proof with

Cogito ergo sum

meaning, “I think, therefore I am.” The intuition is that the mere fact of thinking forms a proof that you exist. But who or what are you exactly? What identifies you? How can we definitively prove you are what you claim to be? Who you claim to be? The problem of identity is an incredibly hard one—how do you know a letter in the mail is from the person that signed it? How do you know a text was written by the owner of a certain phone? How do you know an email comes from the person that owns an email address? This is a fundamental problem that faces the fields of computer science and cryptography, and it is incredibly hard to solve.

Continue reading

Welcome Theory Student Bloggers

Several new bloggers will be joining the XRDS student blog over the next few weeks, expanding its scope to theory related stuff and beyond! Welcome, and looking forward to your posts (posts tagged “Theory” will appear on the theory of computing blog aggregator).

The XRDS blog has been up and running for six months now, with posts written by and for CS students on a range of topics (security, HCI, being a post-doc in Paris to name a few…). The motivation behind the blog is similar to the one described here – to help carry the conversation for the student community, a place to share thoughts and get up to date. If you like the idea and would like to support this initiative, please add us to your blogroll.

Tomorrow – internships!

Big Issue on Big Data – Theory Highlights

We’d like to invite you to check out the following theory-oriented articles in our Fall 2012 issue dedicated to Big Data:

This issue of XRDS was edited by a multidisciplinary team of students – Aditya Parameswaran of Stanford, Andrew Cron of Duke and Huy L. Nguyen of Princeton. From the editors’ letter:

“It has been an interesting time for big data with innovations coming simultaneously from theorists, system builders, and scientists or application designers. We hope to provide readers with an idea of the interplay between developments in these three different communities … [that] together drive forward the development of big data analysis.”

Hope you enjoy reading this issue (feel free to let us know what you think in the comments)!