The Evolution of Local Graph Partitioning

As a follow-up to my previous post on the discussion of where theory and experimentation meet in the study of large-scale networks, I would like to discuss in more detail one of the empirically best-performing algorithms which also has a sound theoretical background: spectral partitioning. In this post I will examine the history of the problem, outline some key results, and present some future ideas for the problem. Continue reading

How to Hack a Sketchy e-voting System

The quintessence of an e-voting transaction is to be secure. In the e-voting context, security issues are very subtle. This is because there are features that clash with each other. For example, guaranteeing anonymity makes it harder to track election fraud. In addition, security in e-voting is highly related to the type of the technology used during the process. In distance e-voting, the voter can cast his vote from his personal computer by sending it to a central server via the Internet. The electronic, network-based nature of the latter makes it susceptible to a wide range of attacks. Continue reading

Theory Behind Big Data

As a PhD student who does research on theory and algorithms for massive data analysis, I am interested in exploring current and future challenges in this area, which I’d like to share it here. There are two major points of view when we talk about big data problems:

One is more focused on industry and business aspects of big data, and includes many IT companies who work on analytics. These companies believe that the potential of big data lies in its ability to solve business problems and provide new business opportunities. To get the most from big data investments, they focus on questions which companies would like to answer. They view big data not as a technological problem but as a business solution, and their main goals are to visualize, explore, discover and predict.

Continue reading

On Domain-Specific Language Usage

Many years ago (I will not reveal my age), I began working on my PhD thesis concerning the area of Domain-Specific Languages (DSLs). Research was booming at the time and many research articles stated in their introduction that DSLs are very useful and increase productivity, by reducing lines of code etc. All these claims seemed logical to me, but I always considered them something like urban legends. We all know that they are correct, but cannot easily prove it. Keeping that on the back of my mind, I searched for a way to bring the “legend” down to measurable facts that will provide solid motivation for the importance DSLs in every day programming. I decided to do a simple experiment that measures DSL usage in open source programs. Continue reading