Ultra-Efficient via Sublinearity

For a long time in the area of design and analysis of algorithms, when we have said that an algorithm is efficient we meant that it runs in time polynomial in the input size n and finding a linear time algorithm have been considered as the most efficient way to solve a problem. It’s been because of this assumption that we need at least to consider and read all the input to solve the problem. This way it seems that we cannot do much better! But nowadays the data sets are growing fast in various areas and applications in a way that it hardly fits in storage and in this case even linear time is prohibitive. To work with this massive amount of data, the traditional notions of an efficient algorithm is not sufficient anymore and we need to design more efficient algorithms and data structures. This encourages researchers to ask whether it is possible to solve the problems using just sublinear amount of resources? what does that mean exactly when we say ‘sublinear resources’?
We can think of sublinear algorithms in the area of big data in three different categories:

Continue reading

Laying the Foundation for a Common Ground

This week, the Simons Institute hosted a workshop entitled Unifying Theory and Experiment for Large-Scale Networks. The goal of the workshop is to bring together researchers involved in various large networks problems to discuss both the theoretical models and the empirical process for testing and validating them. Even further, the “unifying” in the title suggests a forum on where the ends of the spectrum may meet.

Continue reading

Algorithms Fit for Compilation?

Most researchers differ in their workflow.  For researchers in the algorithms world (or at least, those I know), the work is in the design. Our hours are spent at the blueprint stage. Algorithms are designed, improved, reformulated, or reapplied in different problems, mostly on paper. But this is unarguably only the first stage in successfully developing a new algorithm. There are still the matters of proving and testing the algorithm, and submitting the result to the public. When are we done drafting our blueprint? How do we package and ship the blueprint to the engineers and construction team?

Let’s address the more straight-forward question first. What is the best way to present an algorithm? How descriptive and specific should it be? Should it be entirely self-contained or, for instance, could we have a pointer to a “… subroutine of choice”? Is implementability more important than readability?

Continue reading

Talks of Talks of Estimators

This week, TCS+ hosted a talk by Greg Valiant via a Google+ hangout.  Valiant gave a talk on his work with brother, Paul, on an efficient estimator for entropy and support size of an unknown probability distribution requiring only O(n/log n) samples, where n is a bound on the support size of the distribution.  This work diverges from the existing literature by demonstrating that the estimate can be obtained with a concrete linear program; an algorithm which outputs a distribution very similar to the unknown distribution with respect to certain statistical properties.

Continue reading

Demystifying the Fourier magic

Over the years I have gotten used to seeing many theorems in theoretical computer science being proved using discrete Fourier analysis. The Walsh-Fourier (Hadamard) transform of Boolean functions proved to be extremely useful in virtually every subfield of theoretical computer science, including PCPs, property testing, pseudorandomness, and communication complexity. As it turns out, many seemingly hard problems can be solved by writing the Walsh-Fourier expansion and using basic theorems of harmonic analysis.

While I have gotten quite comfortable using Fourier analysis when trying to tackle a problem, and even though I developed a pretty good hunch for which cases Fourier analysis should yield nice results – I have to admit that it took me a long time to begin to unravel the Fourier magic; that is, to understand what is so special about the Walsh-Fourier basis that makes it so powerful.

For the rest of this post, I assume that the readers have some proficiency in Fourier analysis. However, in order to make this post (hopefully) a bit more accessible to readers who are new to harmonic analysis, I would like to dedicate the rest of this paragraph to stating out some basic definitions and facts regrading the Fourier transform. Say we have a real function on the hypercube $latex {f:\mathbb{Z}_2^n \rightarrow \mathbb{R}}&fg=000000$. The Walsh-Fourier expansion of $latex {f}&fg=000000$ is defined as follows:

$latex \displaystyle f(x) = \sum_{S \subseteq [n]}\hat{f}(S)\chi_S(x), \ \ \ \ \ (1)&fg=000000$

where the characters $latex {\chi_S: \mathbb{Z}_2^n \rightarrow \mathbb{R}}&fg=000000$ are defined by

$latex \displaystyle \chi_S(x_1, \ldots, x_n) = (-1)^{\sum_{i \in S} x_i} \ \ \ \ \ (2)&fg=000000$

(note that the summation in Eq. (2) is over $latex {\mathbb{Z}_2^n}&fg=000000$, and that the coefficient of each character $latex {\chi_S}&fg=000000$ is $latex {\hat{f}(S) = \langle f, \chi_S \rangle}&fg=000000$). The way I like to think of the Fourier transform is simply as a change of basis for $latex {f}&fg=000000$, wherein (hopefully) the representation of $latex {f}&fg=000000$ is simpler, and hence easier to analyse. It is also interesting to note that if $latex {f}&fg=000000$ is defined on $latex {\{-1,1\}^n}&fg=000000$ instead on $latex {\mathbb{Z}_2^n}&fg=000000$, then each character $latex {\chi_S: \{-1,1\}^n \rightarrow \mathbb{R}}&fg=000000$ is the monomial $latex {\chi_S(x) = \prod_{i \in S} x_i}&fg=000000$, thus the Fourier expansion is in fact the representation of $latex {f}&fg=000000$ as a multilinear polynomial over $latex {\mathbb{R}}&fg=000000$.

So sure, we have a relatively good understanding of how polynomials behave, and as the foregoing discussion shows, the Fourier transform is a natural way of looking at a function as a multilinear polynomial. But why specifically this base? What is so unique about the base of parities? Are there other bases which are just as effective? Here is my point of view, which I learned from Or Meir: Let $latex {\mathcal{F}}&fg=000000$ be the linear space of functions $latex {f: \mathbb{Z}_2^n\rightarrow\mathbb{R}}&fg=000000$. For every $latex {w\in\mathbb{Z}_2^n}&fg=000000$, consider the linear operator $latex {\sigma_w}&fg=000000$ that maps a function $latex {f \in \mathcal{F}}&fg=000000$ to the function $latex {f’ \in \mathcal{F}}&fg=000000$ such that $latex {f'(x) = f(x+w)}&fg=000000$. Such operators are called shift operators. It turns out that in many natural problems that appear in theoretical computer science (but also in functional analysis, Hardy spaces, the theory of abelian varieties, and the theory of symbolic dynamics) there is a fundamental, underlying need to analyze the effects that such operators have on Boolean functions. Now, a cool property of the Fourier basis (namely, the shift theorem) is the fact that it simultaneously diagonalizes all of the shift operators. Details follow.

Since we are dealing with a finite discrete domain (the Boolean hypercube), we can view the functions in $latex {\mathcal{F}}&fg=000000$ as vectors in a subspace (i.e, $latex {f\in \mathcal{F}}&fg=000000$ can be viewed as a vector $latex {v_f \in \mathbb{R}^{2^n}}&fg=000000$, where the entries of $latex {v_f}&fg=000000$ are the values of the truth table of $latex {f}&fg=000000$). Then, the shift operators are linear operations on vectors, hence they can be viewed as matrices. As we mentioned before, we wish to simplify the representation of the functions and operators that we study, in order to make them easier to analyse. The best we can hope for is to diagonalize the matrix of the operator we inspect, and this is exactly what the Fourier basis does: In the Fourier basis, all of the shift operators are diagonal matrices.

More generally, the Fourier basis diagonalizes the convolution operator, which also underlies the structure of many natural problems in the analysis of Boolean functions. To see the power of the aforementioned diagonalization, let’s look at an important corollary of it: the convolution theorem. Given functions $latex {f,g \in \mathcal{F}}&fg=000000$, their convolution $latex {f * g}&fg=000000$ is also a function in $latex {\mathcal{F}}&fg=000000$, defined by

$latex \displaystyle [f*g](x) = \sum_{y \in \mathbb{Z}_2^n}f(y)g(x-y). \ \ \ \ \ (3)&fg=000000$

The convolution theorem for $latex {\mathbb{Z}_2^n}&fg=000000$ states that the Fourier transform of the pointwise multiplication of two functions is equal to the convolution of their Fourier transforms; that is,

$latex \displaystyle \widehat{f \cdot g} = \hat{f} * \hat{g}. \ \ \ \ \ (4)&fg=000000$

The convolution theorem, as well as the other aforementioned properties of the Fourier transform, makes it very robust for the analysis of Boolean functions. Let me provide an example from my own work (joint with Omer Tamuz). We want to test whether a function $latex {f \in \mathcal{F}}&fg=000000$ is Boolean or not (i.e, whether its image is contained in a set of cardinality 2, say $latex {\{-1,1\}}&fg=000000$). A simple, though crucial, observation is that a function $latex {f \in \mathcal{F}}&fg=000000$ is Boolean if and only if the convolution of $latex {f}&fg=000000$ with itself (which sometimes referred to as the autocorrelation of $latex {f}&fg=000000$) is equal to the delta function (i.e., the function that gets 1 at 0, and gets 0 elsewhere). To see this, note that $latex {f}&fg=000000$ is Boolean if and only if $latex {f^2 = 1}&fg=000000$, apply the convolution theorem, and use the fact that the Fourier transform of the constant 1 function is the delta function. Hence, using the Fourier expansion we are able to give a characterization of Boolean functions that is efficiently testable. This is a general theme: whenever a function is “correlated with its shifted-self”, it begs for a Fourier based approach.

Last, I would like to give a brief taste of “heavier” tools in the same spirit: We can generalize the discussion on efficient representations even further. Fourier analysis is a special case of the representation theory of finite groups. This theory deals with the more general space of functions $latex {f:G\rightarrow \mathbb{C}}&fg=000000$ where $latex {G}&fg=000000$ is a finite group. It allows us to find a basis that makes the analysis of shift operators easier, even though for general groups the shift operators aren’t always diagonalizable. Another possible generalization can be done by analyzing tuples of shifts (e.g., $latex {f(x),f(x+w_1),f(x+w_2),f(x+w_1+w_2)}&fg=000000$). In many cases, such analysis cannot be done by standard Fourier analysis, and higher-order Fourier analysis is needed (for more details, see Terry Tao’s excellent survey on this topic).

Open Problems: Returning to the original question at the beginning of the post: are there other bases of functions that will work as well as the Fourier basis? Can we perhaps mimic the success of wavelets in classical harmonic analysis, and use such bases in theoretical computer science?