Demystifying the Fourier magic

Over the years I have gotten used to seeing many theorems in theoretical computer science being proved using discrete Fourier analysis. The Walsh-Fourier (Hadamard) transform of Boolean functions proved to be extremely useful in virtually every subfield of theoretical computer science, including PCPs, property testing, pseudorandomness, and communication complexity. As it turns out, many seemingly hard problems can be solved by writing the Walsh-Fourier expansion and using basic theorems of harmonic analysis.

While I have gotten quite comfortable using Fourier analysis when trying to tackle a problem, and even though I developed a pretty good hunch for which cases Fourier analysis should yield nice results – I have to admit that it took me a long time to begin to unravel the Fourier magic; that is, to understand what is so special about the Walsh-Fourier basis that makes it so powerful.

For the rest of this post, I assume that the readers have some proficiency in Fourier analysis. However, in order to make this post (hopefully) a bit more accessible to readers who are new to harmonic analysis, I would like to dedicate the rest of this paragraph to stating out some basic definitions and facts regrading the Fourier transform. Say we have a real function on the hypercube $latex {f:\mathbb{Z}_2^n \rightarrow \mathbb{R}}&fg=000000$. The Walsh-Fourier expansion of $latex {f}&fg=000000$ is defined as follows:

$latex \displaystyle f(x) = \sum_{S \subseteq [n]}\hat{f}(S)\chi_S(x), \ \ \ \ \ (1)&fg=000000$

where the characters $latex {\chi_S: \mathbb{Z}_2^n \rightarrow \mathbb{R}}&fg=000000$ are defined by

$latex \displaystyle \chi_S(x_1, \ldots, x_n) = (-1)^{\sum_{i \in S} x_i} \ \ \ \ \ (2)&fg=000000$

(note that the summation in Eq. (2) is over $latex {\mathbb{Z}_2^n}&fg=000000$, and that the coefficient of each character $latex {\chi_S}&fg=000000$ is $latex {\hat{f}(S) = \langle f, \chi_S \rangle}&fg=000000$). The way I like to think of the Fourier transform is simply as a change of basis for $latex {f}&fg=000000$, wherein (hopefully) the representation of $latex {f}&fg=000000$ is simpler, and hence easier to analyse. It is also interesting to note that if $latex {f}&fg=000000$ is defined on $latex {\{-1,1\}^n}&fg=000000$ instead on $latex {\mathbb{Z}_2^n}&fg=000000$, then each character $latex {\chi_S: \{-1,1\}^n \rightarrow \mathbb{R}}&fg=000000$ is the monomial $latex {\chi_S(x) = \prod_{i \in S} x_i}&fg=000000$, thus the Fourier expansion is in fact the representation of $latex {f}&fg=000000$ as a multilinear polynomial over $latex {\mathbb{R}}&fg=000000$.

So sure, we have a relatively good understanding of how polynomials behave, and as the foregoing discussion shows, the Fourier transform is a natural way of looking at a function as a multilinear polynomial. But why specifically this base? What is so unique about the base of parities? Are there other bases which are just as effective? Here is my point of view, which I learned from Or Meir: Let $latex {\mathcal{F}}&fg=000000$ be the linear space of functions $latex {f: \mathbb{Z}_2^n\rightarrow\mathbb{R}}&fg=000000$. For every $latex {w\in\mathbb{Z}_2^n}&fg=000000$, consider the linear operator $latex {\sigma_w}&fg=000000$ that maps a function $latex {f \in \mathcal{F}}&fg=000000$ to the function $latex {f’ \in \mathcal{F}}&fg=000000$ such that $latex {f'(x) = f(x+w)}&fg=000000$. Such operators are called shift operators. It turns out that in many natural problems that appear in theoretical computer science (but also in functional analysis, Hardy spaces, the theory of abelian varieties, and the theory of symbolic dynamics) there is a fundamental, underlying need to analyze the effects that such operators have on Boolean functions. Now, a cool property of the Fourier basis (namely, the shift theorem) is the fact that it simultaneously diagonalizes all of the shift operators. Details follow.

Since we are dealing with a finite discrete domain (the Boolean hypercube), we can view the functions in $latex {\mathcal{F}}&fg=000000$ as vectors in a subspace (i.e, $latex {f\in \mathcal{F}}&fg=000000$ can be viewed as a vector $latex {v_f \in \mathbb{R}^{2^n}}&fg=000000$, where the entries of $latex {v_f}&fg=000000$ are the values of the truth table of $latex {f}&fg=000000$). Then, the shift operators are linear operations on vectors, hence they can be viewed as matrices. As we mentioned before, we wish to simplify the representation of the functions and operators that we study, in order to make them easier to analyse. The best we can hope for is to diagonalize the matrix of the operator we inspect, and this is exactly what the Fourier basis does: In the Fourier basis, all of the shift operators are diagonal matrices.

More generally, the Fourier basis diagonalizes the convolution operator, which also underlies the structure of many natural problems in the analysis of Boolean functions. To see the power of the aforementioned diagonalization, let’s look at an important corollary of it: the convolution theorem. Given functions $latex {f,g \in \mathcal{F}}&fg=000000$, their convolution $latex {f * g}&fg=000000$ is also a function in $latex {\mathcal{F}}&fg=000000$, defined by

$latex \displaystyle [f*g](x) = \sum_{y \in \mathbb{Z}_2^n}f(y)g(x-y). \ \ \ \ \ (3)&fg=000000$

The convolution theorem for $latex {\mathbb{Z}_2^n}&fg=000000$ states that the Fourier transform of the pointwise multiplication of two functions is equal to the convolution of their Fourier transforms; that is,

$latex \displaystyle \widehat{f \cdot g} = \hat{f} * \hat{g}. \ \ \ \ \ (4)&fg=000000$

The convolution theorem, as well as the other aforementioned properties of the Fourier transform, makes it very robust for the analysis of Boolean functions. Let me provide an example from my own work (joint with Omer Tamuz). We want to test whether a function $latex {f \in \mathcal{F}}&fg=000000$ is Boolean or not (i.e, whether its image is contained in a set of cardinality 2, say $latex {\{-1,1\}}&fg=000000$). A simple, though crucial, observation is that a function $latex {f \in \mathcal{F}}&fg=000000$ is Boolean if and only if the convolution of $latex {f}&fg=000000$ with itself (which sometimes referred to as the autocorrelation of $latex {f}&fg=000000$) is equal to the delta function (i.e., the function that gets 1 at 0, and gets 0 elsewhere). To see this, note that $latex {f}&fg=000000$ is Boolean if and only if $latex {f^2 = 1}&fg=000000$, apply the convolution theorem, and use the fact that the Fourier transform of the constant 1 function is the delta function. Hence, using the Fourier expansion we are able to give a characterization of Boolean functions that is efficiently testable. This is a general theme: whenever a function is “correlated with its shifted-self”, it begs for a Fourier based approach.

Last, I would like to give a brief taste of “heavier” tools in the same spirit: We can generalize the discussion on efficient representations even further. Fourier analysis is a special case of the representation theory of finite groups. This theory deals with the more general space of functions $latex {f:G\rightarrow \mathbb{C}}&fg=000000$ where $latex {G}&fg=000000$ is a finite group. It allows us to find a basis that makes the analysis of shift operators easier, even though for general groups the shift operators aren’t always diagonalizable. Another possible generalization can be done by analyzing tuples of shifts (e.g., $latex {f(x),f(x+w_1),f(x+w_2),f(x+w_1+w_2)}&fg=000000$). In many cases, such analysis cannot be done by standard Fourier analysis, and higher-order Fourier analysis is needed (for more details, see Terry Tao’s excellent survey on this topic).

Open Problems: Returning to the original question at the beginning of the post: are there other bases of functions that will work as well as the Fourier basis? Can we perhaps mimic the success of wavelets in classical harmonic analysis, and use such bases in theoretical computer science?

The Internship Hunt

Around this time of year many students look for summer internships. Some want work experience so they can land a better job after graduation. Others hope to advance their research or improve their academic qualifications. Everyone appreciates the extra cash.

I’m now in my third year as a theory PhD student. Last summer I interned at a financial company and this summer I will intern at a tech firm. Because I had worked in software a number of years before grad school, I was not looking for job experience. My motivation (besides the extra cash) was to get a taste of working on research projects in industry. Also I find that immersing myself in “real-world” problems from time to time adds breadth and perspective to my theory research.

Whatever your motivations for seeking an internship, here are some suggestions that I’ve found useful.

Start early. This is mostly common sense so I won’t belabor the point, but consider this: crunch time at the end of the semester is busy enough without having to prepare for tech interviews, crank out a programming exercise, or schedule on-site visits. What counts as early varies, so ask the places you’re interested in.

Use your network. As pointed out in Lora Oehlberg’s recent blog post, your personal network is the first place to look for job opportunities. My first internship in high school, my first job out of college, and my internship from last summer all came through personal connections. People you know who work at a company you’re interested in can help you through the process and even vouch for you. If you know people who’ve done an internship you’re thinking about, talk to them about what to expect.

Practice. I was nearly eliminated from consideration for my internship last summer because I wasn’t prepared for the first tech interview. A friend of mine in the entertainment software industry told me that whenever he goes on the job market, he expects to flub his first interview from lack of practice. I’m not suggesting you should plan to flub interviews; just know that doing well on a tech interview takes mental readiness.

How do you practice? This is where advice from your connections (see previous point) comes in handy, especially from fellow students who have gone through the process before. I’ve found that working through online programming contests or problems at Project Euler helps get me into game shape.

Be candid and upfront. It’s tempting to be cagey with potential employers about competing offers, availability for the summer, or other constraints. Don’t do this. Last spring I interviewed with two companies and got two offers. I was forthcoming about this during the process, so when I had to turn one company down, I was able to keep the door open for the future. That company offered me a position for this summer, and the recruiter specifically mentioned my candor as a factor.

Do you have ideas for successful internship hunting? Cool internship experiences you want to share? Let us know in the comments.

Welcome Theory Student Bloggers

Several new bloggers will be joining the XRDS student blog over the next few weeks, expanding its scope to theory related stuff and beyond! Welcome, and looking forward to your posts (posts tagged “Theory” will appear on the theory of computing blog aggregator).

The XRDS blog has been up and running for six months now, with posts written by and for CS students on a range of topics (security, HCI, being a post-doc in Paris to name a few…). The motivation behind the blog is similar to the one described here – to help carry the conversation for the student community, a place to share thoughts and get up to date. If you like the idea and would like to support this initiative, please add us to your blogroll.

Tomorrow – internships!

Big Issue on Big Data – Theory Highlights

We’d like to invite you to check out the following theory-oriented articles in our Fall 2012 issue dedicated to Big Data:

This issue of XRDS was edited by a multidisciplinary team of students – Aditya Parameswaran of Stanford, Andrew Cron of Duke and Huy L. Nguyen of Princeton. From the editors’ letter:

“It has been an interesting time for big data with innovations coming simultaneously from theorists, system builders, and scientists or application designers. We hope to provide readers with an idea of the interplay between developments in these three different communities … [that] together drive forward the development of big data analysis.”

Hope you enjoy reading this issue (feel free to let us know what you think in the comments)!