]]>

As a first post in what may (or may not) turn out to be a new sequence titled “questions we should have solved by now,” lets see how far we can simplify the problem of explicit UTS and still *not* solve it. Specifically, what if our graph has a small diameter. For example, expander graphs have logarithmic diameter, which should make the problem much easier shouldn’t it? Well, the shortest UTS for expanders is still the -long one based on Nisan’s generators. Well, if logarithmic diameter doesn’t save us, perhaps constant diameter? How about diameter one? Yes, can we have a UTS for a clique? You may be thinking – the clique on $n$ vertices is a *single* graph, how hard can it be to come up with a UTS for a single graph? Turns out that it is not that easy. The shortest explicit UTS we know for the clique is … wait for it, wait for it … still the -long one based on Nisan’s generators! How come? There is indeed only one clique, but the there can be many possible ways to label the edges of the clique and the UTS should cover the clique for all such labellings.

Of course, we all know how difficult simple-looking problems can really be, but please share with us your favorite problems that “should have been solved by now.”

]]>

- Moses Charikar joined in Fall 2015.
- Omer Reingold and Mary Wootters joined in Fall 2016.
- And we have three brand-new hires: Tengyu Ma (joint with Statistics), starting in the fall; and Aviad Rubinstein and Li-Yang Tan, both starting in 2018.

(Greg Valiant, who’s been here all of four years, now counts as an old-timer.)

Meanwhile, our MS&E department recently hired Aaron Sidford, who started in Fall 2016 and also holds a courtesy appointment in computer science, and our math department hired Jan Vondrak!

We’ve also started a new theory postdoc program, the Motwani Postdoctoral Fellowship. We have four postdocs starting next year: Clément Canonne, Rad Niazadeh, Alistair Stewart, and Avishay Tal.

Welcome to all the new members of the Stanford theory group!

]]>

Let me first loosely define PRFs: Consider for example a family of Boolean functions on -bit strings that is efficient – it is easy (polynomial time in ) to sample a key that describes a random function from the family and given and it is easy to evaluate (in fact, the definition only makes sense if we have an ensemble of such families, one for every value of ). Such a family is pseudorandom if it is hard to distinguish from a completely random (uniformly chosen) Boolean function on bits. In other words, a distibguisher that gets black-box (oracle) access to a function cannot tell if it is or a completely random function.

PRFs allow parties to effectively share exponential number of pseudorandom bits (by sharing the parties get access to the exponentially-many outputs of ). Very powerful. Yet, GGM had to explain why Kolmogorov-Complexity based definitions are different, why PRFs are different from one-way functions and why PRFs are different from pseudorandom-bit generators. I can’t imagine any paper using PRFs today that would have to explain such basic distinctions. But even more interestingly, GGM goes to lengths to explain why PRFs are different from the following stateful simulation of a black-box random function: Assume that every past query and its answer is recorded. For every new query check if the query was previously asked and if so answer consistently, otherwise answer with a random bit and record the new pair. While I imagine that such simulation could be useful in some cases, the possibility of such simulation takes nothing from the importance of PRFs. Obvious, right? Well, it clearly wasn’t obvious to an anonymous reviewer when the paper was first submitted (and rejected).

The moral of the story is not that conference committees can be stupid, but rather that like in any other area of life, the work of the pioneers is hard. They need to justify and convince by explaining things that in retrospect seem obvious. And perhaps it is OK. After all, for every GGM there are plenty of papers offering ridiculous new notions. Nevertheless, where would we be without those innovators that keep on breaking new ground for the rest of us to follow?

]]>

Say are -sided dice, i.e. each takes values in . We say that beats if . are

intransitiveif beats , beats and beats .

It is somewhat counterintuitive that this can happen and I’ve used this paradox while introducing probability to undergraduates. Now the question:

are randomly chosen. If beats , beats , what is the likelihood that beats ? Experiments suggest that the probability tends to . The goal of the polymath project is to prove this.

The original intransitive dice were invented by Brad Efron (in the Statistics department here at Stanford) and the phenomenon was first noted by Martin Gardner in 1970, but this probability question was discussed only recently, in a paper from 2013.

Since suggesting the problem, the intransitive dice project has quickly gathered steam and is now onto to its 5th post. (If you haven’t followed a polymath project before, each post discusses the current state and has several comments where people collaboratively make progress on the questions being studied. This particular project is one that theoretical computer scientists could easily contribute to and it’s not too late to jump in! At this point, it looks like they are looking for a local central limit theorem for a random walk on with precise bounds on error terms.

Many years ago, I spent some time on the polymath project on the Erdos Discrepancy Problem. It can be exciting, but it is a huge time sink if you get seriously involved. One of the most fun things about participating is getting a glimpse of how different people think and what mental shortcuts they use — this is the sort of stuff you rarely get out of reading papers.

]]>

The celebrated pigeonhole principle says that if we have disjoint sets each of size , then (the number of sets) is at most .

Suppose that instead of being disjoint, we require that every pair of sets *has intersection at most *. How much can this change the maximum number of sets ? Is it still roughly , or could the number be much larger? Interestingly, it turns out that there is a sharp phase transition at ; this is actually the key fact driving the second author’s recent paper on planted clique lower bounds (see the end of this post for some more details on this). These set systems are known as “combinatorial designs” and are heavily studied.

The phase transition is as follows: if , then (as is the case with disjoint sets), while if then . Remarkably, even though there can be at most sets of size , it is possible to have as many as sets of size (if for prime ).

To be a bit more formal, let be the maximum number of subsets of of size with pairwise intersection sizes at most (i.e., and for ). Assume that and . Then,

for , and

for .

Below we will show how to give upper and lower bounds on in the case where . (The case is a relatively straightforward application of inclusion-exclusion together with pigeonhole.)

Suppose we have sets . For each , we will define to be all the sets that contain . Namely, . Note that all sets in have a pairwise intersection at the element and thus must be non-intersecting in . Thus, by pigeonhole, . Further, note that each occurs in distinct sets . Putting these together, we thus have

.

This yields , and hence .

Intuitively, organize into a rectangle with rows and columns (put any remaining elements off to the side). See figure above. Note that all columns of the rectangles are sets of size without any intersections and constitute sets. Further, we can add the sets composed of the diagonals and anti-diagonals (wrapping around on the left and right sides of the rectangle) which are sets of size without any pairwise intersections of size more than one. This will get us to sets. We could attempt to create sets from more “generalized diagonals” where we start at the top of the rectangle and go horizontally by columns for every row that we go down (and wrap around on the left and right sides). However, if is divisible by , these “generalized diagonals” would intersect some columns in two places instead of one.

However, if is prime, then all such “generalized diagonals” will intersect each other in at most one point. More precisely, for every , we can create a set by taking the elements of the rectangle for every row . There will be such sets that will all have size and will have pairwise intersections of size at most one since linear functions have at most one intersection over finite fields.

By Bertrand’s Postulate and since , there is some prime number such that . We can use this as number of columns in a rectangle. Note that we need which is satisfied since . The number of “generalized diagonals” is as mentioned above. Thus, we have explicitly constructed sets of size with pairwise intersections at most one where

and hence as claimed.

In general, what if we allow the sets to have intersections of size ? The above sections study the case where , but the basic arguments generalize to larger values of as well. In particular, we have

for ,

for

where is the maximum number of possible subsets of of size with pairwise intersections at most . Note that this leaves open the question of what happens for ; in fact, there appears to be a phase transition around , but that is beyond the scope of this post.

The fact that there can be a large number of sets with small intersection when is actually a large part of the motivation behind this recent paper on a lower bound for the semi-random planted clique problem. To quote from the abstract, in the *semi-random *model, first “a set of vertices is chosen and all edges in are included; then, edges between and the rest of the graph are included with probability , while edges not touching are allowed to vary arbitrarily.”

Clearly, if the clique has size then it is possible to create identical cliques (since we can choose the edges between the remaining vertices arbitrarily). To handle this ambiguity, we also specify one vertex which is guaranteed to be in the planted clique . If then this is enough to uniquely identify the planted clique with high probability.

However, if , then it is possible to create cliques such that every vertex lies in exactly two cliques, and all cliques intersect in at most element. Moreover, if we choose the cliques appropriately then it is information-theoretically difficult to tell which of the cliques was the original planted clique . As a result, no matter which vertex is specified, we cannot recover with probability greater than (since we can’t distinguish from the other clique that belongs to).

Here, having an intersection size of (rather than a much larger intersection) is key, because this makes it possible to “steal” from the random edges between and to create additional cliques that intersect with . If the intersection size was instead , then with high probability there would be no subset of vertices of that had a common neighbor outside of , and so there would be no way to create a clique that intersected in elements.

]]>

- The Dish is a favorite hiking attraction in the Stanford foothills. It’s also a powerful communication device – an inspiration to our Blog.
- Photo of the dish due to Linda A. Cicero | Stanford News Service (image here is cropped).
- Group photo due to our own Stanford CS’s Hector Garcia-Molina.

Hope to see many of you in the 2nd TOCA-SV Day, this Friday.

]]>