**Shafi Goldwasser **for the Motwani colloquium, telling us about *Pseudo Deterministic Algorithms and Proofs, ***Avishay Tal ** about *Oracle Separation of BQP and the Polynomial Hierarchy, and ***Badih Ghazi **about *Resource-Efficient Common Randomness and Secret Key Generation. *We will also have student talks, food and drink and a great and diverse group of theoreticians as usual.

I will devote several posts (by myself and others) to the (beautiful) “emerging theory of algorithmic fairness.” Most of these posts will be more technical, but I’d like to devote today’s post to a short discussion of what theoreticians can contribute to this multidisciplinary effort.

My own belief is that computer scientists cannot solve Algorithmic Fairness (and privacy in data analysis or any other issue of this sort) on their own. On the other hand, these issues, in their current computation-driven large-scale incarnation, cannot be seriously addressed without major involvement of computer scientists. Furthermore, what is needed (as I will try to demonstrate in future posts) is a true collaboration, rather than a division of work, where one community sub-contracts another for specific expertise.

One of the reasons the Theory of Computing is particularly suited to this challenge is our basic optimism in the face of complexities and even impossibilities. The topic of Algorithmic Fairness seems to be particularly entangled with such complexities. This is the source of a line of criticism on the inherent limitations of the “tech solutionist” approach to Algorithmic Fairness. For example, “discrimination is the result of biases in the data and cannot be addressed at the level of machine learning.” Another example: “unless we understand the causal structure we are analyzing, fairness cannot be obtained.” These criticisms (while not as devastating as they are sometimes presented) are not without merit, and they deserve a much more technical discussion (that will hopefully come in future posts). At this point I’d like to make two comments:

- The computational lens has served us well in the study of Cryptography, Game Theory, Learning , Privacy and beyond. There is already evidence that it is serving us well in the study of Algorithmic Fairness. I believe that the pessimistic view of what I would call “all-or-nothing-ism” ignores an incredible track record of Theory of Computing in addressing complicated human-involving subject areas, and ignores the progress already made on Algorithmic Fairness.
- Furthermore, no one is planning to stop analyzing data (for example in medical research) because our data is imperfect or because we didn’t figure out causality, Algorithmic Fairness requires both the best solutions we can come up with right now, and a concerted research effort to guarantee better fairness in the future.

While all too common, the term “technologists” in this context is unfortunate. Who are those mysterious “technologists?” Are they software engineers? Are they computer scientists? (and which sub area: Machine Learning? Theory? Others?) Or perhaps CEOs of technology companies? Or perhaps this refers to the investment firms and Wall Street, who seem to have such a huge sway over technology companies? Perhaps users of technology are to blame? Each of those is a completely different group of individuals with completely different sets of constraints and incentives. Lumping them all together is close to meaningless.

In a sequence of posts (by me and others and of increasing level of technical details), I hope to discuss the role of Theory of Computing in the study of the particularly important societal issue of Algorithmic Fairness. In this post, I’d like to briefly discuss the role of Academia more generally.

**The power and weakness of education**

An idea that is getting traction is that ethics and the societal impact of computation should be embedded in essentially all Computer Science courses. I am all for it! (In fact, ethics should be a major part of every curriculum on campus, not just Computer Science). As these days a huge fraction of students take some Computer Science courses, this will improve the awareness of technology consumers to ethics in computation. It will also improve the awareness of software engineers and eventually also the leadership of technology companies and as importantly that of policy makers.

But awareness, in itself, may not have much of an impact. Software engineers often have very little flexibility in shaping the products they develop, even when it comes to topics that more clearly affect the bottom line of their companies (this has to do with the quick pace and incentive structure of companies). Even the most philanthropic CEOs seem to run companies that violate basic ethical considerations. Here too, the incentive structure is much more to blame than lack of awareness. And even consumers that want to punish violators, often do not, as many software companies are to a large extent a monopoly. In other cases, violators operate behind the scenes, hidden from consumers.

**Developing the Knowledge and Tools **

I would also add that topics like privacy and algorithmic fairness require significant sophistication and much of the required knowledge and tools are yet to be developed. This means that academia (and funding agencies) should perform and support much more research. But (big) companies (that make their living exploiting sensitive data) should also hire many more researchers (of various disciplines) to develop the tools they need.

The great breakthroughs in Machine Learning within industry did not occur because the employees of those companies increased their awareness to the importance of data analysis. It happened because those companies employed talented and knowledgeable individuals and poured a lot of money into machine learning. Unless companies invest much more resources in their ethics, we are going to see the same recurring failures in protecting their users.

**Regulations**

As we already mentioned that users are very limited in punishing big companies, it is unlikely that we will see the needed investment across the board (some companies are much better in this regard, but those companies are the exception rather than the rule). In addition to education, we need to enforce good behavior through legislation and regulation. Unfortunately, the direction of the current administration is to remove protections for consumers. Still, we can hope that Europe (as well as some of the more progressive U.S. states), will come to our rescue once again. As far of the role of scientists, we should work with policy makers to develop and advocate for the “right” regulations.

]]>

*Search problem* means that we’re looking for something. *Total *means that what we’re looking for is guaranteed to exist. A famous example is Nash equilibrium: every game has at least one equilibrium, but [Daskalakis, Goldberg, Papadimitriou 2009] proved that it is PPAD-hard to find any of them.

A specific focus of this workshop is on connections to different sub-fields of Theory of Computer Science. We’ve seen exciting progress on those recently, and we hope the workshop can further bridge together all of the above (actually, all of TCS). Which brings me to my next point…

You!

We already have some fantastic speakers confirmed (schedule and workshop website coming soon), including an opening overview talk by Costis Daskalakis, who recently received the Nevalinna Prize (in part) for his work on total search problems.

By the way, if you have something interesting to tell the community about total search problems, and we haven’t contacted you yet about giving a talk, please let us know. We can probably still accommodate you in the schedule, even if you don’t have a Nevalinna Prize.

Looking forward to seeing y’all there – it will be totally awesome!

(Sorry- I couldn’t resist the bad pun…)

]]>

By Scarlett Sozzani

In response to and in support of recent activism around sexual assault and inclusivity at large, I want to take an opportunity to argue that issues of harassment, discrimination, bullying, and other egregiously insupportable actions can only exist when the victims are perceived to be weak, vulnerable, and powerless. And this kind of perception is often (and unwittingly) perpetuated through microagressions by many members of our academic community. Even though microaggressions are arguably even more frequent than outright forms of discrimination and harassment, this issue remains largely unaddressed.

Microaggression is formally defined (on dictionary.com) as:

*a statement, action, or incident regarded as an instance of indirect, subtle, or unintentional discrimination against members of a marginalized group such as a racial or ethnic minority.*

A microaggression is difficult to identify because it is so subtle – what one person may consider a microaggression may seem like merely a rude or tactless comment to another person. And comments that do not overtly mention race, gender, class, sexual orientation, etc. are more difficult to directly attribute as an act of microaggression. Furthermore, microaggressions are sometimes unintentional, so the perpetrator might not even realize they are committing a microaggression against someone. It’s a very personal judgment, so perhaps a good question to ask is: “What is the likelihood that the perpetrator would have made this same comment to a person who identifies with the privileged majority?”

Microagressions also come in many forms: not just in words, but also in tone, attitude, gestures, writing, and in all forms of interaction. The accumulated damage over many instances, over time, cannot be understated. It elicits an intuitive response in the receivers of such microaggressions – a nagging feeling of self-doubt that one doesn’t belong, or isn’t good enough, or isn’t as good as the rest of the people in the room.

And beyond the predominantly discriminative definition of a microaggression, I would argue that any action that makes a person feel like their contributions are not valuable, and that they are not good enough to be standing where they are, is counterproductive to the collective aspirations of a community, especially an ambitious, high-flying research community.

Here are some examples of a few microagressions that I have felt in my very short time as a graduate student and in my various roles as a colleague, advisee, collaborator, and teaching assistant.

-“It’s a hard paper to read, especially if you don’t have the necessary background.”

-“You should be able to get the fellowship, right? You’re a young girl.”

-“How is that not what I just said?”

-“I erased your Piazza post answering a student’s ask for supplementary materials because I didn’t like the paper you referenced.”

-And the classic: “Hey guys…”

Let’s all aim to do a little bit better. Perhaps even go above and beyond in pushing against the current by acknowledging, highlighting, and talking about hard-earned and worthy contributions from our under-represented mentees and colleagues.

]]>(See this video and this Quanta magazine article for more.)

In this post we want to celebrate another aspect of Costis’s work, on tackling statistical and modeling questions at the intersection of statistics, machine learning, and theory. Indeed, over the past years Costis and his students and collaborators have been at the forefront of some fundamental, yet quite topical questions: *how to make sense of data when we have too little of it, or too little time?*

On topics ranging from the daunting “curse of dimensionality” (how to say something meaningful about high-dimensional data, given limited computational power and/or observations? Under which assumptions, and in which scenarii can one still have a principled and sound approach to hypothesis testing, or density estimation, in this case?) to societal issues such as the tension between efficiency and privacy in hypothesis testing (is such a tension even necessary, or can we sometimes get differential privacy at no cost?), while exploring applications to biology and inference on genomic data, Costis’ contributions to these broad questions have been many.

Eagerly waiting for the next breakthroughs, once again — congratulations on this well-deserved award!

*(Image credit: Sarah A. King, from the MIT Technology Review.)*

Given strings and of characters each, the textbook dynamic programming algorithm finds their edit distance in time (if you haven’t seen this in your undergrad algorithms class, consider asking your university for a refund on your tuition). Recent complexity breakthroughs [1][2] show that under plausible assumptions like SETH, quadratic time is almost optimal for exact algorithms. This is too bad, because we like to compute edit distance between very long strings, like entire genomes of two organisms (see also this article in Quanta). There is a sequence of near-linear time algorithms with improved approximation factors [1][2][3][4], but until now the state of the art was polylogarithmic; actually for near-linear time, this is still the state of the art:

**Open question 1**: Is there a constant-factor approximation to edit distance that runs in near-linear time?

Here is a sketch of *an *algorithm. It is somewhat different from the algorithm in the paper because I wrote this post before finding the full paper online.

We partition each string into *windows*, or consecutive substrings of length each. We then restrict our attention to *window-compatible* matchings: that is, instead of looking for the globally optimum way to transform to , we look for a partial matching between the – and -windows, and transform each -window to its matching -windows (unmatched -windows are deleted, and unmatched -windows are inserted). It turns out that restricting to window-compatible matchings is almost without loss of generality.

In order to find the optimum window-compatible matching, we can find the distances between every pair of windows, and then use a (weighted) dynamic program of size . The reason I call it “Step 0” is because so far we made zero progress on running time: we still have to compute the edit distance between pairs, and each computation takes time , so time in total.

Approximating all the pairwise distances reduces to the following problem: given threshold , compute the bipartite graph over the windows, where two windows and share an edge if . In fact it suffices to compute an approximate , where and may share an edge even if their edit distance is a little more than .

**New Goal**: Compute faster than naively computing all pairwise edit distances.

While there are many edges in , say average degree : Draw a random edge , and let be two other neighbors of , respectively. Applying the **triangle inequality** (twice), we have that , so we can immediately add to . In expectation, have neighbors each, so we discovered a total of pairs; of which we expect that roughly correspond to *new* edges in . Repeat at most times until we discovered almost all the edges in . Notice that each iteration took us time (computing all the edit distances from and ); hence in total only . Thus we reduced to the sparse case in truly subquadratic time.

The algorithm up to this point is actually due to a recent paper by Boroujeni et al; for the case when is relatively sparse, they use Grover Search to discover all the remaining edges in quantum subquadratic time. It remains to see how to do it classically…

The main observation we need for this part is that if windows and are close, then in an optimum window-compatible matching they are probably not matched to -windows that are very far apart. And in the rare event that they are matched to far-apart -windows, the cost of inserting so many characters between and outweighs the cost of completely replacing if we had to. So once we have a candidate list of -windows that might match to, it is safe to only search for good matches for around each of those candidates. But when the graph is sparse, we have such a short list: the neighbors of !

We have to be a bit careful: for example, it is possible that is not matched to any of its neighbors in . But if we sample enough ‘s from some interval around , then either (i) at least one of them is matched to a neighbor in ; or (ii) doesn’t contribute much to reducing the edit distance for this interval, so it’s OK to miss some of those edges.

On the back of my virtual envelope, I think the above ideas give a -approximation. But as far as genomes go, up to a -approximation, you’re as closely related to your dog as to a banana. So it would be great to improve the approximation factor:

**Open question 2**: Is there a -approximation algorithm for edit distance in truly subquadratic (or near-linear) time?

Note that only the sparsification step loses more than in approximation. Also, none of the existing fine-grained hardness results rule out an -approximation, even in linear time!

]]>In the same vein, Michael Ekstrand and Michael Veale (the Publicity Chairs for FAT* 2019) have asked me to disseminate the following announcement and CFP.

———–

We are pleased to announce the Call for Papers for the 2019 ACM Conference on Fairness, Accountability, and Transparency (FAT*), to be held in Atlanta, Georgia in January/February 2019.

FAT* is an interdisciplinary conference to connect social, technical and policy domains around broad questions of fairness, accountability and transparency of machine learning, information retrieval, and other computing systems. The conference this year features tracks on Theory And Security, Statistics, Machine Learning, and Data Mining. The inaugural conference at NYU in February 2018 had an acceptance rate of 25% and was sold-out, with 450 international attendees from across academia, industry and public policy.

Papers (8-10 pages, due August 23) are double-blind peer reviewed and published in conference proceedings in the ACM Digital Library. Authors can also opt for non-archival submission, subject to the same review process but only appearing as an abstract in the proceedings. The theoretical computer science community has been involved in work on algorithmic fairness since its inception, and we hope that you’ll consider FAT* as a venue for your work.

Please forward this call to other people or groups you think may be interested.

For more details, see https://fatconference.org/2019/cfp.html

]]>1. For quantum communication, we lose a quadratic factor (corresponding to Grover’s search), i.e. our lower bound is only . But I don’t know how to use Grover’s search to improve over the naive upper bound. So, is the quantum communication of approximate Nash equilibrium closer to linear or quadratic?

Before we discuss why quantum communication of approximate Nash is interesting, it is helpful to first recall some game theory, and in particular remind ourselves why the classical (randomized) communication complexity of approximate Nash is important. Briefly, a two-player game is described by two matrices ; if Alice and Bob play actions , their payoffs are and , respectively. Typically, they want to use randomized strategies (called *mixed strategies*). We say that Alice and Bob are at an (approximate) Nash equilibrium, each player’s strategy is (approximately) best-response to other player’s strategy. I.e. once players are at an equilibrium, they may never want to leave it. The big question is how do they get there in the first place?

A common approach to this question is to look for plausible *dynamics*, or procedures by which players update their strategies, and which guarantee convergence to Nash equilibrium. Defining “plausible” is a fascinating philosophical discussion far beyond the scope of this post, but two useful desiderata are: (i) *uncoupled dynamics*, namely each player knows only her own payoff matrix and the history of the game — this rules out the trivial dynamics where players start at a Nash equilibrium; and (ii) *efficient dynamics*, namely the dynamics must converge faster than it would take Alice to communicate her entire payoff matrix to Bob. Those are certainly not sufficient conditions for plausibility of dynamics, but our communication lower bound rules out *any* efficient uncoupled dynamics.

So, what is a natural model of quantum uncoupled dynamics? I was confused about this for a few weeks: people have studied quantum games where players’ actions are described by qubits, and the payoffs are determined based on their measurements. But this is a generalization of classical games, where we already know that the problem is hard. So I asked Shalev again, and he had another nice observation: the players can still send classical bits (aka play classical actions) — they merely need to share entangled qubits and measure them before deciding what strategies to play. (But admittedly this might not work if the police decide to search the Dilemma Prisoners for entangled qubits before their interrogation…)

One last motivational comment: my very superficial understanding of the real-world feasibility of all this stuff is that while there is a lot of buzz around the race toward the first quantum *computer* that may or may not be able to execute a “hello world!” program [1][2][3], quantum *communication* already allows cross-continental video conferences…

While writing this post, I realized that there is an entire literature on various ways to entangle quantum with game theory. My favorite is this paper by Alan Deckelbaum about **quantum correlated equilibrium**. Correlated equilibrium is a generalization of Nash equilibrium where a trusted *coordinating device* suggests to Alice and Bob pairs of actions drawn from a joint (correlated) distribution. The requirement is that Alice, after seeing the action the coordinating device suggested for her (but not the one for Bob), has no incentive to deviate from the suggestion. It is known that natural no-internal-regret dynamics converge to the *set* of correlated equilibria. But without the trusted coordinating device the players are still incentivized to keep modifying their strategies. (By the way, the classical communication complexity of approximate correlated equilibrium in two-player games is still open!)

Anyway, Deckelbaum points out that any correlated distribution can be simulated using quantum entanglement. Can quantum entanglement replace the trusted coordinating device? Sometimes, but there is a catch: sampling from the correlated distribution using quantum measurements requires players to cooperate with the sampling protocol. Specifically, Deckelbaum shows that some games have correlated equilibria that cannot be truthfully sampled with quantum entanglement, i.e. one of the players has an incentive to deviate from any sampling protocol. He defines quantum correlated equilibria as those that can be sampled truthfully, and asks what is the computational complexity of finding one. (Note that this is an easier question than the PPAD-complete Nash equilibrium, and harder than the polynomial-time correlated equilibrium.) So here is yet another nice question:

2. What is the (randomized/quantum) communication complexity of finding an approximate quantum correlated equilibrium?

]]>Success in your career will be determined more by your weaknesses than

by your strengths. Thus, if you imagine plotting a sequence of scores

that rate your ability to carry out different kinds of tasks, it’s

far better to have a high minimum than a high maximum. Try to identify

your weaknesses and to overcome as many as you can.

Every large project has parts that are fun and parts that are dull.

Learn to get through the dull parts. Never postpone a

distasteful-but-necessary portion of work-to-be-done, unless

there’s a very good reason why you’ll be able to do it better later.

Niels Hendrik Abel gave wonderful advice: “Read the masters!”

Take the time to read lots of papers that were written by top researchers

when they were first discovering important ideas. Study the works of

great computer scientists, and do your best to understand their mindset.

In order to do this well, you’ll have to learn how to put yourself into

their place — remembering what they knew and didn’t know at the time,

and adjusting to their terminology and notation. The exercise of “getting

inside another person’s head” is, in itself, extremely valuable for

building your own mental skills.

Here’s a trick that I often use when reading a technical book or paper:

After the author has stated a problem to be solved (or a theorem to be

proved, etc.), I cover up the text and spend some time trying to solve

that problem by myself. Similarly, before turning the page, I try to

guess what’s on the next page. Of course I usually fail … but even

in failure, I’m much more ready to understand the author’s solution,

than if I hadn’t tried it first. Furthermore, with this modus operandi

I’m repeatedly learning new ways to get past stumbling blocks.

Instead of promoting yourself aggressively, you should try to write so

well that others can readily see for themselves the value of what you’ve

done. Then they’ll spontaneously also tell their friends, and the

word will spread. On the other hand, if a good writer comes to you and

wants to publish an account of your work, it never hurts to have a

good “press agent”.

PS. (from Don) re “reading the masters”

“The purpose of … reading is precisely to suspend one’s mind

in the workings of another sensibility”.

— Guy Davenport, quoted in Harvard Magazine Nov-Dec 2017, p54