Unfortunately, but non-surprisingly, the conference will be virtual this year. But I’m sure that, thanks to Aaron Roth and to the inaugural PC, we will make the best what’s possible and have a great event.

]]>In this post, we’ll discuss four works about secure distributed computation. First, we’ll talk about a method of using MDS (maximum distance separable) error correcting codes to add security and privacy to general data storage (“Cross Subspace Alignment and the Asymptotic Capacity of X-Secure T-Private Information Retrieval” by Jia, Sun, Jafar).

Then we’ll discuss method of adapting a coding strategy for straggler mitigation (“Polynomial codes: an optimal design for high-dimensional coded matrix multiplication” by Yu, Qian, Maddah-Ali, Avestimehr) in matrix multiplication to instead add security or privacy (“On the capacity of secure distributed matrix multiplication” by Chang, Tandon and “Private Coded Matrix Multiplication” by Kim, Yang, Lee)

Throughout this post we will use variations on the following communication model:

The data in the grey box is only given to the master, so workers only have access to what they receive (via green arrows). Later on we will also suppose the workers have a shared library not available to the master. The workers do not communicate with each other as part of the computation, but we want to prevent them from figuring out anything about the data if they do talk to each other.

This model is related to *private computation* but not exactly the same. We assume the servers are “honest but curious”, meaning they won’t introduce malicious computations. We also only require the master to receive the final result, and don’t need to protect any data from the master. This is close to the BGW scheme ([Ben-Or, Goldwasser, Wigderson ’88]), but we do not allow workers to communicate with each other as part of the computation of the result.

We consider *unconditional* or *information-theoretic* security, meaning the data is protected even if the workers have unbounded computational power. Furthermore, we will consider having *perfect secrecy*, in which the mutual information between the information revealed to the workers and the actual messages is zero.

Before we get into matrix-matrix multiplication, consider the problem of storing information on the workers to be retrieved by the master, such that it is “protected.” What do we mean by that? [Jia, Sun, and Jafar ’19] define X-secure T-private information retrieval as follows:

Let be a data set of messages, such that each consists of random bits. A storage scheme of on nodes is

1.

X-secureif any set of up to servers cannot determine anything about any and2.

[Jia, Sun, and Jafar ’19]T-privateif given a query from the user to retrieve some data element , any set of up to users cannot determine the value of .

Letting be the set of queries sent to each node and be the information stored on each node (all vectors of length L), we depict this as:

The information theoretic requirements of this system to be correct can be summarized as follows (using notation for set ):

Property | Information Theoretic Requirement |

Data messages are size bits | |

Data messages are independent | |

Data can be determined from the stored information | |

User has no prior knowledge of server data | |

X-Security | , |

T-Privacy | |

Nodes answer only based on their data and received query | |

User can decode desired message from answers |

Given these constraints, Jia et al. give bounds on the capacity of the system. Capacity is the maximum rate achievable, where rate is defined as bits requested by the worker (, the length of a single message) divided by the number of bits downloaded by the worker. The bounds are in terms of the capacity of T-Private Information Retrieval, (which is the same as the above definition, with only requirement 2).

If then for arbitrary , .

When :

[Jia, Sun, and Jafar ’19]

Jia et al. give schemes that achieve these bounds while preserving the privacy and security constraints by introducing random noise vectors into how data is stored and queries are constructed. The general scheme for uses *cross subspace alignment*, which essentially chooses how to construct the stored information and the queries such that the added noise mostly “cancels out” when the master combines all the response from the servers. The scheme for is straightforward to explain, and demonstrates the idea of using error correcting codes that treat the random values as the message and the actual data as the “noise.”

For this scheme, the message length is set to (the number of nodes , minus the maximum number of colluding servers ). First, we generate random bit vectors of length :

Next, apply an MDS code to to get , which are encoded vectors of length :

For our data , we pad each vector with zeros to get of length :

Now that the dimensions line up, we can add the two together and store each column at the node:

To access the data, the user downloads all bits. The length string downloaded from row can be used to decode : are all zero, so columns through have the values of . This gives the user values from the MDS code used on each row, so they can decode and get and . Then a subtraction from the downloaded data gives . Because of the MDS property of the code used to get , this scheme is X-secure and because the user downloads all bits, it is T-private.

We now move on to the task of matrix-matrix multiplication. The methods for secure and private distributed matrix multiplication we will discuss shortly are based on *polynomial codes*, used by [Yu, Maddah-Ali, Avestimehr ’17] for doing distributed matrix multiplications robust to stragglers. Suppose the master has matrices and for some finite field , and . Assume and are divisible by , so we can represent the matrices divided into submatrices:

and

So to recover , the master needs each entry of:

The key idea of polynomial codes is to encode and in polynomials and to be sent to the worker, where they are multiplied and the result is returned. The goal of Yu et al. was to create robustness to stragglers, and so they add redundancy in this process so that not all workers need to return a result for the master to be able to determine . In particular, only returned values are needed, so servers can be slow or fail completely without hurting the computation. This method can be thought of as setting up the encodings of and so that the resulting multiplications are evaluations of a polynomial with coefficients at different points — equivalent to a Reed-Solomon code.

This idea is adapted by [Chang, Tandon ’18] to protect the data from colluding servers: noise is incorporated into the encodings such that the number of encoded matrices required to determine anything about the data is greater than the security threshold . Since the master receives all responses it is able to decode the result of , but no set of nodes can decode , , or . Similarly, [Kim, Yang, Li ’19] adapts this idea to impose privacy on a matrix-matrix multiplication: workers are assumed to have a shared library , and the user would like to multiply for some without revealing the value of to the workers. The workers encode the entire library such that when the encoding is multiplied by an encoded input from the master, the result is useful to the master in decoding .

Chang and Tandon consider the following two privacy models, where up to servers may collude. The master also has (and in the second model, ), which are matrices of random values with the same dimensions as (and ). These are used in creating the encodings (and ).

is public, is private:

Both private:

Kim, Yang, and Lee take a similar approach of applying the method of polynomial code to *private* matrix multiplication. As before, there are workers, but now the master wants to multiply with some in shared library (all the workers have the shared library).

Since the master isn’t itself encoding it has to tell the workers how to encode the library so that it can reconstruct the desired product. This is done by having the master tell the workers what values of they should use to evaluate the polynomial that corresponds to encoding each library matrix. We denote the encoding of the library done by each worker as the multivariate polynomial which is evaluated at and the node-specific vector to get the node’s encoding, . The worker multiplies this with the encoding of it receives, and returns the resulting value . All together, we get the following communication model:

As we’ve seen, coding techniques originally designed to add redundancy and protect against data loss can also be used to intentionally incorporate noise for data protection. In particular, this can be done when out-sourcing matrix multiplications, making it a useful technique in many data processing and machine learning applications.

References:

- Jia, Zhuqing, Hua Sun, and Syed Ali Jafar. “Cross Subspace Alignment and the Asymptotic Capacity of X-Secure T-Private Information Retrieval.”
*IEEE Transactions on Information Theory*65.9 (2019): 5783-5798. - Yu, Qian, Mohammad Maddah-Ali, and Salman Avestimehr. “Polynomial codes: an optimal design for high-dimensional coded matrix multiplication.”
*Advances in Neural Information Processing Systems*. 2017. - Chang, Wei-Ting, and Ravi Tandon. “On the capacity of secure distributed matrix multiplication.”
*2018 IEEE Global Communications Conference (GLOBECOM)*. IEEE, 2018. - Kim, Minchul, Heecheol Yang, and Jungwoo Lee. “Private Coded Matrix Multiplication.”
*IEEE Transactions on Information Forensics and Security*(2019).

]]>

WIT is one of my favorite (if not *the* favorite) program in the theory community. Many in our community share my enthusiasm (and theory groups fight for the honor of hosting these meetings). The reactions from past participants leave no room for doubt – this is an important a great and experience. So if you fit the workshop’s qualifications – please do yourself a favor and apply!

The Women in Theory (WIT) Workshop is intended for graduate and exceptional undergraduate students in the area of theory of computer science. The workshop will feature technical talks and tutorials by senior and junior women in the field, as well as social events and activities. The motivation for the workshop is twofold. The first goal is to deliver an invigorating educational program; the second is to bring together theory women students from different departments and foster a sense of kinship and camaraderie.

The 7th WIT workshop will take place at Simons Institute at Berkeley, Jun 16 – 19, 2020.

**Confirmed Speakers**: Michal Feldman (Tel-Aviv University), Shafi Goldwasser (Simons, UC Berkeley)

**Organizers**: Tal Rabin (IBM), Shubhangi Saraf (Rutgers) and Lisa Zhang (Bell Labs).

**Local Host Institution: **Simons Institute at Berkeley.

**Local Arrangements**:

**Special Guest:** Omer Reingold (Stanford).

**Contact us:** womenintheory2020@gmail.com.

**To apply**: click here.

**Important dates:**

**Application deadline: **Feb 7, 2020

**Notification of acceptance: **March 15, 2020

**Workshop: **June 16-19, 2020.

One way to try to circumvent the computational barriers is by approximation algorithms (see my blog post). A different approach is to go with quantum algorithms: Grover’s search can solve NC-SAT in time. Even those of us less skeptic than Gil Kalai can probably agree that quadratic quantum speedups won’t be practical anytime soon. But in theory, I find the question of whether we can design subquadratic quantum algorithms for edit distance very interesting (see also [BEG+18]).

Alas, even with the power of quantum computers we don’t know any truly subquadratic algorithms for computing LCS/ED. On the other hand, it is not clear how to rule out even linear-time algorithms. A few weeks ago, Buhrman, Patro, and Speelman, posted a paper that gives a quantum *query complexity *lower bound for LCS/ED.

Open Problem: Close the vs gap for quantum query complexity of ED/LCS.

What is “query complexity” of ED/LCS? Buhrman et al. consider the following query model: In LCS, we want a **maximum monotone matching** between the characters of the two strings, where we can match two characters if they’re identical. Now suppose that in the query complexity problem we still want to find a maximum monotone matching, but instead of the character-equality graph (which is a union of disjoint cliques), we have an arbitrary bipartite graph, and **given a pair of vertices, the oracle tells you if there is an edge between them**.

This model may seem a bit counter-intuitive at first since the graphs may not correspond to any pair of strings; and indeed other models have been considered before [UAH76][AKO10]. But it turns out that this model is well-motivated by the NC-SETH lower bound (see discussion below).

What does the query complexity lower bound mean for algorithms on actual strings? Instead of a black box oracle, our algorithms have access to an NC-circuit that implements it. Intuitively, we don’t know how to do very much with white box circuits, so it seems plausible to hypothesize that the running time will be lower bound by the query complexity. In some sense, this is a special case of the following *ultimate hardness hypothesis *that unifies a lot of the computational hardness assumptions that we like to assume but have no idea how to prove (e.g. P!=NP, P!=BQP, NC-SETH, FP!=PPAD, etc):

[Ultimate Hardness Hypothesis] For every problem, the white-box computational complexity is lower bounded by the black-box query complexity.”

In communication complexity similar statements are known and are called simulation/lifting theorems (see e.g. Mika’s thesis). For computational complexity, there are obvious counter examples such as “decide if the oracle can be implemented by a small circuit”. So it only makes sense to continue to assume the ultimate hardness hypothesis for “reasonable problems” instead of “every problem”.

But Burhman et al. identify the following variant of the ultimate hardness hypothesis which I find very interesting. It is defined with respect to a function which takes as input the truth-table of a circuit and outputs True or False. Roughly, they hypothesize that:

[Burhman et al.-QSETH, paraphrased] For

every, deciding if is true or false is as hard whether we’re given the actual circuit, or only the guarantee that the oracle is implemented by a small circuit”

At a first read, I thought that arguments a-la impossibility of obfuscation [BGI+01] should refute this hypothesis, but a few weeks later I still don’t know how to prove it. Do you?

During my postdoc, I worked on the quantum query complexity of ED/LCS with Shalev Ben-David, Rolando La Placa, and John Wright. I was a bit bummed to find out that we got scooped by Buhrman et al, but I know of at least 3 other groups that were also scooped by the same paper, so at least we’re in good company

At the time, a fellow postdoc from Psychology asked me what I was working on. I resisted the temptation to try to explain the various quantum variants of NC-SETH, and instead told him I was working on “DNA sequencing with quantum computers”. His reaction was priceless. Regardless of what you’re actually working on, try this line during the holidays when your relatives ask you about your work.

]]>

Applications will be accepted until the positions are filled, but review of applicants will begin after Dec 15.

Website: https://academicjobsonline.org/ajo/jobs/15578

Email: theory.stanford@gmail.com

**Parking**

There are **35** reserved spaces in **Tresidder Lot (L-39)** on **November 15, 2019. Our ** space numbers will be **14-48**; see map for location. Posted signs to read **Reserved for ****TOCA-SV CS Workshop****.**

**To avoid a citation, vehicle information is required to obtain permission to park in the designated reserved area. Use: **https://stanford.nupark.com/v2/portal/eventregister/5201e8f7-9339-4acf-8416-62f137dbc523 See instructions.

**Internet browser Chrome or Firefox are recommended and most compatible with the system.*

We prove bounds on the generalization error of convolutional networks. The bounds are characterized in terms of the training loss, the number of parameters, the Lipschitz constant of the loss, and the distance of the initial and final weights. The bounds are independent of the number of pixels in the input, as well as the width and height of hidden feature maps. These are the first bounds for DNNs with such guarantees. We present experiments with CIFAR-10, varying hyperparameters of a deep convolutional network, comparing our bounds with practical generalization gaps.

- Dean Doron, Stanford University,

Existing techniques for derandomizing algorithms based on circuit lower bounds yield a large polynomial slowdown in running time. We show that assuming exponential lower bounds against nondeterministic circuits, we can convert any randomized algorithm running in time T to a deterministic one running in time nearly T^2. Under complexity-theoretic assumptions, such a slowdown is nearly optimal.

In this talk I will concentrate on the role of error-correcting codes in those techniques. We will see which properties of error-correcting codes are useful for constructing pseudorandomness primitives sufficient for derandomization, where they came short of achieving better slowdown, and how we can overcome that.

Based on joint work with Dana Moshkovitz, Justin Oh and David Zuckerman

]]>

The call for papers for FORC 2020 is out. The PC chair, Aaron Roth, has done a remarkable job forming a strong program committee, and we are off to a great start. Please consider sending your research papers and looking forward to seeing many of you at FORC 2020 in the beginning of June at Harvard.

]]>