Testing Juntas (or “Avoiding the Curse of Irrelevant Dimensionality”)
Juntas (or, with more words, “functions of few relevant attributes”) are a central concept in computational learning theory, analysis of Boolean functions, and machine learning. Without trying to motivate this here (see e.g. this blog post of Dick Lipton’s), let us first recap some notions.
A Boolean function is said to be a
-junta (where one should think of
as big, and
as smallish, constant or say at most
) if there is an unknown set of
indices
such that
depends only on the variables
. Put differently, if there exists a Boolean function
such that, for all
,
( is technically on
variables, but “morally” on
). In particular, any Boolean function is trivially an
-junta. Also, because names are fun: a
-junta is called a dictator.
Now, if turns out to be a
-junta for some small value of
, then one can hope to do many things more efficiently: e.g., learn
; test some interesting property of
. With a little bit of luck, now the complexity of all these tasks could be much milder, with
replacing (ish) all the
‘s, with maybe some
‘s thrown in for good measure.
So testing whether an unknown function is a
-junta (given, say, query access to it, or even random samples) could be a quite useful thing to have in our toolbox. Here enters property testing of juntas:
Given parameters
,
, and query access to an unknown Boolean function
, distinguish with high probability between the case where
is a
-junta and the case where
is at least at distance
(in Hamming distance) for any
-junta.
(In the above, the main parameter is , then
, and
comes last — and is often considered as a small constant for the sake of asymptotics. Further, the focus is on the worst-case number of queries made to the unknown function, the query complexity, and not the running time.)
Property testing algorithms come in many flavors: they can be non-adaptive (decide all the queries they will make to beforehand, based only on their private randomness), adaptive, one-sided (accept juntas with probability
and only err, with small probability, on functions far from being juntas), two-sided; standard (as defined above) or tolerant (accept functions
-close to some
-junta, and reject those that are
-far).
So what do we know thus far?
1 Surprise: there is no
Nobody likes to have three parameters, and nobody likes anyway. The first main insight (and surprising result), due to Fischer, Kindler, Ron, Safra, and Samorodnitsky [FKRSS04] who initiated the study of
-junta testing, is that, actually, one can get rid of
altogether!
Namely, they give a tester (actually, a couple) with query complexity , non-adaptive, and one-sided. The main idea underlying this result is the following: if one partitions randomly the
variables in
bins, then with high probability the following happens. If
is indeed a
-junta, then clearly at most
of these bins will ever contain a relevant variable. Moreover, if
is far from being a
-junta, then either
bins will contain significantly “influent” variables, or many bins will contain many “relatively influent” variables.
So then, it suffices to estimate the “influence” of each of these bins, and reject if more than of them have noticeable influence.
Note that I used the word “influence” quite a lot in the previous paragraph, and this for a good reason: the quantity at play here is the influence of a set of variables, a generalization of the standard notion of influence of a variable in Boolean analysis. Defined as
this quantity captures the probability that “re-randomizing” the variables in will change the outcome of the function. Also, it is very simple to estimate given query access to
.
2 Let’s bring that down
After this quite amazing result, Eric Blais [Blais08, Blais09] improved this bound twice: first, establishing a non-adaptive, two-sided upper bound of queries, which hinges on a clever balancing between two sub-algorithms: one to detect few high-influence variables, and the other to detect many low-influence variables. As before, this algorithm relies on random partitioning, and estimating the influence.
The second algorithm yields an adaptive, one-sided upper bound of queries—and proceeds by binary search, trying to find and eliminate influential bins one at a time, repeatedly starting with two inputs
with
and “moving” from
to
, changing all the variables inside a bin at once until the value of
flips.
These last two upper bounds have remained unchallenged. For a rather good reason: they’re more or less tight, each of them.
Indeed, Chockler and Gutfreund [CG04], and Blais, Brody, and Matulef [BBM12] proved that queries were necessary, even for adaptive, two-sided algorithms. (This leaves a
and an
hanging…) Then, after Servedio, Tan, and Wright [STW15] proved a slighter stronger lower bound (for some regime of
) against non-adaptive tester, a result of Chen, Servedio, Tan, Waingarten, and Xie [CSTWX17] showed that, actually,
query were necessary for non-adaptive, two-sided testing of
-juntas.
So, well, both of Blais’ testers are optimal. Are we done?
3 The Virtue of Tolerance
Not quite, actually. Up to some slack (see open problems below), we know the landscape of standard testing of -juntas. But what about trying to distinguish between a very slightly noisy version of a
-junta, and something very far from being one? I.e., what about tolerant testing?
While all the above algorithms have some (very) small amount of tolerant built in (namely, something like ), the best algorithm known for “general” tolerant testing for a while was an
-query tolerant tester implied by… the algorithm of [Blais09]. At least, there is no
in there either. But since it is totally conceivable that a tolerant testing algorithm with polynomial (in
and
) query complexity exists—we have no separation, this is slightly unsatisfying.
Recently, with Eric Blais, Talya Eden, Amit Levi, and Dana Ron [BCELR18], we provided two different algorithms for tolerant testing of juntas.
The first provides a smooth trade-off between the amount of tolerance guaranteed and the query complexity: for vs.
, the query complexity is
. In particular, for
this (slightly) improves on the weak tolerance provided by the known (standard) testers, while having low polynomial query complexity. At the other end of the spectrum, setting
ones gets a tolerant testing algorithm with query complexity
(so, at least, the
moved out of the exponent).
The second builds on the properties of the influence function — which is not only directly related to the distance to -junta$, but also happens to be monotone, submodular, and generally very nice — to phrase the question as a submodular minimization problem under cardinality constraints. Which turns out to be a drag, since this problem is NP-hard, even to approximate. However, relaxing the cardinality constraint (replacing it with a suitable linear penalization term in the objective) makes the problem tractable, at the price of losing a bit in the solution. Specifically, now one can distinguish, with
queries, functions which are
-close to some
-junta from those which are
-far from every $2k$-juntas.
Which almost solves the question, but not quite.
4 Intolerance is Easier?
Mentioned earlier was the fact that we do not have any separation being tolerant and intolerance testers for -juntas. That was a bit of a lie: in yet unpublished work, Levi and Waingarten [LW18] establish the following:
Any (possibly two-sided) non-adaptive tolerant tester for
-juntas must have query complexity
(for
).
Since the standard testing version can be done with queries, this is a polynomial separation between standard and tolerant testing, and the only one I am aware of for a natural class of Boolean functions.
Now, the proof goes through a reduction to a graph testing question, in a model the authors introduce (that of rejection sampling oracle, which on query returns
for an edge
sampled uniformly at random from the unknown graph
; and the cost of a query
is its size
). In more detail, they show a lower bound of
rejection sampling query cost to distinguish between (i)
being a complete balanced bipartite graph with side size
, and (ii)
being the union of two copies of
.
5 Open questions
The above may give the impression that all is done, and that the landscape is fully known (up to some technical details). Sadly (or fortunately), this is far from being true; below, I highlight a few question i personnally find very interesting and appealing. (I may have very bad taste, mind you)
- The role of adaptivity in standard testing. We have pretty tight bounds (up to some
‘s and
‘s) for both adaptive and non-adaptive testing of juntas. What about testing with bounded adaptivity (as introduced in [CG17])? Is there a smooth tradeoff between query complexity and number of rounds of adaptivity, or do things “jump” immediately?
- Tolerant testing. Can we get a
-query tolerant tester without the relaxation “close to
-junta vs. far from
-junta$?
- Tolerant testing. Can we get a tester (even with query complexity exponential in
, but independent of
) which distinguishes “
-close vs.
-far”? (It looks like that all techniques relying on using the influence function, due to its slightly-loose relation to distance to
-junta, have to lose a factor
here). If not, can we prove a lower bound against “influence-based testers”?
- Standard vs. tolerant testing. Can the separation of Levi and Waingarten be extended to adaptive algorithms?
And finally, one I am really fond of: changing the setting itself, say a function is a
-junta if there exists an unknown rotation of the space such that, in this unknown basis,
depends on at most
coordinates. (That is,
only depends on a low-dimensional subspace). Can we test efficiently whether an unknown
is a
-junta?
References.
[BBM12] Blais, Brody, and Matulef. Property testing lower bounds via communication complexity. CCC, 2012.
[BCELR18] Blais, Canonne, Eden, Levi, and Ron. Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism. SODA, 2018.
[Blais08] Blais. Improved bounds for testing juntas. RANDOM, 2008.
[Blais09] Blais. Testing juntas nearly optimally. STOC, 2009.
[CG04] Chockler and Gutfreund. A lower bound for testing juntas. Information Processing Letters, 2004.
[CG17] Canonne and Gur. An adaptivity hierarchy theorem for property testing. CCC, 2017.
[CSTWX17] Chen, Servedio, Tan, Waingarten, and Xie. Settling the query complexity of non-adaptive junta testing. CCC, 2017.
[FKRSS04] Fischer, Kindler, Ron, Safra, and Samorodnitsky. Testing juntas. J. Computer
& System Sciences, 2004.
[STW15] Servedio, Tan, and Wright. Adaptivity helps for testing juntas. CCC, 2015.
Reference missing: [LW18] is https://eccc.weizmann.ac.il/report/2018/094/ (see the short summary on the Property Testing Review Blog: https://ptreview.sublinear.info/?p=999)
LikeLike