Source author record

Li-Yang Tan

Li-Yang Tan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity Data Structures and Algorithms Machine Learning Discrete Mathematics Distributed, Parallel, and Cluster Computing math.PR

Catalog footprint

What is connected

27works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Query-Optimal Algorithm for Finding Counterfactuals

We design an algorithm for finding counterfactuals with strong theoretical guarantees on its performance. For any monotone model $f : X^d \to \{0,1\}$ and instance $x^\star$, our algorithm makes \[ {S(f)^{O(Δ_f(x^\star))}\cdot \log d}\] queries to $f$ and returns {an {\sl optimal}} counterfactual for $x^\star$: a nearest instance $x'$ to $x^\star$ for which $f(x')\ne f(x^\star)$. Here $S(f)$ is the sensitivity of $f$, a discrete analogue of the Lipschitz constant, and $Δ_f(x^\star)$ is the distance from $x^\star$ to its nearest counterfactuals. The previous best known query complexity was $d^{\,O(Δ_f(x^\star))}$, achievable by brute-force local search. We further prove a lower bound of $S(f)^{Ω(Δ_f(x^\star))} + Ω(\log d)$ on the query complexity of any algorithm, thereby showing that the guarantees of our algorithm are essentially optimal.

preprint2022arXiv

Almost 3-Approximate Correlation Clustering in Constant Rounds

We study parallel algorithms for correlation clustering. Each pair among $n$ objects is labeled as either "similar" or "dissimilar". The goal is to partition the objects into arbitrarily many clusters while minimizing the number of disagreements with the labels. Our main result is an algorithm that for any $ε> 0$ obtains a $(3+ε)$-approximation in $O(1/ε)$ rounds (of models such as massively parallel computation, local, and semi-streaming). This is a culminating point for the rich literature on parallel correlation clustering. On the one hand, the approximation (almost) matches a natural barrier of 3 for combinatorial algorithms. On the other hand, the algorithm's round-complexity is essentially constant. To achieve this result, we introduce a simple $O(1/ε)$-round parallel algorithm. Our main result is to provide an analysis of this algorithm, showing that it achieves a $(3+ε)$-approximation. Our analysis draws on new connections to sublinear-time algorithms. Specifically, it builds on the work of Yoshida, Yamamoto, and Ito [STOC'09] on bounding the "query complexity" of greedy maximal independent set. To our knowledge, this is the first application of this method in analyzing the approximation ratio of any algorithm.

preprint2022arXiv

Fooling Gaussian PTFs via Local Hyperconcentration

We give a pseudorandom generator that fools degree-$d$ polynomial threshold functions over $n$-dimensional Gaussian space with seed length $\mathrm{poly}(d)\cdot \log n$. All previous generators had a seed length with at least a $2^d$ dependence on $d$. The key new ingredient is a Local Hyperconcentration Theorem, which shows that every degree-$d$ Gaussian polynomial is hyperconcentrated almost everywhere at scale $d^{-O(1)}$.

preprint2022arXiv

Multitask Learning via Shared Features: Algorithms and Hardness

We investigate the computational efficiency of multitask learning of Boolean functions over the $d$-dimensional hypercube, that are related by means of a feature representation of size $k \ll d$ shared across all tasks. We present a polynomial time multitask learning algorithm for the concept class of halfspaces with margin $γ$, which is based on a simultaneous boosting technique and requires only $\textrm{poly}(k/γ)$ samples-per-task and $\textrm{poly}(k\log(d)/γ)$ samples in total. In addition, we prove a computational separation, showing that assuming there exists a concept class that cannot be learned in the attribute-efficient model, we can construct another concept class such that can be learned in the attribute-efficient model, but cannot be multitask learned efficiently -- multitask learning this concept class either requires super-polynomial time complexity or a much larger total number of samples.

preprint2022arXiv

On the power of adaptivity in statistical adversaries

We study a fundamental question concerning adversarial noise models in statistical problems where the algorithm receives i.i.d. draws from a distribution $\mathcal{D}$. The definitions of these adversaries specify the type of allowable corruptions (noise model) as well as when these corruptions can be made (adaptivity); the latter differentiates between oblivious adversaries that can only corrupt the distribution $\mathcal{D}$ and adaptive adversaries that can have their corruptions depend on the specific sample $S$ that is drawn from $\mathcal{D}$. In this work, we investigate whether oblivious adversaries are effectively equivalent to adaptive adversaries, across all noise models studied in the literature. Specifically, can the behavior of an algorithm $\mathcal{A}$ in the presence of oblivious adversaries always be well-approximated by that of an algorithm $\mathcal{A}'$ in the presence of adaptive adversaries? Our first result shows that this is indeed the case for the broad class of statistical query algorithms, under all reasonable noise models. We then show that in the specific case of additive noise, this equivalence holds for all algorithms. Finally, we map out an approach towards proving this statement in its fullest generality, for all algorithms and under all reasonable noise models.

preprint2022arXiv

Open Problem: Properly learning decision trees in polynomial time?

The authors recently gave an $n^{O(\log\log n)}$ time membership query algorithm for properly learning decision trees under the uniform distribution (Blanc et al., 2021). The previous fastest algorithm for this problem ran in $n^{O(\log n)}$ time, a consequence of Ehrenfeucht and Haussler (1989)'s classic algorithm for the distribution-free setting. In this article we highlight the natural open problem of obtaining a polynomial-time algorithm, discuss possible avenues towards obtaining it, and state intermediate milestones that we believe are of independent interest.

preprint2022arXiv

Popular decision tree algorithms are provably noise tolerant

Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing.

preprint2022arXiv

Reconstructing decision trees

We give the first {\sl reconstruction algorithm} for decision trees: given queries to a function $f$ that is $\mathrm{opt}$-close to a size-$s$ decision tree, our algorithm provides query access to a decision tree $T$ where: $\circ$ $T$ has size $S = s^{O((\log s)^2/\varepsilon^3)}$; $\circ$ $\mathrm{dist}(f,T)\le O(\mathrm{opt})+\varepsilon$; $\circ$ Every query to $T$ is answered with $\mathrm{poly}((\log s)/\varepsilon)\cdot \log n$ queries to $f$ and in $\mathrm{poly}((\log s)/\varepsilon)\cdot n\log n$ time. This yields a {\sl tolerant tester} that distinguishes functions that are close to size-$s$ decision trees from those that are far from size-$S$ decision trees. The polylogarithmic dependence on $s$ in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithms for reconstructing and testing these properties.

preprint2022arXiv

The composition complexity of majority

We study the complexity of computing majority as a composition of local functions: \[ \text{Maj}_n = h(g_1,\ldots,g_m), \] where each $g_j :\{0,1\}^{n} \to \{0,1\}$ is an arbitrary function that queries only $k \ll n$ variables and $h : \{0,1\}^m \to \{0,1\}$ is an arbitrary combining function. We prove an optimal lower bound of \[ m \ge Ω\left( \frac{n}{k} \log k \right) \] on the number of functions needed, which is a factor $Ω(\log k)$ larger than the ideal $m = n/k$. We call this factor the composition overhead; previously, no superconstant lower bounds on it were known for majority. Our lower bound recovers, as a corollary and via an entirely different proof, the best known lower bound for bounded-width branching programs for majority (Alon and Maass '86, Babai et al. '90). It is also the first step in a plan that we propose for breaking a longstanding barrier in lower bounds for small-depth boolean circuits. Novel aspects of our proof include sharp bounds on the information lost as computation flows through the inner functions $g_j$, and the bootstrapping of lower bounds for a multi-output function (Hamming weight) into lower bounds for a single-output one (majority).

preprint2022arXiv

The Query Complexity of Certification

We study the problem of {\sl certification}: given queries to a function $f : \{0,1\}^n \to \{0,1\}$ with certificate complexity $\le k$ and an input $x^\star$, output a size-$k$ certificate for $f$'s value on $x^\star$. This abstractly models a central problem in explainable machine learning, where we think of $f$ as a blackbox model that we seek to explain the predictions of. For monotone functions, a classic local search algorithm of Angluin accomplishes this task with $n$ queries, which we show is optimal for local search algorithms. Our main result is a new algorithm for certifying monotone functions with $O(k^8 \log n)$ queries, which comes close to matching the information-theoretic lower bound of $Ω(k \log n)$. The design and analysis of our algorithm are based on a new connection to threshold phenomena in monotone functions. We further prove exponential-in-$k$ lower bounds when $f$ is non-monotone, and when $f$ is monotone but the algorithm is only given random examples of $f$. These lower bounds show that assumptions on the structure of $f$ and query access to it are both necessary for the polynomial dependence on $k$ that we achieve.

preprint2022arXiv

Tradeoffs for small-depth Frege proofs

We study the complexity of small-depth Frege proofs and give the first tradeoffs between the size of each line and the number of lines. Existing lower bounds apply to the overall proof size -- the sum of sizes of all lines -- and do not distinguish between these notions of complexity. For depth-$d$ Frege proofs of the Tseitin principle on the $n \times n$ grid where each line is a size-$s$ formula, we prove that $\exp(n/2^{Ω(d\sqrt{\log s})})$ many lines are necessary. This yields new lower bounds on line complexity that are not implied by Håstad's recent $\exp(n^{Ω(1/d)})$ lower bound on the overall proof size. For $s = \mathrm{poly}(n)$, for example, our lower bound remains $\exp(n^{1-o(1)})$ for all $d = o(\sqrt{\log n})$, whereas Håstad's lower bound is $\exp(n^{o(1)})$ once $d = ω_n(1)$. Our main conceptual contribution is the simple observation that techniques for establishing correlation bounds in circuit complexity can be leveraged to establish such tradeoffs in proof complexity.

preprint2020arXiv

New lower bounds for Massively Parallel Computation from query complexity

Roughgarden, Vassilvitskii, and Wang (JACM 18) recently introduced a novel framework for proving lower bounds for Massively Parallel Computation using techniques from boolean function complexity. We extend their framework in two different ways, to capture two common features of Massively Parallel Computation: $\circ$ Adaptivity, where machines can write to and adaptively read from shared memory throughout the execution of the computation. Recent work of Behnezhad et al. (SPAA 19) showed that adaptivity enables significantly improved round complexities for a number of central graph problems. $\circ$ Promise problems, where the algorithm only has to succeed on certain inputs. These inputs may have special structure that is of particular interest, or they may be representative of hard instances of the overall problem. Using this extended framework, we give the first unconditional lower bounds on the complexity of distinguishing whether an input graph is a cycle of length $n$ or two cycles of length $n/2$. This promise problem, 1v2-Cycle, has emerged as a central problem in the study of Massively Parallel Computation. We prove that any adaptive algorithm for the 1v2-Cycle problem with I/O capacity $O(n^{\varepsilon})$ per machine requires $Ω(1/\varepsilon)$ rounds, matching a recent upper bound of Behnezhad et al. In addition to strengthening the connections between Massively Parallel Computation and boolean function complexity, we also develop new machinery to reason about the latter. At the heart of our proofs are optimal lower bounds on the query complexity and approximate certificate complexity of the 1v2-Cycle problem.

preprint2020arXiv

Provable guarantees for decision tree induction: the agnostic setting

We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions~$f$ and parameters $s\in \mathbb{N}$, these heuristics construct a decision tree of size $s^{\tilde{O}((\log s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$, where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree for $f$. Previously, such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tildeΩ(\log s)}$ lower bound.

preprint2020arXiv

The Power of Many Samples in Query Complexity

The randomized query complexity $R(f)$ of a boolean function $f\colon\{0,1\}^n\to\{0,1\}$ is famously characterized (via Yao's minimax) by the least number of queries needed to distinguish a distribution $D_0$ over $0$-inputs from a distribution $D_1$ over $1$-inputs, maximized over all pairs $(D_0,D_1)$. We ask: Does this task become easier if we allow query access to infinitely many samples from either $D_0$ or $D_1$? We show the answer is no: There exists a hard pair $(D_0,D_1)$ such that distinguishing $D_0^\infty$ from $D_1^\infty$ requires $Θ(R(f))$ many queries. As an application, we show that for any composed function $f\circ g$ we have $R(f\circ g) \geq Ω(\mathrm{fbs}(f)R(g))$ where $\mathrm{fbs}$ denotes fractional block sensitivity.

preprint2016arXiv

Hypercontractive inequalities via SOS, and the Frankl--Rödl graph

Our main result is a formulation and proof of the reverse hypercontractive inequality in the sum-of-squares (SOS) proof system. As a consequence we show that for any constant $0 < γ\leq 1/4$, the SOS/Lasserre SDP hierarchy at degree $4\lceil \frac{1}{4γ}\rceil$ certifies the statement "the maximum independent set in the Frankl--Rödl graph $\mathrm{FR}^{n}_γ$ has fractional size~$o(1)$". Here $\mathrm{FR}^{n}_γ = (V,E)$ is the graph with $V = \{0,1\}^n$ and $(x,y) \in E$ whenever $Δ(x,y) = (1-γ)n$ (an even integer). In particular, we show the degree-$4$ SOS algorithm certifies the chromatic number lower bound "$χ(\mathrm{FR}^{n}_{1/4}) = ω(1)$", even though $\mathrm{FR}^{n}_{1/4}$ is the canonical integrality gap instance for which standard SDP relaxations cannot even certify "$χ(\mathrm{FR}^{n}_{1/4}) > 3$". Finally, we also give an SOS proof of (a generalization of) the sharp $(2,q)$-hypercontractive inequality for any even integer $q$.

preprint2015arXiv

An average-case depth hierarchy theorem for Boolean circuits

We prove an average-case depth hierarchy theorem for Boolean circuits over the standard basis of $\mathsf{AND}$, $\mathsf{OR}$, and $\mathsf{NOT}$ gates. Our hierarchy theorem says that for every $d \geq 2$, there is an explicit $n$-variable Boolean function $f$, computed by a linear-size depth-$d$ formula, which is such that any depth-$(d-1)$ circuit that agrees with $f$ on $(1/2 + o_n(1))$ fraction of all inputs must have size $\exp({n^{Ω(1/d)}}).$ This answers an open question posed by Håstad in his Ph.D. thesis. Our average-case depth hierarchy theorem implies that the polynomial hierarchy is infinite relative to a random oracle with probability 1, confirming a conjecture of Håstad, Cai, and Babai. We also use our result to show that there is no "approximate converse" to the results of Linial, Mansour, Nisan and Boppana on the total influence of small-depth circuits, thus answering a question posed by O'Donnell, Kalai, and Hatami. A key ingredient in our proof is a notion of \emph{random projections} which generalize random restrictions.

preprint2015arXiv

An inequality for the Fourier spectrum of parity decision trees

We give a new bound on the sum of the linear Fourier coefficients of a Boolean function in terms of its parity decision tree complexity. This result generalizes an inequality of O'Donnell and Servedio for regular decision trees. We use this bound to obtain the first non-trivial lower bound on the parity decision tree complexity of the recursive majority function.

preprint2015arXiv

Near-optimal small-depth lower bounds for small distance connectivity

We show that any depth-$d$ circuit for determining whether an $n$-node graph has an $s$-to-$t$ path of length at most $k$ must have size $n^{Ω(k^{1/d}/d)}$. The previous best circuit size lower bounds for this problem were $n^{k^{\exp(-O(d))}}$ (due to Beame, Impagliazzo, and Pitassi [BIP98]) and $n^{Ω((\log k)/d)}$ (following from a recent formula size lower bound of Rossman [Ros14]). Our lower bound is quite close to optimal, since a simple construction gives depth-$d$ circuits of size $n^{O(k^{2/d})}$ for this problem (and strengthening our bound even to $n^{k^{Ω(1/d)}}$ would require proving that undirected connectivity is not in $\mathsf{NC^1}.$) Our proof is by reduction to a new lower bound on the size of small-depth circuits computing a skewed variant of the "Sipser functions" that have played an important role in classical circuit lower bounds [Sip83, Yao85, Hås86]. A key ingredient in our proof of the required lower bound for these Sipser-like functions is the use of \emph{random projections}, an extension of random restrictions which were recently employed in [RST15]. Random projections allow us to obtain sharper quantitative bounds while employing simpler arguments, both conceptually and technically, than in the previous works [Ajt89, BPU92, BIP98, Ros14].

preprint2014arXiv

Approximate resilience, monotonicity, and the complexity of agnostic learning

A function $f$ is $d$-resilient if all its Fourier coefficients of degree at most $d$ are zero, i.e., $f$ is uncorrelated with all low-degree parities. We study the notion of $\mathit{approximate}$ $\mathit{resilience}$ of Boolean functions, where we say that $f$ is $α$-approximately $d$-resilient if $f$ is $α$-close to a $[-1,1]$-valued $d$-resilient function in $\ell_1$ distance. We show that approximate resilience essentially characterizes the complexity of agnostic learning of a concept class $C$ over the uniform distribution. Roughly speaking, if all functions in a class $C$ are far from being $d$-resilient then $C$ can be learned agnostically in time $n^{O(d)}$ and conversely, if $C$ contains a function close to being $d$-resilient then agnostic learning of $C$ in the statistical query (SQ) framework of Kearns has complexity of at least $n^{Ω(d)}$. This characterization is based on the duality between $\ell_1$ approximation by degree-$d$ polynomials and approximate $d$-resilience that we establish. In particular, it implies that $\ell_1$ approximation by low-degree polynomials, known to be sufficient for agnostic learning over product distributions, is in fact necessary. Focusing on monotone Boolean functions, we exhibit the existence of near-optimal $α$-approximately $\widetildeΩ(α\sqrt{n})$-resilient monotone functions for all $α>0$. Prior to our work, it was conceivable even that every monotone function is $Ω(1)$-far from any $1$-resilient function. Furthermore, we construct simple, explicit monotone functions based on ${\sf Tribes}$ and ${\sf CycleRun}$ that are close to highly resilient functions. Our constructions are based on a fairly general resilience analysis and amplification. These structural results, together with the characterization, imply nearly optimal lower bounds for agnostic learning of monotone juntas.

preprint2014arXiv

Boolean function monotonicity testing requires (almost) $n^{1/2}$ non-adaptive queries

We prove a lower bound of $Ω(n^{1/2 - c})$, for all $c>0$, on the query complexity of (two-sided error) non-adaptive algorithms for testing whether an $n$-variable Boolean function is monotone versus constant-far from monotone. This improves a $\tildeΩ(n^{1/5})$ lower bound for the same problem that was recently given in [CST14] and is very close to $Ω(n^{1/2})$, which we conjecture is the optimal lower bound for this model.

preprint2014arXiv

Learning circuits with few negations

Monotone Boolean functions, and the monotone Boolean circuits that compute them, have been intensively studied in complexity theory. In this paper we study the structure of Boolean functions in terms of the minimum number of negations in any circuit computing them, a complexity measure that interpolates between monotone functions and the class of all functions. We study this generalization of monotonicity from the vantage point of learning theory, giving near-matching upper and lower bounds on the uniform-distribution learnability of circuits in terms of the number of negations they contain. Our upper bounds are based on a new structural characterization of negation-limited circuits that extends a classical result of A. A. Markov. Our lower bounds, which employ Fourier-analytic tools from hardness amplification, give new results even for circuits with no negations (i.e. monotone functions).

preprint2014arXiv

New algorithms and lower bounds for monotonicity testing

We consider the problem of testing whether an unknown Boolean function $f$ is monotone versus $ε$-far from every monotone function. The two main results of this paper are a new lower bound and a new algorithm for this well-studied problem. Lower bound: We prove an $\tildeΩ(n^{1/5})$ lower bound on the query complexity of any non-adaptive two-sided error algorithm for testing whether an unknown Boolean function $f$ is monotone versus constant-far from monotone. This gives an exponential improvement on the previous lower bound of $Ω(\log n)$ due to Fischer et al. [FLN+02]. We show that the same lower bound holds for monotonicity testing of Boolean-valued functions over hypergrid domains $\{1,\ldots,m\}^n$ for all $m\ge 2$. Upper bound: We give an $\tilde{O}(n^{5/6})\text{poly}(1/ε)$-query algorithm that tests whether an unknown Boolean function $f$ is monotone versus $ε$-far from monotone. Our algorithm, which is non-adaptive and makes one-sided error, is a modified version of the algorithm of Chakrabarty and Seshadhri [CS13a], which makes $\tilde{O}(n^{7/8})\text{poly}(1/ε)$ queries.

preprint2013arXiv

A composition theorem for the Fourier Entropy-Influence conjecture

The Fourier Entropy-Influence (FEI) conjecture of Friedgut and Kalai [FK96] seeks to relate two fundamental measures of Boolean function complexity: it states that $H[f] \leq C Inf[f]$ holds for every Boolean function $f$, where $H[f]$ denotes the spectral entropy of $f$, $Inf[f]$ is its total influence, and $C > 0$ is a universal constant. Despite significant interest in the conjecture it has only been shown to hold for a few classes of Boolean functions. Our main result is a composition theorem for the FEI conjecture. We show that if $g_1,...,g_k$ are functions over disjoint sets of variables satisfying the conjecture, and if the Fourier transform of $F$ taken with respect to the product distribution with biases $E[g_1],...,E[g_k]$ satisfies the conjecture, then their composition $F(g_1(x^1),...,g_k(x^k))$ satisfies the conjecture. As an application we show that the FEI conjecture holds for read-once formulas over arbitrary gates of bounded arity, extending a recent result [OWZ11] which proved it for read-once decision trees. Our techniques also yield an explicit function with the largest known ratio of $C \geq 6.278$ between $H[f]$ and $Inf[f]$, improving on the previous lower bound of 4.615.

preprint2012arXiv

Analysis of Boolean Functions

Scribe notes from the 2012 Barbados Workshop on Computational Complexity. A series of lectures on Analysis of Boolean Functions by Ryan O'Donnell, with a guest lecture by Per Austrin.

preprint2012arXiv

New NP-hardness results for 3-Coloring and 2-to-1 Label Cover

We show that given a 3-colorable graph, it is NP-hard to find a 3-coloring with $(16/17 + \eps)$ of the edges bichromatic. In a related result, we show that given a satisfiable instance of the 2-to-1 Label Cover problem, it is NP-hard to find a $(23/24 + \eps)$-satisfying assignment.

preprint2012arXiv

On the Distribution of the Fourier Spectrum of Halfspaces

Bourgain showed that any noise stable Boolean function $f$ can be well-approximated by a junta. In this note we give an exponential sharpening of the parameters of Bourgain's result under the additional assumption that $f$ is a halfspace.

preprint2010arXiv

A regularity lemma, and low-weight approximators, for low-degree polynomial threshold functions

We give a "regularity lemma" for degree-d polynomial threshold functions (PTFs) over the Boolean cube {-1,1}^n. This result shows that every degree-d PTF can be decomposed into a constant number of subfunctions such that almost all of the subfunctions are close to being regular PTFs. Here a "regular PTF is a PTF sign(p(x)) where the influence of each variable on the polynomial p(x) is a small fraction of the total influence of p. As an application of this regularity lemma, we prove that for any constants d \geq 1, \eps \geq 0, every degree-d PTF over n variables has can be approximated to accuracy eps by a constant-degree PTF that has integer weights of total magnitude O(n^d). This weight bound is shown to be optimal up to constant factors.

Li-Yang Tan

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

A Query-Optimal Algorithm for Finding Counterfactuals

Almost 3-Approximate Correlation Clustering in Constant Rounds

Fooling Gaussian PTFs via Local Hyperconcentration

Multitask Learning via Shared Features: Algorithms and Hardness

On the power of adaptivity in statistical adversaries

Open Problem: Properly learning decision trees in polynomial time?

Popular decision tree algorithms are provably noise tolerant

Reconstructing decision trees

The composition complexity of majority

The Query Complexity of Certification

Tradeoffs for small-depth Frege proofs

New lower bounds for Massively Parallel Computation from query complexity

Provable guarantees for decision tree induction: the agnostic setting

The Power of Many Samples in Query Complexity

Hypercontractive inequalities via SOS, and the Frankl--Rödl graph

An average-case depth hierarchy theorem for Boolean circuits

An inequality for the Fourier spectrum of parity decision trees

Near-optimal small-depth lower bounds for small distance connectivity

Approximate resilience, monotonicity, and the complexity of agnostic learning

Boolean function monotonicity testing requires (almost) $n^{1/2}$ non-adaptive queries

Learning circuits with few negations

New algorithms and lower bounds for monotonicity testing

A composition theorem for the Fourier Entropy-Influence conjecture

Analysis of Boolean Functions

New NP-hardness results for 3-Coloring and 2-to-1 Label Cover

On the Distribution of the Fourier Spectrum of Halfspaces

A regularity lemma, and low-weight approximators, for low-degree polynomial threshold functions