Source author record

Max Hopkins

Max Hopkins appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computational Geometry astro-ph.CO astro-ph.IM Computational Complexity Computer Science and Game Theory Data Structures and Algorithms Discrete Mathematics math.CO Multiagent Systems physics.data-an

Catalog footprint

What is connected

7works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Explicit Lower Bounds Against $Ω(n)$-Rounds of Sum-of-Squares

We construct an explicit family of 3-XOR instances hard for $Ω(n)$-levels of the Sum-of-Squares (SoS) semi-definite programming hierarchy. Not only is this the first explicit construction to beat brute force search (beyond low-order improvements (Tulsiani 2021, Pratt 2021)), combined with standard gap amplification techniques it also matches the (optimal) hardness of random instances up to imperfect completeness (Grigoriev TCS 2001, Schoenebeck FOCS 2008). Our result is based on a new form of small-set high dimensional expansion (SS-HDX) inspired by recent breakthroughs in locally testable and quantum LDPC codes. Adapting the recent framework of Dinur, Filmus, Harsha, and Tulsiani (ITCS 2021) for SoS lower bounds from the Ramanujan complex to this setting, we show any (bounded-degree) SS-HDX can be transformed into a highly unsatisfiable 3-XOR instance that cannot be refuted by $Ω(n)$-levels of SoS. We then show Leverrier and Zémor's (Arxiv 2022) recent qLDPC construction gives the desired explicit family of bounded-degree SS-HDX. Incidentally, this gives the strongest known form of bi-directional high dimensional expansion to date.

preprint2022arXiv

Sampling Equilibria: Fast No-Regret Learning in Structured Games

Learning and equilibrium computation in games are fundamental problems across computer science and economics, with applications ranging from politics to machine learning. Much of the work in this area revolves around a simple algorithm termed \emph{randomized weighted majority} (RWM), also known as "Hedge" or "Multiplicative Weights Update," which is well known to achieve statistically optimal rates in adversarial settings (Littlestone and Warmuth '94, Freund and Schapire '99). Unfortunately, RWM comes with an inherent computational barrier: it requires maintaining and sampling from a distribution over all possible actions. In typical settings of interest the action space is exponentially large, seemingly rendering RWM useless in practice. In this work, we refute this notion for a broad variety of \emph{structured} games, showing it is possible to efficiently (approximately) sample the action space in RWM in \emph{polylogarithmic} time. This gives the first efficient no-regret algorithms for problems such as the \emph{(discrete) Colonel Blotto game}, \emph{matroid congestion}, \emph{matroid security}, and basic \emph{dueling games}. As an immediate corollary, we give a polylogarithmic time meta-algorithm to compute approximate Nash Equilibria for these games that is exponentially faster than prior methods in several important settings. Further, our algorithm is the first to efficiently compute equilibria for more involved variants of these games with general sums, more than two players, and, for Colonel Blotto, multiple resource types.

preprint2021arXiv

Bounded Memory Active Learning through Enriched Queries

The explosive growth of easily-accessible unlabeled data has lead to growing interest in active learning, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To combat this, a series of recent works have considered a model in which the learner may ask enriched queries beyond labels. While such models have seen success in drastically lowering label costs, they tend to come at the expense of requiring large amounts of memory. In this work, we study what families of classifiers can be learned in bounded memory. To this end, we introduce a novel streaming-variant of enriched-query active learning along with a natural combinatorial parameter called lossless sample compression that is sufficient for learning not only with bounded memory, but in a query-optimal and computationally efficient manner as well. Finally, we give three fundamental examples of classifier families with small, easy to compute lossless compression schemes when given access to basic enriched queries: axis-aligned rectangles, decision trees, and halfspaces in two dimensions.

preprint2020arXiv

A Novel CMB Component Separation Method: Hierarchical Generalized Morphological Component Analysis

We present a novel technique for Cosmic Microwave Background (CMB) foreground subtraction based on the framework of blind source separation. Inspired by previous work incorporating local variation to Generalized Morphological Component Analysis (GMCA), we introduce Hierarchical GMCA (HGMCA), a Bayesian hierarchical graphical model for source separation. We test our method on $N_{\rm side}=256$ simulated sky maps that include dust, synchrotron, free-free and anomalous microwave emission, and show that HGMCA reduces foreground contamination by $25\%$ over GMCA in both the regions included and excluded by the Planck UT78 mask, decreases the error in the measurement of the CMB temperature power spectrum to the $0.02-0.03\%$ level at $\ell>200$ (and $<0.26\%$ for all $\ell$), and reduces correlation to all the foregrounds. We find equivalent or improved performance when compared to state-of-the-art Internal Linear Combination (ILC)-type algorithms on these simulations, suggesting that HGMCA may be a competitive alternative to foreground separation techniques previously applied to observed CMB data. Additionally, we show that our performance does not suffer when we perturb model parameters or alter the CMB realization, which suggests that our algorithm generalizes well beyond our simplified simulations. Our results open a new avenue for constructing CMB maps through Bayesian hierarchical analysis.

preprint2020arXiv

Noise-tolerant, Reliable Active Classification with Comparison Queries

With the explosion of massive, widely available unlabeled data in the past years, finding label and time efficient, robust learning algorithms has become ever more important in theory and in practice. We study the paradigm of active learning, in which algorithms with access to large pools of data may adaptively choose what samples to label in the hope of exponentially increasing efficiency. By introducing comparisons, an additional type of query comparing two points, we provide the first time and query efficient algorithms for learning non-homogeneous linear separators robust to bounded (Massart) noise. We further provide algorithms for a generalization of the popular Tsybakov low noise condition, and show how comparisons provide a strong reliability guarantee that is often impractical or impossible with only labels - returning a classifier that makes no errors with high probability.

preprint2020arXiv

Point Location and Active Learning: Learning Halfspaces Almost Optimally

Given a finite set $X \subset \mathbb{R}^d$ and a binary linear classifier $c: \mathbb{R}^d \to \{0,1\}$, how many queries of the form $c(x)$ are required to learn the label of every point in $X$? Known as \textit{point location}, this problem has inspired over 35 years of research in the pursuit of an optimal algorithm. Building on the prior work of Kane, Lovett, and Moran (ICALP 2018), we provide the first nearly optimal solution, a randomized linear decision tree of depth $\tilde{O}(d\log(|X|))$, improving on the previous best of $\tilde{O}(d^2\log(|X|))$ from Ezra and Sharir (Discrete and Computational Geometry, 2019). As a corollary, we also provide the first nearly optimal algorithm for actively learning halfspaces in the membership query model. En route to these results, we prove a novel characterization of Barthe's Theorem (Inventiones Mathematicae, 1998) of independent interest. In particular, we show that $X$ may be transformed into approximate isotropic position if and only if there exists no $k$-dimensional subspace with more than a $k/d$-fraction of $X$, and provide a similar characterization for exact isotropic position.

preprint2020arXiv

The Power of Comparisons for Actively Learning Linear Classifiers

In the world of big data, large but costly to label datasets dominate many fields. Active learning, a semi-supervised alternative to the standard PAC-learning model, was introduced to explore whether adaptive labeling could learn concepts with exponentially fewer labeled samples. While previous results show that active learning performs no better than its supervised alternative for important concept classes such as linear separators, we show that by adding weak distributional assumptions and allowing comparison queries, active learning requires exponentially fewer samples. Further, we show that these results hold as well for a stronger model of learning called Reliable and Probably Useful (RPU) learning. In this model, our learner is not allowed to make mistakes, but may instead answer "I don't know." While previous negative results showed this model to have intractably large sample complexity for label queries, we show that comparison queries make RPU-learning at worst logarithmically more expensive in both the passive and active regimes.

Max Hopkins

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Explicit Lower Bounds Against $Ω(n)$-Rounds of Sum-of-Squares

Sampling Equilibria: Fast No-Regret Learning in Structured Games

Bounded Memory Active Learning through Enriched Queries

A Novel CMB Component Separation Method: Hierarchical Generalized Morphological Component Analysis

Noise-tolerant, Reliable Active Classification with Comparison Queries

Point Location and Active Learning: Learning Halfspaces Almost Optimally

The Power of Comparisons for Actively Learning Linear Classifiers