Source author record

Bo Waggoner

Bo Waggoner appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Science and Game Theory Data Structures and Algorithms Artificial Intelligence math.PR math.ST q-fin.EC Statistics Theory

Catalog footprint

What is connected

14works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Trading off Consistency and Dimensionality of Convex Surrogates for the Mode

In multiclass classification over $n$ outcomes, the outcomes must be embedded into the reals with dimension at least $n-1$ in order to design a consistent surrogate loss that leads to the "correct" classification, regardless of the data distribution. For large $n$, such as in information retrieval and structured prediction tasks, optimizing a surrogate in $n-1$ dimensions is often intractable. We investigate ways to trade off surrogate loss dimension, the number of problem instances, and restricting the region of consistency in the simplex for multiclass classification. Following past work, we examine an intuitive embedding procedure that maps outcomes into the vertices of convex polytopes in a low-dimensional surrogate space. We show that full-dimensional subsets of the simplex exist around each point mass distribution for which consistency holds, but also, with less than $n-1$ dimensions, there exist distributions for which a phenomenon called hallucination occurs, which is when the optimal report under the surrogate loss is an outcome with zero probability. Looking towards application, we derive a result to check if consistency holds under a given polytope embedding and low-noise assumption, providing insight into when to use a particular embedding. We provide examples of embedding $n = 2^{d}$ outcomes into the $d$-dimensional unit cube and $n = d!$ outcomes into the $d$-dimensional permutahedron under low-noise assumptions. Finally, we demonstrate that with multiple problem instances, we can learn the mode with $\frac{n}{2}$ dimensions over the whole simplex.

preprint2022arXiv

An Embedding Framework for Consistent Polyhedral Surrogates

We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g.\ rankings) as a point in $\mathbb{R}^d$, assigns the original loss values to these points, and "convexifies" the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses. Given any polyhedral loss $L$, we give a construction of a link function through which $L$ is a consistent surrogate for the loss it embeds. Conversely, we show how to construct a consistent polyhedral surrogate for any given discrete loss. Our framework yields succinct proofs of consistency or inconsistency of various polyhedral surrogates in the literature, and for inconsistent surrogates, it further reveals the discrete losses for which these surrogates are consistent. We show some additional structure of embeddings, such as the equivalence of embedding and matching Bayes risks, and the equivalence of various notions of non-redudancy. Using these results, we establish that indirect elicitation, a necessary condition for consistency, is also sufficient when working with polyhedral surrogates.

preprint2022arXiv

An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates

We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g. rankings) as a point in $R^d$, assigns the original loss values to these points, and "convexifies" the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses: every discrete loss is embedded by some polyhedral loss, and every polyhedral loss embeds some discrete loss. Moreover, an embedding gives rise to a consistent link function as well as linear surrogate regret bounds. Our results are constructive, as we illustrate with several examples. In particular, our framework gives succinct proofs of consistency or inconsistency for various polyhedral surrogates in the literature, and for inconsistent surrogates, it further reveals the discrete losses for which these surrogates are consistent. We go on to show additional structure of embeddings, such as the equivalence of embedding and matching Bayes risks, and the equivalence of various notions of non-redudancy. Using these results, we establish that indirect elicitation, a necessary condition for consistency, is also sufficient when working with polyhedral surrogates.

preprint2022arXiv

Balls and Bins -- Simple Concentration Bounds

Concentration bounds are given for throwing balls into bins independently according to a distribution $p$. The probability of a $k$-loaded bin after $m$ balls is shown to be controlled on both sides by $ρ_{m,k} := m \|p\|_k / k$. This gives concentration inequalities for the maximum load as well as for the waiting time until a $k$-loaded bin.

preprint2022arXiv

Contracts with Information Acquisition, via Scoring Rules

We consider a principal-agent problem where the agent may privately choose to acquire relevant information prior to taking a hidden action. This model generalizes two special cases: a classic moral hazard setting, and a more recently studied problem of incentivizing information acquisition (IA). We show that all of these problems can be reduced to the design of a proper scoring rule. Under a limited liability condition, we consider the special cases separately and then the general problem. We give novel results for the special case of IA, giving a closed form "pointed polyhedral cone" solution for the general multidimensional problem. We also describe a geometric, scoring-rules based solution to the case of the classic contracts problem. Finally, we give an efficient algorithm for the general problem of Contracts with Information Acquisition.

preprint2021arXiv

Unifying Lower Bounds on Prediction Dimension of Consistent Convex Surrogates

Given a prediction task, understanding when one can and cannot design a consistent convex surrogate loss, particularly a low-dimensional one, is an important and active area of machine learning research. The prediction task may be given as a target loss, as in classification and structured prediction, or simply as a (conditional) statistic of the data, as in risk measure estimation. These two scenarios typically involve different techniques for designing and analyzing surrogate losses. We unify these settings using tools from property elicitation, and give a general lower bound on prediction dimension. Our lower bound tightens existing results in the case of discrete predictions, showing that previous calibration-based bounds can largely be recovered via property elicitation. For continuous estimation, our lower bound resolves on open problem on estimating measures of risk and uncertainty.

preprint2020arXiv

A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual Bandit Problem

We investigate the sparse linear contextual bandit problem where the parameter $θ$ is sparse. To relieve the sampling inefficiency, we utilize the "perturbed adversary" where the context is generated adversarilly but with small random non-adaptive perturbations. We prove that the simple online Lasso supports sparse linear contextual bandit with regret bound $\mathcal{O}(\sqrt{kT\log d})$ even when $d \gg T$ where $k$ and $d$ are the number of effective and ambient dimension, respectively. Compared to the recent work from Sivakumar et al. (2020), our analysis does not rely on the precondition processing, adaptive perturbation (the adaptive perturbation violates the i.i.d perturbation setting) or truncation on the error set. Moreover, the special structures in our results explicitly characterize how the perturbation affects exploration length, guide the design of perturbation together with the fundamental performance limit of perturbation method. Numerical experiments are provided to complement the theoretical analysis.

preprint2020arXiv

Computing Equilibria of Prediction Markets via Persuasion

We study the computation of equilibria in prediction markets in perhaps the most fundamental special case with two players and three trading opportunities. To do so, we show equivalence of prediction market equilibria with those of a simpler signaling game with commitment introduced by Kong and Schoenebeck (2018). We then extend their results by giving computationally efficient algorithms for additional parameter regimes. Our approach leverages a new connection between prediction markets and Bayesian persuasion, which also reveals interesting conceptual insights.

preprint2020arXiv

Equal Opportunity in Online Classification with Partial Feedback

We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates -- introduced as "equal opportunity" in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.

preprint2020arXiv

Prophet Inequalities with Linear Correlations and Augmentations

In a classical online decision problem, a decision-maker who is trying to maximize her value inspects a sequence of arriving items to learn their values (drawn from known distributions), and decides when to stop the process by taking the current item. The goal is to prove a "prophet inequality": that she can do approximately as well as a prophet with foreknowledge of all the values. In this work, we investigate this problem when the values are allowed to be correlated. Since non-trivial guarantees are impossible for arbitrary correlations, we consider a natural "linear" correlation structure introduced by Bateni et al. [ESA 2015] as a generalization of the common-base value model of Chawla et al. [GEB 2015]. A key challenge is that threshold-based algorithms, which are commonly used for prophet inequalities, no longer guarantee good performance for linear correlations. We relate this roadblock to another "augmentations" challenge that might be of independent interest: many existing prophet inequality algorithms are not robust to slight increase in the values of the arriving items. We leverage this intuition to prove bounds (matching up to constant factors) that decay gracefully with the amount of correlation of the arriving items. We extend these results to the case of selecting multiple items by designing a new $(1+o(1))$ approximation ratio algorithm that is robust to augmentations.

preprint2016arXiv

Descending Price Optimally Coordinates Search

Investigating potential purchases is often a substantial investment under uncertainty. Standard market designs, such as simultaneous or English auctions, compound this with uncertainty about the price a bidder will have to pay in order to win. As a result they tend to confuse the process of search both by leading to wasteful information acquisition on goods that have already found a good purchaser and by discouraging needed investigations of objects, potentially eliminating all gains from trade. In contrast, we show that the Dutch auction preserves all of its properties from a standard setting without information costs because it guarantees, at the time of information acquisition, a price at which the good can be purchased. Calibrations to start-up acquisition and timber auctions suggest that in practice the social losses through poor search coordination in standard formats are an order of magnitude or two larger than the (negligible) inefficiencies arising from ex-ante bidder asymmetries.

preprint2015arXiv

$\ell_p$ Testing and Learning of Discrete Distributions

The classic problems of testing uniformity of and learning a discrete distribution, given access to independent samples from it, are examined under general $\ell_p$ metrics. The intuitions and results often contrast with the classic $\ell_1$ case. For $p > 1$, we can learn and test with a number of samples that is independent of the support size of the distribution: With an $\ell_p$ tolerance $ε$, $O(\max\{ \sqrt{1/ε^q}, 1/ε^2 \})$ samples suffice for testing uniformity and $O(\max\{ 1/ε^q, 1/ε^2\})$ samples suffice for learning, where $q=p/(p-1)$ is the conjugate of $p$. As this parallels the intuition that $O(\sqrt{n})$ and $O(n)$ samples suffice for the $\ell_1$ case, it seems that $1/ε^q$ acts as an upper bound on the "apparent" support size. For some $\ell_p$ metrics, uniformity testing becomes easier over larger supports: a 6-sided die requires fewer trials to test for fairness than a 2-sided coin, and a card-shuffler requires fewer trials than the die. In fact, this inverse dependence on support size holds if and only if $p > \frac{4}{3}$. The uniformity testing algorithm simply thresholds the number of "collisions" or "coincidences" and has an optimal sample complexity up to constant factors for all $1 \leq p \leq 2$. Another algorithm gives order-optimal sample complexity for $\ell_{\infty}$ uniformity testing. Meanwhile, the most natural learning algorithm is shown to have order-optimal sample complexity for all $\ell_p$ metrics. The author thanks Clément Canonne for discussions and contributions to this work.

preprint2015arXiv

Low-Cost Learning via Active Data Procurement

We design mechanisms for online procurement of data held by strategic agents for machine learning tasks. The challenge is to use past data to actively price future data and give learning guarantees even when an agent's cost for revealing her data may depend arbitrarily on the data itself. We achieve this goal by showing how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. Our results in a sense parallel classic sample complexity guarantees, but with the key resource being money rather than quantity of data: With a budget constraint $B$, we give robust risk (predictive error) bounds on the order of $1/\sqrt{B}$. Because we use an active approach, we can often guarantee to do significantly better by leveraging correlations between costs and data. Our algorithms and analysis go through a model of no-regret learning with $T$ arriving pairs (cost, data) and a budget constraint of $B$. Our regret bounds for this model are on the order of $T/\sqrt{B}$ and we give lower bounds on the same order.

preprint2013arXiv

Designing Markets for Daily Deals

Daily deals platforms such as Amazon Local, Google Offers, GroupOn, and LivingSocial have provided a new channel for merchants to directly market to consumers. In order to maximize consumer acquisition and retention, these platforms would like to offer deals that give good value to users. Currently, selecting such deals is done manually; however, the large number of submarkets and localities necessitates an automatic approach to selecting good deals and determining merchant payments. We approach this challenge as a market design problem. We postulate that merchants already have a good idea of the attractiveness of their deal to consumers as well as the amount they are willing to pay to offer their deal. The goal is to design an auction that maximizes a combination of the revenue of the auctioneer (platform), welfare of the bidders (merchants), and the positive externality on a third party (the consumer), despite the asymmetry of information about this consumer benefit. We design auctions that truthfully elicit this information from the merchants and maximize the social welfare objective, and we characterize the consumer welfare functions for which this objective is truthfully implementable. We generalize this characterization to a very broad mechanism-design setting and give examples of other applications.

Bo Waggoner

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Trading off Consistency and Dimensionality of Convex Surrogates for the Mode

An Embedding Framework for Consistent Polyhedral Surrogates

An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates

Balls and Bins -- Simple Concentration Bounds

Contracts with Information Acquisition, via Scoring Rules

Unifying Lower Bounds on Prediction Dimension of Consistent Convex Surrogates

A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual Bandit Problem

Computing Equilibria of Prediction Markets via Persuasion

Equal Opportunity in Online Classification with Partial Feedback

Prophet Inequalities with Linear Correlations and Augmentations

Descending Price Optimally Coordinates Search

$\ell_p$ Testing and Learning of Discrete Distributions

Low-Cost Learning via Active Data Procurement

Designing Markets for Daily Deals