Source author record

Jon Schneider

Jon Schneider appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Science and Game Theory Data Structures and Algorithms math.CO Discrete Mathematics Information Theory math.DS math.IT math.PR Computational Complexity econ.EM econ.TH Information Retrieval

Catalog footprint

What is connected

18works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Density-Based Algorithms for Corruption-Robust Contextual Search and Convex Optimization

We study the problem of contextual search, a generalization of binary search in higher dimensions, in the adversarial noise model. Let $d$ be the dimension of the problem, $T$ be the time horizon and $C$ be the total amount of adversarial noise in the system. We focus on the $ε$-ball and the symmetric loss. For the $ε$-ball loss, we give a tight regret bound of $O(C + d \log(1/ε))$ improving over the $O(d^3 \log(1/ε) \log^2(T) + C \log(T) \log(1/ε))$ bound of Krishnamurthy et al (Operations Research '23). For the symmetric loss, we give an efficient algorithm with regret $O(C+d \log T)$. To tackle the symmetric loss case, we study the more general setting of Corruption-Robust Convex Optimization with Subgradient feedback, which is of independent interest. Our techniques are a significant departure from prior approaches. Specifically, we keep track of density functions over the candidate target vectors instead of a knowledge set consisting of the candidate target vectors consistent with the feedback obtained.

preprint2026arXiv

Distributional Alignment Games for Answer-Level Fine-Tuning

We focus on the problem of \emph{Answer-Level Fine-Tuning} (ALFT), where the goal is to optimize a language model based on the correctness or properties of its final answers, rather than the specific reasoning traces used to produce them. Directly optimizing answer-level objectives is computationally intractable due to the need to marginalize over the vast space of latent reasoning paths. To overcome this, we propose a general game-theoretical framework that lifts the problem to a \emph{Distributional Alignment Game}. We formulate ALFT as a two-player game between a Policy (the generator) and a Target (an auxiliary distribution). We prove that the Nash Equilibrium of this game corresponds exactly to the solution of the original answer-level optimization problem. This variational perspective transforms the intractable marginalization problem into a tractable projection problem. We demonstrate that this framework unifies recent approaches to diversity and self-improvement (coherence) and provide efficient algorithms compatible with Group Relative Policy Optimization (GRPO), such as Coherence-GRPO, yielding significant complexity gains in mathematical reasoning tasks.

preprint2026arXiv

Generalized Distributional Alignment Games for Unbiased Answer-Level Fine-Tuning

The Distributional Alignment Game framework provides a powerful variational perspective on Answer-Level Fine-Tuning (ALFT). However, standard algorithms for these games rely on estimating logarithmic rewards from small batches, introducing a systematic bias due to Jensen's inequality that can destabilize training. In this paper, we systematically resolve this structural estimation bias. First, we generalize the alignment game to arbitrary Bregman divergences, showing that for a family of geometries inducing polynomial rewards, we can construct provably exact and unbiased estimators using U-statistics. Second, for the canonical KL divergence game where an exact solution is impossible, we derive a globally robust minimax polynomial estimator that is provably optimal, achieving the fundamental statistical error limit of $Θ(1/K^2)$, which we establish via the Ditzian-Totik theorem. Finally, we synthesize these two approaches to propose a novel Variance-Optimal Augmented Polynomial Optimization Program (AQP) Estimator, proving that by systematically reducing variance, our method achieves not only optimal bias but also provably accelerated game convergence, leading to more efficient and stable training with zero online computational overhead.

preprint2024arXiv

Optimal cross-learning for contextual bandits with unknown context distributions

We consider the problem of designing contextual bandit algorithms in the ``cross-learning'' setting of Balseiro et al., where the learner observes the loss for the action they play in all possible contexts, not just the context of the current round. We specifically consider the setting where losses are chosen adversarially and contexts are sampled i.i.d. from an unknown distribution. In this setting, we resolve an open problem of Balseiro et al. by providing an efficient algorithm with a nearly tight (up to logarithmic factors) regret bound of $\widetilde{O}(\sqrt{TK})$, independent of the number of contexts. As a consequence, we obtain the first nearly tight regret bounds for the problems of learning to bid in first-price auctions (under unknown value distributions) and sleeping bandits with a stochastic action set. At the core of our algorithm is a novel technique for coordinating the execution of a learning algorithm over multiple epochs in such a way to remove correlations between estimation of the unknown distribution and the actions played by the algorithm. This technique may be of independent interest for other learning problems involving estimation of an unknown context distribution.

preprint2022arXiv

Bernoulli Factories for Flow-Based Polytopes

We construct explicit combinatorial Bernoulli factories for the class of \emph{flow-based polytopes}; integral 0/1-polytopes defined by a set of network flow constraints. This generalizes the results of Niazadeh et al. (who constructed an explicit factory for the specific case of bipartite perfect matchings) and provides novel exact sampling procedures for sampling paths, circulations, and $k$-flows. In the process, we uncover new connections to algebraic combinatorics.

preprint2022arXiv

Multiparameter Bernoulli Factories

We consider the problem of computing with many coins of unknown bias. We are given samples access to $n$ coins with \emph{unknown} biases $p_1,\dots, p_n$ and are asked to sample from a coin with bias $f(p_1, \dots, p_n)$ for a given function $f:[0,1]^n \rightarrow [0,1]$. We give a complete characterization of the functions $f$ for which this is possible. As a consequence, we show how to extend various combinatorial sampling procedures (most notably, the classic Sampford Sampling for $k$-subsets) to the boundary of the hypercube.

preprint2022arXiv

Strategizing against Learners in Bayesian Games

We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy, while the other, the optimizer, is a rational utility maximizer. We consider general Bayesian games, where the payoffs of both the optimizer and the learner could depend on the type, which is drawn from a publicly known distribution, but revealed privately to the learner. We address the following questions: (a) what is the bare minimum that the optimizer can guarantee to obtain regardless of the no-regret learning algorithm employed by the learner? (b) are there learning algorithms that cap the optimizer payoff at this minimum? (c) can these algorithms be implemented efficiently? While building this theory of optimizer-learner interactions, we define a new combinatorial notion of regret called polytope swap regret, that could be of independent interest in other settings.

preprint2021arXiv

Optimal Contextual Pricing and Extensions

In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\mathbb{R}^d$ and only observes the purchasing decisions of a buyer with a fixed but unknown linear valuation over the products. The regret measures the difference between the revenue the seller could have obtained knowing the buyer valuation and what can be obtained by the learning algorithm. We give a poly-time algorithm for contextual pricing with $O(d \log \log T + d \log d)$ regret which matches the $Ω(d \log \log T)$ lower bound up to the $d \log d$ additive factor. If we replace pricing loss by the symmetric loss, we obtain an algorithm with nearly optimal regret of $O(d \log d)$ matching the $Ω(d)$ lower bound up to $\log d$. These algorithms are based on a novel technique of bounding the value of the Steiner polynomial of a convex region at various scales. The Steiner polynomial is a degree $d$ polynomial with intrinsic volumes as the coefficients. We also study a generalized version of contextual search where the hidden linear function over the Euclidean space is replaced by a hidden function $f : \mathcal{X} \rightarrow \mathcal{Y}$ in a certain hypothesis class $\mathcal{H}$. We provide a generic algorithm with $O(d^2)$ regret where $d$ is the covering dimension of this class. This leads in particular to a $\tilde{O}(s^2)$ regret algorithm for linear contextual search if the linear function is guaranteed to be $s$-sparse. Finally we also extend our results to the noisy feedback model, where each round our feedback is flipped with a fixed probability $p < 1/2$.

preprint2021arXiv

Prior-free Dynamic Mechanism Design With Limited Liability

We study the problem of repeatedly auctioning off an item to one of $k$ bidders where: a) bidders have a per-round individual rationality constraint, b) bidders may leave the mechanism at any point, and c) the bidders' valuations are adversarially chosen (the prior-free setting). Without these constraints, the auctioneer can run a second-price auction to "sell the business" and receive the second highest total value for the entire stream of items. We show that under these constraints, the auctioneer can attain a constant fraction of the "sell the business" benchmark, but no more than $2/e$ of this benchmark. In the course of doing so, we design mechanisms for a single bidder problem of independent interest: how should you repeatedly sell an item to a (per-round IR) buyer with adversarial valuations if you know their total value over all rounds is $V$ but not how their value changes over time? We demonstrate a mechanism that achieves revenue $V/e$ and show that this is tight.

preprint2020arXiv

Learning Product Rankings Robust to Fake Users

In many online platforms, customers' decisions are substantially influenced by product rankings as most customers only examine a few top-ranked products. Concurrently, such platforms also use the same data corresponding to customers' actions to learn how these products must be ranked or ordered. These interactions in the underlying learning process, however, may incentivize sellers to artificially inflate their position by employing fake users, as exemplified by the emergence of click farms. Motivated by such fraudulent behavior, we study the ranking problem of a platform that faces a mixture of real and fake users who are indistinguishable from one another. We first show that existing learning algorithms---that are optimal in the absence of fake users---may converge to highly sub-optimal rankings under manipulation by fake users. To overcome this deficiency, we develop efficient learning algorithms under two informational environments: in the first setting, the platform is aware of the number of fake users, and in the second setting, it is agnostic to the number of fake users. For both these environments, we prove that our algorithms converge to the optimal ranking, while being robust to the aforementioned fraudulent behavior; we also present worst-case performance guarantees for our methods, and show that they significantly outperform existing algorithms. At a high level, our work employs several novel approaches to guarantee robustness such as: (i) constructing product-ordering graphs that encode the pairwise relationships between products inferred from the customers' actions; and (ii) implementing multiple levels of learning with a judicious amount of bi-directional cross-learning between levels.

preprint2020arXiv

Reserve Price Optimization for First Price Auctions

The display advertising industry has recently transitioned from second- to first-price auctions as its primary mechanism for ad allocation and pricing. In light of this, publishers need to re-evaluate and optimize their auction parameters, notably reserve prices. In this paper, we propose a gradient-based algorithm to adaptively update and optimize reserve prices based on estimates of bidders' responsiveness to experimental shocks in reserves. Our key innovation is to draw on the inherent structure of the revenue objective in order to reduce the variance of gradient estimates and improve convergence rates in both theory and practice. We show that revenue in a first-price auction can be usefully decomposed into a \emph{demand} component and a \emph{bidding} component, and introduce techniques to reduce the variance of each component. We characterize the bias-variance trade-offs of these techniques and validate the performance of our proposed algorithm through experiments on synthetic data and real display ad auctions data from Google ad exchange.

preprint2016arXiv

Competitive analysis of the top-K ranking problem

Motivated by applications in recommender systems, web search, social choice and crowdsourcing, we consider the problem of identifying the set of top $K$ items from noisy pairwise comparisons. In our setting, we are non-actively given $r$ pairwise comparisons between each pair of $n$ items, where each comparison has noise constrained by a very general noise model called the strong stochastic transitivity (SST) model. We analyze the competitive ratio of algorithms for the top-$K$ problem. In particular, we present a linear time algorithm for the top-$K$ problem which has a competitive ratio of $\tilde{O}(\sqrt{n})$; i.e. to solve any instance of top-$K$, our algorithm needs at most $\tilde{O}(\sqrt{n})$ times as many samples needed as the best possible algorithm for that instance (in contrast, all previous known algorithms for the top-$K$ problem have competitive ratios of $\tildeΩ(n)$ or worse). We further show that this is tight: any algorithm for the top-$K$ problem has competitive ratio at least $\tildeΩ(\sqrt{n})$.

preprint2016arXiv

Condorcet-Consistent and Approximately Strategyproof Tournament Rules

We consider the manipulability of tournament rules for round-robin tournaments of $n$ competitors. Specifically, $n$ competitors are competing for a prize, and a tournament rule $r$ maps the result of all $\binom{n}{2}$ pairwise matches (called a tournament, $T$) to a distribution over winners. Rule $r$ is Condorcet-consistent if whenever $i$ wins all $n-1$ of her matches, $r$ selects $i$ with probability $1$. We consider strategic manipulation of tournaments where player $j$ might throw their match to player $i$ in order to increase the likelihood that one of them wins the tournament. Regardless of the reason why $j$ chooses to do this, the potential for manipulation exists as long as $\Pr[r(T) = i]$ increases by more than $\Pr[r(T) = j]$ decreases. Unfortunately, it is known that every Condorcet-consistent rule is manipulable (Altman and Kleinberg). In this work, we address the question of how manipulable Condorcet-consistent rules must necessarily be - by trying to minimize the difference between the increase in $\Pr[r(T) = i]$ and decrease in $\Pr[r(T) = j]$ for any potential manipulating pair. We show that every Condorcet-consistent rule is in fact $1/3$-manipulable, and that selecting a winner according to a random single elimination bracket is not $α$-manipulable for any $α> 1/3$. We also show that many previously studied tournament formats are all $1/2$-manipulable, and the popular class of Copeland rules (any rule that selects a player with the most wins) are all in fact $1$-manipulable, the worst possible. Finally, we consider extensions to match-fixing among sets of more than two players.

preprint2015arXiv

Information complexity is computable

The information complexity of a function $f$ is the minimum amount of information Alice and Bob need to exchange to compute the function $f$. In this paper we provide an algorithm for approximating the information complexity of an arbitrary function $f$ to within any additive error $α> 0$, thus resolving an open question as to whether information complexity is computable. In the process, we give the first explicit upper bound on the rate of convergence of the information complexity of $f$ when restricted to $b$-bit protocols to the (unrestricted) information complexity of $f$.

preprint2015arXiv

Tight space-noise tradeoffs in computing the ergodic measure

In this note we obtain tight bounds on the space-complexity of computing the ergodic measure of a low-dimensional discrete-time dynamical system affected by Gaussian noise. If the scale of the noise is $\varepsilon$, and the function describing the evolution of the system is not by itself a source of computational complexity, then the density function of the ergodic measure can be approximated within precision $δ$ in space polynomial in $\log 1/\varepsilon+\log\log 1/δ$. We also show that this bound is tight up to polynomial factors. In the course of showing the above, we prove a result of independent interest in space-bounded computation: that it is possible to exponentiate an $n$ by $n$ matrix to an exponentially large power in space polylogarithmic in $n$.

preprint2014arXiv

Fast Dynamic Pointer Following via Link-Cut Trees

In this paper, we study the problem of fast dynamic pointer following: given a directed graph $G$ where each vertex has outdegree $1$, efficiently support the operations of i) changing the outgoing edge of any vertex, and ii) find the vertex $k$ vertices `after' a given vertex. We exhibit a solution to this problem based on link-cut trees that requires $O(\lg n)$ time per operation, and prove that this is optimal in the cell-probe complexity model.

preprint2012arXiv

Polynomial sequences of binomial-type arising in graph theory

In this paper, we show that the solution to a large class of "tiling" problems is given by a polynomial sequence of binomial type. More specifically, we show that the number of ways to place a fixed set of polyominos on an $n\times n$ toroidal chessboard such that no two polyominos overlap is eventually a polynomial in $n$, and that certain sets of these polynomials satisfy binomial-type recurrences. We exhibit generalizations of this theorem to higher dimensions and other lattices. Finally, we apply the techniques developed in this paper to resolve an open question about the structure of coefficients of chromatic polynomials of certain grid graphs (namely that they also satisfy a binomial-type recurrence).

preprint2011arXiv

Enumeration and Quasipolynomiality of Chip-Firing Configurations

In this paper we explore enumeration problems related to the number of reachable configurations in a chip-firing game on a finite connected graph G. We define an auxiliary notion of debt-reachability and prove that the number of debt-reachable configurations from an initial configuration with c chips on one vertex is a quasipolynomial in c. For the cycle graph C_n, we apply these results to compute a near explicit formula for the number of debt-reachable configurations. We then derive polynomial asymptotic bounds for the number of debt-reachable and reachable configurations, and finally provide evidence for a quasipolynomiality conjecture regarding the number of reachable configurations.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2206.07528:author:3:jon-schneider

Imported May 21, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.02435:author:2:jon-schneider

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2604.27166:author:2:jon-schneider

Imported May 20, 2026Synced May 20, 2026

4 works

Renato Paes Leme

Researcher

Renato Paes Leme contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Mark Braverman

Researcher

Mark Braverman contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Mehryar Mohri

Researcher

Mehryar Mohri contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

S. Matthew Weinberg

Researcher

S. Matthew Weinberg contributes to research discovery and scholarly infrastructure.

Open to collaborate

Jon Schneider

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Density-Based Algorithms for Corruption-Robust Contextual Search and Convex Optimization

Distributional Alignment Games for Answer-Level Fine-Tuning

Generalized Distributional Alignment Games for Unbiased Answer-Level Fine-Tuning

Optimal cross-learning for contextual bandits with unknown context distributions

Bernoulli Factories for Flow-Based Polytopes

Multiparameter Bernoulli Factories

Strategizing against Learners in Bayesian Games

Optimal Contextual Pricing and Extensions

Prior-free Dynamic Mechanism Design With Limited Liability

Learning Product Rankings Robust to Fake Users

Reserve Price Optimization for First Price Auctions

Competitive analysis of the top-K ranking problem

Condorcet-Consistent and Approximately Strategyproof Tournament Rules

Information complexity is computable

Tight space-noise tradeoffs in computing the ergodic measure

Fast Dynamic Pointer Following via Link-Cut Trees

Polynomial sequences of binomial-type arising in graph theory

Enumeration and Quasipolynomiality of Chip-Firing Configurations