Researcher profile

Leonard J. Schulman

Leonard J. Schulman contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2021arXiv

Hadamard Extensions and the Identification of Mixtures of Product Distributions

The Hadamard Extension of a matrix is the matrix consisting of all Hadamard products of subsets of its rows. This construction arises in the context of identifying a mixture of product distributions on binary random variables: full column rank of such extensions is a necessary ingredient of identification algorithms. We provide several results concerning when a Hadamard Extension has full column rank.

preprint2020arXiv

Source Identification for Mixtures of Product Distributions

We give an algorithm for source identification of a mixture of $k$ product distributions on $n$ bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)} n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O'Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).

preprint2020arXiv

The Sparse Hausdorff Moment Problem, with Application to Topic Models

We consider the problem of identifying, from its first $m$ noisy moments, a probability distribution on $[0,1]$ of support $k<\infty$. This is equivalent to the problem of learning a distribution on $m$ observable binary random variables $X_1,X_2,\dots,X_m$ that are iid conditional on a hidden random variable $U$ taking values in $\{1,2,\dots,k\}$. Our focus is on accomplishing this with $m=2k$, which is the minimum $m$ for which verifying that the source is a $k$-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables using a sample of size $\left(1/w_{\min}\right)^2 \cdot\left(1/ζ\right)^{O(k)}$ and post-sampling runtime of only $O(k^{2+o(1)})$ arithmetic operations. Here $w_{\min}$ is the minimum probability of an outcome of $U$, and $ζ$ is the minimum separation between the distinct success probabilities of the $X_i$s. Stated in terms of the moment problem, it suffices to know the moments to additive accuracy $w_{\min}\cdotζ^{O(k)}$. It is known that the sample complexity of any solution to the identification problem must be at least exponential in $k$. Previous results demonstrated either worse sample complexity and worse $O(k^c)$ runtime for some $c$ substantially larger than $2$, or similar sample complexity and much worse $k^{O(k^2)}$ runtime.

preprint2016arXiv

Market Dynamics of Best-Response with Lookahead

One attractive approach to market dynamics is the level $k$ model in which a level $0$ player adopts a very simple response to current conditions, a level $1$ player best-responds to a model in which others take level $0$ actions, and so forth. (This is analogous to $k$-ply exploration of game trees in AI, and to receding-horizon control in control theory.) If players have deterministic mental models with this kind of finite-level response, there is obviously no way their mental models can all be consistent. Nevertheless, there is experimental evidence that people act this way in many situations, motivating the question of what the dynamics of such interactions lead to. We address this question in the setting of Fisher Markets with constant elasticities of substitution (CES) utilities, in the weak gross substitutes (WGS) regime. We show that despite the inconsistency of the mental models, and even if players&#39; models change arbitrarily from round to round, the market converges to its unique equilibrium. (We show this for both synchronous and asynchronous discrete-time updates.) Moreover, the result is computationally feasible in the sense that the convergence rate is linear, i.e., the distance to equilibrium decays exponentially fast. To the best of our knowledge, this is the first result that demonstrates, in Fisher markets, convergence at any rate for dynamics driven by a plausible model of seller incentives. Even for the simple case of (level $0$) best-response dynamics, where we observe that convergence at some rate can be derived from recent results in convex optimization, our result is the first to demonstrate a linear rate of convergence.

preprint2015arXiv

Analysis of a Classical Matrix Preconditioning Algorithm

We study a classical iterative algorithm for balancing matrices in the $L_\infty$ norm via a scaling transformation. This algorithm, which goes back to Osborne and Parlett \& Reinsch in the 1960s, is implemented as a standard preconditioner in many numerical linear algebra packages. Surprisingly, despite its widespread use over several decades, no bounds were known on its rate of convergence. In this paper we prove that, for any irreducible $n\times n$ (real or complex) input matrix~$A$, a natural variant of the algorithm converges in $O(n^3\log(nρ/\varepsilon))$ elementary balancing operations, where $ρ$ measures the initial imbalance of~$A$ and $\varepsilon$ is the target imbalance of the output matrix. (The imbalance of~$A$ is $\max_i |\log(a_i^{\text{out}}/a_i^{\text{in}})|$, where $a_i^{\text{out}},a_i^{\text{in}}$ are the maximum entries in magnitude in the $i$th row and column respectively.) This bound is tight up to the $\log n$ factor. A balancing operation scales the $i$th row and column so that their maximum entries are equal, and requires $O(m/n)$ arithmetic operations on average, where $m$ is the number of non-zero elements in~$A$. Thus the running time of the iterative algorithm is $\tilde{O}(n^2m)$. This is the first time bound of any kind on any variant of the Osborne-Parlett-Reinsch algorithm. We also prove a conjecture of Chen that characterizes those matrices for which the limit of the balancing process is independent of the order in which balancing operations are performed.

preprint2015arXiv

Learning Arbitrary Statistical Mixtures of Discrete Distributions

We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, $\vartheta$, is a probability distribution over probability distributions $p$, where each such $p$ is a probability distribution over $[n] = \{1,2,\dots,n\}$. When we sample from $\vartheta$, we do not observe $p$ directly, but only indirectly and in very noisy fashion, by sampling from $[n]$ repeatedly, independently $K$ times from the distribution $p$. The problem is to infer $\vartheta$ to high accuracy in transportation (earthmover) distance. We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution $\vartheta$. We bound the quality of the solution as a function of the size of the samples $K$ and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.

preprint2015arXiv

The Adversarial Noise Threshold for Distributed Protocols

We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology. In our first result we show that any $n$-party $T$-round protocol on an undirected communication network $G$ can be compiled into a robust simulation protocol on a sparse ($\mathcal{O}(n)$ edges) subnetwork so that the simulation tolerates an adversarial error rate of $Ω\left(\frac{1}{n}\right)$; the simulation has a round complexity of $\mathcal{O}\left(\frac{m \log n}{n} T\right)$, where $m$ is the number of edges in $G$. (So the simulation is work-preserving up to a $\log$ factor.) The adversary&#39;s error rate is within a constant factor of optimal. Given the error rate, the round complexity blowup is within a factor of $\mathcal{O}(k \log n)$ of optimal, where $k$ is the edge connectivity of $G$. We also determine that the maximum tolerable error rate on directed communication networks is $Θ(1/s)$ where $s$ is the number of edges in a minimum equivalent digraph. Next we investigate adversarial per-edge error rates, where the adversary is given an error budget on each edge of the network. We determine the exact limit for tolerable per-edge error rates on an arbitrary directed graph. However, the construction that approaches this limit has exponential round complexity, so we give another compiler, which transforms $T$-round protocols into $\mathcal{O}(mT)$-round simulations, and prove that for polynomial-query black box compilers, the per-edge error rate tolerated by this last compiler is within a constant factor of optimal.

preprint2014arXiv

Achieving Target Equilibria in Network Routing Games without Knowing the Latency Functions

The analysis of network routing games typically assumes, right at the onset, precise and detailed information about the latency functions. Such information may, however, be unavailable or difficult to obtain. Moreover, one is often primarily interested in enforcing a desired target flow as the equilibrium by suitably influencing player behavior in the routing game. We ask whether one can achieve target flows as equilibria without knowing the underlying latency functions. Our main result gives a crisp positive answer to this question. We show that, under fairly general settings, one can efficiently compute edge tolls that induce a given target multicommodity flow in a nonatomic routing game using a polynomial number of queries to an oracle that takes candidate tolls as input and returns the resulting equilibrium flow. This result is obtained via a novel application of the ellipsoid method. Our algorithm extends easily to many other settings, such as (i) when certain edges cannot be tolled or there is an upper bound on the total toll paid by a user, and (ii) general nonatomic congestion games. We obtain tighter bounds on the query complexity for series-parallel networks, and single-commodity routing games with linear latency functions, and complement these with a query-complexity lower bound. We also obtain strong positive results for Stackelberg routing to achieve target equilibria in series-parallel graphs. Our results build upon various new techniques that we develop pertaining to the computation of, and connections between, different notions of approximate equilibrium; properties of multicommodity flows and tolls in series-parallel graphs; and sensitivity of equilibrium flow with respect to tolls. Our results demonstrate that one can indeed circumvent the potentially-onerous task of modeling latency functions, and yet obtain meaningful results for the underlying routing game.

preprint2013arXiv

The Network Improvement Problem for Equilibrium Routing

In routing games, agents pick their routes through a network to minimize their own delay. A primary concern for the network designer in routing games is the average agent delay at equilibrium. A number of methods to control this average delay have received substantial attention, including network tolls, Stackelberg routing, and edge removal. A related approach with arguably greater practical relevance is that of making investments in improvements to the edges of the network, so that, for a given investment budget, the average delay at equilibrium in the improved network is minimized. This problem has received considerable attention in the literature on transportation research and a number of different algorithms have been studied. To our knowledge, none of this work gives guarantees on the output quality of any polynomial-time algorithm. We study a model for this problem introduced in transportation research literature, and present both hardness results and algorithms that obtain nearly optimal performance guarantees. - We first show that a simple algorithm obtains good approximation guarantees for the problem. Despite its simplicity, we show that for affine delays the approximation ratio of 4/3 obtained by the algorithm cannot be improved. - To obtain better results, we then consider restricted topologies. For graphs consisting of parallel paths with affine delay functions we give an optimal algorithm. However, for graphs that consist of a series of parallel links, we show the problem is weakly NP-hard. - Finally, we consider the problem in series-parallel graphs, and give an FPTAS for this case. Our work thus formalizes the intuition held by transportation researchers that the network improvement problem is hard, and presents topology-dependent algorithms that have provably tight approximation guarantees.