Source author record

Will Rosenbaum

Will Rosenbaum appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Computer Science and Game Theory Cryptography and Security Discrete Mathematics Machine Learning math.CO math.ST Statistics Theory

Catalog footprint

What is connected

11works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bias Reduction for Sum Estimation

In classical statistics and distribution testing, it is often assumed that elements can be sampled from some distribution $P$, and that when an element $x$ is sampled, the probability $P$ of sampling $x$ is also known. Recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution $Q$ that is sufficiently close to $P$. This phenomenon raises interesting questions: under what conditions is a "noisy" distribution $Q$ sufficient, and what is the algorithmic cost of coping with this noise? We investigate these questions for the problem of estimating the sum of a multiset of $N$ real values $x_1, \ldots, x_N$. This problem is well-studied in the statistical literature in the case $P = Q$, where the Hansen-Hurwitz estimator is frequently used. We assume that for some known distribution $P$, values are sampled from a distribution $Q$ that is pointwise close to $P$. For every positive integer $k$ we define an estimator $ζ_k$ for $μ= \sum_i x_i$ whose bias is proportional to $γ^k$ (where our $ζ_1$ reduces to the classical Hansen-Hurwitz estimator). As a special case, we show that if $Q$ is pointwise $γ$-close to uniform and all $x_i \in \{0, 1\}$, for any $ε> 0$, we can estimate $μ$ to within additive error $εN$ using $m = Θ({N^{1-\frac{1}{k}} / ε^{2/k}})$ samples, where $k = \left\lceil (\log ε)/(\log γ)\right\rceil$. We show that this sample complexity is essentially optimal. Our bounds show that the sample complexity need not vary uniformly with the desired error parameter $ε$: for some values of $ε$, perturbations in its value have no asymptotic effect on the sample complexity, while for other values, any decrease in its value results in an asymptotically larger sample complexity.

preprint2022arXiv

Finding a Winning Strategy for Wordle is NP-complete

In this paper, we give a formal definition of the popular word-guessing game Wordle. We show that, in general, determining if a given Wordle instance admits a winning strategy is NP-complete. We also show that given a Wordle instance of size $N$, a winning strategy that uses $g$ guesses in the worst case (if any) can be found in time $N^{O(g)}$.

preprint2022arXiv

Packet Forwarding with a Locally Bursty Adversary

We consider packet forwarding in the adversarial queueing theory (AQT) model introduced by Borodin et al. We introduce a refinement of the AQT $(ρ, σ)$-bounded adversary, which we call a \emph{locally bursty adversary} (LBA) that parameterizes injection patterns jointly by edge utilization and packet origin. For constant ($O(1)$) parameters, the LBA model is strictly more permissive than the $(ρ, σ)$ model. For example, there are injection patterns in the LBA model with constant parameters that can only be realized as $(ρ, σ)$-bounded injection patterns with $ρ+ σ= Ω(n)$ (where $n$ is the network size). We show that the LBA model (unlike the $(ρ, σ)$ model) is closed under packet bundling and discretization operations. Thus, the LBA model allows one to reduce the study of general (uniform) capacity networks and inhomogenous packet sizes to unit capacity networks with homogeneous packets. On the algorithmic side, we focus on information gathering networks -- i.e., networks in which all packets share a common destination, and the union of packet routes forms a tree. We show that the Odd-Even Downhill (OED) forwarding protocol described independently by Dobrev et al.\ and Patt-Shamir and Rosenbaum achieves buffer space usage of $O(\log n)$ against all LBAs with constant parameters. OED is a local protocol, but we show that the upper bound is tight even when compared to centralized protocols. Our lower bound for the LBA model is in contrast to the $(ρ, σ)$-model, where centralized protocols can achieve worst-case buffer space usage $O(1)$ for $ρ, σ= O(1)$, while the $O(\log n)$ upper bound for OED is optimal only for local protocols.

preprint2022arXiv

Training-Time Attacks against k-Nearest Neighbors

Nearest neighbor-based methods are commonly used for classification tasks and as subroutines of other data-analysis methods. An attacker with the capability of inserting their own data points into the training set can manipulate the inferred nearest neighbor structure. We distill this goal to the task of performing a training-set data insertion attack against $k$-Nearest Neighbor classification ($k$NN). We prove that computing an optimal training-time (a.k.a. poisoning) attack against $k$NN classification is NP-Hard, even when $k = 1$ and the attacker can insert only a single data point. We provide an anytime algorithm to perform such an attack, and a greedy algorithm for general $k$ and attacker budget. We provide theoretical bounds and empirically demonstrate the effectiveness and practicality of our methods on synthetic and real-world datasets. Empirically, we find that $k$NN is vulnerable in practice and that dimensionality reduction is an effective defense. We conclude with a discussion of open problems illuminated by our analysis.

preprint2021arXiv

Stable Matchings with Restricted Preferences: Structure and Complexity

It is well known that every stable matching instance $I$ has a rotation poset $R(I)$ that can be computed efficiently and the downsets of $R(I)$ are in one-to-one correspondence with the stable matchings of $I$. Furthermore, for every poset $P$, an instance $I(P)$ can be constructed efficiently so that the rotation poset of $I(P)$ is isomorphic to $P$. In this case, we say that $I(P)$ realizes $P$. Many researchers exploit the rotation poset of an instance to develop fast algorithms or to establish the hardness of stable matching problems. In order to gain a parameterized understanding of the complexity of sampling stable matchings, Bhatnagar et al. [SODA 2008] introduced stable matching instances whose preference lists are restricted but nevertheless model situations that arise in practice. In this paper, we study four such parameterized restrictions; our goal is to characterize the rotation posets that arise from these models: $k$-bounded, $k$-attribute, $(k_1, k_2)$-list, $k$-range. We prove that there is a constant $k$ so that every rotation poset is realized by some instance in the first three models for some fixed constant $k$. We describe efficient algorithms for constructing such instances given the Hasse diagram of a poset. As a consequence, the fundamental problem of counting stable matchings remains $\#$BIS-complete even for these restricted instances. For $k$-range preferences, we show that a poset $P$ is realizable if and only if the Hasse diagram of $P$ has pathwidth bounded by functions of $k$. Using this characterization, we show that the following problems are fixed parameter tractable when parametrized by the range of the instance: exactly counting and uniformly sampling stable matchings, finding median, sex-equal, and balanced stable matchings.

preprint2020arXiv

PALS: Plesiochronous and Locally Synchronous Systems

Consider an arbitrary network of communicating modules on a chip, each requiring a local signal telling it when to execute a computational step. There are three common solutions to generating such a local clock signal: (i) by deriving it from a single, central clock source, (ii) by local, free-running oscillators, or (iii) by handshaking between neighboring modules. Conceptually, each of these solutions is the result of a perceived dichotomy in which (sub)systems are either clocked or fully asynchronous, suggesting that the designer's choice is limited to deciding where to draw the line between synchronous and asynchronous design. In contrast, we take the view that the better question to ask is how synchronous the system can and should be. Based on a distributed clock synchronization algorithm, we present a novel design providing modules with local clocks whose frequency bounds are almost as good as those of corresponding free-running oscillators, yet neighboring modules are guaranteed to have a phase offset substantially smaller than one clock cycle. Concretely, parameters obtained from a 15nm ASIC implementation running at 2GHz yield mathematical worst-case bounds of 30ps on phase offset for a 32x32 node grid network.

preprint2020arXiv

Seeing Far vs. Seeing Wide: Volume Complexity of Local Graph Problems

Consider a graph problem that is locally checkable but not locally solvable: given a solution we can check that it is feasible by verifying all constant-radius neighborhoods, but to find a solution each node needs to explore the input graph at least up to distance $Ω(\log n)$ in order to produce its output. We consider the complexity of such problems from the perspective of volume: how large a subgraph does a node need to see in order to produce its output. We study locally checkable graph problems on bounded-degree graphs. We give a number of constructions that exhibit tradeoffs between deterministic distance, randomized distance, deterministic volume, and randomized volume: - If the deterministic distance is linear, it is also known that randomized distance is near-linear. In contrast, we show that there are problems with linear deterministic volume but only logarithmic randomized volume. - We prove a volume hierarchy theorem for randomized complexity: among problems with linear deterministic volume complexity, there are infinitely many distinct randomized volume complexity classes between $Ω(\log n)$ and $O(n)$. This hierarchy persists even when restricting to problems whose randomized and deterministic distance complexities are $Θ(\log n)$. - Similar hierarchies exist for polynomial distance complexities: for any $k, \ell \in N$ with $k \leq \ell$, there are problems whose randomized and deterministic distance complexities are $Θ(n^{1/\ell})$, randomized volume complexities are $Θ(n^{1/k})$, and whose deterministic volume complexities are $Θ(n)$. Additionally, we consider connections between our volume model and massively parallel computation (MPC). We give a general simulation argument that any volume-efficient algorithm can be transformed into a space-efficient MPC algorithm.

preprint2020arXiv

Simple Counting and Sampling Algorithms for Graphs with Bounded Pathwidth

In this paper, we consider the problem of counting and sampling structures in graphs. We define a class of "edge universal labeling problems"---which include proper $k$-colorings, independent sets, and downsets---and describe simple algorithms for counting and uniformly sampling valid labelings of graphs, assuming a path decomposition is given. Thus, we show that several well-studied counting and sampling problems are fixed parameter tractable (FPT) when parameterized by the pathwidth of the input graph. We discuss connections to counting and sampling problems for distributive lattices and, in particular, we give a new FPT algorithm for exactly counting and uniformly sampling stable matchings.

preprint2015arXiv

Fast distributed almost stable marriages

In their seminal work on the Stable Marriage Problem, Gale and Shapley describe an algorithm which finds a stable matching in $O(n^2)$ communication rounds. Their algorithm has a natural interpretation as a distributed algorithm where each player is represented by a single processor. In this distributed model, Floreen, Kaski, Polishchuk, and Suomela recently showed that for bounded preference lists, terminating the Gale-Shapley algorithm after a constant number of rounds results in an almost stable matching. In this paper, we describe a new deterministic distributed algorithm which finds an almost stable matching in $O(\log^5 n)$ communication rounds for arbitrary preferences. We also present a faster randomized variant which requires $O(\log^2 n)$ rounds. This run-time can be improved to $O(1)$ rounds for "almost regular" (and in particular complete) preferences. To our knowledge, these are the first sub-polynomial round distributed algorithms for any variant of the stable marriage problem with unbounded preferences.

preprint2014arXiv

It's Not Easy Being Three: The Approximability of Three-Dimensional Stable Matching Problems

In 1976, Knuth asked if the stable marriage problem (SMP) can be generalized to marriages consisting of 3 genders. In 1988, Alkan showed that the natural generalization of SMP to 3 genders ($3$GSM) need not admit a stable marriage. Three years later, Ng and Hirschberg proved that it is NP-complete to determine if given preferences admit a stable marriage. They further prove an analogous result for the $3$ person stable assignment ($3$PSA) problem. In light of Ng and Hirschberg's NP-hardness result for $3$GSM and $3$PSA, we initiate the study of approximate versions of these problems. In particular, we describe two optimization variants of $3$GSM and $3$PSA: maximally stable marriage/matching (MSM) and maximum stable submarriage/submatching (MSS). We show that both variants are NP-hard to approximate within some fixed constant factor. Conversely, we describe a simple polynomial time algorithm which computes constant factor approximations for the maximally stable marriage and matching problems. Thus both variants of MSM are APX-complete.

preprint2014arXiv

On The Communication Complexity of Finding an (Approximate) Stable Marriage

In this paper, we consider the communication complexity of protocols that compute stable matchings. We work within the context of Gale and Shapley's original stable marriage problem\cite{GS62}: $n$ men and $n$ women each privately hold a total and strict ordering on all of the members of the opposite gender. They wish to collaborate in order to find a stable matching---a pairing of the men and women such that no unmatched pair mutually prefer each other to their assigned partners in the matching. We show that any communication protocol (deterministic, nondeterministic, or randomized) that correctly ouputs a stable matching requires $Ω(n^2)$ bits of communication. Thus, the original algorithm of Gale and Shapley is communication-optimal up to a logarithmic factor. We then introduce a "divorce metric" on the set of all matchings, which allows us to consider approximately stable matchings. We describe an efficient algorithm to compute the "distance to stability" of a given matching. We then show that even under the relaxed requirement that a protocol only yield an approximate stable matching, the $Ω(n^2)$ communication lower bound still holds.

Will Rosenbaum

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Bias Reduction for Sum Estimation

Finding a Winning Strategy for Wordle is NP-complete

Packet Forwarding with a Locally Bursty Adversary

Training-Time Attacks against k-Nearest Neighbors

Stable Matchings with Restricted Preferences: Structure and Complexity

PALS: Plesiochronous and Locally Synchronous Systems

Seeing Far vs. Seeing Wide: Volume Complexity of Local Graph Problems

Simple Counting and Sampling Algorithms for Graphs with Bounded Pathwidth

Fast distributed almost stable marriages

It's Not Easy Being Three: The Approximability of Three-Dimensional Stable Matching Problems

On The Communication Complexity of Finding an (Approximate) Stable Marriage