Researcher profile

Jelani Nelson

Jelani Nelson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Differentially Private All-Pairs Shortest Path Distances: Improved Algorithms and Lower Bounds

We study the problem of releasing the weights of all-pair shortest paths in a weighted undirected graph with differential privacy (DP). In this setting, the underlying graph is fixed and two graphs are neighbors if their edge weights differ by at most $1$ in the $\ell_1$-distance. We give an $ε$-DP algorithm with additive error $\tilde{O}(n^{2/3} / ε)$ and an $(ε, δ)$-DP algorithm with additive error $\tilde{O}(\sqrt{n} / ε)$ where $n$ denotes the number of vertices. This positively answers a question of Sealfon (PODS'16), who asked whether a $o(n)$-error algorithm exists. We also show that an additive error of $Ω(n^{1/6})$ is necessary for any sufficiently small $ε, δ> 0$. Finally, we consider a relaxed setting where a multiplicative approximation is allowed. We show that, with a multiplicative approximation factor $k$, %$2k - 1$, the additive error can be reduced to $\tilde{O}\left(n^{1/2 + O(1/k)} / ε\right)$ in the $ε$-DP case and $\tilde{O}(n^{1/3 + O(1/k)} / ε)$ in the $(ε, δ)$-DP case, respectively.

preprint2022arXiv

Estimation of Entropy in Constant Space with Improved Sample Complexity

Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$. We conjecture that this is optimal up to $\text{polylog}(1/ε)$ factors.

preprint2022arXiv

On the Robustness of CountSketch to Adaptive Inputs

CountSketch is a popular dimensionality reduction technique that maps vectors to a lower dimension using randomized linear measurements. The sketch supports recovering $\ell_2$-heavy hitters of a vector (entries with $v[i]^2 \geq \frac{1}{k}\|\boldsymbol{v}\|^2_2$). We study the robustness of the sketch in adaptive settings where input vectors may depend on the output from prior inputs. Adaptive settings arise in processes with feedback or with adversarial attacks. We show that the classic estimator is not robust, and can be attacked with a number of queries of the order of the sketch size. We propose a robust estimator (for a slightly modified sketch) that allows for quadratic number of queries in the sketch size, which is an improvement factor of $\sqrt{k}$ (for $k$ heavy hitters) over prior work.

preprint2022arXiv

Optimal bounds for approximate counting

Storing a counter incremented $N$ times would naively consume $O(\log N)$ bits of memory. In 1978 Morris described the very first streaming algorithm: the "Morris Counter". His algorithm's space bound is a random variable, and it has been shown to be $O(\log\log N + \log(1/\varepsilon) + \log(1/δ))$ bits in expectation to provide a $(1+\varepsilon)$-approximation with probability $1-δ$ to the counter's value. We provide a new simple algorithm with a simple analysis showing that randomized space $O(\log\log N + \log(1/\varepsilon) + \log\log(1/δ))$ bits suffice for the same task, i.e. an exponentially improved dependence on the inverse failure probability. We then provide a new analysis showing that the original Morris Counter itself, after a minor but necessary tweak, actually also enjoys this same improved upper bound. Lastly, we prove a new lower bound for this task showing optimality of our upper bound. We thus completely resolve the asymptotic space complexity of approximate counting. Furthermore all our constants are explicit, and our lower bound and tightest upper bound differ by a multiplicative factor of at most $3+o(1)$.

preprint2022arXiv

Private Frequency Estimation via Projective Geometry

In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost $O(n + k\exp(\varepsilon) \log k)$ for the server to approximately reconstruct the frequency histogram, while achieving the state-of-the-art privacy-utility tradeoff. In many parameter settings used in practice this is a significant improvement over the $ O(n+k^2)$ computation cost that is achieved by the recent PI-RAPPOR algorithm (Feldman and Talwar; 2021). Our empirical evaluation shows a speedup of over 50x over PI-RAPPOR while using approximately 75x less memory for practically relevant parameter settings. In addition, the running time of our algorithm is within an order of magnitude of HadamardResponse (Acharya, Sun, and Zhang; 2019) and RecursiveHadamardResponse (Chen, Kairouz, and Ozgur; 2020) which have significantly worse reconstruction error. The error of our algorithm essentially matches that of the communication- and time-inefficient but utility-optimal SubsetSelection (SS) algorithm (Ye and Barg; 2017). Our new algorithm is based on using Projective Planes over a finite field to define a small collection of sets that are close to being pairwise independent and a dynamic programming algorithm for approximate histogram reconstruction on the server side. We also give an extension of PGR, which we call HybridProjectiveGeometryResponse, that allows trading off computation time with utility smoothly.

preprint2022arXiv

Randomized Algorithms for Scientific Computing (RASC)

Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.

preprint2022arXiv

Tricking the Hashing Trick: A Tight Lower Bound on the Robustness of CountSketch to Adaptive Inputs

CountSketch and Feature Hashing (the "hashing trick") are popular randomized dimensionality reduction methods that support recovery of $\ell_2$-heavy hitters (keys $i$ where $v_i^2 > ε\|\boldsymbol{v}\|_2^2$) and approximate inner products. When the inputs are {\em not adaptive} (do not depend on prior outputs), classic estimators applied to a sketch of size $O(\ell/ε)$ are accurate for a number of queries that is exponential in $\ell$. When inputs are adaptive, however, an adversarial input can be constructed after $O(\ell)$ queries with the classic estimator and the best known robust estimator only supports $\tilde{O}(\ell^2)$ queries. In this work we show that this quadratic dependence is in a sense inherent: We design an attack that after $O(\ell^2)$ queries produces an adversarial input vector whose sketch is highly biased. Our attack uses "natural" non-adaptive inputs (only the final adversarial input is chosen adaptively) and universally applies with any correct estimator, including one that is unknown to the attacker. In that, we expose inherent vulnerability of this fundamental method.

preprint2020arXiv

Margin-Based Generalization Lower Bounds for Boosted Classifiers

Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998) and has inspired numerous boosting algorithms and generalization bounds. To date, the strongest known generalization (upper bound) is the $k$th margin bound of Gao and Zhou (2013). Despite the numerous generalization upper bounds that have been proved over the last two decades, nothing is known about the tightness of these bounds. In this paper, we give the first margin-based lower bounds on the generalization error of boosted classifiers. Our lower bounds nearly match the $k$th margin bound and thus almost settle the generalization performance of boosted classifiers in terms of margins.

preprint2010arXiv

Bounded Independence Fools Degree-2 Threshold Functions

Let x be a random vector coming from any k-wise independent distribution over {-1,1}^n. For an n-variate degree-2 polynomial p, we prove that E[sgn(p(x))] is determined up to an additive epsilon for k = poly(1/epsilon). This answers an open question of Diakonikolas et al. (FOCS 2009). Using standard constructions of k-wise independent distributions, we obtain a broad class of explicit generators that epsilon-fool the class of degree-2 threshold functions with seed length log(n)*poly(1/epsilon). Our approach is quite robust: it easily extends to yield that the intersection of any constant number of degree-2 threshold functions is epsilon-fooled by poly(1/epsilon)-wise independence. Our results also hold if the entries of x are k-wise independent standard normals, implying for example that bounded independence derandomizes the Goemans-Williamson hyperplane rounding scheme. To achieve our results, we introduce a technique we dub multivariate FT-mollification, a generalization of the univariate form introduced by Kane et al. (SODA 2010) in the context of streaming algorithms. Along the way we prove a generalized hypercontractive inequality for quadratic forms which takes the operator norm of the associated matrix into account. These techniques may be of independent interest.