Researcher profile

Salil Vadhan

Salil Vadhan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2023arXiv

Canonical Noise Distributions and Private Hypothesis Tests

$f$-DP has recently been proposed as a generalization of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of $f$-DP, we propose the concept of a canonical noise distribution (CND), the first mechanism designed for an arbitrary $f$-DP guarantee. The notion of CND captures whether an additive privacy mechanism perfectly matches the privacy guarantee of a given $f$. We prove that a CND always exists, and give a construction that produces a CND for any $f$. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private $p$-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data, within the general $f$-DP framework. We apply our techniques to the problem of difference of proportions testing, and construct a UMP unbiased (UMPU) "semi-private" test which upper bounds the performance of any $f$-DP test. Using this as a benchmark we propose a private test, based on the inversion of characteristic functions, which allows for optimal inference for the two population parameters and is nearly as powerful as the semi-private UMPU. When specialized to the case of $(ε,0)$-DP, we show empirically that our proposed test is more powerful than any $(ε/\sqrt 2)$-DP test and has more accurate type I errors than the classic normal approximation test.

preprint2022arXiv

High-precision Estimation of Random Walks in Small Space

We provide a deterministic $\tilde{O}(\log N)$-space algorithm for estimating random walk probabilities on undirected graphs, and more generally Eulerian directed graphs, to within inverse polynomial additive error ($ε=1/\mathrm{poly}(N)$) where $N$ is the length of the input. Previously, this problem was known to be solvable by a randomized algorithm using space $O(\log N)$ (following Aleliunas et al., FOCS 79) and by a deterministic algorithm using space $O(\log^{3/2} N)$ (Saks and Zhou, FOCS 95 and JCSS 99), both of which held for arbitrary directed graphs but had not been improved even for undirected graphs. We also give improvements on the space complexity of both of these previous algorithms for non-Eulerian directed graphs when the error is negligible ($ε=1/N^{ω(1)}$), generalizing what Hoza and Zuckerman (FOCS 18) recently showed for the special case of distinguishing whether a random walk probability is $0$ or greater than $ε$. We achieve these results by giving new reductions between powering Eulerian random-walk matrices and inverting Eulerian Laplacian matrices, providing a new notion of spectral approximation for Eulerian graphs that is preserved under powering, and giving the first deterministic $\tilde{O}(\log N)$-space algorithm for inverting Eulerian Laplacian matrices. The latter algorithm builds on the work of Murtagh et al. (FOCS 17) that gave a deterministic $\tilde{O}(\log N)$-space algorithm for inverting undirected Laplacian matrices, and the work of Cohen et al. (FOCS 19) that gave a randomized $\tilde{O}(N)$-time algorithm for inverting Eulerian Laplacian matrices. A running theme throughout these contributions is an analysis of "cycle-lifted graphs", where we take a graph and "lift" it to a new graph whose adjacency matrix is the tensor product of the original adjacency matrix and a directed cycle (or variants of one).

preprint2022arXiv

Hypothesis Testing for Differentially Private Linear Regression

In this work, we design differentially private hypothesis tests for the following problems in the general linear model: testing a linear relationship and testing for the presence of mixtures. The majority of our hypothesis tests are based on differentially private versions of the $F$-statistic for the general linear model framework, which are uniformly most powerful unbiased in the non-private setting. We also present other tests for these problems, one of which is based on the differentially private nonparametric tests of Couch, Kazan, Shi, Bray, and Groce (CCS 2019), which is especially suited for the small dataset regime. We show that the differentially private $F$-statistic converges to the asymptotic distribution of its non-private counterpart. As a corollary, the statistical power of the differentially private $F$-statistic converges to the statistical power of the non-private $F$-statistic. Through a suite of Monte Carlo based experiments, we show that our tests achieve desired significance levels and have a high power that approaches the power of the non-private tests as we increase sample sizes or the privacy-loss parameter. We also show when our tests outperform existing methods in the literature.

preprint2020arXiv

Differentially Private Simple Linear Regression

Economics and social science research often require analyzing datasets of sensitive personal information at fine granularity, with models fit to small subsets of the data. Unfortunately, such fine-grained analysis can easily reveal sensitive individual information. We study algorithms for simple linear regression that satisfy differential privacy, a constraint which guarantees that an algorithm's output reveals little about any individual input data record, even to an attacker with arbitrary side information about the dataset. We consider the design of differentially private algorithms for simple linear regression for small datasets, with tens to hundreds of datapoints, which is a particularly challenging regime for differential privacy. Focusing on a particular application to small-area analysis in economics research, we study the performance of a spectrum of algorithms we adapt to the setting. We identify key factors that affect their performance, showing through a range of experiments that algorithms based on robust estimators (in particular, the Theil-Sen estimator) perform well on the smallest datasets, but that other more standard algorithms do better as the dataset size increases.

preprint2020arXiv

Spectral Sparsification via Bounded-Independence Sampling

We give a deterministic, nearly logarithmic-space algorithm for mild spectral sparsification of undirected graphs. Given a weighted, undirected graph $G$ on $n$ vertices described by a binary string of length $N$, an integer $k\leq \log n$, and an error parameter $ε> 0$, our algorithm runs in space $\tilde{O}(k\log (N\cdot w_{\mathrm{max}}/w_{\mathrm{min}}))$ where $w_{\mathrm{max}}$ and $w_{\mathrm{min}}$ are the maximum and minimum edge weights in $G$, and produces a weighted graph $H$ with $\tilde{O}(n^{1+2/k}/ε^2)$ edges that spectrally approximates $G$, in the sense of Spielmen and Teng [ST04], up to an error of $ε$. Our algorithm is based on a new bounded-independence analysis of Spielman and Srivastava's effective resistance based edge sampling algorithm [SS08] and uses results from recent work on space-bounded Laplacian solvers [MRSV17]. In particular, we demonstrate an inherent tradeoff (via upper and lower bounds) between the amount of (bounded) independence used in the edge sampling algorithm, denoted by $k$ above, and the resulting sparsity that can be achieved.