Researcher profile

Aleksandar Nikolov

Aleksandar Nikolov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

Private Query Release via the Johnson-Lindenstrauss Transform

We introduce a new method for releasing answers to statistical queries with differential privacy, based on the Johnson-Lindenstrauss lemma. The key idea is to randomly project the query answers to a lower dimensional space so that the distance between any two vectors of feasible query answers is preserved up to an additive error. Then we answer the projected queries using a simple noise-adding mechanism, and lift the answers up to the original dimension. Using this method, we give, for the first time, purely differentially private mechanisms with optimal worst case sample complexity under average error for answering a workload of $k$ queries over a universe of size $N$. As other applications, we give the first purely private efficient mechanisms with optimal sample complexity for computing the covariance of a bounded high-dimensional distribution, and for answering 2-way marginal queries. We also show that, up to the dependence on the error, a variant of our mechanism is nearly optimal for every given query workload.

preprint2020arXiv

Locally Private Hypothesis Selection

We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the constraints of $\varepsilon$-local differential privacy, a distribution from $\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $\tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $Ω(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $\tilde O(k)$ samples and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.

preprint2020arXiv

Maximizing Determinants under Matroid Constraints

Given vectors $v_1,\dots,v_n\in\mathbb{R}^d$ and a matroid $M=([n],I)$, we study the problem of finding a basis $S$ of $M$ such that $\det(\sum_{i \in S}v_i v_i^\top)$ is maximized. This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an $e^{2k}$-estimation for any matroid of rank $k$ and a $(1+ε)^d$-approximation for a uniform matroid of rank $k\ge d+\frac dε$, where the rank $k\ge d$ denotes the desired size of the optimal set. Our main result is a new approximation algorithm with an approximation guarantee that depends only on the dimension $d$ of the vectors and not on the size $k$ of the output set. In particular, we show an $(O(d))^{d}$-estimation and an $(O(d))^{d^3}$-approximation for any matroid, giving a significant improvement over prior work when $k\gg d$. Our result relies on the existence of an optimal solution to a convex programming relaxation for the problem which has sparse support; in particular, no more than $O(d^2)$ variables of the solution have fractional values. The sparsity results rely on the interplay between the first-order optimality conditions for the convex program and matroid theory. We believe that the techniques introduced to show sparsity of optimal solutions to convex programs will be of independent interest. We also give a randomized algorithm that rounds a sparse fractional solution to a feasible integral solution to the original problem. To show the approximation guarantee, we utilize recent works on strongly log-concave polynomials and show new relationships between different convex programs studied for the problem. Finally, we use the estimation algorithm and sparsity results to give an efficient deterministic approximation algorithm with an approximation guarantee that depends solely on the dimension $d$.

preprint2020arXiv

On the Computational Complexity of Linear Discrepancy

Many problems in computer science and applied mathematics require rounding a vector $\mathbf{w}$ of fractional values lying in the interval $[0,1]$ to a binary vector $\mathbf{x}$ so that, for a given matrix $\mathbf{A}$, $\mathbf{A}\mathbf{x}$ is as close to $\mathbf{A}\mathbf{w}$ as possible. For example, this problem arises in LP rounding algorithms used to approximate $\mathsf{NP}$-hard optimization problems and in the design of uniformly distributed point sets for numerical integration. For a given matrix $\mathbf{A}$, the worst-case error over all choices of $\mathbf{w}$ incurred by the best possible rounding is measured by the linear discrepancy of $\mathbf{A}$, a quantity studied in discrepancy theory, and introduced by Lovasz, Spencer, and Vesztergombi (EJC, 1986). We initiate the study of the computational complexity of linear discrepancy. Our investigation proceeds in two directions: (1) proving hardness results and (2) finding both exact and approximate algorithms to evaluate the linear discrepancy of certain matrices. For (1), we show that linear discrepancy is $\mathsf{NP}$-hard. Thus we do not expect to find an efficient exact algorithm for the general case. Restricting our attention to matrices with a constant number of rows, we present a poly-time exact algorithm for matrices consisting of a single row and matrices with a constant number of rows and entries of bounded magnitude. We also present an exponential-time approximation algorithm for general matrices, and an algorithm that approximates linear discrepancy to within an exponential factor.

preprint2020arXiv

Private Query Release Assisted by Public Data

We study the problem of differentially private query release assisted by access to public data. In this problem, the goal is to answer a large class $\mathcal{H}$ of statistical queries with error no more than $α$ using a combination of public and private samples. The algorithm is required to satisfy differential privacy only with respect to the private samples. We study the limits of this task in terms of the private and public sample complexities. First, we show that we can solve the problem for any query class $\mathcal{H}$ of finite VC-dimension using only $d/α$ public samples and $\sqrt{p}d^{3/2}/α^2$ private samples, where $d$ and $p$ are the VC-dimension and dual VC-dimension of $\mathcal{H}$, respectively. In comparison, with only private samples, this problem cannot be solved even for simple query classes with VC-dimension one, and without any private samples, a larger public sample of size $d/α^2$ is needed. Next, we give sample complexity lower bounds that exhibit tight dependence on $p$ and $α$. For the class of decision stumps, we give a lower bound of $\sqrt{p}/α$ on the private sample complexity whenever the public sample size is less than $1/α^2$. Given our upper bounds, this shows that the dependence on $\sqrt{p}$ is necessary in the private sample complexity. We also give a lower bound of $1/α$ on the public sample complexity for a broad family of query classes, which by our upper bound, is tight in $α$.

preprint2013arXiv

Nearly Optimal Private Convolution

We study computing the convolution of a private input $x$ with a public input $h$, while satisfying the guarantees of $(ε, δ)$-differential privacy. Convolution is a fundamental operation, intimately related to Fourier Transforms. In our setting, the private input may represent a time series of sensitive events or a histogram of a database of confidential personal information. Convolution then captures important primitives including linear filtering, which is an essential tool in time series analysis, and aggregation queries on projections of the data. We give a nearly optimal algorithm for computing convolutions while satisfying $(ε, δ)$-differential privacy. Surprisingly, we follow the simple strategy of adding independent Laplacian noise to each Fourier coefficient and bounding the privacy loss using the composition theorem of Dwork, Rothblum, and Vadhan. We derive a closed form expression for the optimal noise to add to each Fourier coefficient using convex programming duality. Our algorithm is very efficient -- it is essentially no more computationally expensive than a Fast Fourier Transform. To prove near optimality, we use the recent discrepancy lowerbounds of Muthukrishnan and Nikolov and derive a spectral lower bound using a characterization of discrepancy in terms of determinants.

preprint2012arXiv

Optimal Private Halfspace Counting via Discrepancy

A range counting problem is specified by a set $P$ of size $|P| = n$ of points in $\mathbb{R}^d$, an integer weight $x_p$ associated to each point $p \in P$, and a range space ${\cal R} \subseteq 2^{P}$. Given a query range $R \in {\cal R}$, the target output is $R(\vec{x}) = \sum_{p \in R}{x_p}$. Range counting for different range spaces is a central problem in Computational Geometry. We study $(ε, δ)$-differentially private algorithms for range counting. Our main results are for the range space given by hyperplanes, that is, the halfspace counting problem. We present an $(ε, δ)$-differentially private algorithm for halfspace counting in $d$ dimensions which achieves $O(n^{1-1/d})$ average squared error. This contrasts with the $Ω(n)$ lower bound established by the classical result of Dinur and Nissim [PODS 2003] for arbitrary subset counting queries. We also show a matching lower bound on average squared error for any $(ε, δ)$-differentially private algorithm for halfspace counting. Both bounds are obtained using discrepancy theory. For the lower bound, we use a modified discrepancy measure and bound approximation of $(ε, δ)$-differentially private algorithms for range counting queries in terms of this discrepancy. We also relate the modified discrepancy measure to classical combinatorial discrepancy, which allows us to exploit known discrepancy lower bounds. This approach also yields a lower bound of $Ω((\log n)^{d-1})$ for $(ε, δ)$-differentially private orthogonal range counting in $d$ dimensions, the first known superconstant lower bound for this problem. For the upper bound, we use an approach inspired by partial coloring methods for proving discrepancy upper bounds, and obtain $(ε, δ)$-differentially private algorithms for range counting with polynomially bounded shatter function range spaces.

preprint2011arXiv

A counterexample to Beck's conjecture on the discrepancy of three permutations

Given three permutations on the integers 1 through n, consider the set system consisting of each interval in each of the three permutations. Jozsef Beck conjectured (c. 1987) that the discrepancy of this set system is O(1). We give a counterexample to this conjecture: for any positive integer n = 3^k, we exhibit three permutations whose corresponding set system has discrepancy Omega(log(n)). Our counterexample is based on a simple recursive construction, and our proof of the discrepancy lower bound is by induction. This example also disproves a generalization of Beck's conjecture due to Spencer, Srinivasan and Tetali, who conjectured that a set system corresponding to l permutations has discrepancy O(sqrt(l)).

preprint2010arXiv

Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)

These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intractability with support from the National Security Agency and the National Science Foundation. The speakers at the tutorial were Matthew Andrews, Sanjeev Arora, Moses Charikar, Prahladh Harsha, Subhash Khot, Dana Moshkovitz and Lisa Zhang. The sribes were Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov, Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard and Gwen Spencer.

preprint2010arXiv

Pan-private Algorithms: When Memory Does Not Help

Consider updates arriving online in which the $t$th input is $(i_t,d_t)$, where $i_t$'s are thought of as IDs of users. Informally, a randomized function $f$ is {\em differentially private} with respect to the IDs if the probability distribution induced by $f$ is not much different from that induced by it on an input in which occurrences of an ID $j$ are replaced with some other ID $k$ Recently, this notion was extended to {\em pan-privacy} where the computation of $f$ retains differential privacy, even if the internal memory of the algorithm is exposed to the adversary (say by a malicious break-in or by fiat by the government). This is a strong notion of privacy, and surprisingly, for basic counting tasks such as distinct counts, heavy hitters and others, Dwork et al~\cite{dwork-pan} present pan-private algorithms with reasonable accuracy. The pan-private algorithms are nontrivial, and rely on sampling. We reexamine these basic counting tasks and show improved bounds. In particular, we estimate the distinct count $\Dt$ to within $(1\pm \eps)\Dt \pm O(\polylog m)$, where $m$ is the number of elements in the universe. This uses suitably noisy statistics on sketches known in the streaming literature. We also present the first known lower bounds for pan-privacy with respect to a single intrusion. Our lower bounds show that, even if allowed to work with unbounded memory, pan-private algorithms for distinct counts can not be significantly more accurate than our algorithms. Our lower bound uses noisy decoding. For heavy hitter counts, we present a pan private streaming algorithm that is accurate to within $O(k)$ in worst case; previously known bound for this problem is arbitrarily worse. An interesting aspect of our pan-private algorithms is that, they deliberately use very small (polylogarithmic) space and tend to be streaming algorithms, even though using more space is not forbidden.