Source author record

Ainesh Bakshi

Ainesh Bakshi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Geometry Computer Vision math.NA Numerical Analysis

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Low-Rank Approximation with $1/ε^{1/3}$ Matrix-Vector Products

We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm. Here, given access to a matrix $A$ through matrix-vector products, an accuracy parameter $ε$, and a target rank $k$, the goal is to find a rank-$k$ matrix $Z$ with orthonormal columns such that $\| A(I -ZZ^\top)\|_{S_p} \leq (1+ε)\min_{U^\top U = I_k} \|A(I - U U^\top)\|_{S_p}$, where $\|M\|_{S_p}$ denotes the $\ell_p$ norm of the the singular values of $M$. For the special cases of $p=2$ (Frobenius norm) and $p = \infty$ (Spectral norm), Musco and Musco (NeurIPS 2015) obtained an algorithm based on Krylov methods that uses $\tilde{O}(k/\sqrtε)$ matrix-vector products, improving on the naïve $\tilde{O}(k/ε)$ dependence obtainable by the power method, where $\tilde{O}$ suppresses poly$(\log(dk/ε))$ factors. Our main result is an algorithm that uses only $\tilde{O}(kp^{1/6}/ε^{1/3})$ matrix-vector products, and works for all $p \geq 1$. For $p = 2$ our bound improves the previous $\tilde{O}(k/ε^{1/2})$ bound to $\tilde{O}(k/ε^{1/3})$. Since the Schatten-$p$ and Schatten-$\infty$ norms are the same up to a $(1+ ε)$-factor when $p \geq (\log d)/ε$, our bound recovers the result of Musco and Musco for $p = \infty$. Further, we prove a matrix-vector query lower bound of $Ω(1/ε^{1/3})$ for any fixed constant $p \geq 1$, showing that surprisingly $\tildeΘ(1/ε^{1/3})$ is the optimal complexity for constant~$k$. To obtain our results, we introduce several new techniques, including optimizing over multiple Krylov subspaces simultaneously, and pinching inequalities for partitioned operators. Our lower bound for $p \in [1,2]$ uses the Araki-Lieb-Thirring trace inequality, whereas for $p>2$, we appeal to a norm-compression inequality for aligned partitioned operators.

preprint2021arXiv

List-Decodable Subspace Recovery: Dimension Independent Error in Polynomial Time

In list-decodable subspace recovery, the input is a collection of $n$ points $αn$ (for some $α\ll 1/2$) of which are drawn i.i.d. from a distribution $\mathcal{D}$ with a isotropic rank $r$ covariance $Π_*$ (the \emph{inliers}) and the rest are arbitrary, potential adversarial outliers. The goal is to recover a $O(1/α)$ size list of candidate covariances that contains a $\hatΠ$ close to $Π_*$. Two recent independent works (Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for this problem. These results, however, obtain an error that grows with the dimension (linearly in [RY] and logarithmically in BK) at the cost of quasi-polynomial running time) and rely on \emph{certifiable anti-concentration} - a relatively strict condition satisfied essentially only by the Gaussian distribution. In this work, we improve on these results on all three fronts: \emph{dimension-independent} error via a faster fixed-polynomial running time under less restrictive distributional assumptions. Specifically, we give a $poly(1/α) d^{O(1)}$ time algorithm that outputs a list containing a $\hatΠ$ satisfying $\|\hatΠ -Π_*\|_F \leq O(1/α)$. Our result only needs $\mathcal{D}$ to have \emph{certifiably hypercontractive} degree 2 polynomials. As a result, in addition to Gaussians, our algorithm applies to the uniform distribution on the hypercube and $q$-ary cubes and arbitrary product distributions with subgaussian marginals. Prior work (Raghavendra and Yau, 2020) had identified such distributions as potential hard examples as such distributions do not exhibit strong enough anti-concentration. When $\mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger error guarantee of $\|\hatΠ-Π_*\|_F \leq η$ for any arbitrary $η> 0$ in $d^{O(poly(1/α) + \log (1/η))}$ time.

preprint2020arXiv

Testing Positive Semi-Definiteness via Random Submatrices

We study the problem of testing whether a matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$ with bounded entries ($\|\mathbf{A}\|_\infty \leq 1$) is positive semi-definite (PSD), or $ε$-far in Euclidean distance from the PSD cone, meaning that $\min_{\mathbf{B} \succeq 0} \|\mathbf{A} - \mathbf{B}\|_F^2 > εn^2$, where $\mathbf{B} \succeq 0$ denotes that $\mathbf{B}$ is PSD. Our main algorithmic contribution is a non-adaptive tester which distinguishes between these cases using only $\tilde{O}(1/ε^4)$ queries to the entries of $\mathbf{A}$. If instead of the Euclidean norm we considered the distance in spectral norm, we obtain the "$\ell_\infty$-gap problem", where $\mathbf{A}$ is either PSD or satisfies $\min_{\mathbf{B}\succeq 0} \|\mathbf{A}- \mathbf{B}\|_2 > εn$. For this related problem, we give a $\tilde{O}(1/ε^2)$ query tester, which we show is optimal up to $\log(1/ε)$ factors. Our testers randomly sample a collection of principal submatrices and check whether these submatrices are PSD. Consequentially, our algorithms achieve one-sided error: whenever they output that $\mathbf{A}$ is not PSD, they return a certificate that $\mathbf{A}$ has negative eigenvalues. We complement our upper bound for PSD testing with Euclidean norm distance by giving a $\tildeΩ(1/ε^2)$ lower bound for any non-adaptive algorithm. Our lower bound construction is general, and can be used to derive lower bounds for a number of spectral testing problems. As an example of the applicability of our construction, we obtain a new $\tildeΩ(1/ε^4)$ sampling lower bound for testing the Schatten-$1$ norm with a $εn^{1.5}$ gap, extending a result of Balcan, Li, Woodruff, and Zhang [SODA'19]. In addition, it yields new sampling lower bounds for estimating the Ky-Fan Norm, and the cost of the best rank-$k$ approximation.

preprint2020arXiv

Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams

We study the Maximum Independent Set problem for geometric objects given in the data stream model. A set of geometric objects is said to be independent if the objects are pairwise disjoint. We consider geometric objects in one and two dimensions, i.e., intervals and disks. Let $α$ be the cardinality of the largest independent set. Our goal is to estimate $α$ in a small amount of space, given that the input is received as a one-pass stream. We also consider a generalization of this problem by assigning weights to each object and estimating $β$, the largest value of a weighted independent set. We initialize the study of this problem in the turnstile streaming model (insertions and deletions) and provide the first algorithms for estimating $α$ and $β$. For unit-length intervals, we obtain a $(2+ε)$-approximation to $α$ and $β$ in poly$(\frac{\log(n)}ε)$ space. We also show a matching lower bound. Combined with the $3/2$-approximation for insertion-only streams by Cabello and Perez-Lanterno [CP15], our result implies a separation between the insertion-only and turnstile model. For unit-radius disks, we obtain a $\left(\frac{8\sqrt{3}}π\right)$-approximation to $α$ and $β$ in poly$(\log(n), ε^{-1})$ space, which is closely related to the hexagonal circle packing constant. We provide algorithms for estimating $α$ for arbitrary-length intervals under a bounded intersection assumption and study the parameterized space complexity of estimating $α$ and $β$, where the parameter is the ratio of maximum to minimum interval length.

preprint2014arXiv

A Novel Feature Selection and Extraction Technique for Classification

This paper presents a versatile technique for the purpose of feature selection and extraction - Class Dependent Features (CDFs). We use CDFs to improve the accuracy of classification and at the same time control computational expense by tackling the curse of dimensionality. In order to demonstrate the generality of this technique, it is applied to handwritten digit recognition and text categorization.