Researcher profile

Ravindran Kannan

Ravindran Kannan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
2close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

Finding a latent k-simplex in O(k . nnz(data)) time via Subset Smoothing

In this paper we show that a large class of Latent variable models, such as Mixed Membership Stochastic Block(MMSB) Models, Topic Models, and Adversarial Clustering, can be unified through a geometric perspective, replacing model specific assumptions and algorithms for individual models. The geometric perspective leads to the formulation: \emph{find a latent $k-$ polytope $K$ in ${\bf R}^d$ given $n$ data points, each obtained by perturbing a latent point in $K$}. This problem does not seem to have been considered in the literature. The most important contribution of this paper is to show that the latent $k-$polytope problem admits an efficient algorithm under deterministic assumptions which naturally hold in Latent variable models considered in this paper. ur algorithm runs in time $O^*(k\; \mbox{nnz})$ matching the best running time of algorithms in special cases considered here and is better when the data is sparse, as is the case in applications. An important novelty of the algorithm is the introduction of \emph{subset smoothed polytope}, $K'$, the convex hull of the ${n\choose δn}$ points obtained by averaging all $δn$ subsets of the data points, for a given $δ\in (0,1)$. We show that $K$ and $K'$ are close in Hausdroff distance. Among the consequences of our algorithm are the following: (a) MMSB Models and Topic Models: the first quasi-input-sparsity time algorithm for parameter estimation for $k \in O^*(1)$, (b) Adversarial Clustering: In $k-$means, if, an adversary is allowed to move many data points from each cluster an arbitrary amount towards the convex hull of the centers of other clusters, our algorithm still estimates cluster centers well.

preprint2010arXiv

Clustering with Spectral Norm and the k-means Algorithm

There has been much progress on efficient algorithms for clustering data points generated by a mixture of $k$ probability distributions under the assumption that the means of the distributions are well-separated, i.e., the distance between the means of any two distributions is at least $Ω(k)$ standard deviations. These results generally make heavy use of the generative model and particular properties of the distributions. In this paper, we show that a simple clustering algorithm works without assuming any generative (probabilistic) model. Our only assumption is what we call a "proximity condition": the projection of any data point onto the line joining its cluster center to any other cluster center is $Ω(k)$ standard deviations closer to its own center than the other center. Here the notion of standard deviations is based on the spectral norm of the matrix whose rows represent the difference between a point and the mean of the cluster to which it belongs. We show that in the generative models studied, our proximity condition is satisfied and so we are able to derive most known results for generative models as corollaries of our main result. We also prove some new results for generative models - e.g., we can cluster all but a small fraction of points only assuming a bound on the variance. Our algorithm relies on the well known $k$-means algorithm, and along the way, we prove a result of independent interest -- that the $k$-means algorithm converges to the "true centers" even in the presence of spurious points provided the initial (estimated) centers are close enough to the corresponding actual centers and all but a small fraction of the points satisfy the proximity condition. Finally, we present a new technique for boosting the ratio of inter-center separation to standard deviation.

preprint2010arXiv

Spectral Methods for Matrices and Tensors

While Spectral Methods have long been used for Principal Component Analysis, this survey focusses on work over the last 15 years with three salient features: (i) Spectral methods are useful not only for numerical problems, but also discrete optimization problems (Constraint Optimization Problems - CSP's) like the max. cut problem and similar mathematical considerations underlie both areas. (ii) Spectral methods can be extended to tensors. The theory and algorithms for tensors are not as simple/clean as for matrices, but the survey describes methods for low-rank approximation which extend to tensors. These tensor approximations help us solve Max-$r$-CSP's for $r>2$ as well as numerical tensor problems. (iii) Sampling on the fly plays a prominent role in these methods. A primary result is that for any matrix, a random submatrix of rows/columns picked with probabilities proportional to the squared lengths (of rows/columns), yields estimates of the singular values as well as an approximation to the whole matrix.

preprint2010arXiv

Two new Probability inequalities and Concentration Results

Concentration results and probabilistic analysis for combinatorial problems like the TSP, MWST, graph coloring have received much attention, but generally, for i.i.d. samples (i.i.d. points in the unit square for the TSP, for example). Here, we prove two probability inequalities which generalize and strengthen Martingale inequalities. The inequalities provide the tools to deal with more general heavy-tailed and inhomogeneous distributions for combinatorial problems. We prove a wide range of applications - in addition to the TSP, MWST, graph coloring, we also prove more general results than known previously for concentration in bin-packing, sub-graph counts, Johnson-Lindenstrauss random projection theorem. It is hoped that the strength of the inequalities will serve many more purposes.