Researcher profile

Kaushik Sinha

Kaushik Sinha contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
2close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2020arXiv

Neural Neighborhood Encoding for Classification

Inspired by the fruit-fly olfactory circuit, the Fly Bloom Filter [Dasgupta et al., 2018] is able to efficiently summarize the data with a single pass and has been used for novelty detection. We propose a new classifier (for binary and multi-class classification) that effectively encodes the different local neighborhoods for each class with a per-class Fly Bloom Filter. The inference on test data requires an efficient {\tt FlyHash} [Dasgupta, et al., 2017] operation followed by a high-dimensional, but {\em sparse}, dot product with the per-class Bloom Filters. The learning is trivially parallelizable. On the theoretical side, we establish conditions under which the prediction of our proposed classifier on any test example agrees with the prediction of the nearest neighbor classifier with high probability. We extensively evaluate our proposed scheme with over $50$ data sets of varied data dimensionality to demonstrate that the predictive performance of our proposed neuroscience inspired classifier is competitive the the nearest-neighbor classifiers and other single-pass classifiers.

preprint2010arXiv

Learning Gaussian Mixtures with Arbitrary Separation

In this paper we present a method for learning the parameters of a mixture of $k$ identical spherical Gaussians in $n$-dimensional space with an arbitrarily small separation between the components. Our algorithm is polynomial in all parameters other than $k$. The algorithm is based on an appropriate grid search over the space of parameters. The theoretical analysis of the algorithm hinges on a reduction of the problem to 1 dimension and showing that two 1-dimensional mixtures whose densities are close in the $L^2$ norm must have similar means and mixing coefficients. To produce such a lower bound for the $L^2$ norm in terms of the distances between the corresponding means, we analyze the behavior of the Fourier transform of a mixture of Gaussians in 1 dimension around the origin, which turns out to be closely related to the properties of the Vandermonde matrix obtained from the component means. Analysis of this matrix together with basic function approximation results allows us to provide a lower bound for the norm of the mixture in the Fourier domain. In recent years much research has been aimed at understanding the computational aspects of learning parameters of Gaussians mixture distributions in high dimension. To the best of our knowledge all existing work on learning parameters of Gaussian mixtures assumes minimum separation between components of the mixture which is an increasing function of either the dimension of the space $n$ or the number of components $k$. In our paper we prove the first result showing that parameters of a $n$-dimensional Gaussian mixture model with arbitrarily small component separation can be learned in time polynomial in $n$.

preprint2010arXiv

Polynomial Learning of Distribution Families

The question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary fixed number of components. The result on learning Gaussian mixtures relies on an analysis of distributions belonging to what we call "polynomial families" in low dimension. These families are characterized by their moments being polynomial in parameters and include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time and using a polynomial number of sample points. The result on learning polynomial families is quite general and is of independent interest. To estimate parameters of a Gaussian mixture distribution in high dimensions, we provide a deterministic algorithm for dimensionality reduction. This allows us to reduce learning a high-dimensional mixture to a polynomial number of parameter estimations in low dimension. Combining this reduction with the results on polynomial families yields our result on learning arbitrary Gaussian mixtures in high dimensions.