Researcher profile

Natesh S. Pillai

Natesh S. Pillai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2020arXiv

Fast and memory-optimal dimension reduction using Kac's walk

In this work, we analyze dimension reduction algorithms based on the Kac walk and discrete variants. (1) For $n$ points in $\mathbb{R}^{d}$, we design an optimal Johnson-Lindenstrauss (JL) transform based on the Kac walk which can be applied to any vector in time $O(d\log{d})$ for essentially the same restriction on $n$ as in the best-known transforms due to Ailon and Liberty [SODA, 2008], and Bamberger and Krahmer [arXiv, 2017]. Our algorithm is memory-optimal, and outperforms existing algorithms in regimes when $n$ is sufficiently large and the distortion parameter is sufficiently small. In particular, this confirms a conjecture of Ailon and Chazelle [STOC, 2006] in a stronger form. (2) The same construction gives a simple transform with optimal Restricted Isometry Property (RIP) which can be applied in time $O(d\log{d})$ for essentially the same range of sparsity as in the best-known such transform due to Ailon and Rauhut [Discrete Comput. Geom., 2014]. (3) We show that by fixing the angle in the Kac walk to be $π/4$ throughout, one obtains optimal JL and RIP transforms with almost the same running time, thereby confirming -- up to a $\log\log{d}$ factor -- a conjecture of Avron, Maymounkov, and Toledo [SIAM J. Sci. Comput., 2010]. Our moment-based analysis of this modification of the Kac walk may also be of independent interest.

preprint2020arXiv

Universality and least singular values of random matrix products: a simplified approach

In this note, we show how to provide sharp control on the least singular value of a certain translated linearization matrix arising in the study of the local universality of products of independent random matrices. This problem was first considered in a recent work of Koppel, O'Rourke, and Vu, and compared to their work, our proof is substantially simpler and established in much greater generality . In particular, we only assume that the entries of the ensemble are centered, and have second and fourth moments uniformly bounded away from $0$ and infinity, whereas previous work assumed a uniform subgaussian decay condition and that the entries within each factor of the product are identically distributed. A consequence of our least singular value bound is that the four moment matching universality results for the products of independent random matrices, recently obtained by Koppel, O'Rourke, and Vu, hold under much weaker hypotheses. Our proof technique is also of independent interest in the study of structured sparse matrices.

preprint2015arXiv

Bayesian Nonparametric Weighted Sampling Inference

It has historically been a challenge to perform Bayesian inference in a design-based survey context. The present paper develops a Bayesian model for sampling inference in the presence of inverse-probability weights. We use a hierarchical approach in which we model the distribution of the weights of the nonsampled units in the population and simultaneously include them as predictors in a nonparametric Gaussian process regression. We use simulation studies to evaluate the performance of our procedure and compare it to the classical design-based estimator. We apply our method to the Fragile Family and Child Wellbeing Study. Our studies find the Bayesian nonparametric finite population estimator to be more robust than the classical design-based estimator without loss in efficiency, which works because we induce regularization for small cells and thus this is a way of automatically smoothing the highly variable weights.

preprint2013arXiv

Regularity of laws and ergodicity of hypoelliptic SDEs driven by rough paths

We consider differential equations driven by rough paths and study the regularity of the laws and their long time behavior. In particular, we focus on the case when the driving noise is a rough path valued fractional Brownian motion with Hurst parameter $H\in(\frac{1}{3},\frac{1}{2}]$. Our contribution in this work is twofold. First, when the driving vector fields satisfy Hörmander's celebrated "Lie bracket condition," we derive explicit quantitative bounds on the inverse of the Malliavin matrix. En route to this, we provide a novel "deterministic" version of Norris's lemma for differential equations driven by rough paths. This result, with the added assumption that the linearized equation has moments, will then yield that the transition laws have a smooth density with respect to Lebesgue measure. Our second main result states that under Hörmander's condition, the solutions to rough differential equations driven by fractional Brownian motion with $H\in(\frac{1}{3},\frac{1}{2}]$ enjoy a suitable version of the strong Feller property. Under a standard controllability condition, this implies that they admit a unique stationary solution that is physical in the sense that it does not "look into the future."

preprint2013arXiv

Statistical Inference for Stochastic Differential Equations with Memory

In this paper we construct a framework for doing statistical inference for discretely observed stochastic differential equations (SDEs) where the driving noise has 'memory'. Classical SDE models for inference assume the driving noise to be Brownian motion, or "white noise", thus implying a Markov assumption. We focus on the case when the driving noise is a fractional Brownian motion, which is a common continuous-time modeling device for capturing long-range memory. Since the likelihood is intractable, we proceed via data augmentation, adapting a familiar discretization and missing data approach developed for the white noise case. In addition to the other SDE parameters, we take the Hurst index to be unknown and estimate it from the data. Posterior sampling is performed via a Hybrid Monte Carlo algorithm on both the parameters and the missing data simultaneously so as to improve mixing. We point out that, due to the long-range correlations of the driving noise, careful discretization of the underlying SDE is necessary for valid inference. Our approach can be adapted to other types of rough-path driving processes such as Gaussian "colored" noise. The methodology is used to estimate the evolution of the memory parameter in US short-term interest rates.

preprint2012arXiv

Bayesian shrinkage

Penalized regression methods, such as $L_1$ regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated an amazing variety of continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In sharp contrast to the corresponding frequentist literature, very little is known about the properties of such priors. Focusing on a broad class of shrinkage priors, we provide precise results on prior and posterior concentration. Interestingly, we demonstrate that most commonly used shrinkage priors, including the Bayesian Lasso, are suboptimal in high-dimensional settings. A new class of Dirichlet Laplace (DL) priors are proposed, which are optimal and lead to efficient posterior computation exploiting results from normalized random measure theory. Finite sample performance of Dirichlet Laplace priors relative to alternatives is assessed in simulations.

preprint2012arXiv

Causal inference from $2^k$ factorial designs using the potential outcomes model

A framework for causal inference from two-level factorial designs is proposed. The framework utilizes the concept of potential outcomes that lies at the center stage of causal inference and extends Neyman's repeated sampling approach for estimation of causal effects and randomization tests based on Fisher's sharp null hypothesis to the case of 2-level factorial experiments. The framework allows for statistical inference from a finite population, permits definition and estimation of estimands other than "average factorial effects" and leads to more flexible inference procedures than those based on ordinary least squares estimation from a linear model.

preprint2012arXiv

Diffusion limits of the random walk Metropolis algorithm in high dimensions

Diffusion limits of MCMC methods in high dimensions provide a useful theoretical tool for studying computational complexity. In particular, they lead directly to precise estimates of the number of steps required to explore the target measure, in stationarity, as a function of the dimension of the state space. However, to date such results have mainly been proved for target measures with a product structure, severely limiting their applicability. The purpose of this paper is to study diffusion limits for a class of naturally occurring high-dimensional measures found from the approximation of measures on a Hilbert space which are absolutely continuous with respect to a Gaussian reference measure. The diffusion limit of a random walk Metropolis algorithm to an infinite-dimensional Hilbert space valued SDE (or SPDE) is proved, facilitating understanding of the computational complexity of the algorithm.

preprint2012arXiv

Edge universality of correlation matrices

Let $\widetilde{X}_{M\times N}$ be a rectangular data matrix with independent real-valued entries $[\widetilde{x}_{ij}]$ satisfying $\mathbb {E}\widetilde{x}_{ij}=0$ and $\mathbb {E}\widetilde{x}^2_{ij}=\frac{1}{M}$, $N,M\to\infty$. These entries have a subexponential decay at the tails. We will be working in the regime $N/M=d_N,\lim_{N\to\infty}d_N\neq0,1,\infty$. In this paper we prove the edge universality of correlation matrices ${X}^{\dagger}X$, where the rectangular matrix $X$ (called the standardized matrix) is obtained by normalizing each column of the data matrix $\widetilde{X}$ by its Euclidean norm. Our main result states that asymptotically the $k$-point ($k\geq1$) correlation functions of the extreme eigenvalues (at both edges of the spectrum) of the correlation matrix ${X}^{\dagger}X$ converge to those of the Gaussian correlation matrix, that is, Tracy-Widom law, and, thus, in particular, the largest and the smallest eigenvalues of ${X}^{\dagger}X$ after appropriate centering and rescaling converge to the Tracy-Widom distribution. The asymptotic distribution of extreme eigenvalues of the Gaussian correlation matrix has been worked out only recently. As a corollary of the main result in this paper, we also obtain that the extreme eigenvalues of Gaussian correlation matrices are asymptotically distributed according to the Tracy-Widom law. The proof is based on the comparison of Green functions, but the key obstacle to be surmounted is the strong dependence of the entries of the correlation matrix. We achieve this via a novel argument which involves comparing the moments of product of the entries of the standardized data matrix to those of the raw data matrix. Our proof strategy may be extended for proving the edge universality of other random matrix ensembles with dependent entries and hence is of independent interest.

preprint2012arXiv

Geometric ergodicity of a bead-spring pair with stochastic Stokes forcing

We consider a simple model for the fluctuating hydrodynamics of a flexible polymer in dilute solution, demonstrating geometric ergodicity for a pair of particles that interact with each other through a nonlinear spring potential while being advected by a stochastic Stokes fluid velocity field. This is a generalization of previous models which have used linear spring forces as well as white-in-time fluid velocity fields. We follow previous work combining control theoretic arguments, Lyapunov functions, and hypo-elliptic diffusion theory to prove exponential convergence via a Harris chain argument. In addition we allow the possibility of excluding certain "bad" sets in phase space in which the assumptions are violated but from which the system leaves with a controllable probability. This allows for the treatment of singular drifts, such as those derived from the Lennard-Jones potential, which is a novel feature of this work.

preprint2012arXiv

Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions

The Metropolis-adjusted Langevin (MALA) algorithm is a sampling algorithm which makes local moves by incorporating information about the gradient of the logarithm of the target density. In this paper we study the efficiency of MALA on a natural class of target measures supported on an infinite dimensional Hilbert space. These natural measures have density with respect to a Gaussian random field measure and arise in many applications such as Bayesian nonparametric statistics and the theory of conditioned diffusions. We prove that, started in stationarity, a suitably interpolated and scaled version of the Markov chain corresponding to MALA converges to an infinite dimensional diffusion process. Our results imply that, in stationarity, the MALA algorithm applied to an N-dimensional approximation of the target will take $\mathcal{O}(N^{1/3})$ steps to explore the invariant measure, comparing favorably with the Random Walk Metropolis which was recently shown to require $\mathcal{O}(N)$ steps when applied to the same class of problems.

preprint2011arXiv

On a Class of Shrinkage Priors for Covariance Matrix Estimation

We propose a flexible class of models based on scale mixture of uniform distributions to construct shrinkage priors for covariance matrix estimation. This new class of priors enjoys a number of advantages over the traditional scale mixture of normal priors, including its simplicity and flexibility in characterizing the prior density. We also exhibit a simple, easy to implement Gibbs sampler for posterior simulation which leads to efficient estimation in high dimensional problems. We first discuss the theory and computational details of this new approach and then extend the basic model to a new class of multivariate conditional autoregressive models for analyzing multivariate areal data. The proposed spatial model flexibly characterizes both the spatial and the outcome correlation structures at an appealing computational cost. Examples consisting of both synthetic and real-world data show the utility of this new framework in terms of robust estimation as well as improved predictive performance.

preprint2010arXiv

Ergodicity of hypoelliptic SDEs driven by fractional Brownian motion

We demonstrate that stochastic differential equations (SDEs) driven by fractional Brownian motion with Hurst parameter H > 1/2 have similar ergodic properties as SDEs driven by standard Brownian motion. The focus in this article is on hypoelliptic systems satisfying Hörmander's condition. We show that such systems satisfy a suitable version of the strong Feller property and we conclude that they admit a unique stationary solution that is physical in the sense that it does not "look into the future". The main technical result required for the analysis is a bound on the moments of the inverse of the Malliavin covariance matrix, conditional on the past of the driving noise.

preprint2010arXiv

Optimal tuning of the Hybrid Monte-Carlo Algorithm

We investigate the properties of the Hybrid Monte-Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible w.r.t. a given target distribution $Π$ by using separable Hamiltonian dynamics with potential $-\logΠ$. The additional momentum variables are chosen at random from the Boltzmann distribution and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis-Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an $\mathcal{O}(1)$ acceptance probability as the dimension $d$ of the state space tends to $\infty$, the leapfrog step-size $h$ should be scaled as $h= l \times d^{-1/4}$. Therefore, in high dimensions, HMC requires $\mathcal{O}(d^{1/4})$ steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be 0.651 (to three decimal places). This is the choice which optimally balances the cost of generating a proposal, which {\em decreases} as $l$ increases, against the cost related to the average number of proposals required to obtain acceptance, which {\em increases} as $l$ increases.