Source author record

Dootika Vats

Dootika Vats appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation Methodology math.ST Statistics Theory math.PR

Catalog footprint

What is connected

7works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A principled stopping rule for importance sampling

Importance sampling (IS) is a Monte Carlo technique that relies on weighted samples, simulated from a proposal distribution, to estimate intractable integrals. The quality of the estimators improves with the number of samples. However, for achieving a desired quality of estimation, the required number of samples is unknown and depends on the quantity of interest, the estimator, and the chosen proposal. We present a sequential stopping rule that terminates simulation when the overall variability in estimation is relatively small. The proposed methodology closely connects to the idea of an effective sample size in IS and overcomes crucial shortcomings of existing metrics, e.g., it acknowledges multivariate estimation problems. Our stopping rule retains asymptotic guarantees and provides users a clear guideline on when to stop the simulation in IS.

preprint2022arXiv

Dimension-free Mixing for High-dimensional Bayesian Variable Selection

Yang et al. (2016) proved that the symmetric random walk Metropolis--Hastings algorithm for Bayesian variable selection is rapidly mixing under mild high-dimensional assumptions. We propose a novel MCMC sampler using an informed proposal scheme, which we prove achieves a much faster mixing time that is independent of the number of covariates, under the same assumptions. To the best of our knowledge, this is the first high-dimensional result which rigorously shows that the mixing rate of informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation. Motivated by the theoretical analysis of our sampler, we further propose a new approach called "two-stage drift condition" to studying convergence rates of Markov chains on general state spaces, which can be useful for obtaining tight complexity bounds in high-dimensional settings. The practical advantages of our algorithm are illustrated by both simulation studies and real data analysis.

preprint2022arXiv

Optimal Scaling of MCMC Beyond Metropolis

The problem of optimally scaling the proposal distribution in a Markov chain Monte Carlo algorithm is critical to the quality of the generated samples. Much work has gone into obtaining such results for various Metropolis-Hastings (MH) algorithms. Recently, acceptance probabilities other than MH are being employed in problems with intractable target distributions. There is little resource available on tuning the Gaussian proposal distributions for this situation. We obtain optimal scaling results for a general class of acceptance functions, which includes Barker's and Lazy-MH. In particular, optimal values for the Barker's algorithm are derived and found to be significantly different from that obtained for the MH algorithm. Our theoretical conclusions are supported by numerical simulations indicating that when the optimal proposal variance is unknown, tuning to the optimal acceptance probability remains an effective strategy.

preprint2021arXiv

Bayesian equation selection on sparse data for discovery of stochastic dynamical systems

Often the underlying system of differential equations driving a stochastic dynamical system is assumed to be known, with inference conditioned on this assumption. We present a Bayesian framework for discovering this system of differential equations under assumptions that align with real-life scenarios, including the availability of relatively sparse data. Further, we discuss computational strategies that are critical in teasing out the important details about the dynamical system and algorithmic innovations to solve for acute parameter interdependence in the absence of rich data. This gives a complete Bayesian pathway for model identification via a variable selection paradigm and parameter estimation of the corresponding model using only the observed data. We present detailed computations and analysis of the Lorenz-96, Lorenz-63, and the Orstein-Uhlenbeck system using the Bayesian framework we propose.

preprint2020arXiv

Assessing and Visualizing Simultaneous Simulation Error

Monte Carlo experiments produce samples in order to estimate features of a given distribution. However, simultaneous estimation of means and quantiles has received little attention, despite being common practice. In this setting we establish a multivariate central limit theorem for any finite combination of sample means and quantiles under the assumption of a strongly mixing process, which includes the standard Monte Carlo and Markov chain Monte Carlo settings. We build on this to provide a fast algorithm for constructing hyperrectangular confidence regions having the desired simultaneous coverage probability and a convenient marginal interpretation. The methods are incorporated into standard ways of visualizing the results of Monte Carlo experiments enabling the practitioner to more easily assess the reliability of the results. We demonstrate the utility of this approach in various Monte Carlo settings including simulation studies based on independent and identically distributed samples and Bayesian analyses using Markov chain Monte Carlo sampling.

preprint2020arXiv

Revisiting the Gelman-Rubin Diagnostic

Gelman and Rubin's (1992) convergence diagnostic is one of the most popular methods for terminating a Markov chain Monte Carlo (MCMC) sampler. Since the seminal paper, researchers have developed sophisticated methods for estimating variance of Monte Carlo averages. We show that these estimators find immediate use in the Gelman-Rubin statistic, a connection not previously established in the literature. We incorporate these estimators to upgrade both the univariate and multivariate Gelman-Rubin statistics, leading to improved stability in MCMC termination time. An immediate advantage is that our new Gelman-Rubin statistic can be calculated for a single chain. In addition, we establish a one-to-one relationship between the Gelman-Rubin statistic and effective sample size. Leveraging this relationship, we develop a principled termination criterion for the Gelman-Rubin statistic. Finally, we demonstrate the utility of our improved diagnostic via examples.

preprint2016arXiv

Strong Consistency of Multivariate Spectral Variance Estimators

Markov chain Monte Carlo (MCMC) algorithms are used to estimate features of interest of a distribution. The Monte Carlo error in estimation has an asymptotic normal distribution whose multivariate nature has so far been ignored in the MCMC community. We present a class of multivariate spectral variance estimators for the asymptotic covariance matrix in the Markov chain central limit theorem and provide conditions for strong consistency. We examine the finite sample properties of the multivariate spectral variance estimators and its eigenvalues in the context of a vector autoregressive process of order 1.

Dootika Vats

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A principled stopping rule for importance sampling

Dimension-free Mixing for High-dimensional Bayesian Variable Selection

Optimal Scaling of MCMC Beyond Metropolis

Bayesian equation selection on sparse data for discovery of stochastic dynamical systems

Assessing and Visualizing Simultaneous Simulation Error

Revisiting the Gelman-Rubin Diagnostic

Strong Consistency of Multivariate Spectral Variance Estimators