Researcher profile

Víctor Elvira

Víctor Elvira contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

A principled stopping rule for importance sampling

Importance sampling (IS) is a Monte Carlo technique that relies on weighted samples, simulated from a proposal distribution, to estimate intractable integrals. The quality of the estimators improves with the number of samples. However, for achieving a desired quality of estimation, the required number of samples is unknown and depends on the quantity of interest, the estimator, and the chosen proposal. We present a sequential stopping rule that terminates simulation when the overall variability in estimation is relatively small. The proposed methodology closely connects to the idea of an effective sample size in IS and overcomes crucial shortcomings of existing metrics, e.g., it acknowledges multivariate estimation problems. Our stopping rule retains asymptotic guarantees and provides users a clear guideline on when to stop the simulation in IS.

preprint2022arXiv

Large Data and (Not Even Very) Complex Ecological Models: When Worlds Collide

We consider the challenges that arise when fitting complex ecological models to 'large' data sets. In particular, we focus on random effect models which are commonly used to describe individual heterogeneity, often present in ecological populations under study. In general, these models lead to a likelihood that is expressible only as an analytically intractable integral. Common techniques for fitting such models to data include, for example, the use of numerical approximations for the integral, or a Bayesian data augmentation approach. However, as the size of the data set increases (i.e. the number of individuals increases), these computational tools may become computationally infeasible. We present an efficient Bayesian model-fitting approach, whereby we initially sample from the posterior distribution of a smaller subsample of the data, before correcting this sample to obtain estimates of the posterior distribution of the full dataset, using an importance sampling approach. We consider several practical issues, including the subsampling mechanism, computational efficiencies (including the ability to parallelise the algorithm) and combining subsampling estimates using multiple subsampled datasets. We demonstrate the approach in relation to individual heterogeneity capture-recapture models. We initially demonstrate the feasibility of the approach via simulated data before considering a challenging real dataset of approximately 30,000 guillemots, and obtain posterior estimates in substantially reduced computational time.

preprint2022arXiv

Multiple Importance Sampling ELBO and Deep Ensembles of Variational Approximations

In variational inference (VI), the marginal log-likelihood is estimated using the standard evidence lower bound (ELBO), or improved versions as the importance weighted ELBO (IWELBO). We propose the multiple importance sampling ELBO (MISELBO), a \textit{versatile} yet \textit{simple} framework. MISELBO is applicable in both amortized and classical VI, and it uses ensembles, e.g., deep ensembles, of independently inferred variational approximations. As far as we are aware, the concept of deep ensembles in amortized VI has not previously been established. We prove that MISELBO provides a tighter bound than the average of standard ELBOs, and demonstrate empirically that it gives tighter bounds than the average of IWELBOs. MISELBO is evaluated in density-estimation experiments that include MNIST and several real-data phylogenetic tree inference problems. First, on the MNIST dataset, MISELBO boosts the density-estimation performances of a state-of-the-art model, nouveau VAE. Second, in the phylogenetic tree inference setting, our framework enhances a state-of-the-art VI algorithm that uses normalizing flows. On top of the technical benefits of MISELBO, it allows to unveil connections between VI and recent advances in the importance sampling literature, paving the way for further methodological advances. We provide our code at \url{https://github.com/Lagergren-Lab/MISELBO}.

preprint2022arXiv

Optimized Population Monte Carlo

Adaptive importance sampling (AIS) methods are increasingly used for the approximation of distributions and related intractable integrals in the context of Bayesian inference. Population Monte Carlo (PMC) algorithms are a subclass of AIS methods, widely used due to their ease in the adaptation. In this paper, we propose a novel algorithm that exploits the benefits of the PMC framework and includes more efficient adaptive mechanisms, exploiting geometric information of the target distribution. In particular, the novel algorithm adapts the location and scale parameters of a set of importance densities (proposals). At each iteration, the location parameters are adapted by combining a versatile resampling strategy (i.e., using the information of previous weighted samples) with an advanced optimization-based scheme. Local second-order information of the target distribution is incorporated through a preconditioning matrix acting as a scaling metric onto a gradient direction. A damped Newton approach is adopted to ensure robustness of the scheme. The resulting metric is also used to update the scale parameters of the proposals. We discuss several key theoretical foundations for the proposed approach. Finally, we show the successful performance of the proposed method in three numerical examples, involving challenging distributions.

preprint2022arXiv

Rethinking the Effective Sample Size

The effective sample size (ESS) is widely used in sample-based simulation methods for assessing the quality of a Monte Carlo approximation of a given distribution and of related integrals. In this paper, we revisit the approximation of the ESS in the specific context of importance sampling (IS). The derivation of this approximation, that we will denote as $\widehat{\text{ESS}}$, is partially available in Kong (1992). This approximation has been widely used in the last 25 years due to its simplicity as a practical rule of thumb in a wide variety of importance sampling methods. However, we show that the multiple assumptions and approximations in the derivation of $\widehat{\text{ESS}}$, makes it difficult to be considered even as a reasonable approximation of the ESS. We extend the discussion of the $\widehat{\text{ESS}}$ in the multiple importance sampling (MIS) setting, we display numerical examples, and we discuss several avenues for developing alternative metrics. This paper does not cover the use of ESS for MCMC algorithms.

preprint2022arXiv

Variance Analysis of Multiple Importance Sampling Schemes

Multiple importance sampling (MIS) is an increasingly used methodology where several proposal densities are used to approximate integrals, generally involving target probability density functions. The use of several proposals allows for a large variety of sampling and weighting schemes. Then, the practitioner must choose a given scheme, i.e., sampling mechanism and weighting function. A variance analysis has been proposed in Elvira et al (2019, Statistical Science 34, 129-155), showing the superiority of the balanced heuristic estimator with respect to other competing schemes in some scenarios. However, some of their results are valid only for two proposals. In this paper, we extend and generalize these results, providing novel proofs that allow to determine the variance relations among MIS schemes.

preprint2021arXiv

Importance Gaussian Quadrature

Importance sampling (IS) and numerical integration methods are usually employed for approximating moments of complicated target distributions. In its basic procedure, the IS methodology randomly draws samples from a proposal distribution and weights them accordingly, accounting for the mismatch between the target and proposal. In this work, we present a general framework of numerical integration techniques inspired by the IS methodology. The framework can also be seen as an incorporation of deterministic rules into IS methods, reducing the error of the estimators by several orders of magnitude in several problems of interest. The proposed approach extends the range of applicability of the Gaussian quadrature rules. For instance, the IS perspective allows us to use Gauss-Hermite rules in problems where the integrand is not involving a Gaussian distribution, and even more, when the integrand can only be evaluated up to a normalizing constant, as it is usually the case in Bayesian inference. The novel perspective makes use of recent advances on the multiple IS (MIS) and adaptive (AIS) literatures, and incorporates it to a wider numerical integration framework that combines several numerical integration rules that can be iteratively adapted. We analyze the convergence of the algorithms and provide some representative examples showing the superiority of the proposed approach in terms of performance.

preprint2020arXiv

GraphEM: EM algorithm for blind Kalman filtering under graphical sparsity constraints

Modeling and inference with multivariate sequences is central in a number of signal processing applications such as acoustics, social network analysis, biomedical, and finance, to name a few. The linear-Gaussian state-space model is a common way to describe a time series through the evolution of a hidden state, with the advantage of presenting a simple inference procedure due to the celebrated Kalman filter. A fundamental question when analyzing multivariate sequences is the search for relationships between their entries (or the modeled hidden states), especially when the inherent structure is a non-fully connected graph. In such context, graphical modeling combined with parsimony constraints allows to limit the proliferation of parameters and enables a compact data representation which is easier to interpret by the experts. In this work, we propose a novel expectation-minimization algorithm for estimating the linear matrix operator in the state equation of a linear-Gaussian state-space model. Lasso regularization is included in the M-step, that we solved using a proximal splitting Douglas-Rachford algorithm. Numerical experiments illustrate the benefits of the proposed model and inference technique, named GraphEM, over competitors relying on Granger causality.

preprint2016arXiv

Improving Population Monte Carlo: Alternative Weighting and Resampling Schemes

Population Monte Carlo (PMC) sampling methods are powerful tools for approximating distributions of static unknowns given a set of observations. These methods are iterative in nature: at each step they generate samples from a proposal distribution and assign them weights according to the importance sampling principle. Critical issues in applying PMC methods are the choice of the generating functions for the samples and the avoidance of the sample degeneracy. In this paper, we propose three new schemes that considerably improve the performance of the original PMC formulation by allowing for better exploration of the space of unknowns and by selecting more adequately the surviving samples. A theoretical analysis is performed, proving the superiority of the novel schemes in terms of variance of the associated estimators and preservation of the sample diversity. Furthermore, we show that they outperform other state of the art algorithms (both in terms of mean square error and robustness w.r.t. initialization) through extensive numerical simulations.