Researcher profile

Nicolas Chopin

Nicolas Chopin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2023arXiv

Fast Slate Policy Optimization: Going Beyond Plackett-Luce

An increasingly important building block of large scale machine learning systems is based on returning slates; an ordered lists of items given a query. Applications of this technology include: search, information retrieval and recommender systems. When the action space is large, decision systems are restricted to a particular structure to complete online queries quickly. This paper addresses the optimization of these large scale decision systems given an arbitrary reward function. We cast this learning problem in a policy optimization framework and propose a new class of policies, born from a novel relaxation of decision functions. This results in a simple, yet efficient learning algorithm that scales to massive action spaces. We compare our method to the commonly adopted Plackett-Luce policy class and demonstrate the effectiveness of our approach on problems with action space sizes in the order of millions.

preprint2022arXiv

De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.

preprint2022arXiv

Improved Gibbs samplers for Cosmic Microwave Background power spectrum estimation

We study different variants of the Gibbs sampler algorithm from the perspective of their applicability to the estimation of power spectra of the cosmic microwave background (CMB) anisotropies. These include approaches studied earlier in the CMB literature as well as new ones which are proposed in this work. We demonstrate all these variants on full and cut sky simulations and compare their performance, assessing both their computational and statistical efficiency. For this we employ a consistent comparison metric, an effective sample size (ESS) per second, commonly used in this context in the statistical literature. We show that one of the proposed approaches, referred to as Centered overrelax, which capitalizes on additional, auxiliary variables to minimize computational time needed per sample, and uses overrelaxation to decorrelate subsequent samples, performs better than the standard Gibbs sampler by a factor between one and two orders of magnitude in the nearly full-sky, satellite-like cases. It therefore potentially provides an interesting alternative to the currently favored approaches.

preprint2022arXiv

On resampling schemes for particle filters with weakly informative observations

We consider particle filters with weakly informative observations (or `potentials') relative to the latent state dynamics. The particular focus of this work is on particle filters to approximate time-discretisations of continuous-time Feynman--Kac path integral models -- a scenario that naturally arises when addressing filtering and smoothing problems in continuous time -- but our findings are indicative about weakly informative settings beyond this context too. We study the performance of different resampling schemes, such as systematic resampling, SSP (Srinivasan sampling process) and stratified resampling, as the time-discretisation becomes finer and also identify their continuous-time limit, which is expressed as a suitably defined `infinitesimal generator.' By contrasting these generators, we find that (certain modifications of) systematic and SSP resampling `dominate' stratified and independent `killing' resampling in terms of their limiting overall resampling rate. The reduced intensity of resampling manifests itself in lower variance in our numerical experiment. This efficiency result, through an ordering of the resampling rate, is new to the literature. The second major contribution of this work concerns the analysis of the limiting behaviour of the entire population of particles of the particle filter as the time discretisation becomes finer. We provide the first proof, under general conditions, that the particle approximation of the discretised continuous-time Feynman--Kac path integral models converges to a (uniformly weighted) continuous-time particle system.

preprint2020arXiv

Adaptive Tuning Of Hamiltonian Monte Carlo Within Sequential Monte Carlo

Sequential Monte Carlo (SMC) samplers form an attractive alternative to MCMC for Bayesian computation. However, their performance depends strongly on the Markov kernels used to rejuvenate particles. We discuss how to calibrate automatically (using the current particles) Hamiltonian Monte Carlo kernels within SMC. To do so, we build upon the adaptive SMC approach of Fearnhead and Taylor (2013), and we also suggest alternative methods. We illustrate the advantages of using HMC kernels within an SMC sampler via an extensive numerical study.

preprint2020arXiv

Metropolis-Hastings with Averaged Acceptance Ratios

Markov chain Monte Carlo (MCMC) methods to sample from a probability distribution $π$ defined on a space $(Θ,\mathcal{T})$ consist of the simulation of realisations of Markov chains $\{θ_{n},n\geq1\}$ of invariant distribution $π$ and such that the distribution of $θ_{i}$ converges to $π$ as $i\rightarrow\infty$. In practice one is typically interested in the computation of expectations of functions, say $f$, with respect to $π$ and it is also required that averages $M^{-1}\sum_{n=1}^{M}f(θ_{n})$ converge to the expectation of interest. The iterative nature of MCMC makes it difficult to develop generic methods to take advantage of parallel computing environments when interested in reducing time to convergence. While numerous approaches have been proposed to reduce the variance of ergodic averages, including averaging over independent realisations of $\{θ_{n},n\geq1\}$ simulated on several computers, techniques to reduce the "burn-in" of MCMC are scarce. In this paper we explore a simple and generic approach to improve convergence to equilibrium of existing algorithms which rely on the Metropolis-Hastings (MH) update, the main building block of MCMC. The main idea is to use averages of the acceptance ratio w.r.t. multiple realisations of random variables involved, while preserving $π$ as invariant distribution. The methodology requires limited change to existing code, is naturally suited to parallel computing and is shown on our examples to provide substantial performance improvements both in terms of convergence to equilibrium and variance of ergodic averages. In some scenarios gains are observed even on a serial machine.

preprint2020arXiv

Negative association, ordering and convergence of resampling methods

We study convergence and convergence rates for resampling schemes. Our first main result is a general consistency theorem based on the notion of negative association, which is applied to establish the almost-sure weak convergence of measures output from Kitagawa's (1996) stratified resampling method. Carpenter et al's (1999) systematic resampling method is similar in structure but can fail to converge depending on the order of the input samples. We introduce a new resampling algorithm based on a stochastic rounding technique of Srinivasan (2001), which shares some attractive properties of systematic resampling, but which exhibits negative association and therefore converges irrespective of the order of the input samples. We confirm a conjecture made by Kitagawa (1996) that ordering input samples by their states in $\mathbb{R}$ yields a faster rate of convergence; we establish that when particles are ordered using the Hilbert curve in $\mathbb{R}^d$, the variance of the resampling error is ${\scriptscriptstyle\mathcal{O}}(N^{-(1+1/d)})$ under mild conditions, where $N$ is the number of particles. We use these results to establish asymptotic properties of particle algorithms based on resampling schemes that differ from multinomial resampling.