Source author record

Sylvain Le Corff

Sylvain Le Corff appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning math.PR Methodology Applications eess.IV eess.SP

Catalog footprint

What is connected

19works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

State and parameter learning with PaRIS particle Gibbs

Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.

preprint2022arXiv

Amortized backward variational inference in nonlinear state-space models

We consider the problem of state estimation in general state-space models using variational inference. For a generic variational family defined using the same backward decomposition as the actual joint smoothing distribution, we establish for the first time that, under mixing assumptions, the variational approximation of expectations of additive state functionals induces an error which grows at most linearly in the number of observations. This guarantee is consistent with the known upper bounds for the approximation of smoothing distributions using standard Monte Carlo methods. Moreover, we propose an amortized inference framework where a neural network shared over all times steps outputs the parameters of the variational kernels. We also study empirically parametrizations which allow analytical marginalization of the variational distributions, and therefore lead to efficient smoothing algorithms. Significant improvements are made over state-of-the art variational solutions, especially when the generative model depends on a strongly nonlinear and noninjective mixing function.

preprint2022arXiv

Diffusion bridges vector quantized Variational AutoEncoders

Vector Quantized-Variational AutoEncoders (VQ-VAE) are generative models based on discrete latent representations of the data, where inputs are mapped to a finite set of learned embeddings.To generate new samples, an autoregressive prior distribution over the discrete states must be trained separately. This prior is generally very complex and leads to slow generation. In this work, we propose a new model to train the prior and the encoder/decoder networks simultaneously. We build a diffusion bridge between a continuous coded vector and a non-informative prior distribution. The latent discrete states are then given as random functions of these continuous vectors. We show that our model is competitive with the autoregressive prior on the mini-Imagenet and CIFAR dataset and is efficient in both optimization and sampling. Our framework also extends the standard VQ-VAE and enables end-to-end training.

preprint2021arXiv

Deconvolution with unknown noise distribution is possible for multivariate signals

This paper considers the deconvolution problem in the case where the target signal is multidimensional and no information is known about the noise distribution. More precisely, no assumption is made on the noise distribution and no samples are available to estimate it: the deconvolution problem is solved based only on the corrupted signal observations. We establish the identifiability of the model up to translation when the signal has a Laplace transform with an exponential growth smaller than $2$ and when it can be decomposed into two dependent components. Then, we propose an estimator of the probability density function of the signal without any assumption on the noise distribution. As this estimator depends of the lightness of the tail of the signal distribution which is usually unknown, a model selection procedure is proposed to obtain an adaptive estimator in this parameter with the same rate of convergence as the estimator with a known tail parameter. Finally, we establish a lower bound on the minimax rate of convergence that matches the upper bound.

preprint2021arXiv

Joint self-supervised blind denoising and noise estimation

We propose a novel self-supervised image blind denoising approach in which two neural networks jointly predict the clean signal and infer the noise distribution. Assuming that the noisy observations are independent conditionally to the signal, the networks can be jointly trained without clean training data. Therefore, our approach is particularly relevant for biomedical image denoising where the noise is difficult to model precisely and clean training data are usually unavailable. Our method significantly outperforms current state-of-the-art self-supervised blind denoising algorithms, on six publicly available biomedical image datasets. We also show empirically with synthetic noisy data that our model captures the noise distribution efficiently. Finally, the described framework is simple, lightweight and computationally efficient, making it useful in practical cases.

preprint2020arXiv

End-to-end deep metamodeling to calibrate and optimize energy loads

In this paper, we propose a new end-to-end methodology to optimize the energy performance and the comfort, air quality and hygiene of large buildings. A metamodel based on a Transformer network is introduced and trained using a dataset sampled with a simulation program. Then, a few physical parameters and the building management system settings of this metamodel are calibrated using the CMA-ES optimization algorithm and real data obtained from sensors. Finally, the optimal settings to minimize the energy loads while maintaining a target thermal comfort and air quality are obtained using a multi-objective optimization procedure. The numerical experiments illustrate how this metamodel ensures a significant gain in energy efficiency while being computationally much more appealing than models requiring a huge number of physical parameters to be estimated.

preprint2020arXiv

Identifiability and consistent estimation of nonparametric translation hidden Markov models with general state space

This paper considers hidden Markov models where the observations are given as the sum of a latent state which lies in a general state space and some independent noise with unknown distribution. It is shown that these fully nonparametric translation models are identifiable with respect to both the distribution of the latent variables and the distribution of the noise, under mostly a light tail assumption on the latent variables. Two nonparametric estimation methods are proposed and we prove that the corresponding estimators are consistent for the weak convergence topology. These results are illustrated with numerical experiments.

preprint2020arXiv

Learning the distribution of latent variables in paired comparison models with round-robin scheduling

Paired comparison data considered in this paper originate from the comparison of a large number N of individuals in couples. The dataset is a collection of results of contests between two individuals when each of them has faced n opponents, where n is much larger than N. Individual are represented by independent and identically distributed random parameters characterizing their abilities.The paper studies the maximum likelihood estimator of the parameters distribution. The analysis relies on the construction of a graphical model encoding conditional dependencies of the observations which are the outcomes of the first n contests each individual is involved in. This graphical model allows to prove geometric loss of memory properties and deduce the asymptotic behavior of the likelihood function. This paper sets the focus on graphical models obtained from round-robin scheduling of these contests.Following a classical construction in learning theory, the asymptotic likelihood is used to measure performance of the maximum likelihood estimator. Risk bounds for this estimator are finally obtained by sub-Gaussian deviation results for Markov chains applied to the graphical model.

preprint2016arXiv

On the two-filter approximations of marginal smoothing distributions in general state space models

A prevalent problem in general state space models is the approximation of the smoothing distribution of a state conditional on the observations from the past, the present, and the future. The aim of this paper is to provide a rigorous analysis of such approximations of smoothed distributions provided by the two-filter algorithms. We extend the results available for the approximation of smoothing distributions to these two-filter approaches which combine a forward filter approximating the filtering distributions with a backward information filter approximating a quantity proportional to the posterior distribution of the state given future observations.

preprint2016arXiv

Optimal scaling of the Random Walk Metropolis algorithm under Lp mean differentiability

This paper considers the optimal scaling problem for high-dimensional random walk Metropolis algorithms for densities which are differentiable in Lp mean but which may be irregular at some points (like the Laplace density for example) and/or are supported on an interval. Our main result is the weak convergence of the Markov chain (appropriately rescaled in time and space) to a Langevin diffusion process as the dimension d goes to infinity. Because the log-density might be non-differentiable, the limiting diffusion could be singular. The scaling limit is established under assumptions which are much weaker than the one used in the original derivation of [6]. This result has important practical implications for the use of random walk Metropolis algorithms in Bayesian frameworks based on sparsity inducing priors.

preprint2016arXiv

Statistical Inference for Oscillation Processes

A new model for time series with a specific oscillation pattern is proposed. The model consists of a hidden phase process controlling the speed of polling and a nonparametric curve characterizing the pattern, leading together to a generalized state space model. Identifiability of the model is proved and a method for statistical inference based on a particle smoother and a nonparametric EM algorithm is developed. In particular, the oscillation pattern and the unobserved phase process are estimated. The proposed algorithms are computationally efficient and their performance is assessed through simulations and an application to human electrocardiogram recordings.

preprint2015arXiv

A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection

This paper introduces a new Markov Chain Monte Carlo method for Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines a Metropolis Adjusted Langevin (MALA) step to propose local moves associated with a shrinkage-thresholding step allowing to propose new models. The geometric ergodicity of this new trans-dimensional Markov Chain Monte Carlo sampler is established. An extensive numerical experiment, on simulated and real data, is presented to illustrate the performance of the proposed algorithm in comparison with some more classical trans-dimensional algorithms.

preprint2015arXiv

Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models

In this paper, we consider the filtering and smoothing recursions in nonparametric finite state space hidden Markov models (HMMs) when the parameters of the model are unknown and replaced by estimators. We provide an explicit and time uniform control of the filtering and smoothing errors in total variation norm as a function of the parameter estimation errors. We prove that the risk for the filtering and smoothing errors may be uniformly upper bounded by the risk of the estimators. It has been proved very recently that statistical inference for finite state space nonparametric HMMs is possible. We study how the recent spectral methods developed in the parametric setting may be extended to the nonparametric framework and we give explicit upper bounds for the L2-risk of the nonparametric spectral estimators. When the observation space is compact, this provides explicit rates for the filtering and smoothing errors in total variation norm. The performance of the spectral method is assessed with simulated data for both the estimation of the (nonparametric) conditional distribution of the observations and the estimation of the marginal smoothing distributions.

preprint2015arXiv

Nonparametric regression on hidden phi-mixing variables: identifiability and consistency of a pseudo-likelihood based estimation procedure

This paper outlines a new nonparametric estimation procedure for unobserved phi-mixing processes. It is assumed that the only information on the stationary hidden states (Xk) is given by the process (Yk), where Yk is a noisy observation of f(Xk). The paper introduces a maximum pseudo-likelihood procedure to estimate the function f and the distribution of the hidden states using blocks of observations of length b. The identifiability of the model is studied in the particular cases b=1 and b=2. The consistency of the estimators of f and of the distribution of the hidden states as the number of observations grows to infinity is established.

preprint2012arXiv

Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm

Online variants of the Expectation Maximization (EM) algorithm have recently been proposed to perform parameter inference with large data sets or data streams, in independent latent models and in hidden Markov models. Nevertheless, the convergence properties of these algorithms remain an open problem at least in the hidden Markov case. This contribution deals with a new online EM algorithm which updates the parameter at some deterministic times. Some convergence results have been derived even in general latent models such as hidden Markov models. These properties rely on the assumption that some intermediate quantities are available in closed form or can be approximated by Monte Carlo methods when the Monte Carlo error vanishes rapidly enough. In this paper, we propose an algorithm which approximates these quantities using Sequential Monte Carlo methods. The convergence of this algorithm and of an averaged version is established and their performance is illustrated through Monte Carlo experiments.

preprint2012arXiv

Non-asymptotic deviation inequalities for smoothed additive functionals in non-linear state-space models

The approximation of fixed-interval smoothing distributions is a key issue in inference for general state-space hidden Markov models (HMM). This contribution establishes non-asymptotic bounds for the Forward Filtering Backward Smoothing (FFBS) and the Forward Filtering Backward Simulation (FFBSi) estimators of fixed-interval smoothing functionals. We show that the rate of convergence of the Lq-mean errors of both methods depends on the number of observations T and the number of particles N only through the ratio T/N for additive functionals. In the case of the FFBS, this improves recent results providing bounds depending on T and the square root of N.

preprint2012arXiv

Online Expectation Maximization based algorithms for inference in hidden Markov models

The Expectation Maximization (EM) algorithm is a versatile tool for model parameter estimation in latent data models. When processing large data sets or data stream however, EM becomes intractable since it requires the whole data set to be available at each iteration of the algorithm. In this contribution, a new generic online EM algorithm for model parameter inference in general Hidden Markov Model is proposed. This new algorithm updates the parameter estimate after a block of observations is processed (online). The convergence of this new algorithm is established, and the rate of convergence is studied showing the impact of the block size. An averaging procedure is also proposed to improve the rate of convergence. Finally, practical illustrations are presented to highlight the performance of these algorithms in comparison to other online maximum likelihood procedures.

preprint2012arXiv

Simultaneous Localization and Mapping Problem in Wireless Sensor Networks

Mobile device localization in wireless sensor networks is a challenging task. It has already been addressed when the WiFI propagation maps of the access points are modeled deterministically. However, this procedure does not take into account the environmental dynamics and also assumes an offline human training calibration. In this paper, the maps are made of an average indoor propagation model combined with a perturbation field which represents the influence of the environment. This perturbation field is embedded with a prior distribution. The device localization is dealt with using Sequential Monte Carlo methods and relies on the estimation of the propagation maps. This inference task is performed online, i.e. using the observations sequentially, with a recently proposed online Expectation Maximization based algorithm. The performance of the algorithm are illustrated through Monte Carlo experiments.

preprint2012arXiv

Supplement paper to "Online Expectation Maximization based algorithms for inference in hidden Markov models"

This is a supplementary material to the paper "Online Expectation Maximization based algorithms for inference in hidden Markov models". It contains further technical derivations and additional simulation results.

Sylvain Le Corff

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

State and parameter learning with PaRIS particle Gibbs

Amortized backward variational inference in nonlinear state-space models

Diffusion bridges vector quantized Variational AutoEncoders

Deconvolution with unknown noise distribution is possible for multivariate signals

Joint self-supervised blind denoising and noise estimation

End-to-end deep metamodeling to calibrate and optimize energy loads

Identifiability and consistent estimation of nonparametric translation hidden Markov models with general state space

Learning the distribution of latent variables in paired comparison models with round-robin scheduling

On the two-filter approximations of marginal smoothing distributions in general state space models

Optimal scaling of the Random Walk Metropolis algorithm under Lp mean differentiability

Statistical Inference for Oscillation Processes

A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection

Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models

Nonparametric regression on hidden phi-mixing variables: identifiability and consistency of a pseudo-likelihood based estimation procedure

Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm

Non-asymptotic deviation inequalities for smoothed additive functionals in non-linear state-space models

Online Expectation Maximization based algorithms for inference in hidden Markov models

Simultaneous Localization and Mapping Problem in Wireless Sensor Networks

Supplement paper to "Online Expectation Maximization based algorithms for inference in hidden Markov models"