Source author record

Judith Rousseau

Judith Rousseau appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Methodology Computation Applications Machine Learning math.HO stat.OT

Catalog footprint

What is connected

35works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is a simple to implement, black-box method, that does not require the user to specify a low-cost posterior approximation. It is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that our method provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required.

preprint2022arXiv

A flexible, random histogram kernel for discrete-time Hawkes processes

Hawkes processes are a self-exciting stochastic process used to describe phenomena whereby past events increase the probability of the occurrence of future events. This work presents a flexible approach for modelling a variant of these, namely discrete-time Hawkes processes. Most standard models of Hawkes processes rely on a parametric form for the function describing the influence of past events, referred to as the triggering kernel. This is likely to be insufficient to capture the true excitation pattern, particularly for complex data. By utilising trans-dimensional Markov chain Monte Carlo inference techniques, our proposed model for the triggering kernel can take the form of any step function, affording significantly more flexibility than a parametric form. We first demonstrate the utility of the proposed model through a comprehensive simulation study. This includes univariate scenarios, and multivariate scenarios whereby there are multiple interacting Hawkes processes. We then apply the proposed model to several case studies: the interaction between two countries during the early to middle stages of the COVID-19 pandemic, taking Italy and France as an example, and the interaction of terrorist activity between two countries in close spatial proximity, Indonesia and the Philippines, and then within three regions of the Philippines.

preprint2022arXiv

Bayesian Nonparametrics for Sparse Dynamic Networks

In this paper we propose a Bayesian nonparametric approach to modelling sparse time-varying networks. A positive parameter is associated to each node of a network, which models the sociability of that node. Sociabilities are assumed to evolve over time, and are modelled via a dynamic point process model. The model is able to capture long term evolution of the sociabilities. Moreover, it yields sparse graphs, where the number of edges grows subquadratically with the number of nodes. The evolution of the sociabilities is described by a tractable time-varying generalised gamma process. We provide some theoretical insights into the model and apply it to three datasets: a simulated network, a network of hyperlinks between communities on Reddit, and a network of co-occurences of words in Reuters news articles after the September 11th attacks.

preprint2022arXiv

Evidence estimation in finite and infinite mixture models and applications

Estimating the model evidence - or mariginal likelihood of the data - is a notoriously difficult task for finite and infinite mixture models and we reexamine here different Monte Carlo techniques advocated in the recent literature, as well as novel approaches based on Geyer (1994) reverse logistic regression technique, Chib (1995) algorithm, and Sequential Monte Carlo (SMC). Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric 'strongly identifiable' Dirichlet Process Mixture (DPM) model.

preprint2022arXiv

Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit

Recent work by Jacot et al. (2018) has shown that training a neural network using gradient descent in parameter space is related to kernel gradient descent in function space with respect to the Neural Tangent Kernel (NTK). Lee et al. (2019) built on this result by establishing that the output of a neural network trained using gradient descent can be approximated by a linear model when the network width is large. Indeed, under regularity conditions, the NTK converges to a time-independent kernel in the infinite-width limit. This regime is often called the NTK regime. In parallel, recent works on signal propagation (Poole et al., 2016; Schoenholz et al., 2017; Hayou et al., 2019a) studied the impact of the initialization and the activation function on signal propagation in deep neural networks. In this paper, we connect these two theories by quantifying the impact of the initialization and the activation function on the NTK when the network depth becomes large. In particular, we provide a comprehensive analysis of the convergence rates of the NTK regime to the infinite depth regime.

preprint2017arXiv

Adaptive density estimation based on a mixture of Gammas

We consider the problem of Bayesian density estimation on the positive semiline for possibly unbounded densities. We propose a hierarchical Bayesian estimator based on the gamma mixture prior which can be viewed as a location mixture. We study convergence rates of Bayesian density estimators based on such mixtures. We construct approximations of the local Hölder densities, and of their extension to unbounded densities, to be continuous mixtures of gamma distributions, leading to approximations of such densities by finite mixtures. These results are then used to derive posterior concentration rates, with priors based on these mixture models. The rates are minimax (up to a log n term) and since the priors are independent of the smoothness the rates are adaptive to the smoothness.

preprint2016arXiv

Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator

We consider the asymptotic behaviour of the marginal maximum likelihood empirical Bayes posterior distribution in general setting. First we characterize the set where the maximum marginal likelihood estimator is located with high probability. Then we provide oracle type of upper and lower bounds for the contraction rates of the empirical Bayes posterior. We also show that the hierarchical Bayes posterior achieves the same contraction rate as the maximum marginal likelihood empirical Bayes posterior. We demonstrate the applicability of our general results for various models and prior distributions by deriving upper and lower bounds for the contraction rates of the corresponding empirical and hierarchical Bayes posterior distributions.

preprint2016arXiv

Clustering action potential spikes: Insights on the use of overfitted finite mixture models and Dirichlet process mixture models

The modelling of action potentials from extracellular recordings, or spike sorting, is a rich area of neuroscience research in which latent variable models are often used. Two such models, Overfitted Finite Mixture models (OFMs) and Dirichlet Process Mixture models (DPMs) are considered to provide insights for unsupervised clustering of complex, multivariate medical data when the number of clusters is unknown. OFM and DPM are structured in a similar hierarchical fashion but they are based on different philosophies with different underlying assumptions. This study investigates how these differences impact on a real study of spike sorting, for the estimation of multivariate Gaussian location-scale mixture models in the presence of common difficulties arising from complex medical data. The results provide insights allowing the future analyst to choose an approach suited to the situation and goal of the research problem at hand.

preprint2016arXiv

Overfitting hidden Markov models with an unknown number of states

This paper presents new theory and methodology for the Bayesian estimation of overfitted hidden Markov models, with finite state space. The goal is then to achieve posterior emptying of extra states. A prior configuration is constructed which favours configurations where the hidden Markov chain remains ergodic although it empties out some of the states. Asymptotic posterior convergence rates are proven theoretically, and demonstrated with a large sample simulation. The problem of overfitted HMMs is then considered in the context of smaller sample sizes, and due to computational and mixing issues two alternative prior structures are studied, one commonly used in practice, and a mixture of the two priors. The Prior Parallel Tempering approach of van Havre (2015) is also extended to HMMs to allow MCMC estimation of the complex posterior space. A replicate simulation study and an in-depth exploration is performed to compare the three priors with hyperparameters chosen according to the asymptotic constraints alongside less informative alternatives.

preprint2016arXiv

Some comments about "Penalising model component complexity" by Simpson et al. (2017)

This note discusses the paper "Penalising model component complexity" by Simpson et al. (2017). While we acknowledge the highly novel approach to prior construction and commend the authors for setting new-encompassing principles that will Bayesian modelling, and while we perceive the potential connection with other branches of the literature, we remain uncertain as to what extent the principles exposed in the paper can be developed outside specific models, given their lack of precision. The very notions of model component, base model, overfitting prior are for instance conceptual rather than mathematical and we thus fear the concept of penalised complexity may not further than extending first-guess priors into larger families, thus failing to establish reference priors on a novel sound ground.

preprint2016arXiv

Some comments about A Bayesian criterion for singular models by M. Drton and M. Plummer

These are written comments about the Read Paper A Bayesian criterion for singular models by M. Drton and M. Plummer, read to the Royal Statistical Society on October 5, 2016. The discussion was delivered by Judith Rousseau.

preprint2016arXiv

Some comments about James Watson's and Chris Holmes' "Approximate Models and Robust Decisions": Nonparametric Bayesian clay for robust decision bricks

This note discusses Watson and Holmes (2016) and their pro- posals towards more robust Bayesian decisions. While we acknowledge and commend the authors for setting new and all-encompassing prin- ciples of Bayesian robustness, and we appreciate the strong anchoring of those within a decision-theoretic referential, we remain uncertain as to which extent such principles can be applied outside binary de- cisions. We also wonder at the ultimate relevance of Kullback-Leibler neighbourhoods to characterise robustness and favour extensions along non-parametric axes.

preprint2016arXiv

Tails assumptions and posterior concentration rates for mixtures of Gaussians

Nowadays in density estimation, posterior rates of convergence for location and location-scale mixtures of Gaussians are only known under light-tail assumptions; with better rates achieved by location mixtures. It is conjectured, but not proved, that the situation should be reversed under heavy tails assumptions. The conjecture is based on the feeling that there is no need to achieve a good order of approximation in regions with few data (say, in the tails), favoring location-scale mixtures which allow for spatially varying order of approximation. Here we test the previous argument on the Gaussian errors mean regression model with random design, for which the light tail assumption is not required for proofs. Although we cannot invalidate the conjecture due to the lack of lower bound, we find that even with heavy tails assumptions, location-scale mixtures apparently perform always worst than location mixtures. However, the proofs suggest to introduce hybrid location-scale mixtures that are find to outperform both location and location-scale mixtures, whatever the nature of the tails. Finally, we show that all tails assumptions can be released at the price of making the prior distribution covariate dependent.

preprint2015arXiv

A Bernstein-von Mises theorem for smooth functionals in semiparametric models

A Bernstein-von Mises theorem is derived for general semiparametric functionals. The result is applied to a variety of semiparametric problems in i.i.d. and non-i.i.d. situations. In particular, new tools are developed to handle semiparametric bias, in particular for nonlinear functionals and in cases where regularity is possibly low. Examples include the squared $L^2$-norm in Gaussian white noise, nonlinear functionals in density estimation, as well as functionals in autoregressive models. For density estimation, a systematic study of BvM results for two important classes of priors is provided, namely random histograms and Gaussian process priors.

preprint2015arXiv

Comment on Article by Berger, Bernardo, and Sun

Discussion of Overall Objective Priors by James O. Berger, Jose M. Bernardo, Dongchu Sun [arXiv:1504.02689].

preprint2015arXiv

Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets"

Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets" by Szabó, van der Vaart and van Zanten [arXiv:1310.4489v5].

preprint2015arXiv

On adaptive posterior concentration rates

We investigate the problem of deriving posterior concentration rates under different loss functions in nonparametric Bayes. We first provide a lower bound on posterior coverages of shrinking neighbourhoods that relates the metric or loss under which the shrinking neighbourhood is considered, and an intrinsic pre-metric linked to frequentist separation rates. In the Gaussian white noise model, we construct feasible priors based on a spike and slab procedure reminiscent of wavelet thresholding that achieve adaptive rates of contraction under $L^2$ or $L^{\infty}$ metrics when the underlying parameter belongs to a collection of Hölder balls and that moreover achieve our lower bound. We analyse the consequences in terms of asymptotic behaviour of posterior credible balls as well as frequentist minimax adaptive estimation. Our results are appended with an upper bound for the contraction rate under an arbitrary loss in a generic regular experiment. The upper bound is attained for certain sieve priors and enables to extend our results to density estimation.

preprint2015arXiv

Overfitting Bayesian Mixture Models with an Unknown Number of Components

This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models

preprint2014arXiv

About the posterior distribution in hidden Markov models with unknown number of states

We consider finite state space stationary hidden Markov models (HMMs) in the situation where the number of hidden states is unknown. We provide a frequentist asymptotic evaluation of Bayesian analysis methods. Our main result gives posterior concentration rates for the marginal densities, that is for the density of a fixed number of consecutive observations. Using conditions on the prior, we are then able to define a consistent Bayesian estimator of the number of hidden states. It is known that the likelihood ratio test statistic for overfitted HMMs has a nonstandard behaviour and is unbounded. Our conditions on the prior may be seen as a way to penalize parameters to avoid this phenomenon. Inference of parameters is a much more difficult task than inference of marginal densities, we still provide a precise description of the situation when the observations are i.i.d. and we allow for $2$ possible hidden states.

preprint2014arXiv

Bayesian matrix completion: prior specification

Low-rank matrix estimation from incomplete measurements recently received increased attention due to the emergence of several challenging applications, such as recommender systems; see in particular the famous Netflix challenge. While the behaviour of algorithms based on nuclear norm minimization is now well understood, an as yet unexplored avenue of research is the behaviour of Bayesian algorithms in this context. In this paper, we briefly review the priors used in the Bayesian literature for matrix completion. A standard approach is to assign an inverse gamma prior to the singular values of a certain singular value decomposition of the matrix of interest; this prior is conjugate. However, we show that two other types of priors (again for the singular values) may be conjugate for this model: a gamma prior, and a discrete prior. Conjugacy is very convenient, as it makes it possible to implement either Gibbs sampling or Variational Bayes. Interestingly enough, the maximum a posteriori for these different priors is related to the nuclear norm minimization problems. We also compare all these priors on simulated datasets, and on the classical MovieLens and Netflix datasets.

preprint2014arXiv

Posterior concentration rates for counting processes with Aalen multiplicative intensities

We provide general conditions to derive posterior concentration rates for Aalen counting processes. The conditions are designed to resemble those proposed in the literature for the problem of density estimation, for instance in Ghosal et al. (2000), so that existing results on density estimation can be adapted to the present setting. We apply the general theorem to some prior models including Dirichlet process mixtures of uniform densities to estimate monotone non-increasing intensities and log-splines.

preprint2014arXiv

Posterior concentration rates for empirical Bayes procedures, with applications to Dirichlet Process mixtures

In this paper we provide general conditions to check on the model and the prior to derive posterior concentration rates for data-dependent priors (or empirical Bayes approaches). We aim at providing conditions that are close to the conditions provided in the seminal paper by Ghosal and van der Vaart (2007a). We then apply the general theorem to two different settings: the estimation of a density using Dirichlet process mixtures of Gaussian random variables with base measure depending on some empirical quantities and the estimation of the intensity of a counting process under the Aalen model. A simulation study for inhomogeneous Poisson processes also illustrates our results. In the former case we also derive some results on the estimation of the mixing density and on the deconvolution problem. In the latter, we provide a general theorem on posterior concentration rates for counting processes with Aalen multiplicative intensity with priors not depending on the data.

preprint2014arXiv

Using informative priors in the estimation of mixtures over time with application to aerosol particle size distributions

The issue of using informative priors for estimation of mixtures at multiple time points is examined. Several different informative priors and an independent prior are compared using samples of actual and simulated aerosol particle size distribution (PSD) data. Measurements of aerosol PSDs refer to the concentration of aerosol particles in terms of their size, which is typically multimodal in nature and collected at frequent time intervals. The use of informative priors is found to better identify component parameters at each time point and more clearly establish patterns in the parameters over time. Some caveats to this finding are discussed.

preprint2013arXiv

Bayesian optimal adaptive estimation using a sieve prior

We derive rates of contraction of posterior distributions on nonparametric models resulting from sieve priors. The aim of the paper is to provide general conditions to get posterior rates when the parameter space has a general structure, and rate adaptation when the parameter space is, e.g., a Sobolev class. The conditions employed, although standard in the literature, are combined in a different way. The results are applied to density, regression, nonlinear autoregression and Gaussian white noise models. In the latter we have also considered a loss function which is different from the usual l2 norm, namely the pointwise loss. In this case it is possible to prove that the adaptive Bayesian approach for the l2 loss is strongly suboptimal and we provide a lower bound on the rate.

preprint2013arXiv

Non parametric finite translation mixtures with dependent regime

In this paper we consider non parametric finite translation mixtures. We prove that all the parameters of the model are identifiable as soon as the matrix that defines the joint distribution of two consecutive latent variables is non singular and the translation parameters are distinct. Under this assumption, we provide a consistent estimator of the number of populations, of the translation parameters and of the distribution of two consecutive latent variables, which we prove to be asymptotically normally distributed under mild dependency assumptions. We propose a non parametric estimator of the unknown translated density. In case the latent variables form a Markov chain (Hidden Markov models), we prove an oracle inequality leading to the fact that this estimator is minimax adaptive over regularity classes of densities.

preprint2012arXiv

Bayes and empirical Bayes: do they merge?

Bayesian inference is attractive for its coherence and good frequentist properties. However, it is a common experience that eliciting a honest prior may be difficult and, in practice, people often take an {\em empirical Bayes} approach, plugging empirical estimates of the prior hyperparameters into the posterior distribution. Even if not rigorously justified, the underlying idea is that, when the sample size is large, empirical Bayes leads to "similar" inferential answers. Yet, precise mathematical results seem to be missing. In this work, we give a more rigorous justification in terms of merging of Bayes and empirical Bayes posterior distributions. We consider two notions of merging: Bayesian weak merging and frequentist merging in total variation. Since weak merging is related to consistency, we provide sufficient conditions for consistency of empirical Bayes posteriors. Also, we show that, under regularity conditions, the empirical Bayes procedure asymptotically selects the value of the hyperparameter for which the prior mostly favors the "truth". Examples include empirical Bayes density estimation with Dirichlet process mixtures.

preprint2012arXiv

Bayesian nonparametric estimation of the spectral density of a long or intermediate memory Gaussian process

A stationary Gaussian process is said to be long-range dependent (resp., anti-persistent) if its spectral density $f(λ)$ can be written as $f(λ)=|λ|^{-2d}g(|λ|)$, where $0<d<1/2$ (resp., $-1/2<d<0$), and $g$ is continuous and positive. We propose a novel Bayesian nonparametric approach for the estimation of the spectral density of such processes. We prove posterior consistency for both $d$ and $g$, under appropriate conditions on the prior distribution. We establish the rate of convergence for a general class of priors and apply our results to the family of fractionally exponential priors. Our approach is based on the true likelihood and does not resort to Whittle's approximation.

preprint2012arXiv

Bayesian semi-parametric estimation of the long-memory parameter under FEXP-priors

For a Gaussian time series with long-memory behavior, we use the FEXP-model for semi-parametric estimation of the long-memory parameter $d$. The true spectral density $f_o$ is assumed to have long-memory parameter $d_o$ and a FEXP-expansion of Sobolev-regularity $\be > 1$. We prove that when $k$ follows a Poisson or geometric prior, or a sieve prior increasing at rate $n^{\frac{1}{1+2\be}}$, $d$ converges to $d_o$ at a suboptimal rate. When the sieve prior increases at rate $n^{\frac{1}{2\be}}$ however, the minimax rate is almost obtained. Our results can be seen as a Bayesian equivalent of the result which Moulines and Soulier obtained for some frequentist estimators.

preprint2012arXiv

Computational aspects of Bayesian spectral density estimation

Gaussian time-series models are often specified through their spectral density. Such models present several computational challenges, in particular because of the non-sparse nature of the covariance matrix. We derive a fast approximation of the likelihood for such models. We propose to sample from the approximate posterior (that is, the prior times the approximate likelihood), and then to recover the exact posterior through importance sampling. We show that the variance of the importance sampling weights vanishes as the sample size goes to infinity. We explain why the approximate posterior may typically multi-modal, and we derive a Sequential Monte Carlo sampler based on an annealing sequence in order to sample from that target distribution. Performance of the overall approach is evaluated on simulated and real datasets. In addition, for one real world dataset, we provide some numerical evidence that a Bayesian approach to semi-parametric estimation of spectral density may provide more reasonable results than its Frequentist counter-parts.

preprint2011arXiv

Inherent Difficulties of Non-Bayesian Likelihood-based Inference, as Revealed by an Examination of a Recent Book by Aitkin

For many decades, statisticians have made attempts to prepare the Bayesian omelette without breaking the Bayesian eggs; that is, to obtain probabilistic likelihood-based inferences without relying on informative prior distributions. A recent example is Murray Aitkin's recent book, {\em Statistical Inference}, which presents an approach to statistical hypothesis testing based on comparisons of posterior distributions of likelihoods under competing models. Aitkin develops and illustrates his method using some simple examples of inference from iid data and two-way tests of independence. We analyze in this note some consequences of the inferential paradigm adopted therein, discussing why the approach is incompatible with a Bayesian perspective and why we do not find it relevant for applied work.

preprint2010arXiv

Bayesian Inference

This chapter provides a overview of Bayesian inference, mostly emphasising that it is a universal method for summarising uncertainty and making estimates and predictions using probability statements conditional on observed data and an assumed model (Gelman 2008). The Bayesian perspective is thus applicable to all aspects of statistical inference, while being open to the incorporation of information items resulting from earlier experiments and from expert opinions. We provide here the basic elements of Bayesian analysis when considered for standard models, refering to Marin and Robert (2007) and to Robert (2007) for book-length entries.1 In the following, we refrain from embarking upon philosophical discussions about the nature of knowledge (see, e.g., Robert 2007, Chapter 10), opting instead for a mathematically sound presentation of an eminently practical statistical methodology. We indeed believe that the most convincing arguments for adopting a Bayesian version of data analyses are in the versatility of this tool and in the large range of existing applications, rather than in those polemical arguments.

preprint2010arXiv

Harold Jeffreys's Theory of Probability Revisited

Published exactly seventy years ago, Jeffreys's Theory of Probability (1939) has had a unique impact on the Bayesian community and is now considered to be one of the main classics in Bayesian Statistics as well as the initiator of the objective Bayes school. In particular, its advances on the derivation of noninformative priors as well as on the scaling of Bayes factors have had a lasting impact on the field. However, the book reflects the characteristics of the time, especially in terms of mathematical rigor. In this paper we point out the fundamental aspects of this reference work, especially the thorough coverage of testing problems and the construction of both estimation and testing noninformative priors based on functional divergences. Our major aim here is to help modern readers in navigating in this difficult text and in concentrating on passages that are still relevant today.

preprint2010arXiv

On Bayesian Data Analysis

This introduction to Bayesian statistics presents the main concepts as well as the principal reasons advocated in favour of a Bayesian modelling. We cover the various approaches to prior determination as well as the basis asymptotic arguments in favour of using Bayes estimators. The testing aspects of Bayesian inference are also examined in details.

preprint2010arXiv

Rates of convergence for the posterior distributions of mixtures of Betas and adaptive nonparametric estimation of the density

In this paper, we investigate the asymptotic properties of nonparametric Bayesian mixtures of Betas for estimating a smooth density on $[0,1]$. We consider a parametrization of Beta distributions in terms of mean and scale parameters and construct a mixture of these Betas in the mean parameter, while putting a prior on this scaling parameter. We prove that such Bayesian nonparametric models have good frequentist asymptotic properties. We determine the posterior rate of concentration around the true density and prove that it is the minimax rate of concentration when the true density belongs to a Hölder class with regularity $β$, for all positive $β$, leading to a minimax adaptive estimating procedure of the density. We also believe that the approximating results obtained on these mixtures of Beta densities can be of interest in a frequentist framework.

preprint2010arXiv

Rejoinder: Harold Jeffreys's Theory of Probability Revisited

We are grateful to all discussants of our re-visitation for their strong support in our enterprise and for their overall agreement with our perspective. Further discussions with them and other leading statisticians showed that the legacy of Theory of Probability is alive and lasting. [arXiv:0804.3173]

Judith Rousseau

What is connected

Connect this record

See the researcher in context

Building this map preview

35 published item(s)

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

A flexible, random histogram kernel for discrete-time Hawkes processes

Bayesian Nonparametrics for Sparse Dynamic Networks

Evidence estimation in finite and infinite mixture models and applications

Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit

Adaptive density estimation based on a mixture of Gammas

Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator

Clustering action potential spikes: Insights on the use of overfitted finite mixture models and Dirichlet process mixture models

Overfitting hidden Markov models with an unknown number of states

Some comments about "Penalising model component complexity" by Simpson et al. (2017)

Some comments about A Bayesian criterion for singular models by M. Drton and M. Plummer

Some comments about James Watson's and Chris Holmes' "Approximate Models and Robust Decisions": Nonparametric Bayesian clay for robust decision bricks

Tails assumptions and posterior concentration rates for mixtures of Gaussians

A Bernstein-von Mises theorem for smooth functionals in semiparametric models

Comment on Article by Berger, Bernardo, and Sun

Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets"

On adaptive posterior concentration rates

Overfitting Bayesian Mixture Models with an Unknown Number of Components

About the posterior distribution in hidden Markov models with unknown number of states

Bayesian matrix completion: prior specification

Posterior concentration rates for counting processes with Aalen multiplicative intensities

Posterior concentration rates for empirical Bayes procedures, with applications to Dirichlet Process mixtures

Using informative priors in the estimation of mixtures over time with application to aerosol particle size distributions

Bayesian optimal adaptive estimation using a sieve prior

Non parametric finite translation mixtures with dependent regime

Bayes and empirical Bayes: do they merge?

Bayesian nonparametric estimation of the spectral density of a long or intermediate memory Gaussian process

Bayesian semi-parametric estimation of the long-memory parameter under FEXP-priors

Computational aspects of Bayesian spectral density estimation

Inherent Difficulties of Non-Bayesian Likelihood-based Inference, as Revealed by an Examination of a Recent Book by Aitkin

Bayesian Inference

Harold Jeffreys's Theory of Probability Revisited

On Bayesian Data Analysis

Rates of convergence for the posterior distributions of mixtures of Betas and adaptive nonparametric estimation of the density

Rejoinder: Harold Jeffreys's Theory of Probability Revisited