Source author record

Denys Pommeret

Denys Pommeret appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Machine Learning

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Mixed data Deep Gaussian Mixture Model: A clustering model for mixed datasets

Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the variables in order to design groups. In this work we introduce a multilayer architecture model-based clustering method called Mixed Deep Gaussian Mixture Model (MDGMM) that can be viewed as an automatic way to merge the clustering performed separately on continuous and non-continuous data. This architecture is flexible and can be adapted to mixed as well as to continuous or non-continuous data. In this sense we generalize Generalized Linear Latent Variable Models and Deep Gaussian Mixture Models. We also design a new initialisation strategy and a data driven method that selects the best specification of the model and the optimal number of clusters for a given dataset "on the fly". Besides, our model provides continuous low-dimensional representations of the data which can be a useful tool to visualize mixed datasets. Finally, we validate the performance of our approach comparing its results with state-of-the-art mixed data clustering models over several commonly used datasets.

preprint2012arXiv

Likelihood-Free Parallel Tempering

Approximate Bayesian Computational (ABC) methods (or likelihood-free methods) have appeared in the past fifteen years as useful methods to perform Bayesian analyses when the likelihood is analytically or computationally intractable. Several ABC methods have been proposed: Monte Carlo Markov Chains (MCMC) methods have been developped by Marjoramet al. (2003) and by Bortotet al. (2007) for instance, and sequential methods have been proposed among others by Sissonet al. (2007), Beaumont et al. (2009) and Del Moral et al. (2009). Until now, while ABC-MCMC methods remain the reference, sequential ABC methods have appeared to outperforms them (see for example McKinley et al. (2009) or Sisson et al. (2007)). In this paper a new algorithm combining population-based MCMC methods with ABC requirements is proposed, using an analogy with the Parallel Tempering algorithm (Geyer, 1991). Performances are compared with existing ABC algorithms on simulations and on a real example.

preprint2012arXiv

Parallel Tempering with Equi-Energy Moves

The Equi-Energy Sampler (EES) introduced by Kou et al [2006] is based on a population of chains which are updated by local moves and global moves, also called equi-energy jumps. The state space is partitioned into energy rings, and the current state of a chain can jump to a past state of an adjacent chain that has energy level close to its level. This algorithm has been developed to facilitate global moves between different chains, resulting in a good exploration of the state space by the target chain. This method seems to be more efficient than the classical Parallel Tempering (PT) algorithm. However it is difficult to use in combination with a Gibbs sampler and it necessitates increased storage. In this paper we propose an adaptation of this EES that combines PT with the principle of swapping between chains with same levels of energy. This adaptation, that we shall call Parallel Tempering with Equi-Energy Moves (PTEEM), keeps the original idea of the EES method while ensuring good theoretical properties, and practical implementation even if combined with a Gibbs sampler. Performances of the PTEEM algorithm are compared with those of the EES and of the standard PT algorithms in the context of mixture models, and in a problem of identification of gene regulatory binding motifs.

preprint2011arXiv

A study of variable selection using g-prior distribution with ridge parameter

In the Bayesian stochastic search variable selection framework, a common prior distribution for the regression coefficients is the g-prior of Zellner (1986). However, there are two standard cases in which the associated covariance matrix does not exist, and the conventional prior of Zellner can not be used: if the number of observations is lower than the number of variables (large p and small n paradigm), or if some variables are linear combinations of others. In such situations a prior distribution derived from the prior of Zellner can be used, by introducing a ridge parameter. This prior introduced by Gupta and Ibrahim (2007) is a flexible and simple adaptation of the g-prior. In this paper we study the influence of the ridge parameter on the selection of variables. A simple way to choose the associated hyper-parameters is proposed. The method is valid for any generalized linear mixed model and we focus on the case of probit mixed models when some variables are linear combinations of others. The method is applied to both simulated and real datasets obtained from Affymetrix microarray experiments. Results are compared to those obtained with the Bayesian Lasso.

preprint2011arXiv

Comparing Two Contaminated Samples

We consider the problem of testing whether two samples of contaminated data, possibly paired, are from the same distribution. Is is assumed that the contaminations are additive noises with known moments of all orders. The test statistic is based on the polynomials moments of the difference between observations and noises. . A data driven selection is proposed to choose automatically the number of involved polynomials. We present a simulation study in order to investigate the power of the proposed test within discrete and continuous cases. A real-data example is presented to demonstrate the method.

preprint2011arXiv

Nonparametric test for detecting change in distribution with panel data

This paper considers the problem of comparing two processes with panel data. A nonparametric test is proposed for detecting a monotone change in the link between the two process distributions. The test statistic is of CUSUM type, based on the empirical distribution functions. The asymptotic distribution of the proposed statistic is derived and its finite sample property is examined by bootstrap procedures through Monte Carlo simulations.

preprint2011arXiv

Testing for equality between two transformations of random variables

Consider two random variables contaminated by two unknown transformations. The aim of this paper is to test the equality of those transformations. Two cases are distinguished: first, the two random variables have known distributions. Second, they are unknown but observed before contaminations. We propose a nonparametric test statistic based on empirical cumulative distribution functions. Monte Carlo studies are performed to analyze the level and the power of the test. An illustration is presented through a real data set.

Denys Pommeret

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Mixed data Deep Gaussian Mixture Model: A clustering model for mixed datasets

Likelihood-Free Parallel Tempering

Parallel Tempering with Equi-Energy Moves

A study of variable selection using g-prior distribution with ridge parameter

Comparing Two Contaminated Samples

Nonparametric test for detecting change in distribution with panel data

Testing for equality between two transformations of random variables