Source author record

Gérard Govaert

Gérard Govaert appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Methodology math.ST Statistics Theory Other Quantitative Biology

Catalog footprint

What is connected

9works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

Construction d'une plate-forme intégrée pour la cartographie de l'exposition des populations aux substances chimiques de l'environnement

L'analyse du lien entre l'environnement et la santé est devenue une préoccupation majeure de santé publique comme en témoigne l'émergence des deux Plans nationaux santé environnement. Pour ce faire, les décideurs sont confrontés au besoin de développement d'outils nécessaires à l'identification des zones géographiques dans lesquelles une surexposition potentielle à des substances toxiques est observée. L'objectif du projet Système d'information géographique (SIG), facteurs de risques environnementaux et décès par cancer (SIGFRIED 1) est de construire une plate-forme de modélisation permettant d'évaluer, par une approche spatiale, l'exposition de la population française aux substances chimiques et d'en identifier ses déterminants. L'évaluation des expositions est réalisée par le biais d'une modélisation multimédia probabiliste. Les problèmes épistémologiques liés à l'absence de données sont palliés par la mise en œuvre d'outils utilisant les techniques d'analyse spatiale. Un exemple est fourni sur la région Nord-Pas-de-Calais et Picardie, pour le cadmium, le nickel et le plomb. Le calcul de l'exposition est réalisé sur une durée de 70 ans sur la base des données disponibles autour de l'année 2004 sur une maille de 1 km de côté. Par exemple pour le Nord-Pas-de-Calais, les indicateurs permettent de définir deux zones pour le cadmium et trois zones pour le plomb. Celles-ci sont liées à l'historique industriel de la région : le bassin minier, les activités métallurgiques et l'agglomération lilloise. La contribution des différentes voies d'exposition varie sensiblement d'un polluant à l'autre. Les cartes d'exposition ainsi obtenues permettent d'identifier les zones géographiques dans lesquelles conduire en priorité des études environnementales de terrains. Le SIG construit constitue la base d'une plate-forme où les données d'émission à la source, de mesures environnementales, d'exposition, puis sanitaires et socio-économiques pourront être associées. -- Analysis of the association between the environment and health has become a major public health concern, as shown by the development of two national environmental health plans. For such an analysis, policy-makers need tools to identify the geographic areas where overexposure to toxic agents may be observed. The objective of the SIGFRIED 1 project is to build a work station for spatial modeling of the exposure of the French population to chemical substances and for identifying the determinants of this exposure. Probabilistic multimedia modeling is used to assess exposure. The epistemological problems associated with the absence of data are overcome by the implementation of tools that apply spatial analysis techniques. An example is furnished for the region of Nord-Pas-de-Calais and Picardie, for cadmium, nickel and lead exposure. The calculation of exposure is performed for duration of 70 years on the basis of data collected around 2004 fora grid of squares 1 km in length. For example, for Nord-Pas-de-Calais, the indicators allow us to define two areas for cadmium and three for lead. They are linked to the region's industrial history: mining basin, metallurgy activities, and the Lille metropolitan area. The contribution of various exposure pathways varied substantially from one pollutant to another. The exposure maps thus obtained allow us to identify the geographic area where environmental studies must be conducted in priority. The GIS thus constructed is the foundation of a workstation where source emission data, environmental exposure measurements, and finally health and socioeconomic measurements can be combined.

preprint2013arXiv

A hidden process regression model for functional data description. Application to curve discrimination

A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve discrimination rule is derived using the Maximum A Posteriori rule. The proposed model is evaluated using simulated curves and real world curves acquired during railway switch operations, by performing comparisons with the piecewise regression approach in terms of curve modeling and classification.

preprint2013arXiv

A regression model with a hidden logistic process for feature extraction from time series

A new approach for feature extraction from time series is proposed in this paper. This approach consists of a specific regression model incorporating a discrete hidden logistic process. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. A piecewise regression algorithm and its iterative variant have also been considered for comparisons. An experimental study using simulated and real data reveals good performances of the proposed approach.

preprint2013arXiv

A regression model with a hidden logistic process for signal parametrization

A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.

preprint2013arXiv

Classification automatique de données temporelles en classes ordonnées

This paper proposes a method of segmenting temporal data into ordered classes. It is based on mixture models and a discrete latent process, which enables to successively activates the classes. The classification can be performed by maximizing the likelihood via the EM algorithm or by simultaneously optimizing the model parameters and the partition by the CEM algorithm. These two algorithms can be seen as alternatives to Fisher's algorithm, which improve its computing time.

preprint2013arXiv

Model-based clustering and segmentation of time series with changes in regime

Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation-Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject to changes in regime. The proposed approach consists in modeling each cluster by a regression model in which the polynomial coefficients vary according to a discrete hidden process. In particular, this approach makes use of logistic functions to model the (smooth or abrupt) transitions between regimes. The model parameters are estimated by the maximum likelihood method solved by an Expectation-Maximization algorithm. The proposed approach can also be regarded as a clustering approach which operates by finding groups of time series having common changes in regime. In addition to providing a time series partition, it therefore provides a time series segmentation. The problem of selecting the optimal numbers of clusters and segments is solved by means of the Bayesian Information Criterion (BIC). The proposed approach is shown to be efficient using a variety of simulated time series and real-world time series of electrical power consumption from rail switching operations.

preprint2013arXiv

Model-based clustering with Hidden Markov Model regression for time series with regime changes

This paper introduces a novel model-based clustering approach for clustering time series which present changes in regime. It consists of a mixture of polynomial regressions governed by hidden Markov chains. The underlying hidden process for each cluster activates successively several polynomial regimes during time. The parameter estimation is performed by the maximum likelihood method through a dedicated Expectation-Maximization (EM) algorithm. The proposed approach is evaluated using simulated time series and real-world time series issued from a railway diagnosis application. Comparisons with existing approaches for time series clustering, including the stand EM for Gaussian mixtures, $K$-means clustering, the standard mixture of regression models and mixture of Hidden Markov Models, demonstrate the effectiveness of the proposed approach.

preprint2013arXiv

Modèle à processus latent et algorithme EM pour la régression non linéaire

A non linear regression approach which consists of a specific regression model incorporating a latent process, allowing various polynomial regression models to be activated preferentially and smoothly, is introduced in this paper. The model parameters are estimated by maximum likelihood performed via a dedicated expecation-maximization (EM) algorithm. An experimental study using simulated and real data sets reveals good performances of the proposed approach.

preprint2013arXiv

Time series modeling by a regression approach based on a latent process

Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.

Gérard Govaert

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Construction d'une plate-forme intégrée pour la cartographie de l'exposition des populations aux substances chimiques de l'environnement

A hidden process regression model for functional data description. Application to curve discrimination

A regression model with a hidden logistic process for feature extraction from time series

A regression model with a hidden logistic process for signal parametrization

Classification automatique de données temporelles en classes ordonnées

Model-based clustering and segmentation of time series with changes in regime

Model-based clustering with Hidden Markov Model regression for time series with regime changes

Modèle à processus latent et algorithme EM pour la régression non linéaire

Time series modeling by a regression approach based on a latent process