Source author record

Alice Cleynen

Alice Cleynen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Computation math.ST Methodology Statistics Theory math.OC

Catalog footprint

What is connected

7works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Medical follow-up optimization: A Monte-Carlo planning strategy

Designing patient-specific follow-up strategy is a crucial step towards personalized medicine in cancer. Tools to help doctors deciding on treatment allocation together with next visit date, based on patient preferences and medical observations, would be particularly beneficial. Such tools should be based on realistic models of disease progress under the impact of medical treatments, involve the design of (multi-)objective functions that a treatment strategy should optimize along the patient's medical journey, and include efficient resolution algorithms to optimize personalized follow-up by taking the patient's history and preferences into account. We propose to model cancer evolution with a Piecewise Deterministic Markov Process where patients alternate between remission and relapse phases with disease-specific tumor evolution. This model is controlled via the online optimization of a long-term cost function accounting for treatment side-effects, hospital visits burden and disease impact on the quality of life. Optimization is based on noisy measurements of blood markers at visit dates. We leverage the Partially-Observed Monte-Carlo Planning algorithm to solve this continuous-time, continuous-state problem, taking advantage of the nearly-deterministic nature of cancer evolution. We show that this approximate solution approach of the exact model performs better than the counterpart exact resolution of the discrete model, while allowing for more versatility in the cost function model.

preprint2015arXiv

Model selection for the segmentation of multiparameter exponential family distributions

We consider the segmentation problem of univariate distributions from the exponential family with multiple parameters. In segmentation, the choice of the number of segments remains a difficult issue due to the discrete nature of the change-points. In this general exponential family distribution framework, we propose a penalized log-likelihood estimator where the penalty is inspired by papers of L. Birgé and P. Massart. The resulting estimator is proved to satisfy an oracle inequality. We then further study the particular case of categorical variables by comparing the values of the key constants when derived from the specification of our general approach and when obtained by working directly with the characteristics of this distribution. Finally, a simulation study is conducted to assess the performance of our criterion for the exponential distribution, and an application on real data modelled by the categorical distribution is provided.

preprint2013arXiv

Comparing change-point locations of independent profiles with application to gene annotation

We are interested in the comparison of transcript boundaries from cells which originated in different environments. The goal is to assess whether this phenomenon, called differential splicing, is used to modify the transcription of the genome in response to stress factors. We address this question by comparing the change-points locations in the individual segmentation of each profile, which correspond to the RNA-Seq data for a gene in one growth condition. This requires the ability to evaluate the uncertainty of the change-point positions, and the work of Rigaill et. al. (2011) provides an appropriate framework in such case. Building on their approach, we propose two methods for the comparison of change-points, and illustrate our results on a dataset from the yeast specie. We show that the UTR boundaries are subject to differential splicing, while the intron boundaries are conserved in all profiles. Our approach is implemented in an R package called EBS which is available on the CRAN.

preprint2013arXiv

Fast estimation of the ICL criterion for change-point detection problems with applications to Next-Generation Sequencing data

In this paper, we consider the Integrated Completed Likelihood (ICL) as a useful criterion for estimating the number of changes in the underlying distribution of data in problems where detecting the precise location of these changes is the main goal. The exact computation of the ICL requires O(Kn2) operations (with K the number of segments and n the number of data-points) which is prohibitive in many practical situations with large sequences of data. We describe a framework to estimate the ICL with O(Kn) complexity. Our approach is general in the sense that it can accommodate any given model distribution. We checked the run-time and validity of our approach on simulated data and demonstrate its good performance when analyzing real Next-Generation Sequencing (NGS) data using a negative binomial model.

preprint2013arXiv

Finite state space non parametric Hidden Markov Models are in general identifiable

In this paper, we prove that finite state space non parametric hidden Markov models are identifiable as soon as the transition matrix of the latent Markov chain has full rank and the emission probability distributions are linearly independent. We then propose several non parametric likelihood based estimation methods, which we apply to models used in applications. We finally show on examples that the use of non parametric modeling and estimation may improve the classification performances.

preprint2013arXiv

Segmentation of the Poisson and negative binomial rate models: a penalized estimator

We consider the segmentation problem of Poisson and negative binomial (i.e. overdispersed Poisson) rate distributions. In segmentation, an important issue remains the choice of the number of segments. To this end, we propose a penalized log-likelihood estimator where the penalty function is constructed in a non-asymptotic context following the works of L. Birgé and P. Massart. The resulting estimator is proved to satisfy an oracle inequality. The performances of our criterion is assessed using simulated and real datasets in the RNA-seq data analysis context.

preprint2013arXiv

Segmentor3IsBack: an R package for the fast and exact segmentation of Seq-data

Genome annotation is an important issue in biology which has long been addressed with gene prediction methods and manual experiments requiring biological expertise. The expanding Next Generation Sequencing technologies and their enhanced precision allow a new approach to the domain: the segmentation of RNA-Seq data to determine gene boundaries. Because of its almost linear complexity, we propose to use the Pruned Dynamic Programming Algorithm, which performances had been acknowledged for CGH arrays, for Seq-experiment outputs. This requires the adaptation of the algorithm to the negative binomial distribution with which we model the data. We show that if the dispersion in the signal is known, the PDP algorithm can be used and we provide an estimator for this dispersion. We then propose to estimate the number of segments, which can be associated to coding or non-coding regions of the genome, using an oracle penalty. We illustrate the results of our approach on a real data-set and show its good performance. Our algorithm is available as an R package on the CRAN repository.

Alice Cleynen

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Medical follow-up optimization: A Monte-Carlo planning strategy

Model selection for the segmentation of multiparameter exponential family distributions

Comparing change-point locations of independent profiles with application to gene annotation

Fast estimation of the ICL criterion for change-point detection problems with applications to Next-Generation Sequencing data

Finite state space non parametric Hidden Markov Models are in general identifiable

Segmentation of the Poisson and negative binomial rate models: a penalized estimator

Segmentor3IsBack: an R package for the fast and exact segmentation of Seq-data