Source author record

Erwan Le Pennec

Erwan Le Pennec appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning math.OC

Catalog footprint

What is connected

8works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Robust Reinforcement Learning with Distributional Risk-averse formulation

Robust Reinforcement Learning tries to make predictions more robust to changes in the dynamics or rewards of the system. This problem is particularly important when the dynamics and rewards of the environment are estimated from the data. In this paper, we approximate the Robust Reinforcement Learning constrained with a $Φ$-divergence using an approximate Risk-Averse formulation. We show that the classical Reinforcement Learning formulation can be robustified using standard deviation penalization of the objective. Two algorithms based on Distributional Reinforcement Learning, one for discrete and one for continuous action spaces are proposed and tested in a classical Gym environment to demonstrate the robustness of the algorithms.

preprint2020arXiv

Challenging common bolus advisor for self-monitoring type-I diabetes patients using Reinforcement Learning

Patients with diabetes who are self-monitoring have to decide right before each meal how much insulin they should take. A standard bolus advisor exists, but has never actually been proven to be optimal in any sense. We challenged this rule applying Reinforcement Learning techniques on data simulated with T1DM, an FDA-approved simulator developed by Kovatchev et al. modeling the gluco-insulin interaction. Results show that the optimal bolus rule is fairly different from the standard bolus advisor, and if followed can actually avoid hypoglycemia episodes.

preprint2019arXiv

Learning from both experts and data

In this work we study the problem of inferring a discrete probability distribution using both expert knowledge and empirical data. This is an important issue for many applications where the scarcity of data prevents a purely empirical approach. In this context, it is common to rely first on an initial domain knowledge a priori before proceeding to an online data acquisition. We are particularly interested in the intermediate regime where we do not have enough data to do without the initial expert a priori of the experts, but enough to correct it if necessary. We present here a novel way to tackle this issue with a method providing an objective way to choose the weight to be given to experts compared to data. We show, both empirically and theoretically, that our proposed estimator is always more efficient than the best of the two models (expert or data) within a constant.

preprint2013arXiv

Gaussian Mixture Regression model with logistic weights, a penalized maximum likelihood approach

We wish to estimate conditional density using Gaussian Mixture Regression model with logistic weights and means depending on the covariate. We aim at selecting the number of components of this model as well as the other parameters by a penalized maximum likelihood approach. We provide a lower bound on penalty, proportional up to a logarithmic term to the dimension of each model, that ensures an oracle inequality for our estimator. Our theoretical analysis is supported by some numerical experiments.

preprint2012arXiv

Conditional Density Estimation by Penalized Likelihood Model Selection and Applications

In this technical report, we consider conditional density estimation with a maximum likelihood approach. Under weak assumptions, we obtain a theoretical bound for a Kullback-Leibler type loss for a single model maximum likelihood estimate. We use a penalized model selection technique to select a best model within a collection. We give a general condition on penalty choice that leads to oracle type inequality for the resulting estimate. This construction is applied to two examples of partition-based conditional density models, models in which the conditional density depends only in a piecewise manner from the covariate. The first example relies on classical piecewise polynomial densities while the second uses Gaussian mixtures with varying mixing proportion but same mixture components. We show how this last case is related to an unsupervised segmentation application that has been the source of our motivation to this study.

preprint2012arXiv

Radon needlet thresholding

We provide a new algorithm for the treatment of the noisy inversion of the Radon transform using an appropriate thresholding technique adapted to a well-chosen new localized basis. We establish minimax results and prove their optimality. In particular, we prove that the procedures provided here are able to attain minimax bounds for any $\mathbb {L}_p$ loss. It s important to notice that most of the minimax bounds obtained here are new to our knowledge. It is also important to emphasize the adaptation properties of our procedures with respect to the regularity (sparsity) of the object to recover and to inhomogeneous smoothness. We perform a numerical study that is of importance since we especially have to discuss the cubature problems and propose an averaging procedure that is mostly in the spirit of the cycle spinning performed for periodic signals.

preprint2009arXiv

Inversion of noisy Radon transform by SVD based needlet

A linear method for inverting noisy observations of the Radon transform is developed based on decomposition systems (needlets) with rapidly decaying elements induced by the Radon transform SVD basis. Upper bounds of the risk of the estimator are established in $L^p$ ($1\le p\le \infty$) norms for functions with Besov space smoothness. A practical implementation of the method is given and several examples are discussed.

preprint2008arXiv

Thresholding methods to estimate the copula density

This paper deals with the problem of the multivariate copula density estimation. Using wavelet methods we provide two shrinkage procedures based on thresholding rules for which the knowledge of the regularity of the copula density to be estimated is not necessary. These methods, said to be adaptive, are proved to perform very well when adopting the minimax and the maxiset approaches. Moreover we show that these procedures can be discriminated in the maxiset sense. We produce an estimation algorithm whose qualities are evaluated thanks some simulation. Last, we propose a real life application for financial data.