Source author record

Jean-Michel Loubes

Jean-Michel Loubes appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Machine Learning Methodology Applications math.PR Artificial Intelligence cs.CY Databases econ.EM math.OC q-fin.RM

Catalog footprint

What is connected

41works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

preprint2023arXiv

On the coalitional decomposition of parameters of interest

Understanding the behavior of a black-box model with probabilistic inputs can be based on the decomposition of a parameter of interest (e.g., its variance) into contributions attributed to each coalition of inputs (i.e., subsets of inputs). In this paper, we produce conditions for obtaining unambiguous and interpretable decompositions of very general parameters of interest. This allows to recover known decompositions, holding under weaker assumptions than stated in the literature.

preprint2023arXiv

Transport-based Counterfactual Models

Counterfactual frameworks have grown popular in machine learning for both explaining algorithmic decisions but also defining individual notions of fairness, more intuitive than typical group fairness conditions. However, state-of-the-art models to compute counterfactuals are either unrealistic or unfeasible. In particular, while Pearl's causal inference provides appealing rules to calculate counterfactuals, it relies on a model that is unknown and hard to discover in practice. We address the problem of designing realistic and feasible counterfactuals in the absence of a causal model. We define transport-based counterfactual models as collections of joint probability distributions between observable distributions, and show their connection to causal counterfactuals. More specifically, we argue that optimal-transport theory defines relevant transport-based counterfactual models, as they are numerically feasible, statistically-faithful, and can coincide under some assumptions with causal counterfactual models. Finally, these models make counterfactual approaches to fairness feasible, and we illustrate their practicality and efficiency on fair learning. With this paper, we aim at laying out the theoretical foundations for a new, implementable approach to counterfactual thinking.

preprint2022arXiv

An improved central limit theorem and fast convergence rates for entropic transportation costs

We prove a central limit theorem for the entropic transportation cost between subgaussian probability measures, centered at the population cost. This is the first result which allows for asymptotically valid inference for entropic optimal transport between measures which are not necessarily discrete. In the compactly supported case, we complement these results with new, faster, convergence rates for the expected entropic transportation cost between empirical measures. Our proof is based on strengthening convergence results for dual solutions to the entropic optimal transport problem.

preprint2022arXiv

Central Limit Theorems for Semidiscrete Wasserstein Distances

We prove a Central Limit Theorem for the empirical optimal transport cost, $\sqrt{\frac{nm}{n+m}}\{\mathcal{T}_c(P_n,Q_m)-\mathcal{T}_c(P,Q)\}$, in the semi discrete case, i.e when the distribution $P$ is supported in $N$ points, but without assumptions on $Q$. We show that the asymptotic distribution is the supremun of a centered Gaussian process, which is Gaussian under some additional conditions on the probability $Q$ and on the cost. Such results imply the central limit theorem for the $p$-Wassertein distance, for $p\geq 1$. This means that, for fixed $N$, the curse of dimensionality is avoided. To better understand the influence of such $N$, we provide bounds of $E|\mathcal{W}_1(P,Q_m)-\mathcal{W}_1(P,Q)|$ depending on $m$ and $N$. Finally, the semidiscrete framework provides a control on the second derivative of the dual formulation, which yields the first central limit theorem for the optimal transport potentials. The results are supported by simulations that help to visualize the given limits and bounds. We analyse also the cases where classical bootstrap works.

preprint2022arXiv

Dimension Reduction for time series with Variational AutoEncoders

In this work, we explore dimensionality reduction techniques for univariate and multivariate time series data. We especially conduct a comparison between wavelet decomposition and convolutional variational autoencoders for dimension reduction. We show that variational autoencoders are a good option for reducing the dimension of high dimensional data like ECG. We make these comparisons on a real world, publicly available, ECG dataset that has lots of variability and use the reconstruction error as the metric. We then explore the robustness of these models with noisy data whether for training or inference. These tests are intended to reflect the problems that exist in real-world time series data and the VAE was robust to both tests.

preprint2022arXiv

Explaining Machine Learning Models using Entropic Variable Projection

In this paper, we present a new explainability formalism designed to shed light on how each input variable of a test set impacts the predictions of machine learning models. Hence, we propose a group explainability formalism for trained machine learning decision rules, based on their response to the variability of the input variables distribution. In order to emphasize the impact of each input variable, this formalism uses an information theory framework that quantifies the influence of all input-output observations based on entropic projections. This is thus the first unified and model agnostic formalism enabling data scientists to interpret the dependence between the input variables, their impact on the prediction errors, and their influence on the output predictions. Convergence rates of the entropic projections are provided in the large sample case. Most importantly, we prove that computing an explanation in our framework has a low algorithmic complexity, making it scalable to real-life large datasets. We illustrate our strategy by explaining complex decision rules learned by using XGBoost, Random Forest or Deep Neural Network classifiers on various datasets such as Adult Income, MNIST, CelebA, Boston Housing, Iris, as well as synthetic ones. We finally make clear its differences with the explainability strategies LIME and SHAP, that are based on single observations. Results can be reproduced by using the freely distributed Python toolbox https://gems-ai.aniti.fr/.

preprint2022arXiv

Fairness constraint in Structural Econometrics and Application to fair estimation using Instrumental Variables

A supervised machine learning algorithm determines a model from a learning sample that will be used to predict new observations. To this end, it aggregates individual characteristics of the observations of the learning sample. But this information aggregation does not consider any potential selection on unobservables and any status-quo biases which may be contained in the training sample. The latter bias has raised concerns around the so-called \textit{fairness} of machine learning algorithms, especially towards disadvantaged groups. In this chapter, we review the issue of fairness in machine learning through the lenses of structural econometrics models in which the unknown index is the solution of a functional equation and issues of endogeneity are explicitly accounted for. We model fairness as a linear operator whose null space contains the set of strictly {\it fair} indexes. A {\it fair} solution is obtained by projecting the unconstrained index into the null space of this operator or by directly finding the closest solution of the functional equation into this null space. We also acknowledge that policymakers may incur a cost when moving away from the status quo. Achieving \textit{approximate fairness} is obtained by introducing a fairness penalty in the learning procedure and balancing more or less heavily the influence between the status quo and a full fair solution.

preprint2022arXiv

GAN Estimation of Lipschitz Optimal Transport Maps

This paper introduces the first statistically consistent estimator of the optimal transport map between two probability distributions, based on neural networks. Building on theoretical and practical advances in the field of Lipschitz neural networks, we define a Lipschitz-constrained generative adversarial network penalized by the quadratic transportation cost. Then, we demonstrate that, under regularity assumptions, the obtained generator converges uniformly to the optimal transport map as the sample size increases to infinity. Furthermore, we show through a number of numerical experiments that the learnt mapping has promising performances. In contrast to previous work tackling either statistical guarantees or practicality, we provide an expressive and feasible estimator which paves way for optimal transport applications where the asymptotic behaviour must be certified.

preprint2021arXiv

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

Optimal transport maps define a one-to-one correspondence between probability distributions, and as such have grown popular for machine learning applications. However, these maps are generally defined on empirical observations and cannot be generalized to new samples while preserving asymptotic properties. We extend a novel method to learn a consistent estimator of a continuous optimal transport map from two empirical distributions. The consequences of this work are two-fold: first, it enables to extend the transport plan to new observations without computing again the discrete optimal transport map; second, it provides statistical guarantees to machine learning applications of optimal transport. We illustrate the strength of this approach by deriving a consistent framework for transport-based counterfactual explanations in fairness.

preprint2021arXiv

Central Limit Theorems for General Transportation Costs

We consider the problem of optimal transportation with general cost between a empirical measure and a general target probability on R d , with d $\ge$ 1. We extend results in [19] and prove asymptotic stability of both optimal transport maps and potentials for a large class of costs in R d. We derive a central limit theorem (CLT) towards a Gaussian distribution for the empirical transportation cost under minimal assumptions, with a new proof based on the Efron-Stein inequality and on the sequential compactness of the closed unit ball in L 2 (P) for the weak topology. We provide also CLTs for empirical Wassertsein distances in the special case of potential costs | $\bullet$ | p , p > 1.

preprint2020arXiv

A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

Applications based on Machine Learning models have now become an indispensable part of the everyday life and the professional world. A critical question then recently arised among the population: Do algorithmic decisions convey any type of discrimination against specific groups of population or minorities? In this paper, we show the importance of understanding how a bias can be introduced into automatic decisions. We first present a mathematical framework for the fair learning problem, specifically in the binary classification setting. We then propose to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set. Finally, we check the performance of different approaches aiming to reduce the bias in binary classification outcomes. Importantly, we show that some intuitive methods are ineffective. This sheds light on the fact trying to make fair machine learning models may be a particularly challenging task, in particular when the training observations contain a bias.

preprint2020arXiv

Gaussian Processes indexed on the symmetric group: prediction and learning

In the framework of the supervised learning of a real function defined on a space X , the so called Kriging method stands on a real Gaussian field defined on X. The Euclidean case is well known and has been widely studied. In this paper, we explore the less classical case where X is the non commutative finite group of permutations. In this setting, we propose and study an harmonic analysis of the covariance operators that enables to consider Gaussian processes models and forecasting issues. Our theory is motivated by statistical ranking problems.

preprint2020arXiv

optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

Data obtained from Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well-known phenomenon produced by measurements on different individuals, with different characteristics such as illness, age, sex, etc. The use of different settings for measurement, the variation of the conditions during experiments and the different types of flow cytometers are some of the technical causes of variability. This mixture of sources of variability makes the use of supervised machine learning for identification of cell populations difficult. The present work is conceived as a combination of strategies to facilitate the task of supervised gating. We propose $optimalFlowTemplates$, based on a similarity distance and $\text{Wasserstein barycenters}$, which clusters cytometries and produces prototype cytometries for the different groups. We show that supervised learning, restricted to the new groups, performs better than the same techniques applied to the whole collection. We also present $optimalFlowClassification$, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code is freely available as $optimalFlow$ a Bioconductor R package at https://bioconductor.org/packages/optimalFlow. optimalFlowTemplates+optimalFlowClassification addresses the problem of using supervised learning while accounting for biological and technical variability. Our methodology provides a robust automated gating workflow that handles the intrinsic variability of flow cytometry data well. Our main innovation is the methodology itself and the optimal-transport techniques that we apply to flow cytometry analysis.

preprint2020arXiv

Projection to Fairness in Statistical Learning

In the context of regression, we consider the fundamental question of making an estimator fair while preserving its prediction accuracy as much as possible. To that end, we define its projection to fairness as its closest fair estimator in a sense that reflects prediction accuracy. Our methodology leverages tools from optimal transport to construct efficiently the projection to fairness of any given estimator as a simple post-processing step. Moreover, our approach precisely quantifies the cost of fairness, measured in terms of prediction accuracy.

preprint2020arXiv

Review of Mathematical frameworks for Fairness in Machine Learning

A review of the main fairness definitions and fair learning methodologies proposed in the literature over the last years is presented from a mathematical point of view. Following our independence-based approach, we consider how to build fair algorithms and the consequences on the degradation of their performance compared to the possibly unfair case. This corresponds to the price for fairness given by the criteria $\textit{statistical parity}$ or $\textit{equality of odds}$. Novel results giving the expressions of the optimal fair classifier and the optimal fair predictor (under a linear regression gaussian model) in the sense of $\textit{equality of odds}$ are presented.

preprint2020arXiv

Risk Measures Estimation Under Wasserstein Barycenter

Randomness in financial markets requires modern and robust multivariate models of risk measures. This paper proposes a new approach for modeling multivariate risk measures under Wasserstein barycenters of probability measures supported on location-scatter families. Simple and advanced copulas multivariate Value at Risk models are compared with the derived technique. The performance of the model is also checked in market indices of United States generated by the financial crisis due to COVID-19. The introduced model behaves satisfactory in both common and volatile periods of asset prices, providing realistic VaR forecast in this era of social distancing.

preprint2020arXiv

The statistical effect of entropic regularization in optimal transportation

We propose to tackle the problem of understanding the effect of regularization in Sinkhorn algotihms. In the case of Gaussian distributions we provide a closed form for the regularized optimal transport which enables to provide a better understanding of the effect of the regularization from a statistical framework.

preprint2016arXiv

Big Data analytics. Three use cases with R, Python and Spark

Management and analysis of big data are systematically associated with a data distributed architecture in the Hadoop and now Spark frameworks. This article offers an introduction for statisticians to these technologies by comparing the performance obtained by the direct use of three reference environments: R, Python Scikit-learn, Spark MLlib on three public use cases: character recognition, recommending films, categorizing products. As main result, it appears that, if Spark is very efficient for data munging and recommendation by collaborative filtering (non-negative factorization), current implementations of conventional learning methods (logistic regression, random forests) in MLlib or SparkML do not ou poorly compete habitual use of these methods (R, Python Scikit-learn) in an integrated or undistributed architecture

preprint2016arXiv

Destination Prediction by Trajectory Distribution Based Model

In this paper we propose a new method to predict the final destination of vehicle trips based on their initial partial trajectories. We first review how we obtained clustering of trajectories that describes user behaviour. Then, we explain how we model main traffic flow patterns by a mixture of 2d Gaussian distributions. This yielded a density based clustering of locations, which produces a data driven grid of similar points within each pattern. We present how this model can be used to predict the final destination of a new trajectory based on their first locations using a two step procedure: We first assign the new trajectory to the clusters it mot likely belongs. Secondly, we use characteristics from trajectories inside these clusters to predict the final destination. Finally, we present experimental results of our methods for classification of trajectories and final destination prediction on datasets of timestamped GPS-Location of taxi trips. We test our methods on two different datasets, to assess the capacity of our method to adapt automatically to different subsets.

preprint2016arXiv

Existence and Consistency of Wasserstein Barycenters

In this paper, based on the Fr{é}chet mean, we define a notion of barycenter corresponding to a usual notion of statistical mean. We prove the existence of Wasserstein barycenters of random distributions defined on a geodesic space (E, d). We also prove the consistency of this barycenter in a general setting, that includes taking barycenters of empirical versions of the distributions or of a growing set of distributions.

preprint2015arXiv

A statistical analysis of a deformation model with Wasserstein barycenters : estimation procedure and goodness of fit test

We propose a study of a distribution registration model for general deformation functions. In this framework, we provide estimators of the deformations as well as a goodness of fit test of the model. For this, we consider a criterion which studies the Fr{é}chet mean (or barycenter) of the warped distributions whose study enables to make inference on the model. In particular we obtain the asymptotic distribution and a bootstrap procedure for the Wasserstein variation.

preprint2015arXiv

Review and Perspective for Distance Based Trajectory Clustering

In this paper we tackle the issue of clustering trajectories of geolocalized observations. Using clustering technics based on the choice of a distance between the observations, we first provide a comprehensive review of the different distances used in the literature to compare trajectories. Then based on the limitations of these methods, we introduce a new distance : Symmetrized Segment-Path Distance (SSPD). We finally compare this new distance to the others according to their corresponding clustering results obtained using both hierarchical clustering and affinity propagation methods.

preprint2014arXiv

A Kriging procedure for processes indexed by graphs

We provide a new kriging procedure of processes on graphs. Based on the construction of Gaussian random processes indexed by graphs, we extend to this framework the usual linear prediction method for spatial random fields, known as kriging. We provide the expression of the estimator of such a random field at unobserved locations as well as a control for the prediction error.

preprint2014arXiv

A unified framework for the study of the PLS estimator's properties

In this paper we propose a new approach to study the properties of the Partial Least Squares (PLS) estimator. This approach relies on the link between PLS and discrete orthogonal polynomials. Indeed many important PLS objects can be expressed in terms of some specific discrete orthogonal polynomials, called the residual polynomials. Based on the explicit analytical expression we have stated for these polynomials in terms of signal and noise, we provide a new framework for the study of PLS. Furthermore, we show that this new approach allows to simplify and retreive independent proofs of many classical results (proved earlier by different authors using various approaches and tools). This general and unifying approach also sheds new light on PLS and helps to gain insight on its properties.

preprint2014arXiv

Big Data Analytics - Retour vers le Futur 3; De Statisticien à Data Scientist

The rapid evolution of information systems managing more and more voluminous data has caused profound paradigm shifts in the job of statistician, becoming successively data miner, bioinformatician and now data scientist. Without the sake of completeness and after having illustrated these successive mutations, this article briefly introduced the new research issues that quickly rise in Statistics, and more generally in Mathematics, in order to integrate the characteristics: volume, variety and velocity, of big data.

preprint2014arXiv

Efficient estimation of conditional covariance matrices for dimension reduction

Let $\boldsymbol{X}\in \mathbb{R}^p$ and $Y\in \mathbb{R}$. In this paper we propose an estimator of the conditional covariance matrix, $\mathrm{Cov}(\mathbb{E}[\boldsymbol{X}\vert Y])$, in an inverse regression setting. Based on the estimation of a quadratic functional, this methodology provides an efficient estimator from a semi parametric point of view. We consider a functional Taylor expansion of $\mathrm{Cov}(\mathbb{E}[\boldsymbol{X}\vert Y])$ under some mild conditions and the effect of using an estimate of the unknown joint distribution. The asymptotic properties of this estimator are also provided.

preprint2014arXiv

Group Lasso for generalized linear models in high dimension

Nowadays an increasing amount of data is available and we have to deal with models in high dimension (number of covariates much larger than the sample size). Under sparsity assumption it is reasonable to hope that we can make a good estimation of the regression parameter. This sparsity assumption as well as a block structuration of the covariates into groups with similar modes of behavior is for example quite natural in genomics. A huge amount of scientific literature exists for Gaussian linear models including the Lasso estimator and also the Group Lasso estimator which promotes group sparsity under an a priori knowledge of the groups. We extend this Group Lasso procedure to generalized linear models and we study the properties of this estimator for sparse high-dimensional generalized linear models to find convergence rates. We provide oracle inequalities for the prediction and estimation error under assumptions on the covariables and under a condition on the design matrix. We show the ability of this estimator to recover good sparse approximation of the true model. At last we extend these results to the case of an Elastic net penalty and we apply them to the so-called Poisson regression case which has not been studied in this context contrary to the logistic regression.

preprint2014arXiv

PLS: a new statistical insight through the prism of orthogonal polynomials

Partial Least Square (PLS) is a dimension reduction method used to remove multicollinearities in a regression model. However contrary to Principal Components Analysis (PCA) the PLS components are also choosen to be optimal for predicting the response $Y$. In this paper we provide a new and explicit formula for the residuals. We show that the residuals are completely determined by the spectrum of the design matrix and by the noise on the observations. Because few are known on the behaviour of the PLS components we also investigate their statistical properties in a regression context. New results on regression and prediction error for PLS are stated under the assumption of a low variance of the noise.

preprint2013arXiv

A robust algorithm for template curve estimation based on manifold embedding

This paper considers the problem of finding a meaningful template function that represents the common pattern of a sample of curves. To address this issue, a novel algorithm based on a robust version of the isometric featuring mapping (Isomap) algorithm is developed. Assuming that the functional data lie on an intrinsically low-dimensional smooth manifold with unknown underlying structure, we propose an approximation of the geodesic distance. This approximation is used to compute the corresponding empirical Fréchet median function, which provides an intrinsic estimator of the template function. Unlike the Isomap method, the algorithm has the advantage of being parameter free and easier to use. Comparisons with other methods, with both simulated and real datasets, show that the algorithm works well and outperforms these methods.

preprint2013arXiv

Distribution's template estimate with Wasserstein metrics

In this paper we tackle the problem of comparing distributions of random variables and defining a mean pattern between a sample of random events. Using barycenters of measures in the Wasserstein space, we propose an iterative version as an estimation of the mean distribution. Moreover, when the distributions are a common measure warped by a centered random operator, then the barycenter enables to recover this distribution template.

preprint2013arXiv

Functional calibration estimation by the maximum entropy on the mean principle

We extend the problem of obtaining an estimator for the finite population mean parameter incorporating complete auxiliary information through calibration estimation in survey sampling but considering a functional data framework. The functional calibration sampling weights of the estimator are obtained by matching the calibration estimation problem with the maximum entropy on the mean principle. In particular, the calibration estimation is viewed as an infinite dimensional linear inverse problem following the structure of the maximum entropy on the mean approach. We give a precise theoretical setting and estimate the functional calibration weights assuming, as prior measures, the centered Gaussian and compound Poisson random measures. Additionally, through a simple simulation study, we show that our functional calibration estimator improves its accuracy compared with the Horvitz-Thompson estimator.

preprint2012arXiv

Adaptive Covariance Estimation with model selection

We provide in this paper a fully adaptive penalized procedure to select a covariance among a collection of models observing i.i.d replications of the process at fixed observation points. For this we generalize previous results of Bigot and al. and propose to use a data driven penalty to obtain an oracle inequality for the estimator. We prove that this method is an extension to the matricial regression model of the work by Baraud.

preprint2012arXiv

Gaussian stationary processes over graphs, general frame and maximum likelihood identification

In this paper, using spectral theory of Hilbertian operators, we study ARMA Gaussian processes indexed by graphs. We extend Whittle maximum likelihood estimation of the parameters for the corresponding spectral density and show their asymptotic optimality.

preprint2012arXiv

Modeling Weather Conditions Consequences on Road Trafficking Behaviors

We provide a model to understand how adverse weather conditions modify traffic flow dynamic. We first prove that the microscopic Free Flow Speed of the vehicles is changed and then provide a rule to model this change. For this, we consider a thresholded linear model, corresponding to an application of a MARS model to road trafficking. This model adapts itself locally to the whole road network and provides accurate unbiased forecasted speed using live or short term forecasted weather data information.

preprint2011arXiv

Adaptive estimation of spectral densities via wavelet thresholding and information projection

In this paper, we study the problem of adaptive estimation of the spectral density of a stationary Gaussian process. For this purpose, we consider a wavelet-based method which combines the ideas of wavelet approximation and estimation by information projection in order to warrants that the solution is a nonnegative function. The spectral density of the process is estimated by projecting the wavelet thresholding expansion of the periodogram onto a family of exponential functions. This ensures that the spectral density estimator is a strictly positive function. Then, by Bochner's theorem, the corresponding estimator of the covariance function is semidefinite positive. The theoretical behavior of the estimator is established in terms of rate of convergence of the Kullback-Leibler discrepancy over Besov classes. We also show the excellent practical performance of the estimator in some numerical experiments.

preprint2011arXiv

Estimation error for blind Gaussian time series prediction

We tackle the issue of the blind prediction of a Gaussian time series. For this, we construct a projection operator build by plugging an empirical covariance estimation into a Schur complement decomposition of the projector. This operator is then used to compute the predictor. Rates of convergence of the estimates are given.

preprint2011arXiv

Group Lasso estimation of high-dimensional covariance matrices

In this paper, we consider the Group Lasso estimator of the covariance matrix of a stochastic process corrupted by an additive noise. We propose to estimate the covariance matrix in a high-dimensional setting under the assumption that the process has a sparse representation in a large dictionary of basis functions. Using a matrix regression model, we propose a new methodology for high-dimensional covariance matrix estimation based on empirical contrast regularization by a group Lasso penalty. Using such a penalty, the method selects a sparse set of basis functions in the dictionary used to approximate the process, leading to an approximation of the covariance matrix into a low dimensional space. Consistency of the estimator is studied in Frobenius and operator norms and an application to sparse PCA is proposed.

preprint2011arXiv

LAN property for some fractional type Brownian motion

We study asymptotic expansion of the likelihood of a certain class of Gaussian processes characterized by their spectral density $f_θ$. We consider the case where $f_θ\PAR{x} \sim_{x\to 0} \ABS{x}^{-\al(θ)}L_θ(x)$ with $L_θ$ a slowly varying function and $\al\PARθ\in (-\infty,1)$. We prove LAN property for these models which include in particular fractional Brownian motion %$B^α_t,\: α\geq 1/2$ or ARFIMA processes.

preprint2011arXiv

Manifold embedding for curve registration

We focus on the problem of finding a good representative of a sample of random curves warped from a common pattern f. We first prove that such a problem can be moved onto a manifold framework. Then, we propose an estimation of the common pattern f based on an approximated geodesic distance on a suitable manifold. We then compare the proposed method to more classical methods.

preprint2011arXiv

Unbiased risk estimation method for covariance estimation

We consider a model selection estimator of the covariance of a random process. Using the Unbiased Risk Estimation (URE) method, we build an estimator of the risk which allows to select an estimator in a collection of model. Then, we present an oracle inequality which ensures that the risk of the selected estimator is close to the risk of the oracle. Simulations show the efficiency of this methodology.

Jean-Michel Loubes

What is connected

Connect this record

See the researcher in context

Building this map preview

41 published item(s)

Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

On the coalitional decomposition of parameters of interest

Transport-based Counterfactual Models

An improved central limit theorem and fast convergence rates for entropic transportation costs

Central Limit Theorems for Semidiscrete Wasserstein Distances

Dimension Reduction for time series with Variational AutoEncoders

Explaining Machine Learning Models using Entropic Variable Projection

Fairness constraint in Structural Econometrics and Application to fair estimation using Instrumental Variables

GAN Estimation of Lipschitz Optimal Transport Maps

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

Central Limit Theorems for General Transportation Costs

A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

Gaussian Processes indexed on the symmetric group: prediction and learning

optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

Projection to Fairness in Statistical Learning

Review of Mathematical frameworks for Fairness in Machine Learning

Risk Measures Estimation Under Wasserstein Barycenter

The statistical effect of entropic regularization in optimal transportation

Big Data analytics. Three use cases with R, Python and Spark

Destination Prediction by Trajectory Distribution Based Model

Existence and Consistency of Wasserstein Barycenters

A statistical analysis of a deformation model with Wasserstein barycenters : estimation procedure and goodness of fit test

Review and Perspective for Distance Based Trajectory Clustering

A Kriging procedure for processes indexed by graphs

A unified framework for the study of the PLS estimator's properties

Big Data Analytics - Retour vers le Futur 3; De Statisticien à Data Scientist

Efficient estimation of conditional covariance matrices for dimension reduction

Group Lasso for generalized linear models in high dimension

PLS: a new statistical insight through the prism of orthogonal polynomials

A robust algorithm for template curve estimation based on manifold embedding

Distribution's template estimate with Wasserstein metrics

Functional calibration estimation by the maximum entropy on the mean principle

Adaptive Covariance Estimation with model selection

Gaussian stationary processes over graphs, general frame and maximum likelihood identification

Modeling Weather Conditions Consequences on Road Trafficking Behaviors

Adaptive estimation of spectral densities via wavelet thresholding and information projection

Estimation error for blind Gaussian time series prediction

Group Lasso estimation of high-dimensional covariance matrices

LAN property for some fractional type Brownian motion

Manifold embedding for curve registration

Unbiased risk estimation method for covariance estimation