Researcher profile

Jean-Michel Loubes

Jean-Michel Loubes contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

preprint2023arXiv

On the coalitional decomposition of parameters of interest

Understanding the behavior of a black-box model with probabilistic inputs can be based on the decomposition of a parameter of interest (e.g., its variance) into contributions attributed to each coalition of inputs (i.e., subsets of inputs). In this paper, we produce conditions for obtaining unambiguous and interpretable decompositions of very general parameters of interest. This allows to recover known decompositions, holding under weaker assumptions than stated in the literature.

preprint2023arXiv

Transport-based Counterfactual Models

Counterfactual frameworks have grown popular in machine learning for both explaining algorithmic decisions but also defining individual notions of fairness, more intuitive than typical group fairness conditions. However, state-of-the-art models to compute counterfactuals are either unrealistic or unfeasible. In particular, while Pearl's causal inference provides appealing rules to calculate counterfactuals, it relies on a model that is unknown and hard to discover in practice. We address the problem of designing realistic and feasible counterfactuals in the absence of a causal model. We define transport-based counterfactual models as collections of joint probability distributions between observable distributions, and show their connection to causal counterfactuals. More specifically, we argue that optimal-transport theory defines relevant transport-based counterfactual models, as they are numerically feasible, statistically-faithful, and can coincide under some assumptions with causal counterfactual models. Finally, these models make counterfactual approaches to fairness feasible, and we illustrate their practicality and efficiency on fair learning. With this paper, we aim at laying out the theoretical foundations for a new, implementable approach to counterfactual thinking.

preprint2022arXiv

An improved central limit theorem and fast convergence rates for entropic transportation costs

We prove a central limit theorem for the entropic transportation cost between subgaussian probability measures, centered at the population cost. This is the first result which allows for asymptotically valid inference for entropic optimal transport between measures which are not necessarily discrete. In the compactly supported case, we complement these results with new, faster, convergence rates for the expected entropic transportation cost between empirical measures. Our proof is based on strengthening convergence results for dual solutions to the entropic optimal transport problem.

preprint2022arXiv

Central Limit Theorems for Semidiscrete Wasserstein Distances

We prove a Central Limit Theorem for the empirical optimal transport cost, $\sqrt{\frac{nm}{n+m}}\{\mathcal{T}_c(P_n,Q_m)-\mathcal{T}_c(P,Q)\}$, in the semi discrete case, i.e when the distribution $P$ is supported in $N$ points, but without assumptions on $Q$. We show that the asymptotic distribution is the supremun of a centered Gaussian process, which is Gaussian under some additional conditions on the probability $Q$ and on the cost. Such results imply the central limit theorem for the $p$-Wassertein distance, for $p\geq 1$. This means that, for fixed $N$, the curse of dimensionality is avoided. To better understand the influence of such $N$, we provide bounds of $E|\mathcal{W}_1(P,Q_m)-\mathcal{W}_1(P,Q)|$ depending on $m$ and $N$. Finally, the semidiscrete framework provides a control on the second derivative of the dual formulation, which yields the first central limit theorem for the optimal transport potentials. The results are supported by simulations that help to visualize the given limits and bounds. We analyse also the cases where classical bootstrap works.

preprint2022arXiv

Dimension Reduction for time series with Variational AutoEncoders

In this work, we explore dimensionality reduction techniques for univariate and multivariate time series data. We especially conduct a comparison between wavelet decomposition and convolutional variational autoencoders for dimension reduction. We show that variational autoencoders are a good option for reducing the dimension of high dimensional data like ECG. We make these comparisons on a real world, publicly available, ECG dataset that has lots of variability and use the reconstruction error as the metric. We then explore the robustness of these models with noisy data whether for training or inference. These tests are intended to reflect the problems that exist in real-world time series data and the VAE was robust to both tests.

preprint2022arXiv

Explaining Machine Learning Models using Entropic Variable Projection

In this paper, we present a new explainability formalism designed to shed light on how each input variable of a test set impacts the predictions of machine learning models. Hence, we propose a group explainability formalism for trained machine learning decision rules, based on their response to the variability of the input variables distribution. In order to emphasize the impact of each input variable, this formalism uses an information theory framework that quantifies the influence of all input-output observations based on entropic projections. This is thus the first unified and model agnostic formalism enabling data scientists to interpret the dependence between the input variables, their impact on the prediction errors, and their influence on the output predictions. Convergence rates of the entropic projections are provided in the large sample case. Most importantly, we prove that computing an explanation in our framework has a low algorithmic complexity, making it scalable to real-life large datasets. We illustrate our strategy by explaining complex decision rules learned by using XGBoost, Random Forest or Deep Neural Network classifiers on various datasets such as Adult Income, MNIST, CelebA, Boston Housing, Iris, as well as synthetic ones. We finally make clear its differences with the explainability strategies LIME and SHAP, that are based on single observations. Results can be reproduced by using the freely distributed Python toolbox https://gems-ai.aniti.fr/.

preprint2022arXiv

Fairness constraint in Structural Econometrics and Application to fair estimation using Instrumental Variables

A supervised machine learning algorithm determines a model from a learning sample that will be used to predict new observations. To this end, it aggregates individual characteristics of the observations of the learning sample. But this information aggregation does not consider any potential selection on unobservables and any status-quo biases which may be contained in the training sample. The latter bias has raised concerns around the so-called \textit{fairness} of machine learning algorithms, especially towards disadvantaged groups. In this chapter, we review the issue of fairness in machine learning through the lenses of structural econometrics models in which the unknown index is the solution of a functional equation and issues of endogeneity are explicitly accounted for. We model fairness as a linear operator whose null space contains the set of strictly {\it fair} indexes. A {\it fair} solution is obtained by projecting the unconstrained index into the null space of this operator or by directly finding the closest solution of the functional equation into this null space. We also acknowledge that policymakers may incur a cost when moving away from the status quo. Achieving \textit{approximate fairness} is obtained by introducing a fairness penalty in the learning procedure and balancing more or less heavily the influence between the status quo and a full fair solution.

preprint2022arXiv

GAN Estimation of Lipschitz Optimal Transport Maps

This paper introduces the first statistically consistent estimator of the optimal transport map between two probability distributions, based on neural networks. Building on theoretical and practical advances in the field of Lipschitz neural networks, we define a Lipschitz-constrained generative adversarial network penalized by the quadratic transportation cost. Then, we demonstrate that, under regularity assumptions, the obtained generator converges uniformly to the optimal transport map as the sample size increases to infinity. Furthermore, we show through a number of numerical experiments that the learnt mapping has promising performances. In contrast to previous work tackling either statistical guarantees or practicality, we provide an expressive and feasible estimator which paves way for optimal transport applications where the asymptotic behaviour must be certified.

preprint2021arXiv

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

Optimal transport maps define a one-to-one correspondence between probability distributions, and as such have grown popular for machine learning applications. However, these maps are generally defined on empirical observations and cannot be generalized to new samples while preserving asymptotic properties. We extend a novel method to learn a consistent estimator of a continuous optimal transport map from two empirical distributions. The consequences of this work are two-fold: first, it enables to extend the transport plan to new observations without computing again the discrete optimal transport map; second, it provides statistical guarantees to machine learning applications of optimal transport. We illustrate the strength of this approach by deriving a consistent framework for transport-based counterfactual explanations in fairness.

preprint2021arXiv

Central Limit Theorems for General Transportation Costs

We consider the problem of optimal transportation with general cost between a empirical measure and a general target probability on R d , with d $\ge$ 1. We extend results in [19] and prove asymptotic stability of both optimal transport maps and potentials for a large class of costs in R d. We derive a central limit theorem (CLT) towards a Gaussian distribution for the empirical transportation cost under minimal assumptions, with a new proof based on the Efron-Stein inequality and on the sequential compactness of the closed unit ball in L 2 (P) for the weak topology. We provide also CLTs for empirical Wassertsein distances in the special case of potential costs | $\bullet$ | p , p > 1.

preprint2020arXiv

A survey of bias in Machine Learning through the prism of Statistical Parity for the Adult Data Set

Applications based on Machine Learning models have now become an indispensable part of the everyday life and the professional world. A critical question then recently arised among the population: Do algorithmic decisions convey any type of discrimination against specific groups of population or minorities? In this paper, we show the importance of understanding how a bias can be introduced into automatic decisions. We first present a mathematical framework for the fair learning problem, specifically in the binary classification setting. We then propose to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set. Finally, we check the performance of different approaches aiming to reduce the bias in binary classification outcomes. Importantly, we show that some intuitive methods are ineffective. This sheds light on the fact trying to make fair machine learning models may be a particularly challenging task, in particular when the training observations contain a bias.

preprint2020arXiv

Gaussian Processes indexed on the symmetric group: prediction and learning

In the framework of the supervised learning of a real function defined on a space X , the so called Kriging method stands on a real Gaussian field defined on X. The Euclidean case is well known and has been widely studied. In this paper, we explore the less classical case where X is the non commutative finite group of permutations. In this setting, we propose and study an harmonic analysis of the covariance operators that enables to consider Gaussian processes models and forecasting issues. Our theory is motivated by statistical ranking problems.

preprint2020arXiv

optimalFlow: Optimal-transport approach to flow cytometry gating and population matching

Data obtained from Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well-known phenomenon produced by measurements on different individuals, with different characteristics such as illness, age, sex, etc. The use of different settings for measurement, the variation of the conditions during experiments and the different types of flow cytometers are some of the technical causes of variability. This mixture of sources of variability makes the use of supervised machine learning for identification of cell populations difficult. The present work is conceived as a combination of strategies to facilitate the task of supervised gating. We propose $optimalFlowTemplates$, based on a similarity distance and $\text{Wasserstein barycenters}$, which clusters cytometries and produces prototype cytometries for the different groups. We show that supervised learning, restricted to the new groups, performs better than the same techniques applied to the whole collection. We also present $optimalFlowClassification$, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code is freely available as $optimalFlow$ a Bioconductor R package at https://bioconductor.org/packages/optimalFlow. optimalFlowTemplates+optimalFlowClassification addresses the problem of using supervised learning while accounting for biological and technical variability. Our methodology provides a robust automated gating workflow that handles the intrinsic variability of flow cytometry data well. Our main innovation is the methodology itself and the optimal-transport techniques that we apply to flow cytometry analysis.

preprint2020arXiv

Projection to Fairness in Statistical Learning

In the context of regression, we consider the fundamental question of making an estimator fair while preserving its prediction accuracy as much as possible. To that end, we define its projection to fairness as its closest fair estimator in a sense that reflects prediction accuracy. Our methodology leverages tools from optimal transport to construct efficiently the projection to fairness of any given estimator as a simple post-processing step. Moreover, our approach precisely quantifies the cost of fairness, measured in terms of prediction accuracy.

preprint2020arXiv

Review of Mathematical frameworks for Fairness in Machine Learning

A review of the main fairness definitions and fair learning methodologies proposed in the literature over the last years is presented from a mathematical point of view. Following our independence-based approach, we consider how to build fair algorithms and the consequences on the degradation of their performance compared to the possibly unfair case. This corresponds to the price for fairness given by the criteria $\textit{statistical parity}$ or $\textit{equality of odds}$. Novel results giving the expressions of the optimal fair classifier and the optimal fair predictor (under a linear regression gaussian model) in the sense of $\textit{equality of odds}$ are presented.

preprint2020arXiv

Risk Measures Estimation Under Wasserstein Barycenter

Randomness in financial markets requires modern and robust multivariate models of risk measures. This paper proposes a new approach for modeling multivariate risk measures under Wasserstein barycenters of probability measures supported on location-scatter families. Simple and advanced copulas multivariate Value at Risk models are compared with the derived technique. The performance of the model is also checked in market indices of United States generated by the financial crisis due to COVID-19. The introduced model behaves satisfactory in both common and volatile periods of asset prices, providing realistic VaR forecast in this era of social distancing.