Researcher profile

Bertrand Michel

Bertrand Michel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Concentration of the empirical measure in Wasserstein distance: bounds involving the covering dimension

We give concentration inequalities in Wasserstein distance for the empirical measure of a sequence of independent and identically distributed random variables with values in a Polish space E. These inequalities involve the covering dimension of the support of the distribution of the variables. More precisely, we obtain a complete extension of the concentration inequalities of Fournier and Guillin [2015] in the case where E = R^d , in which the covering dimension replaces the dimension of the ambient space E.

preprint2022arXiv

Topological phase estimation method for reparameterized periodic functions

We consider a signal composed of several periods of a periodic function, of which we observe a noisy reparametrisation. The phase estimation problem consists of finding that reparametrisation, and, in particular, the number of observed periods. Existing methods are well-suited to the setting where the periodic function is known, or at least, simple. We consider the case when it is unknown and we propose an estimation method based on the shape of the signal. We use the persistent homology of sublevel sets of the signal to capture the temporal structure of its local extrema. We infer the number of periods in the signal by counting points in the persistence diagram and their multiplicities. Using the estimated number of periods, we construct an estimator of the reparametrisation. It is based on counting the number of sufficiently prominent local minima in the signal. This work is motivated by a vehicle positioning problem, on which we evaluated the proposed method.

preprint2021arXiv

Learning with tree tensor networks: complexity estimates and model selection

Tree tensor networks, or tree-based tensor formats, are prominent model classes for the approximation of high-dimensional functions in computational and data science. They correspond to sum-product neural networks with a sparse connectivity associated with a dimension tree and widths given by a tuple of tensor ranks. The approximation power of these models has been proved to be (near to) optimal for classical smoothness classes. However, in an empirical risk minimization framework with a limited number of observations, the dimension tree and ranks should be selected carefully to balance estimation and approximation errors. We propose and analyze a complexity-based model selection method for tree tensor networks in an empirical risk minimization framework and we analyze its performance over a wide range of smoothness classes. Given a family of model classes associated with different trees, ranks, tensor product feature spaces and sparsity patterns for sparse tensor networks, a model is selected (à la Barron, Birgé, Massart) by minimizing a penalized empirical risk, with a penalty depending on the complexity of the model class and derived from estimates of the metric entropy of tree tensor networks. This choice of penalty yields a risk bound for the selected predictor. In a least-squares setting, after deriving fast rates of convergence of the risk, we show that our strategy is (near to) minimax adaptive to a wide range of smoothness classes including Sobolev or Besov spaces (with isotropic, anisotropic or mixed dominating smoothness) and analytic functions. We discuss the role of sparsity of the tensor network for obtaining optimal performance in several regimes. In practice, the amplitude of the penalty is calibrated with a slope heuristics method. Numerical experiments in a least-squares regression setting illustrate the performance of the strategy.

preprint2021arXiv

Statistical analysis of Mapper for stochastic and multivariate filters

Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in Topological Data Analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works [BBMW19, CO17, CMO18, MW16], focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is $\mathbb{R}^p$, and in the general case, when it is a general metric space $(Z, d_Z)$, instead of $\mathbb{R}$. The few results that are available in this setting [DMW17, MW16] can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in [DMW17]. We finally provide applications of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach

preprint2020arXiv

Bayesian hierarchical models for the prediction of the driver flow and passenger waiting times in a stochastic carpooling service

Carpooling is an integral component in smart carbon-neutral cities, in particular to facilitate homework commuting. We study an innovative carpooling service developed by the start-up Ecov which specialises in homework commutes in peri-urban and rural regions. When a passenger makes a carpooling request, a designated driver is not assigned as in a traditional carpooling service; rather the passenger waits for the first driver, from a population of non-professional drivers who are already en route, to arrive. We propose a two-stage Bayesian hierarchical model to overcome the considerable difficulties, due to the sparsely observed driver and passenger data from an embryonic stochastic carpooling service, to deliver high-quality predictions of driver flow and passenger waiting times. The first stage focuses on the driver flow, whose predictions are aggregated at the daily level to compensate the data sparsity. The second stage processes this single daily driver flow into sub-daily (e.g. hourly) predictions of the passenger waiting times. We demonstrate that our model mostly outperforms frequentist and non-hierarchical Bayesian methods for observed data from operational carpooling service in Lyon, France and we also validated our model on simulated data.

preprint2020arXiv

Gaussian linear model selection in a dependent context

In this paper, we study the nonparametric linear model, when the error process is a dependent Gaussian process. We focus on the estimation of the mean vector via a model selection approach. We first give the general theoretical form of the penalty function, ensuring that the penalized estimator among a collection of models satisfies an oracle inequality. Then we derive a penalty shape involving the spectral radius of the covariance matrix of the errors, which can be chosen proportional to the dimension when the error process is stationary and short range dependent. However, this penalty can be too rough in some cases, in particular when the error process is long range dependent. In a second part, we focus on the fixed-design regression model assuming that the error process is a stationary Gaussian process. We propose a model selection procedure in order to estimate the mean function via piecewise polynomials on a regular partition, when the error process is either short range dependent, long range dependent or anti-persistent. We present different kinds of penalties, depending on the memory of the process. For each case, an adaptive estimator is built, and the rates of convergence are computed. Thanks to several sets of simulations, we study the performance of these different penalties for all types of errors (short memory, long memory and anti-persistent errors). Finally, we give an application of our method to the well-known Nile data, which clearly shows that the type of dependence of the error process must be taken into account.