Researcher profile

Yohann De Castro

Yohann De Castro contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Fast Spawn\&Prune (FS\&P): Global convergence of stochastic conic particle gradient descent via birth/death process

We investigate the global optimization of the objective function arising in continuous sparse regression, specifically the Beurling LASSO (BLASSO), over the space of measures. While Conic Particle Gradient Descent (CPGD) methods are computationally efficient, they may become trapped in local minima due to the non-convexity of the parameterization. To overcome this limitation, we introduce Fast Spawn\&Prune (FS\&P), a stochastic algorithm that extends FastPart introduced in De Castro et al. (2025) and combines CPGD with a birth-death process. The birth mechanism ensures asymptotic global exploration by introducing particles in regions where first-order optimality conditions are violated, while the death process preserves computational efficiency by pruning non-informative particles. We provide the first theoretical guarantee of global convergence for this class of discrete-time stochastic algorithms, without requiring exponentially large initializations. Furthermore, we derive explicit convergence rates for the excess risk, which scale as $\mathcal{O}\big(\left(\log K / K\right)^{\frac{1}{2(2+d)}}\big)$, where $K$ denotes the number of iterations and d the dimension of the domain, thereby quantifying the trade-off between global exploration and local refinement. Moreover, the sample complexity is $\mathcal{O}\big(N^{-\frac{1}{4(2+d)}}\big)$ (up to logarithmic factors). We also propose a horizon-free variant that does not require prior knowledge of the iteration budget.

preprint2022arXiv

Concentration inequality for U-statistics of order two for uniformly ergodic Markov chains

We prove a new concentration inequality for U-statistics of order two for uniformly ergodic Markov chains. Working with bounded and $π$-canonical kernels, we show that we can recover the convergence rate of Arcones and Gin{é} who proved a concentration result for U-statistics of independent random variables and canonical kernels. Our result allows for a dependence of the kernels $h_{i,j}$ with the indexes in the sums, which prevents the use of standard blocking tools. Our proof relies on an inductive analysis where we use martingale techniques, uniform ergodicity, Nummelin splitting and Bernstein's type inequality. Assuming further that the Markov chain starts from its invariant distribution, we prove a Bernstein-type concentration inequality that provides sharper convergence rate for small variance terms.

preprint2022arXiv

Markov Random Geometric Graph (MRGG): A Growth Model for Temporal Dynamic Networks

We introduce Markov Random Geometric Graphs (MRGGs), a growth model for temporal dynamic networks. It is based on a Markovian latent space dynamic: consecutive latent points are sampled on the Euclidean Sphere using an unknown Markov kernel; and two nodes are connected with a probability depending on a unknown function of their latent geodesic distance. More precisely, at each stamp-time $k$ we add a latent point $X_k$ sampled by jumping from the previous one $X_{k-1}$ in a direction chosen uniformly $Y_k$ and with a length $r_k$ drawn from an unknown distribution called the latitude function. The connection probabilities between each pair of nodes are equal to the envelope function of the distance between these two latent points. We provide theoretical guarantees for the non-parametric estimation of the latitude and the envelope functions.We propose an efficient algorithm that achieves those non-parametric estimation tasks based on an ad-hoc Hierarchical Agglomerative Clustering approach. As a by product, we show how MRGGs can be used to detect dependence structure in growing graphs and to solve link prediction problems.

preprint2022arXiv

Minimax Estimation of Partially-Observed Vector AutoRegressions

High-dimensional time series are a core ingredient of the statistical modeling toolkit, for which numerous estimation methods are known.But when observations are scarce or corrupted, the learning task becomes much harder.The question is: how much harder? In this paper, we study the properties of a partially-observed Vector AutoRegressive process, which is a state-space model endowed with a stochastic observation mechanism.Our goal is to estimate its sparse transition matrix, but we only have access to a small and noisy subsample of the state components.Interestingly, the sampling process itself is random and can exhibit temporal correlations, a feature shared by many realistic data acquisition scenarios.We start by describing an estimator based on the Yule-Walker equation and the Dantzig selector, and we give an upper bound on its non-asymptotic error.Then, we provide a matching minimax lower bound, thus proving near-optimality of our estimator.The convergence rate we obtain sheds light on the role of several key parameters such as the sampling ratio, the amount of noise and the number of non-zero coefficients in the transition matrix.These theoretical findings are commented and illustrated by numerical experiments on simulated data.

preprint2022arXiv

Random Geometric Graph: Some recent developments and perspectives

The Random Geometric Graph (RGG) is a random graph model for network data with an underlying spatial representation. Geometry endows RGGs with a rich dependence structure and often leads to desirable properties of real-world networks such as the small-world phenomenon and clustering. Originally introduced to model wireless communication networks, RGGs are now very popular with applications ranging from network user profiling to protein-protein interactions in biology. RGGs are also of purely theoretical interest since the underlying geometry gives rise to challenging mathematical questions. Their resolutions involve results from probability, statistics, combinatorics or information theory, placing RGGs at the intersection of a large span of research communities. This paper surveys the recent developments in RGGs from the lens of high dimensional settings and non-parametric inference. We also explain how this model differs from classical community based random graph models and we review recent works that try to take the best of both worlds. As a by-product, we expose the scope of the mathematical tools used in the proofs.

preprint2021arXiv

Forecasting Nonnegative Time Series via Sliding Mask Method (SMM) and Latent Clustered Forecast (LCF)

We consider nonnegative time series forecasting framework. Based on recent advances in Nonnegative Matrix Factorization (NMF) and Archetypal Analysis, we introduce two procedures referred to as Sliding Mask Method (SMM) and Latent Clustered Forecast (LCF). SMM is a simple and powerful method based on time window prediction using Completion of Nonnegative Matrices. This new procedure combines low nonnegative rank decomposition and matrix completion where the hidden values are to be forecasted. LCF is two stage: it leverages archetypal analysis for dimension reduction and clustering of time series, then it uses any black-box supervised forecast solver on the clustered latent representation. Theoretical guarantees on uniqueness and robustness of the solution of NMF Completion-type problems are also provided for the first time. Finally, numerical experiments on real-world and synthetic data-set confirms forecasting accuracy for both the methodologies.

preprint2020arXiv

Adaptive Estimation of Nonparametric Geometric Graphs

This article studies the recovery of graphons when they are convolution kernels on compact (symmetric) metric spaces. This case is of particular interest since it covers the situation where the probability of an edge depends only on some unknown nonparametric function of the distance between latent points, referred to as Nonparametric Geometric Graphs (NGG). In this setting, adaptive estimation of NGG is possible using a spectral procedure combined with a Goldenshluger-Lepski adaptation method. The latent spaces covered by our framework encompass (among others) compact symmetric spaces of rank one, namely real spheres and projective spaces. For these latter, explicit computations of the eigen-basis and of the model complexity can be achieved, leading to quantitative non-asymptotic results. The time complexity of our method scales cubicly in the size of the graph and exponentially in the regularity of the graphon. Hence, this paper offers an algorithmically and theoretically efficient procedure to estimate smooth NGG. As a by product, this paper shows a non-asymptotic concentration result on the spectrum of integral operators defined by symmetric kernels (not necessarily positive).

preprint2020arXiv

SuperMix: Sparse Regularization for Mixtures

This paper investigates the statistical estimation of a discrete mixing measure $μ$0 involved in a kernel mixture model. Using some recent advances in l1-regularization over the space of measures, we introduce a "data fitting and regularization" convex program for estimating $μ$0 in a grid-less manner from a sample of mixture law, this method is referred to as Beurling-LASSO. Our contribution is twofold: we derive a lower bound on the bandwidth of our data fitting term depending only on the support of $μ$0 and its so-called "minimum separation" to ensure quantitative support localization error bounds; and under a so-called "non-degenerate source condition" we derive a non-asymptotic support stability property. This latter shows that for a sufficiently large sample size n, our estimator has exactly as many weighted Dirac masses as the target $μ$0 , converging in amplitude and localization towards the true ones. Finally, we also introduce some tractable algorithms for solving this convex program based on "Sliding Frank-Wolfe" or "Conic Particle Gradient Descent". Statistical performances of this estimator are investigated designing a so-called "dual certificate", which is appropriate to our setting. Some classical situations, as e.g. mixtures of super-smooth distributions (e.g. Gaussian distributions) or ordinary-smooth distributions (e.g. Laplace distributions), are discussed at the end of the paper.