Researcher profile

Peggy Cénac

Peggy Cénac contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2020arXiv

An efficient Averaged Stochastic Gauss-Newton algorithm for estimating parameters of non linear regressions models

Non linear regression models are a standard tool for modeling real phenomena, with several applications in machine learning, ecology, econometry... Estimating the parameters of the model has garnered a lot of attention during many years. We focus here on a recursive method for estimating parameters of non linear regressions. Indeed, these kinds of methods, whose most famous are probably the stochastic gradient algorithm and its averaged version, enable to deal efficiently with massive data arriving sequentially. Nevertheless, they can be, in practice, very sensitive to the case where the eigen-values of the Hessian of the functional we would like to minimize are at different scales. To avoid this problem, we first introduce an online Stochastic Gauss-Newton algorithm. In order to improve the estimates behavior in case of bad initialization, we also introduce a new Averaged Stochastic Gauss-Newton algorithm and prove its asymptotic efficiency.

preprint2020arXiv

Variable Length Memory Chains: characterization of stationary probability measures

Variable Length Memory Chains (VLMC), which are generalizations of finite order Markov chains, turn out to be an essential tool to modelize random sequences in many domains, as well as an interesting object in contemporary probability theory. The question of the existence of stationary probability measures leads us to introduce a key combinatorial structure for words produced by a VLMC: the Longest Internal Suffix. This notion allows us to state a necessary and sufficient condition for a general VLMC to admit a unique invariant probability measure. This condition turns out to get a much simpler form for a subclass of VLMC: the stable VLMC. This natural subclass, unlike the general case, enjoys a renewal property. Namely, a stable VLMC induces a semi-Markov chain on an at most countable state space. Unfortunately, this discrete time renewal process does not contain the whole information of the VLMC, preventing the study of a stable VLMC to be reduced to the study of its induced semi-Markov chain. For a subclass of stable VLMC, the convergence in distribution of a VLMC towards its stationary probability measure is established. Finally, finite state space semi-Markov chains turn out to be very special stable VLMC, shedding some new light on their limit distributions.

preprint2012arXiv

Almost sure central limit theorems for random ratios and applications to LSE for fractional Ornstein-Uhlenbeck processes

We investigate an almost sure limit theorem (ASCLT) for sequences of random variables having the form of a ratio of two terms such that the numerator satisfies the ASCLT and the denominator is a positive term which converges almost surely to 1. This result leads to the ASCLT for least square estimators for Ornstein-Uhlenbeck process driven by fractional Brownian motion.

preprint2012arXiv

Persistent random walks, variable length Markov chains and piecewise deterministic Markov processes

A classical random walk $(S_t, t\in\mathbb{N})$ is defined by $S_t:=\displaystyle\sum_{n=0}^t X_n$, where $(X_n)$ are i.i.d. When the increments $(X_n)_{n\in\mathbb{N}}$ are a one-order Markov chain, a short memory is introduced in the dynamics of $(S_t)$. This so-called "persistent" random walk is nolonger Markovian and, under suitable conditions, the rescaled process converges towards the integrated telegraph noise (ITN) as the time-scale and space-scale parameters tend to zero (see Herrmann and Vallois, 2010; Tapiero-Vallois, Tapiero-Vallois2}). The ITN process is effectively non-Markovian too. The aim is to consider persistent random walks $(S_t)$ whose increments are Markov chains with variable order which can be infinite. This variable memory is enlighted by a one-to-one correspondence between $(X_n)$ and a suitable Variable Length Markov Chain (VLMC), since for a VLMC the dependency from the past can be unbounded. The key fact is to consider the non Markovian letter process $(X_n)$ as the margin of a couple $(X_n,M_n)_{n\ge 0}$ where $(M_n)_{n\ge 0}$ stands for the memory of the process $(X_n)$. We prove that, under a suitable rescaling, $(S_n,X_n,M_n)$ converges in distribution towards a time continuous process $(S^0(t),X(t),M(t))$. The process $(S^0(t))$ is a semi-Markov and Piecewise Deterministic Markov Process whose paths are piecewise linear.

preprint2012arXiv

Recursive estimation of the conditional geometric median in Hilbert spaces

A recursive estimator of the conditional geometric median in Hilbert spaces is studied. It is based on a stochastic gradient algorithm whose aim is to minimize a weighted L1 criterion and is consequently well adapted for robust online estimation. The weights are controlled by a kernel function and an associated bandwidth. Almost sure convergence and L2 rates of convergence are proved under general conditions on the conditional distribution as well as the sequence of descent steps of the algorithm and the sequence of bandwidths. Asymptotic normality is also proved for the averaged version of the algorithm with an optimal rate of convergence. A simulation study confirms the interest of this new and fast algorithm when the sample sizes are large. Finally, the ability of these recursive algorithms to deal with very high-dimensional data is illustrated on the robust estimation of television audience profiles conditional on the total time spent watching television over a period of 24 hours.

preprint2011arXiv

Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm

With the progress of measurement apparatus and the development of automatic sensors it is not unusual anymore to get thousands of samples of observations taking values in high dimension spaces such as functional spaces. In such large samples of high dimensional data, outlying curves may not be uncommon and even a few individuals may corrupt simple statistical indicators such as the mean trajectory. We focus here on the estimation of the geometric median which is a direct generalization of the real median and has nice robustness properties. The geometric median being defined as the minimizer of a simple convex functional that is differentiable everywhere when the distribution has no atoms, it is possible to estimate it with online gradient algorithms. Such algorithms are very fast and can deal with large samples. Furthermore they also can be simply updated when the data arrive sequentially. We state the almost sure consistency and the L2 rates of convergence of the stochastic gradient estimator as well as the asymptotic normality of its averaged version. We get that the asymptotic distribution of the averaged version of the algorithm is the same as the classic estimators which are based on the minimization of the empirical loss function. The performances of our averaged sequential estimator, both in terms of computation speed and accuracy of the estimations, are evaluated with a small simulation study. Our approach is also illustrated on a sample of more 5000 individual television audiences measured every second over a period of 24 hours.

preprint2011arXiv

Uncommon Suffix Tries

Common assumptions on the source producing the words inserted in a suffix trie with $n$ leaves lead to a $\log n$ height and saturation level. We provide an example of a suffix trie whose height increases faster than a power of $n$ and another one whose saturation level is negligible with respect to $\log n$. Both are built from VLMC (Variable Length Markov Chain) probabilistic sources; they are easily extended to families of sources having the same properties. The first example corresponds to a "logarithmic infinite comb" and enjoys a non uniform polynomial mixing. The second one corresponds to a "factorial infinite comb" for which mixing is uniform and exponential.

preprint2010arXiv

Variable length Markov chains and dynamical sources

Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the mapping. On two examples, the ``comb'' and the ``bamboo blossom'', we find a necessary and sufficient condition for the existence and the unicity of a stationary probability measure for the VLMC. These two examples are detailed in order to provide the associated Dirichlet series as well as the generating functions of word occurrences.