Researcher profile

Michel Broniatowski

Michel Broniatowski contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2022arXiv

A Unifying Framework for Some Directed Distances in Statistics

Density-based directed distances -- particularly known as divergences -- between probability distributions are widely used in statistics as well as in the adjacent research fields of information theory, artificial intelligence and machine learning. Prominent examples are the Kullback-Leibler information distance (relative entropy) which e.g. is closely connected to the omnipresent maximum likelihood estimation method, and Pearson's chisquare-distance which e.g. is used for the celebrated chisquare goodness-of-fit test. Another line of statistical inference is built upon distribution-function-based divergences such as e.g. the prominent (weighted versions of) Cramer-von Mises test statistics respectively Anderson-Darling test statistics which are frequently applied for goodness-of-fit investigations; some more recent methods deal with (other kinds of) cumulative paired divergences and closely related concepts. In this paper, we provide a general framework which covers in particular both the above-mentioned density-based and distribution-function-based divergence approaches; the dissimilarity of quantiles respectively of other statistical functionals will be included as well. From this framework, we structurally extract numerous classical and also state-of-the-art (including new) procedures. Furthermore, we deduce new concepts of dependence between random variables, as alternatives to the celebrated mutual information. Some variational representations are discussed, too.

preprint2020arXiv

A sequential design for extreme quantiles estimation under binary sampling

We propose a sequential design method aiming at the estimation of an extreme quantile based on a sample of dichotomic data corresponding to peaks over a given threshold. This study is motivated by an industrial challenge in material reliability and consists in estimating a failure quantile from trials whose outcomes are reduced to indicators of whether the specimen have failed at the tested stress levels. The solution proposed is a sequential design making use of a splitting approach, decomposing the target probability level into a product of probabilities of conditional events of higher order. The method consists in gradually targeting the tail of the distribution and sampling under truncated distributions. The model is GEV or Weibull, and sequential estimation of its parameters involves an improved maximum likelihood procedure for binary data, due to the large uncertainty associated with such a restricted information.

preprint2020arXiv

Minimum divergence estimators, Maximum Likelihood and the generalized bootstrap

This paper is an attempt to set a justification for making use of some dicrepancy indexes, starting from the classical Maximum Likelihood definition, and adapting the corresponding basic principle of inference to situations where minimization of those indexes between a model and some extension of the empirical measure of the data appears as its natural extension. This leads to the so called generalized bootstrap setting for which minimum divergence inference seems to replace Maximum Likelihood one. 1 Motivation and context Divergences between probability measures are widely used in Statistics and Data Science in order to perform inference under models of various kinds, paramet-ric or semi parametric, or even in non parametric settings. The corresponding methods extend the likelihood paradigm and insert inference in some minimum "distance" framing, which provides a convenient description for the properties of the resulting estimators and tests, under the model or under misspecifica-tion. Furthermore they pave the way to a large number of competitive methods , which allows for trade-off between efficiency and robustness, among others. Many families of such divergences have been proposed, some of them stemming from classical statistics (such as the Chi-square), while others have their origin in other fields such as Information theory. Some measures of discrepancy involve regularity of the corresponding probability measures while others seem to be restricted to measures on finite or countable spaces, at least when using them as inferential tools, henceforth in situations when the elements of a model have to be confronted with a dataset. The choice of a specific discrepancy measure in specific context is somehow arbitrary in many cases, although the resulting conclusion of the inference might differ accordingly, above all under misspecification; however the need for such approaches is clear when aiming at robustness.

preprint2014arXiv

Long runs under a conditional limit distribution

This paper presents a sharp approximation of the density of long runs of a random walk conditioned on its end value or by an average of a function of its summands as their number tends to infinity. In the large deviation range of the conditioning event it extends the Gibbs conditional principle in the sense that it provides a description of the distribution of the random walk on long subsequences. An approximation of the density of the runs is also obtained when the conditioning event states that the end value of the random walk belongs to a thin or a thick set with a nonempty interior. The approximations hold either in probability under the conditional distribution of the random walk, or in total variation norm between measures. An application of the approximation scheme to the evaluation of rare event probabilities through importance sampling is provided. When the conditioning event is in the range of the central limit theorem, it provides a tool for statistical inference in the sense that it produces an effective way to implement the Rao-Blackwell theorem for the improvement of estimators; it also leads to conditional inference procedures in models with nuisance parameters. An algorithm for the simulation of such long runs is presented, together with an algorithm determining the maximal length for which the approximation is valid up to a prescribed accuracy.

preprint2013arXiv

Light tails: Gibbs conditional principle under extreme deviation

Let $X_{1},..,X_{n}$ denote an i.i.d. sample with light tail distribution and $S_{1}^{n}$ denote the sum of its terms; let $a_{n}$ be a real sequence\ going to infinity with $n.$\ In a previous paper (\cite{BoniaCao}) it is proved that as $n\rightarrow\infty$, given $\left(S_{1}^{n}/n>a_{n}\right) $ all terms $X_{i_{\text{}}}$ concentrate around $a_{n}$ with probability going to 1. This paper explores the asymptotic distribution of $X_{1}$ under the conditioning events $\left(S_{1}^{n}/n=a_{n}\right) $ and $\left(S_{1}^{n}/n\geq a_{n}\right)$ . It is proved that under some regulatity property, the asymptotic conditional distribution of $X_{1}$ given $\left(S_{1}^{n}/n=a_{n}\right) $ can be approximated in variation norm by the tilted distribution at point $a_{n}$, extending therefore the classical LDP case developed in Diaconis and Freedman (1988) . Also under $\left(S_{1}^{n}/n\geq a_{n}\right) $ the dominating point property holds. It also considers the case when the $X_{i}$'s are $\mathbb{R}^{d}-$valued, $f$ is a real valued function defined on $\mathbb{R}^{d}$ and the conditioning event writes $\left(U_{1}^{n}/n=a_{n}\right) $ or $\left(U_{1}^{n}/n\geq a_{n}\right)$ with $U_{1}^{n}:=\left(f(X_{1})+..+f(X_{n})\right) /n$ and $f(X_{1})$ has a light tail distribution$.$ As a by-product some attention is paid to the estimation of high level sets of functions.

preprint2012arXiv

A conditional limit theorem for random walks under extreme deviation

This paper explores a conditional Gibbs theorem for a random walkinduced by i.i.d. (X_{1},..,X_{n}) conditioned on an extreme deviation of its sum (S_{1}^{n}=na_{n}) or (S_{1}^{n}>na_{n}) where a_{n}\rightarrow\infty. It is proved that when the summands have light tails with some additional regulatity property, then the asymptotic conditional distribution of X_{1} can be approximated in variation norm by the tilted distribution at point a_{n}, extending therefore the classical LDP case.

preprint2012arXiv

Conditional inference in parametric models

This paper presents a new approach to conditional inference, based on the simulation of samples conditioned by a statistics of the data. Also an explicit expression for the approximation of the conditional likelihood of long runs of the sample given the observed statistics is provided. It is shown that when the conditioning statistics is sufficient for a given parameter, the approximating density is still invariant with respect to the parameter. A new Rao-Blackwellisation procedure is proposed and simulation shows that Lehmann Scheffé Theorem is valid for this approximation. Conditional inference for exponential families with nuisance parameter is also studied, leading to Monte carlo tests. Finally the estimation of the parameter of interest through conditional likelihood is considered. Comparison with the parametric bootstrap method is discussed.

preprint2012arXiv

Stretched random walks and the behaviour of their summands

This paper explores the joint behaviour of the summands of a random walk when their mean value goes to infinity as its length increases. It is proved that all the summands must share the same value, which extends previous results in the context of large exceedances of finite sums of i.i.d. random variables. Some consequences are drawn pertaining to the local behaviour of a random walk conditioned on a large deviation constraint on its end value. It is shown that the sample paths exhibit local oblic segments with increasing size and slope as the length of the random walk increases.

preprint2012arXiv

Towards zero variance estimators for rare event probabilities

Improving Importance Sampling estimators for rare event probabilities requires sharp approximations of conditional densities. This is achieved for events E_{n}:=(f(X_{1})+...+f(X_{n}))\inA_{n} where the summands are i.i.d. and E_{n} is a large or moderate deviation event. The approximation of the conditional density of the real r.v's X_{i} 's, for 1\leqi\leqk_{n} with repect to E_{n} on long runs, when k_{n}/n\to1, is handled. The maximal value of k compatible with a given accuracy is discussed; algorithms and simulated results are presented.

preprint2012arXiv

Weighted sampling, Maximum Likelihood and minimum divergence estimators

This paper explores Maximum Likelihood in parametric models in the context of Sanov type Large Deviation Probabilities. MLE in parametric models under weighted sampling is shown to be associated with the minimization of a specific divergence criterion defined with respect to the distribution of the weights. Some properties of the resulting inferential procedure are presented; Bahadur efficiency of tests are also considered in this context.

preprint2011arXiv

An estimation method for the chi-square divergence with application to test of hypotheses

We propose a new definition of the chi-square divergence between distributions. Based on convexity properties and duality, this version of the χ^2 is well suited both for the classical applications of the χ^2 for the analysis of contingency tables and for the statistical tests for parametric models, for which it has been advocated to be robust against inliers. We present two applications in testing. In the first one we deal with tests for finite and infinite numbers of linear constraints, while, in the second one, we apply χ^2-methodology for parametric testing against contamination.

preprint2011arXiv

Long runs under point conditioning. The real case

This paper presents a sharp approximation of the density of long runs of a random walk conditioned on its end value or by an average of a functions of its summands as their number tends to infinity. The conditioning event is of moderate or large deviation type. The result extends the Gibbs conditional principle in the sense that it provides a description of the distribution of the random walk on long subsequences. An algorithm for the simulation of such long runs is presented, together with an algorithm determining their maximal length for which the approximation is valid up to a prescribed accuracy.

preprint2011arXiv

Minimum divergence estimators, maximum likelihood and exponential families

In this note we prove the dual representation formula of the divergence between two distributions in a parametric model. Resulting estimators for the divergence as for the parameter are derived. These estimators do not make use of any grouping nor smoothing. It is proved that all differentiable divergences induce the same estimator of the parameter on any regular exponential family, which is nothing else but the MLE.

preprint2011arXiv

Upper bounds for the error in some interpolation and extrapolation designs

This paper deals with probabilistic upper bounds for the error in functional estimation defined on some interpolation and extrapolation designs, when the function to estimate is supposed to be analytic. The error pertaining to the estimate may depend on various factors: the frequency of observations on the knots, the position and number of the knots, and also on the error committed when approximating the function through its Taylor expansion. When the number of observations is fixed, then all these parameters are determined by the choice of the design and by the choice estimator of the unknown function. The scope of the paper is therefore to determine a rule for the minimal number of observation required to achieve an upper bound of the error on the estimate with a given maximal probability.

preprint2010arXiv

Bivariate Cox model and copulas

This paper introduces a new class of Cox models for dependent bivariate data. The impact of the covariate on the dependence of the variables is captured through the modification of their copula. Various classes of well known copulas are stable under the model (archimedean type and extreme value copulas), meaning that the role of the covariate acts in a simple and explicit way on the copula in the class; specific parametric classes are considered.

preprint2010arXiv

Minimization of divergences on sets of signed measures

We consider the minimization problem of $ϕ$-divergences between a given probability measure $P$ and subsets $Ω$ of the vector space $\mathcal{M}_\mathcal{F}$ of all signed finite measures which integrate a given class $\mathcal{F}$ of bounded or unbounded measurable functions. The vector space $\mathcal{M}_\mathcal{F}$ is endowed with the weak topology induced by the class $\mathcal{F}\cup \mathcal{B}_b$ where $\mathcal{B}_b$ is the class of all bounded measurable functions. We treat the problems of existence and characterization of the $ϕ$-projections of $P$ on $Ω$. We consider also the dual equality and the dual attainment problems when $Ω$ is defined by linear constraints.