Source author record

Michel Broniatowski

Michel Broniatowski appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory math.PR Methodology Applications Computation Information Theory Machine Learning math.CA math.IT math.NA math.OC

Catalog footprint

What is connected

23works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Unifying Framework for Some Directed Distances in Statistics

Density-based directed distances -- particularly known as divergences -- between probability distributions are widely used in statistics as well as in the adjacent research fields of information theory, artificial intelligence and machine learning. Prominent examples are the Kullback-Leibler information distance (relative entropy) which e.g. is closely connected to the omnipresent maximum likelihood estimation method, and Pearson's chisquare-distance which e.g. is used for the celebrated chisquare goodness-of-fit test. Another line of statistical inference is built upon distribution-function-based divergences such as e.g. the prominent (weighted versions of) Cramer-von Mises test statistics respectively Anderson-Darling test statistics which are frequently applied for goodness-of-fit investigations; some more recent methods deal with (other kinds of) cumulative paired divergences and closely related concepts. In this paper, we provide a general framework which covers in particular both the above-mentioned density-based and distribution-function-based divergence approaches; the dissimilarity of quantiles respectively of other statistical functionals will be included as well. From this framework, we structurally extract numerous classical and also state-of-the-art (including new) procedures. Furthermore, we deduce new concepts of dependence between random variables, as alternatives to the celebrated mutual information. Some variational representations are discussed, too.

preprint2020arXiv

A sequential design for extreme quantiles estimation under binary sampling

We propose a sequential design method aiming at the estimation of an extreme quantile based on a sample of dichotomic data corresponding to peaks over a given threshold. This study is motivated by an industrial challenge in material reliability and consists in estimating a failure quantile from trials whose outcomes are reduced to indicators of whether the specimen have failed at the tested stress levels. The solution proposed is a sequential design making use of a splitting approach, decomposing the target probability level into a product of probabilities of conditional events of higher order. The method consists in gradually targeting the tail of the distribution and sampling under truncated distributions. The model is GEV or Weibull, and sequential estimation of its parameters involves an improved maximum likelihood procedure for binary data, due to the large uncertainty associated with such a restricted information.

preprint2020arXiv

Minimum divergence estimators, Maximum Likelihood and the generalized bootstrap

This paper is an attempt to set a justification for making use of some dicrepancy indexes, starting from the classical Maximum Likelihood definition, and adapting the corresponding basic principle of inference to situations where minimization of those indexes between a model and some extension of the empirical measure of the data appears as its natural extension. This leads to the so called generalized bootstrap setting for which minimum divergence inference seems to replace Maximum Likelihood one. 1 Motivation and context Divergences between probability measures are widely used in Statistics and Data Science in order to perform inference under models of various kinds, paramet-ric or semi parametric, or even in non parametric settings. The corresponding methods extend the likelihood paradigm and insert inference in some minimum "distance" framing, which provides a convenient description for the properties of the resulting estimators and tests, under the model or under misspecifica-tion. Furthermore they pave the way to a large number of competitive methods , which allows for trade-off between efficiency and robustness, among others. Many families of such divergences have been proposed, some of them stemming from classical statistics (such as the Chi-square), while others have their origin in other fields such as Information theory. Some measures of discrepancy involve regularity of the corresponding probability measures while others seem to be restricted to measures on finite or countable spaces, at least when using them as inferential tools, henceforth in situations when the elements of a model have to be confronted with a dataset. The choice of a specific discrepancy measure in specific context is somehow arbitrary in many cases, although the resulting conclusion of the inference might differ accordingly, above all under misspecification; however the need for such approaches is clear when aiming at robustness.

preprint2016arXiv

A Gibbs Conditional theorem under extreme deviation

We explore some properties of the conditional distribution of an i.i.d. sample under large exceedances of its sum. Thresholds for the asymptotic independance of the summands are observed, in contrast with the classical case when the conditioning event is in the range of a large deviation. This paper is an extension to [7]. Tools include a new Edgeworth expansion adapted to specific triangular arrays where the rows are generated by tilted distribution with diverging parameters, together with some Abelian type results.

preprint2016arXiv

SAFIP: a streaming algorithm for inverse problems

This paper presents a new algorithm which aims at the resolution of inverse problems of the form f(x) = 0, for x a vector of dimension d and f an arbitrary function with mild regularity condition. The set of solutions S may be infinite. This algorithm produces a good coverage of S, with a limited number of evaluations of the function f. It is therefore appropriate for complex problems where those evaluations are costly. Various examples are presented, with d varying from 2 to 10. Proofs of convergence and of coverage of S are presented.

preprint2016arXiv

Two Iterative Proximal-Point Algorithms for the Calculus of Divergence-based Estimators with Application to Mixture Models

Estimators derived from an EM algorithm are not robust since they are based on the maximization of the likelihood function. We propose a proximal-point algorithm based on the EM algorithm which aim to minimize a divergence criterion. Resulting estimators are generally robust against outliers and misspecification. An EM-type proximal-point algorithm is also introduced in order to produce robust estimators for mixture models. Convergence properties of the two algorithms are treated. We relax an identifiability condition imposed on the proximal term in the literature; a condition which is generally not fulfilled by mixture models. The convergence of the introduced algorithms is discussed on a two-component Weibull mixture and a two-component Gaussian mixture entailing a condition on the initialization of the EM algorithm in order for the later to converge. Simulations on mixture models using different statistical divergences are provided to confirm the validity of our work and the robustness of the resulting estimators against outliers in comparison to the EM algorithm.

preprint2014arXiv

A sharp Abelian theorem for the Laplace transform

This paper states asymptotic equivalents for the three first moments of the Eescher transform of a distribution on R with smooth density in the upper tail. As a by product if provides a tail approximation for its moment generating function, and shows that the Esscher transforms have a Gaussian behavior for large values of the parameter.

preprint2014arXiv

Long runs under a conditional limit distribution

This paper presents a sharp approximation of the density of long runs of a random walk conditioned on its end value or by an average of a function of its summands as their number tends to infinity. In the large deviation range of the conditioning event it extends the Gibbs conditional principle in the sense that it provides a description of the distribution of the random walk on long subsequences. An approximation of the density of the runs is also obtained when the conditioning event states that the end value of the random walk belongs to a thin or a thick set with a nonempty interior. The approximations hold either in probability under the conditional distribution of the random walk, or in total variation norm between measures. An application of the approximation scheme to the evaluation of rare event probabilities through importance sampling is provided. When the conditioning event is in the range of the central limit theorem, it provides a tool for statistical inference in the sense that it produces an effective way to implement the Rao-Blackwell theorem for the improvement of estimators; it also leads to conditional inference procedures in models with nuisance parameters. An algorithm for the simulation of such long runs is presented, together with an algorithm determining the maximal length for which the approximation is valid up to a prescribed accuracy.

preprint2014arXiv

Some overview on unbiased interpolation and extrapolation designs

This paper considers the construction of optimal designs due to Hoel and Levine and Guest. It focuses on the relation between the theory of the uniform approximation of functions and the optimality of the designs. Some application to accelerated tests is also presented. The multivariate case is also handled in some special situations.

preprint2013arXiv

Light tails: Gibbs conditional principle under extreme deviation

Let $X_{1},..,X_{n}$ denote an i.i.d. sample with light tail distribution and $S_{1}^{n}$ denote the sum of its terms; let $a_{n}$ be a real sequence\ going to infinity with $n.$\ In a previous paper (\cite{BoniaCao}) it is proved that as $n\rightarrow\infty$, given $\left(S_{1}^{n}/n>a_{n}\right) $ all terms $X_{i_{\text{}}}$ concentrate around $a_{n}$ with probability going to 1. This paper explores the asymptotic distribution of $X_{1}$ under the conditioning events $\left(S_{1}^{n}/n=a_{n}\right) $ and $\left(S_{1}^{n}/n\geq a_{n}\right)$ . It is proved that under some regulatity property, the asymptotic conditional distribution of $X_{1}$ given $\left(S_{1}^{n}/n=a_{n}\right) $ can be approximated in variation norm by the tilted distribution at point $a_{n}$, extending therefore the classical LDP case developed in Diaconis and Freedman (1988) . Also under $\left(S_{1}^{n}/n\geq a_{n}\right) $ the dominating point property holds. It also considers the case when the $X_{i}$'s are $\mathbb{R}^{d}-$valued, $f$ is a real valued function defined on $\mathbb{R}^{d}$ and the conditioning event writes $\left(U_{1}^{n}/n=a_{n}\right) $ or $\left(U_{1}^{n}/n\geq a_{n}\right)$ with $U_{1}^{n}:=\left(f(X_{1})+..+f(X_{n})\right) /n$ and $f(X_{1})$ has a light tail distribution$.$ As a by-product some attention is paid to the estimation of high level sets of functions.

preprint2012arXiv

A conditional limit theorem for random walks under extreme deviation

This paper explores a conditional Gibbs theorem for a random walkinduced by i.i.d. (X_{1},..,X_{n}) conditioned on an extreme deviation of its sum (S_{1}^{n}=na_{n}) or (S_{1}^{n}>na_{n}) where a_{n}\rightarrow\infty. It is proved that when the summands have light tails with some additional regulatity property, then the asymptotic conditional distribution of X_{1} can be approximated in variation norm by the tilted distribution at point a_{n}, extending therefore the classical LDP case.

preprint2012arXiv

Conditional inference in parametric models

This paper presents a new approach to conditional inference, based on the simulation of samples conditioned by a statistics of the data. Also an explicit expression for the approximation of the conditional likelihood of long runs of the sample given the observed statistics is provided. It is shown that when the conditioning statistics is sufficient for a given parameter, the approximating density is still invariant with respect to the parameter. A new Rao-Blackwellisation procedure is proposed and simulation shows that Lehmann Scheffé Theorem is valid for this approximation. Conditional inference for exponential families with nuisance parameter is also studied, leading to Monte carlo tests. Finally the estimation of the parameter of interest through conditional likelihood is considered. Comparison with the parametric bootstrap method is discussed.

preprint2012arXiv

Stretched random walks and the behaviour of their summands

This paper explores the joint behaviour of the summands of a random walk when their mean value goes to infinity as its length increases. It is proved that all the summands must share the same value, which extends previous results in the context of large exceedances of finite sums of i.i.d. random variables. Some consequences are drawn pertaining to the local behaviour of a random walk conditioned on a large deviation constraint on its end value. It is shown that the sample paths exhibit local oblic segments with increasing size and slope as the length of the random walk increases.

preprint2012arXiv

Towards zero variance estimators for rare event probabilities

Improving Importance Sampling estimators for rare event probabilities requires sharp approximations of conditional densities. This is achieved for events E_{n}:=(f(X_{1})+...+f(X_{n}))\inA_{n} where the summands are i.i.d. and E_{n} is a large or moderate deviation event. The approximation of the conditional density of the real r.v's X_{i} 's, for 1\leqi\leqk_{n} with repect to E_{n} on long runs, when k_{n}/n\to1, is handled. The maximal value of k compatible with a given accuracy is discussed; algorithms and simulated results are presented.

preprint2012arXiv

Weighted sampling, Maximum Likelihood and minimum divergence estimators

This paper explores Maximum Likelihood in parametric models in the context of Sanov type Large Deviation Probabilities. MLE in parametric models under weighted sampling is shown to be associated with the minimization of a specific divergence criterion defined with respect to the distribution of the weights. Some properties of the resulting inferential procedure are presented; Bahadur efficiency of tests are also considered in this context.

preprint2011arXiv

An estimation method for the chi-square divergence with application to test of hypotheses

We propose a new definition of the chi-square divergence between distributions. Based on convexity properties and duality, this version of the χ^2 is well suited both for the classical applications of the χ^2 for the analysis of contingency tables and for the statistical tests for parametric models, for which it has been advocated to be robust against inliers. We present two applications in testing. In the first one we deal with tests for finite and infinite numbers of linear constraints, while, in the second one, we apply χ^2-methodology for parametric testing against contamination.

preprint2011arXiv

Decomposable Pseudodistances and Applications in Statistical Estimation

The aim of this paper is to introduce new statistical criterions for estimation, suitable for inference in models with common continuous support. This proposal is in the direct line of a renewed interest for divergence based inference tools imbedding the most classical ones, such as maximum likelihood, Chi-square or Kullback Leibler. General pseudodistances with decomposable structure are considered, they allowing to define minimum pseudodistance estimators, without using nonparametric density estimators. A special class of pseudodistances indexed by α>0, leading for α\downarrow0 to the Kulback Leibler divergence, is presented in detail. Corresponding estimation criteria are developed and asymptotic properties are studied. The estimation method is then extended to regression models. Finally, some examples based on Monte Carlo simulations are discussed.

preprint2011arXiv

Divergences and Duality for Estimation and Test under Moment Condition Models

We introduce estimation and test procedures through divergence minimiza- tion for models satisfying linear constraints with unknown parameter. These procedures extend the empirical likelihood (EL) method and share common features with generalized empirical likelihood approach. We treat the problems of existence and characterization of the divergence projections of probability distributions on sets of signed finite measures. We give a precise characterization of duality, for the proposed class of estimates and test statistics, which is used to derive their limiting distributions (including the EL estimate and the EL ratio statistic) both under the null hypotheses and under alterna- tives or misspecification. An approximation to the power function is deduced as well as the sample size which ensures a desired power for a given alternative.

preprint2011arXiv

Long runs under point conditioning. The real case

This paper presents a sharp approximation of the density of long runs of a random walk conditioned on its end value or by an average of a functions of its summands as their number tends to infinity. The conditioning event is of moderate or large deviation type. The result extends the Gibbs conditional principle in the sense that it provides a description of the distribution of the random walk on long subsequences. An algorithm for the simulation of such long runs is presented, together with an algorithm determining their maximal length for which the approximation is valid up to a prescribed accuracy.

preprint2011arXiv

Minimum divergence estimators, maximum likelihood and exponential families

In this note we prove the dual representation formula of the divergence between two distributions in a parametric model. Resulting estimators for the divergence as for the parameter are derived. These estimators do not make use of any grouping nor smoothing. It is proved that all differentiable divergences induce the same estimator of the parameter on any regular exponential family, which is nothing else but the MLE.

preprint2011arXiv

Upper bounds for the error in some interpolation and extrapolation designs

This paper deals with probabilistic upper bounds for the error in functional estimation defined on some interpolation and extrapolation designs, when the function to estimate is supposed to be analytic. The error pertaining to the estimate may depend on various factors: the frequency of observations on the knots, the position and number of the knots, and also on the error committed when approximating the function through its Taylor expansion. When the number of observations is fixed, then all these parameters are determined by the choice of the design and by the choice estimator of the unknown function. The scope of the paper is therefore to determine a rule for the minimal number of observation required to achieve an upper bound of the error on the estimate with a given maximal probability.

preprint2010arXiv

Bivariate Cox model and copulas

This paper introduces a new class of Cox models for dependent bivariate data. The impact of the covariate on the dependence of the variables is captured through the modification of their copula. Various classes of well known copulas are stable under the model (archimedean type and extreme value copulas), meaning that the role of the covariate acts in a simple and explicit way on the copula in the class; specific parametric classes are considered.

preprint2010arXiv

Minimization of divergences on sets of signed measures

We consider the minimization problem of $ϕ$-divergences between a given probability measure $P$ and subsets $Ω$ of the vector space $\mathcal{M}_\mathcal{F}$ of all signed finite measures which integrate a given class $\mathcal{F}$ of bounded or unbounded measurable functions. The vector space $\mathcal{M}_\mathcal{F}$ is endowed with the weak topology induced by the class $\mathcal{F}\cup \mathcal{B}_b$ where $\mathcal{B}_b$ is the class of all bounded measurable functions. We treat the problems of existence and characterization of the $ϕ$-projections of $P$ on $Ω$. We consider also the dual equality and the dual attainment problems when $Ω$ is defined by linear constraints.

Michel Broniatowski

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

A Unifying Framework for Some Directed Distances in Statistics

A sequential design for extreme quantiles estimation under binary sampling

Minimum divergence estimators, Maximum Likelihood and the generalized bootstrap

A Gibbs Conditional theorem under extreme deviation

SAFIP: a streaming algorithm for inverse problems

Two Iterative Proximal-Point Algorithms for the Calculus of Divergence-based Estimators with Application to Mixture Models

A sharp Abelian theorem for the Laplace transform

Long runs under a conditional limit distribution

Some overview on unbiased interpolation and extrapolation designs

Light tails: Gibbs conditional principle under extreme deviation

A conditional limit theorem for random walks under extreme deviation

Conditional inference in parametric models

Stretched random walks and the behaviour of their summands

Towards zero variance estimators for rare event probabilities

Weighted sampling, Maximum Likelihood and minimum divergence estimators

An estimation method for the chi-square divergence with application to test of hypotheses

Decomposable Pseudodistances and Applications in Statistical Estimation

Divergences and Duality for Estimation and Test under Moment Condition Models

Long runs under point conditioning. The real case

Minimum divergence estimators, maximum likelihood and exponential families

Upper bounds for the error in some interpolation and extrapolation designs

Bivariate Cox model and copulas

Minimization of divergences on sets of signed measures