Researcher profile

Bernard Ycart

Bernard Ycart contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2015arXiv

Gärtner-Ellis condition for squared asymptotically stationary Gaussian processes

The Gärtner-Ellis condition for the square of an asymptotically stationary Gaussian process is established. The same limit holds for the conditional distri-bution given any fixed initial point, which entails weak multiplicative ergodicity. The limit is shown to be the Laplace transform of a convolution of Gamma distributions with Poisson compound of exponentials. A proof based on Wiener-Hopf factorization induces a probabilistic interpretation of the limit in terms of a regression problem.

preprint2014arXiv

1827 : la mode de la statistique en France; origine, extension, personnages

Independent to a great extent from the scientific development of the discipline, a trend for statistics has developed in France, from 1827 on. It was probably sparked by Charles Dupin's 'Carte figurative de l'instruction populaire', with its famous Saint-Malo Geneva line, supposed to separate the educated North from the ignorant South. It became attractive to produce, under the name 'statistics', more or less quantitative descriptions on any subject. Beyond literary records, the phenomenon can be measured by its semantic penetration in the press. Even if the ambition of most of these amateurs has remained strictly descriptive, some of them did raise the issue of proving through numbers. This is particularly remarkable, since within institutional science, the techniques of statistical proving, that had been introduced by Laplace at the end of the 18th century, have remained largely ignored for a very long time.

preprint2014arXiv

Jakob Bielfeld (1717--1770) and the diffusion of statistical concepts in eighteenth century Europe

Published between 1760 and 1770, Bielfeld's writings prove that scholars of the time were acquainted with the concepts of both political arithmetic and German statistik, long before they merged into a new discipline at the beginning the following century. It is argued here that these works may have been an important source of diffusion of statistical concepts at the end of the eighteenth century. Bielfeld is now almost completely forgotten, and the reasons for his lack of fame in posterity are examined.

preprint2014arXiv

Large scale statistical analysis of GEO datasets

The problem addressed here is that of simultaneous treatment of several gene expression datasets, possibly collected under different experimental conditions and/or platforms. Using robust statistics, a large scale statistical analysis has been conducted over $20$ datasets downloaded from the Gene Expression Omnibus repository. The differences between datasets are compared to the variability inside a given dataset. Evidence that meaningful biological information can be extracted by merging different sources is provided.

preprint2014arXiv

Simultaneous growth of two cancer cell lines evidences variability in growth rates

Cancer cells co-cultured in vitro reveal unexpected differential growth rates that classical exponential growth models cannot account for. Two non-interacting cell lines were grown in the same culture, and counts of each species were recorded at periodic times. The relative growth of population ratios was found to depend on the initial proportion, in contradiction with the traditional exponential growth model. The proposed explanation is the variability of growth rates for clones inside the same cell line. This leads to a log-quadratic growth model that provides both a theoretical explanation to the phenomenon that was observed, and a better fit to our growth data.

preprint2014arXiv

Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis

Gene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. From a statistical point of view, the centering of its test statistic does not allow the derivation of asymptotic results. A test statistic with a different centering is proposed. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution can be computed by Monte-Carlo simulation. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. The fact that the evaluation of the asymptotic distribution serves for many different gene sets results in shorter computing times. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.

preprint2013arXiv

Exponential growth of bifurcating processes with ancestral dependence

Branching processes are classical growth models in cell kinetics. In their construction, it is usually assumed that cell lifetimes are independent random variables, which has been proved false in experiments. Models of dependent lifetimes are considered here, in particular bifurcating Markov chains. Under hypotheses of stationarity and multiplicative ergodicity, the corresponding branching process is proved to have the same type of asymptotics as its classic counterpart in the i.i.d. supercritical case: the cell population grows exponentially, the growth rate being related to the exponent of multiplicative ergodicity, in a similar way as to the Laplace transform of lifetimes in the i.i.d. case. An identifiable model for which the multiplicative ergodicity coefficients and the growth rate can be explicitly computed is proposed.

preprint2013arXiv

Fluctuation analysis with cell deaths

The classical Luria-Delbrück model for fluctuation analysis is extended to the case where cells can either divide or die at the end of their generation time. This leads to a family of probability distributions generalizing the Luria-Delbrück family, and depending on three parameters: the expected number of mutations, the relative fitness of normal cells compared to mutants, and the death probability of mutants. The probabilistic treatment is similar to that of the classical case; simulation and computing algorithms are provided. The estimation problem is discussed: if the death probability is known, the two other parameters can be reliably estimated. If the death probability is unknown, the model can be identified only for large samples.

preprint2013arXiv

Fluctuation analysis: can estimates be trusted?

The estimation of mutation probabilities and relative fitnesses in fluctuation analysis is based on the unrealistic hypothesis that the single-cell times to division are exponentially distributed. Using the classical Luria-Delbrück distribution outside its modelling hypotheses induces an important bias on the estimation of the relative fitness. The model is extended here to any division time distribution. Mutant counts follow a generalization of the Luria-Delbrück distribution, which depends on the mean number of mutations, the relative fitness of normal cells compared to mutants, and the division time distribution of mutant cells. Empirical probability generating function techniques yield precise estimates both of the mean number of mutations and the relative fitness of normal cells compared to mutants. In the case where no information is available on the division time distribution, it is shown that the estimation procedure using constant division times yields more reliable results. Numerical results both on observed and simulated data are reported.

preprint2013arXiv

Statistical data mining for symbol associations in genomic databases

A methodology is proposed to automatically detect significant symbol associations in genomic databases. A new statistical test is proposed to assess the significance of a group of symbols when found in several genesets of a given database. Applied to symbol pairs, the thresholded p-values of the test define a graph structure on the set of symbols. The cliques of that graph are significant symbol associations, linked to a set of genesets where they can be found. The method can be applied to any database, and is illustrated MSigDB C2 database. Many of the symbol associations detected in C2 or in non-specific selections did correspond to already known interactions. On more specific selections of C2, many previously unkown symbol associations have been detected. These associations unveal new candidates for gene or protein interactions, needing further investigation for biological evidence.

preprint2012arXiv

A case of mathematical eponymy: the Vandermonde determinant

We study the historical process that led to the worldwide adoption, throughout mathematical research papers and textbooks, of the denomination "Vandermonde determinant". The mathematical object can be related to two passages in Vandermonde's writings, of which one inspired Cauchy's definition of determinants. Influential citations of Cauchy and Jacobi may have initiated the naming process. It started during the second half of the 19\textsuperscript{th} century as a pedagogical practice in France. The spread in textbooks and research journals began during the first half of 20\textsuperscript{th} century, and only reached full acceptance after the 1960's. The naming process is still ongoing, in the sense that the volume of publications using the denomination grows significantly faster than the overall volume of the field.

preprint2012arXiv

Alberti's letter counts

Four centuries before modern statistical linguistics was born, Leon Battista Alberti (1404--1472) compared the frequency of vowels in Latin poems and orations, making the first quantified observation of a stylistic difference ever. Using a corpus of 20 Latin texts (over 5 million letters), Alberti's observations are statistically assessed. Letter counts prove that poets used significantly more a's, e's, and y's, whereas orators used more of the other vowels. The sample sizes needed to justify the assertions are studied, and proved to be within reach for Alberti's scholarship.

preprint2012arXiv

Letter counting: a stem cell for Cryptology, Quantitative Linguistics, and Statistics

Counting letters in written texts is a very ancient practice. It has accompanied the development of Cryptology, Quantitative Linguistics, and Statistics. In Cryptology, counting frequencies of the different characters in an encrypted message is the basis of the so called frequency analysis method. In Quantitative Linguistics, the proportion of vowels to consonants in different languages was studied long before authorship attribution. In Statistics, the alternation vowel-consonants was the only example that Markov ever gave of his theory of chained events. A short history of letter counting is presented. The three domains, Cryptology, Quantitative Linguistics, and Statistics, are then examined, focusing on the interactions with the other two fields through letter counting. As a conclusion, the eclectism of past centuries scholars, their background in humanities, and their familiarity with cryptograms, are identified as contributing factors to the mutual enrichment process which is described here.

preprint2012arXiv

Statistics for the Luria-Delbrück distribution

The Luria-Delbrück distribution is a classical model of mutations in cell kinetics. It is obtained as a limit when the probability of mutation tends to zero and the number of divisions to infinity. It can be interpreted as a compound Poisson distribution (for the number of mutations) of exponential mixtures (for the developing time of mutant clones) of geometric distributions (for the number of cells produced by a mutant clone in a given time). The probabilistic interpretation, and a rigourous proof of convergence in the general case, are deduced from classical results on Bellman-Harris branching processes. The two parameters of the Luria-Delbrück distribution are the expected number of mutations, which is the parameter of interest, and the relative fitness of normal cells compared to mutants, which is the heavy tail exponent. Both can be simultaneously estimated by the maximum likehood method. However, the computation becomes numerically unstable as soon as the maximal value of the sample is large, which occurs frequently due to the heavy tail property. Based on the empirical generating function, robust estimators are proposed and their asymptotic variance is given. They are comparable in precision to maximum likelihood estimators, with a much broader range of calculability, a better numerical stability, and a negligible computing time.