Source author record

Rodrigo Labouriau

Rodrigo Labouriau appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Machine Learning math.CA math.FA math.ST Methodology Molecular Networks stat.OT Statistics Theory

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

A Multivariate Methodology for Analysing Students' Performance Using Register Data

We present a new method for jointly modelling the students' results in the university's admission exams and their performance in subsequent courses at the university. The case considered involved all the students enrolled at the University of Campinas in 2014 to evening studies programs in educational branches related to exact sciences. We collected the number of attempts used for passing the university course of geometry and the results of the admission exams of those students in seven disciplines. The method introduced involved a combination of multivariate generalised linear mixed models (GLMM) and graphical models for representing the covariance structure of the random components. The models we used allowed us to discuss the association of quantities of very different nature. We used Gaussian GLMM for modelling the performance in the admission exams and a frailty discrete-time Cox proportional model, represented by a GLMM, to describe the number of attempts for passing Geometry. The analyses were stratified into two populations: the students who received a bonus giving advantages in the university's admission process to compensate social and racial inequalities and those who did not receive the compensation. The two populations presented different patterns. Using general properties of graphical models, we argue that, on the one hand, the predicted performance in the admission exam of Mathematics could solely be used as a predictor of the performance in geometry for the students who received the bonus. On the other hand, the Portuguese admission exam's predicted performance could be used as a single predictor of the performance in geometry for the students who did not receive the bonus.

preprint2020arXiv

An introduction to Bent Jorgensen's ideas

We briefly expose some key aspects of the theory and use of dispersion models, for which Bent Jorgensen played a crucial role as a driving force and an inspiration source. Starting with the general notion of dispersion models, built using minimalistic mathematical assumptions, we specialize in two classes of families of distributions with different statistical flavors: exponential dispersion and proper dispersion models. The construction of dispersion models involves the solution of integral equations that are, in general, untractable. These difficulties disappear when a more mathematical structure is assumed: it reduces to the calculation of a moment generating function or of a Riemann-Stieltjes integral for the exponential dispersion and the proper dispersion models, respectively. A new technique for constructing dispersion models based on characteristic functions is introduced turning the integral equations above into a tractable convolution equation and yielding examples of dispersion models that are neither proper dispersion nor exponential dispersion models. A corollary is that the cardinality of regular and non-regular dispersion models are both large. Some selected applications are discussed including exponential families non-linear models (for which generalized linear models are particular cases) and several models for clustered and dependent data based on a latent Levy process.

preprint2020arXiv

Construction and Extension of Dispersion Models

There are two main classes of dispersion models studied in the literature: proper (PDM), and exponential dispersion models (EDM). Dispersion models that are neither proper nor exponential dispersion models are termed here non-standard dispersion models (NSDM). This paper exposes a technique for constructing new PDMs and NSDMs. This construction provides a solution to an open question in the theory of dispersion models about the extension of non-standard dispersion models. Given a unit deviance function, a dispersion model is usually constructed by calculating a normalising function that makes the density function integrates one. This calculation involves the solution of non-trivial integral equations. The main idea explored here is to use characteristic functions of real non-lattice symmetric probability measures to construct a family of unit deviances that are sufficiently regular to make the associated integral equations tractable. The integral equations associated to those unit deviances admit a trivial solution, in the sense that the normalising function is a constant function independent of the observed values. However, we show, using the machinery of distributions (i.e., generalised functions) and expansions of the normalising function with respect to specially constructed Riez systems, that those integral equations also admit infinitely many non-trivial solutions, generating many NSDMs. We conclude that, the cardinality of the class of non-standard dispersion models is larger than the cardinality of the class of real non-lattice symmetric probability measures.

preprint2016arXiv

The Laplace transform and polynomial approximation in L2

This short note gives a sufficient condition for having the class of polynomials dense in the space of square integrable functions with respect to a finite measure dominated by the Lebesgue measure in the real line, here denoted by $L^2$. It is shown that if the Laplace transform of the measure in play is bounded in a neighbourhood of the origin, then the moments of all order are finite and the class of polynomials is dense in $L^2$. The existence of the moments of all orders is well known for the case where the measure is concentrated in the positive real line (see Feller, 1966), but the result concerning the polynomial approximation is original, even thought the proof is relatively simple. Additionally, an alternative stronger condition easier to be verified not involving the calculation of the Laplace transform is given. The condition essentially says that the density of the measure should have exponential decaying tails. The tools presented are of interest for constructing semiparametric extensions of classic parametric models.

preprint2014arXiv

A Note on the Identifiability of Generalized Linear Mixed Models

I present here a simple proof that, under general regularity conditions, the standard parametrization of generalized linear mixed model is identifiable. The proof is based on the assumptions of generalized linear mixed models on the first and second order moments and some general mild regularity conditions, and, therefore, is extensible to quasi-likelihood based generalized linear models. In particular, binomial and Poisson mixed models with dispersion parameter are identifiable when equipped with the standard parametrization.

preprint2014arXiv

Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits

A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented concentrates on longevity studies. The framework presented allows to combine models based on continuous time with models based on discrete time in a joint analysis. The continuous time models are approximations of the frailty model in which the hazard function will be assumed to be piece-wise constant. The discrete time models used are multivariate variants of the discrete relative risk models. These models allow for regular parametric likelihood-based inference by exploring a coincidence of their likelihood functions and the likelihood functions of suitably defined multivariate generalized linear mixed models. The models include a dispersion parameter, which is essential for obtaining a decomposition of the variance of the trait of interest as a sum of parcels representing the additive genetic effects, environmental effects and unspecified sources of variability; as required in quantitative genetic applications. The methods presented are implemented in such a way that large and complex quantitative genetic data can be analyzed.

preprint2010arXiv

Characterization of differentially expressed genes using high-dimensional co-expression networks

We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation of spurious information along the network are avoided. The proposed inference procedure is based on the minimization of the Bayesian Information Criterion (BIC) in the class of decomposable graphical models. This class of models can be used to represent complex relationships and has suitable properties that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than differentially expressed genes located in less interconnected regions. Based on that idea, a measure of uncertainty that resembles the notion of relative entropy is proposed. Our methods are illustrated with three publically available data sets on microarray experiments (the larger involving more than 50,000 genes and 64 patients) and a short simulation study.

Rodrigo Labouriau

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A Multivariate Methodology for Analysing Students' Performance Using Register Data

An introduction to Bent Jorgensen's ideas

Construction and Extension of Dispersion Models

The Laplace transform and polynomial approximation in L2

A Note on the Identifiability of Generalized Linear Mixed Models

Multivariate Survival Mixed Models for Genetic Analysis of Longevity Traits

Characterization of differentially expressed genes using high-dimensional co-expression networks