Source author record

Ana Arribas-Gil

Ana Arribas-Gil appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Applications Computation Genomics Quantitative Methods

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

Bayesian Regression Analysis of Data with Random Effects Covariates from Nonlinear Longitudinal Measurements

Joint models for a wide class of response variables and longitudinal measurements consist on a mixed-effects model to fit longitudinal trajectories whose random effects enter as covariates in a generalized linear model for the primary response. They provide a useful way to asses association between these two kinds of data, which in clinical studies are often collected jointly on a series of individuals and may help understanding, for instance, the mechanisms of recovery of a certain disease or the efficacy of a given therapy. The most common joint model in this framework is based on a linear mixed model for the longitudinal data. However, for complex datasets the linearity assumption may be too restrictive. Some works have considered generalizing this setting with the use of a nonlinear mixed-effects model for the longitudinal trajectories but the proposed estimation procedures based on likelihood approximations have been shown De la Cruz et al. (2011) to exhibit some computational efficiency problems. In this article we propose an MCMC-based estimation procedure in the joint model with a nonlinear mixed-effects model for the longitudinal data and a generalized linear model for the primary response. Moreover, we consider that the errors in the longitudinal model may be correlated. We apply our method to the analysis of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. We also conduct a simulation study to asses the importance of modelling correlated errors and quantify the consequences of model misspecification.

preprint2013arXiv

Shape Outlier Detection and Visualization for Functional Data: the Outliergram

We propose a new method to visualize and detect shape outliers in samples of curves. In functional data analysis we observe curves defined over a given real interval and shape outliers are those curves that exhibit a different shape from the rest of the sample. Whereas magnitude outliers, that is, curves that exhibit atypically high or low values at some points or across the whole interval, are in general easy to identify, shape outliers are often masked among the rest of the curves and thus difficult to detect. In this article we exploit the relation between two depths for functional data to help visualizing curves in terms of shape and to develop an algorithm for shape outlier detection. We illustrate the use of the visualization tool, the outliergram, through several examples and asses the performance of the algorithm on a simulation study. We apply them to the detection of outliers in a children growth dataset in which the girls sample is contaminated with boys curves and viceversa.

preprint2012arXiv

Lasso-type estimators for Semiparametric Nonlinear Mixed-Effects Models Estimation

Parametric nonlinear mixed effects models (NLMEs) are now widely used in biometrical studies, especially in pharmacokinetics research and HIV dynamics models, due to, among other aspects, the computational advances achieved during the last years. However, this kind of models may not be flexible enough for complex longitudinal data analysis. Semiparametric NLMEs (SNMMs) have been proposed by Ke and Wang (2001). These models are a good compromise and retain nice features of both parametric and nonparametric models resulting in more flexible models than standard parametric NLMEs. However, SNMMs are complex models for which estimation still remains a challenge. The estimation procedure proposed by Ke and Wang (2001) is based on a combination of log-likelihood approximation methods for parametric estimation and smoothing splines techniques for nonparametric estimation. In this work, we propose new estimation strategies in SNMMs. On the one hand, we use the Stochastic Approximation version of EM algorithm (Delyon et al., 1999) to obtain exact ML and REML estimates of the fixed effects and variance components. On the other hand, we propose a LASSO-type method to estimate the unknown nonlinear function. We derive oracle inequalities for this nonparametric estimator. We combine the two approaches in a general estimation procedure that we illustrate with simulated and real data.

preprint2012arXiv

Pairwise Dynamic Time Warping for Event Data

We introduce a new version of dynamic time warping for samples of observed event times that are modeled as time-warped intensity processes. Our approach is devel- oped within a framework where for each experimental unit or subject in a sample, one observes a random number of event times or random locations. As in our setting the number of observed events differs from subject to subject, usual landmark align- ment methods that require the number of events to be the same across subjects are not feasible. We address this challenge by applying dynamic time warping, initially by aligning the event times for pairs of subjects, regardless of whether the numbers of observed events within the considered pair of subjects match. The information about pairwise alignments is then combined to extract an overall alignment of the events for each subject across the entire sample. This overall alignment provides a useful description of event data and can be used as a pre-processing step for subse- quent analysis. The method is illustrated with a historical fertility study and with on-line auction data.

preprint2011arXiv

A context dependent pair hidden Markov model for statistical alignment

This article proposes a novel approach to statistical alignment of nucleotide sequences by introducing a context dependent structure on the substitution process in the underlying evolutionary model. We propose to estimate alignments and context dependent mutation rates relying on the observation of two homologous sequences. The procedure is based on a generalized pair-hidden Markov structure, where conditional on the alignment path, the nucleotide sequences follow a Markov distribution. We use a stochastic approximation expectation maximization (saem) algorithm to give accurate estimators of parameters and alignments. We provide results both on simulated data and vertebrate genomes, which are known to have a high mutation rate from CG dinucleotide. In particular, we establish that the method improves the accuracy of the alignment of a human pseudogene and its functional gene.

preprint2009arXiv

Parameter Estimation in multiple-hidden i.i.d. models from biological multiple alignment

In this work we deal with parameter estimation in a latent variable model, namely the multiple-hidden i.i.d. model, which is derived from multiple alignment algorithms. We first provide a rigorous formalism for the homology structure of k sequences related by a star-shaped phylogenetic tree in the context of multiple alignment based on indel evolution models. We discuss possible definitions of likelihoods and compare them to the criterion used in multiple alignment algorithms. Existence of two different Information divergence rates is established and a divergence property is shown under additional assumptions. This would yield consistency for the parameter in parametrization schemes for which the divergence property holds. We finally extend the definition of the multiple-hidden i.i.d. model and the results obtained to the case in which the sequences are related by an arbitrary phylogenetic tree. Simulations illustrate different cases which are not covered by our results.

preprint2005arXiv

Parameter estimation in pair hidden Markov models

This paper deals with parameter estimation in pair hidden Markov models (pair-HMMs). We first provide a rigorous formalism for these models and discuss possible definitions of likelihoods. The model being biologically motivated, some restrictions with respect to the full parameter space naturally occur. Existence of two different Information divergence rates is established and divergence property (namely positivity at values different from the true one) is shown under additional assumptions. This yields consistency for the parameter in parametrization schemes for which the divergence property holds. Simulations illustrate different cases which are not covered by our results.