Source author record

C. F. Beckmann

C. F. Beckmann appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Methodology Applications Computation

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Evaluation of data imputation strategies in complex, deeply-phenotyped data sets: the case of the EU-AIMS Longitudinal European Autism Project

An increasing number of large-scale multi-modal research initiatives has been conducted in the typically developing population, as well as in psychiatric cohorts. Missing data is a common problem in such datasets due to the difficulty of assessing multiple measures on a large number of participants. The consequences of missing data accumulate when researchers aim to explore relationships between multiple measures. Here we aim to evaluate different imputation strategies to fill in missing values in clinical data from a large (total N=764) and deeply characterised (i.e. range of clinical and cognitive instruments administered) sample of N=453 autistic individuals and N=311 control individuals recruited as part of the EU-AIMS Longitudinal European Autism Project (LEAP) consortium. In particular we consider a total of 160 clinical measures divided in 15 overlapping subsets of participants. We use two simple but common univariate strategies, mean and median imputation, as well as a Round Robin regression approach involving four independent multivariate regression models including a linear model, Bayesian Ridge regression, as well as several non-linear models, Decision Trees, Extra Trees and K-Neighbours regression. We evaluate the models using the traditional mean square error towards removed available data, and consider in addition the KL divergence between the observed and the imputed distributions. We show that all of the multivariate approaches tested provide a substantial improvement compared to typical univariate approaches. Further, our analyses reveal that across all 15 data-subsets tested, an Extra Trees regression approach provided the best global results. This allows the selection of a unique model to impute missing data for the LEAP project and deliver a fixed set of imputed clinical data to be used by researchers working with the LEAP dataset in the future.

preprint2016arXiv

Bayesian estimators of the Gamma distribution

In this paper we introduce two Bayesian estimators for learning the parameters of the Gamma distribution. The first algorithm uses a well known unnormalized conjugate prior for the Gamma shape and the second one uses a non-linear approximation to the likelihood and a prior on the shape that is conjugate to the approximated likelihood. In both cases use the Laplace approximation to compute the required expectations. We perform a theoretical comparison between maximum like- lihood and the presented Bayesian algorithms that allow us to provide non-informative parameter values for the priors hyper parameters. We also provide a numerical comparison using synthetic data. The introduction of these novel Bayesian estimators open the possibility of including Gamma distributions into more complex Bayesian structures, e.g. variational Bayesian mixture models.

preprint2016arXiv

Estimating an Inverse Gamma distribution

In this paper we introduce five different algorithms based on method of moments, maximum likelihood and full Bayesian estimation for learning the parameters of the Inverse Gamma distribution. We also provide an expression for the KL divergence for Inverse Gamma distributions which allows us to quantify the estimation accuracy of each of the algorithms. All the presented algorithms are novel. The most relevant novelties include the first conjugate prior for the Inverse Gamma shape parameter which allows analytical Bayesian inference, and two very fast algorithms, a maximum likelihood and a Bayesian one, both based on likelihood approximation. In order to compute expectations under the proposed distributions we use the Laplace approximation. The introduction of these novel Bayesian estimators opens the possibility of including Inverse Gamma distributions into more complex Bayesian structures, e.g. variational Bayesian mixture models. The algorithms introduced in this paper are computationally compared using synthetic data and interesting relationships between the maximum likelihood and the Bayesian approaches are derived.

preprint2016arXiv

Variational Mixture Models with Gamma or inverse-Gamma components

Mixture models with Gamma and or inverse-Gamma distributed mixture components are useful for medical image tissue segmentation or as post-hoc models for regression coefficients obtained from linear regression within a Generalised Linear Modeling framework (GLM), used in this case to separate stochastic (Gaussian) noise from some kind of positive or negative "activation" (modeled as Gamma or inverse-Gamma distributed). To date, the most common choice in this context it is Gaussian/Gamma mixture models learned through a maximum likelihood (ML) approach; we recently extended such algorithm for mixture models with inverse-Gamma components. Here, we introduce a fully analytical Variational Bayes (VB) learning framework for both Gamma and/or inverse-Gamma components. We use synthetic and resting state fMRI data to compare the performance of the ML and VB algorithms in terms of area under the curve and computational cost. We observed that the ML Gaussian/Gamma model is very expensive specially when considering high resolution images; furthermore, these solutions are highly variable and they occasionally can overestimate the activations severely. The Bayesian Gauss-Gamma is in general the fastest algorithm but provides too dense solutions. The maximum likelihood Gaussian/inverse-Gamma is also very fast but provides in general very sparse solutions. The variational Gaussian/inverse-Gamma mixture model is the most robust and its cost is acceptable even for high resolution images. Further, the presented methodology represents an essential building block that can be directly used in more complex inference tasks, specially designed to analyse MRI-fMRI data; such models include for example analytical variational mixture models with adaptive spatial regularization or better source models for new spatial blind source separation approaches.

C. F. Beckmann

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Evaluation of data imputation strategies in complex, deeply-phenotyped data sets: the case of the EU-AIMS Longitudinal European Autism Project

Bayesian estimators of the Gamma distribution

Estimating an Inverse Gamma distribution

Variational Mixture Models with Gamma or inverse-Gamma components