Source author record

Leonhard Held

Leonhard Held appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications physics.soc-ph Computation math.ST physics.data-an Statistics Theory

Catalog footprint

What is connected

12works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bayes Factor Group Sequential Designs

The Bayes factor, the data-based updating factor from prior to posterior odds, is a principled measure of relative evidence for two competing hypotheses. It is naturally suited to sequential data analysis in settings such as clinical trials and animal experiments, where early stopping for efficacy or futility is desirable. However, designing such studies is challenging because computing design characteristics, such as the probability of obtaining conclusive evidence or the expected sample size, typically requires computationally intensive Monte Carlo simulations, as no closed-form or efficient numerical methods exist. To address this issue, we extend results from classical group sequential design theory to sequential Bayes factor designs. The key idea is to derive Bayes factor stopping regions in terms of the z-statistic and use the known distribution of the cumulative z-statistics to compute stopping probabilities through multivariate normal integration. The resulting method is fast, accurate, and simulation-free. We illustrate it with examples from clinical trials, animal experiments, and psychological studies. We also provide an open-source implementation in the bfpwr R package. Our method makes exploring sequential Bayes factor designs as straightforward as classical group sequential designs, enabling experiments to rapidly design informative and efficient experiments.

preprint2022arXiv

Comparing Confidence Intervals for a Binomial Proportion with the Interval Score

There are over 55 different ways to construct a confidence respectively credible interval (CI) for the binomial proportion. Methods to compare them are necessary to decide which should be used in practice. The interval score has been suggested to compare prediction intervals. This score is a proper scoring rule that combines the coverage as a measure of calibration and the width as a measure of sharpness. We evaluate eleven CIs for the binomial proportion based on the expected interval score and propose a summary measure which can take into account different weighting of the underlying true proportion. Under uniform weighting, the expected interval score recommends the Wilson CI or Bayesian credible intervals with a uniform prior. If extremely low or high proportions receive more weight, the score recommends Bayesian credible intervals based on Jeffreys' prior. While more work is needed to theoretically justify the use of the interval score for the comparison of CIs, our results suggest that it constitutes a useful method to combine coverage and width in one measure. This novel approach could also be used in other applications.

preprint2020arXiv

A marginal moment matching approach for fitting endemic-epidemic models to underreported disease surveillance counts

Count data are often subject to underreporting, especially in infectious disease surveillance. We propose an approximate maximum likelihood method to fit count time series models from the endemic-epidemic class to underreported data. The approach is based on marginal moment matching where underreported processes are approximated through completely observed processes from the same class. Moreover, the form of the bias when underreporting is ignored or taken into account via multiplication factors is analysed. Notably, we show that this leads to a downward bias in model-based estimates of the effective reproductive number. A marginal moment matching approach can also be used to account for reporting intervals which are longer than the mean serial interval of a disease. The good performance of the proposed methodology is demonstrated in simulation studies. An extension to time-varying parameters and reporting probabilities is discussed and applied in a case study on weekly rotavirus gastroenteritis counts in Berlin, Germany.

preprint2020arXiv

Endemic-epidemic models with discrete-time serial interval distributions for infectious disease prediction

Multivariate count time series models are an important tool for the analysis and prediction of infectious disease spread. We consider the endemic-epidemic framework, an autoregressive model class for infectious disease surveillance counts, and replace the default autoregression on counts from the previous time period with more flexible weighting schemes inspired by discrete-time serial interval distributions. We employ three different parametric formulations, each with an additional unknown weighting parameter estimated via a profile likelihood approach, and compare them to an unrestricted nonparametric approach. The new methods are illustrated in a univariate analysis of dengue fever incidence in San Juan, Puerto Rico, and a spatio-temporal study of viral gastroenteritis in the twelve districts of Berlin. We assess the predictive performance of the suggested models and several reference models at various forecast horizons. In both applications, the performance of the endemic-epidemic models is considerably improved by the proposed weighting schemes.

preprint2016arXiv

Model-based testing for space-time interaction using point processes: An application to psychiatric hospital admissions in an urban area

Spatio-temporal interaction is inherent to cases of infectious diseases and occurrences of earthquakes, whereas the spread of other events, such as cancer or crime, is less evident. Statistical significance tests of space-time clustering usually assess the correlation between the spatial and temporal (transformed) distances of the events. Although appealing through simplicity, these classical tests do not adjust for the underlying population nor can they account for a distance decay of interaction. We propose to use the framework of an endemic-epidemic point process model to jointly estimate a background event rate explained by seasonal and areal characteristics, as well as a superposed epidemic component representing the hypothesis of interest. We illustrate this new model-based test for space-time interaction by analysing psychiatric inpatient admissions in Zurich, Switzerland (2007-2012). Several socio-economic factors were found to be associated with the admission rate, but there was no evidence of general clustering of the cases.

preprint2015arXiv

Approximate Bayesian Model Selection with the Deviance Statistic

Bayesian model selection poses two main challenges: the specification of parameter priors for all models, and the computation of the resulting Bayes factors between models. There is now a large literature on automatic and objective parameter priors in the linear model. One important class are $g$-priors, which were recently extended from linear to generalized linear models (GLMs). We show that the resulting Bayes factors can be approximated by test-based Bayes factors (Johnson [Scand. J. Stat. 35 (2008) 354-368]) using the deviance statistics of the models. To estimate the hyperparameter $g$, we propose empirical and fully Bayes approaches and link the former to minimum Bayes factors and shrinkage estimates from the literature. Furthermore, we describe how to approximate the corresponding posterior distribution of the regression coefficients based on the standard GLM output. We illustrate the approach with the development of a clinical prediction model for 30-day survival in the GUSTO-I trial using logistic regression.

preprint2014arXiv

Power-law models for infectious disease spread

Short-time human travel behaviour can be described by a power law with respect to distance. We incorporate this information in space-time models for infectious disease surveillance data to better capture the dynamics of disease spread. Two previously established model classes are extended, which both decompose disease risk additively into endemic and epidemic components: a spatio-temporal point process model for individual-level data and a multivariate time-series model for aggregated count data. In both frameworks, a power-law decay of spatial interaction is embedded into the epidemic component and estimated jointly with all other unknown parameters using (penalised) likelihood inference. Whereas the power law can be based on Euclidean distance in the point process model, a novel formulation is proposed for count data where the power law depends on the order of the neighbourhood of discrete spatial units. The performance of the new approach is investigated by a reanalysis of individual cases of invasive meningococcal disease in Germany (2002-2008) and count data on influenza in 140 administrative districts of Southern Germany (2001-2008). In both applications, the power law substantially improves model fit and predictions, and is reasonably close to alternative qualitative formulations, where distance and order of neighbourhood, respectively, are treated as a factor. Implementation in the R package surveillance allows the approach to be applied in other settings.

preprint2013arXiv

Bayesian analysis of measurement error models using INLA

To account for measurement error (ME) in explanatory variables, Bayesian approaches provide a flexible framework, as expert knowledge about unobserved covariates can be incorporated in the prior distributions. However, given the analytic intractability of the posterior distribution, model inference so far has to be performed via time-consuming and complex Markov chain Monte Carlo implementations. In this paper we extend the Integrated nested Laplace approximations (INLA) approach to formulate Gaussian ME models in generalized linear mixed models. We present three applications, and show how parameter estimates are obtained for common ME models, such as the classical and Berkson error model including heteroscedastic variances. To illustrate the practical feasibility, R-code is provided.

preprint2013arXiv

Sensitivity analysis for Bayesian hierarchical models

Prior sensitivity examination plays an important role in applied Bayesian analyses. This is especially true for Bayesian hierarchical models, where interpretability of the parameters within deeper layers in the hierarchy becomes challenging. In addition, lack of information together with identifiability issues may imply that the prior distributions for such models have an undesired influence on the posterior inference. Despite its relevance, informal approaches to prior sensitivity analysis are currently used. They require repetitive re-runs of the model with ad-hoc modified base prior parameter values. Other formal approaches to prior sensitivity analysis suffer from a lack of popularity in practice, mainly due to their high computational cost and absence of software implementation. We propose a novel formal approach to prior sensitivity analysis which is fast and accurate. It quantifies sensitivity without the need for a model re-run. We develop a ready-to-use priorSens package in R for routine prior sensitivity investigation by R-INLA. Throughout a series of examples we show how our approach can be used to detect high prior sensitivities of some parameters as well as identifiability issues in possibly over-parametrized Bayesian hierarchical models.

preprint2012arXiv

Estimation and extrapolation of time trends in registry data---Borrowing strength from related populations

To analyze and project age-specific mortality or morbidity rates age-period-cohort (APC) models are very popular. Bayesian approaches facilitate estimation and improve predictions by assigning smoothing priors to age, period and cohort effects. Adjustments for overdispersion are straightforward using additional random effects. When rates are further stratified, for example, by countries, multivariate APC models can be used, where differences of stratum-specific effects are interpretable as log relative risks. Here, we incorporate correlated stratum-specific smoothing priors and correlated overdispersion parameters into the multivariate APC model, and use Markov chain Monte Carlo and integrated nested Laplace approximations for inference. Compared to a model without correlation, the new approach may lead to more precise relative risk estimates, as shown in an application to chronic obstructive pulmonary disease mortality in three regions of England and Wales. Furthermore, the imputation of missing data for one particular stratum may be improved, since the new approach takes advantage of the remaining strata if the corresponding observations are available there. This is shown in an application to female mortality in Denmark, Sweden and Norway from the 20th century, where we treat for each country in turn either the first or second half of the observations as missing and then impute the omitted data. The projections are compared to those obtained from a univariate APC model and an extended Lee--Carter demographic forecasting approach using the proper Dawid--Sebastiani scoring rule.

preprint2012arXiv

Mixtures of g-Priors for Generalised Additive Model Selection with Penalised Splines

We propose an objective Bayesian approach to the selection of covariates and their penalised splines transformations in generalised additive models. Specification of a reasonable default prior for the model parameters and combination with a multiplicity-correction prior for the models themselves is crucial for this task. Here we use well-studied and well-behaved continuous mixtures of g-priors as default priors. We introduce the methodology in the normal model and extend it to non-normal exponential families. A simulation study and an application from the literature illustrate the proposed approach. An efficient implementation is available in the R-package "hypergsplines".

preprint2010arXiv

Hyper-g Priors for Generalized Linear Models

We develop an extension of the classical Zellner's g-prior to generalized linear models. The prior on the hyperparameter g is handled in a flexible way, so that any continuous proper hyperprior f(g) can be used, giving rise to a large class of hyper-g priors. Connections with the literature are described in detail. A fast and accurate integrated Laplace approximation of the marginal likelihood makes inference in large model spaces feasible. For posterior parameter estimation we propose an efficient and tuning-free Metropolis-Hastings sampler. The methodology is illustrated with variable selection and automatic covariate transformation in the Pima Indians diabetes data set.

Leonhard Held

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Bayes Factor Group Sequential Designs

Comparing Confidence Intervals for a Binomial Proportion with the Interval Score

A marginal moment matching approach for fitting endemic-epidemic models to underreported disease surveillance counts

Endemic-epidemic models with discrete-time serial interval distributions for infectious disease prediction

Model-based testing for space-time interaction using point processes: An application to psychiatric hospital admissions in an urban area

Approximate Bayesian Model Selection with the Deviance Statistic

Power-law models for infectious disease spread

Bayesian analysis of measurement error models using INLA

Sensitivity analysis for Bayesian hierarchical models

Estimation and extrapolation of time trends in registry data---Borrowing strength from related populations

Mixtures of g-Priors for Generalised Additive Model Selection with Penalised Splines

Hyper-g Priors for Generalized Linear Models