Source author record

Thomas Kneib

Thomas Kneib appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning Computation math.ST q-fin.ST Quantitative Methods Statistics Theory

Catalog footprint

What is connected

20works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bayesian Conditional Transformation Models

Recent developments in statistical regression methodology shift away from pure mean regression towards distributional regression models. One important strand thereof is that of conditional transformation models (CTMs). CTMs infer the entire conditional distribution directly by applying a transformation function to the response conditionally on a set of covariates towards a simple log-concave reference distribution. Thereby, CTMs allow not only variance, kurtosis or skewness but the complete conditional distribution to depend on the explanatory variables. We propose a Bayesian notion of conditional transformation models (BCTMs) focusing on exactly observed continuous responses, but also incorporating extensions to randomly censored and discrete responses. Rather than relying on Bernstein polynomials that have been considered in likelihood-based CTMs, we implement a spline-based parametrization for monotonic effects that are supplemented with smoothness priors. Furthermore, we are able to benefit from the Bayesian paradigm via easily obtainable credible intervals and other quantities without relying on large sample approximations. A simulation study demonstrates the competitiveness of our approach against its likelihood-based counterpart but also Bayesian additive models of location, scale and shape and Bayesian quantile regression. Two applications illustrate the versatility of BCTMs in problems involving real world data, again including the comparison with various types of competitors.

preprint2022arXiv

Bayesian Discrete Conditional Transformation Models

We propose a novel Bayesian model framework for discrete ordinal and count data based on conditional transformations of the responses. The conditional transformation function is estimated from the data in conjunction with an a priori chosen reference distribution. For count responses, the resulting transformation model is novel in the sense that it is a Bayesian fully parametric yet distribution-free approach that can additionally account for excess zeros with additive transformation function specifications. For ordinal categoric responses, our cumulative link transformation model allows the inclusion of linear and nonlinear covariate effects that can additionally be made category-specific, resulting in (non-)proportional odds or hazards models and more, depending on the choice of the reference distribution. Inference is conducted by a generic modular Markov chain Monte Carlo algorithm where multivariate Gaussian priors enforce specific properties such as smoothness on the functional effects. To illustrate the versatility of Bayesian discrete conditional transformation models, applications to counts of patent citations in the presence of excess zeros and on treating forest health categories in a discrete partial proportional odds model are presented.

preprint2022arXiv

Distributional Gradient Boosting Machines

We present a unified probabilistic gradient boosting framework for regression tasks that models and predicts the entire conditional distribution of a univariate response variable as a function of covariates. Our likelihood-based approach allows us to either model all conditional moments of a parametric distribution, or to approximate the conditional cumulative distribution function via Normalizing Flows. As underlying computational backbones, our framework is based on XGBoost and LightGBM. Modelling and predicting the entire conditional distribution greatly enhances existing tree-based gradient boosting implementations, as it allows to create probabilistic forecasts from which prediction intervals and quantiles of interest can be derived. Empirical results show that our framework achieves state-of-the-art forecast accuracy.

preprint2022arXiv

Multivariate Distributional Stochastic Frontier Models

The primary objective of Stochastic Frontier (SF) Analysis is the deconvolution of the estimated composed error terms into noise and inefficiency. Assuming a parametric production function (e.g. Cobb-Douglas, Translog, etc.), might lead to false inefficiency estimates. To overcome this limiting assumption, the production function can be modelled utilizing P-splines. Application of this powerful and flexible tool enables modelling of a wide range of production functions. Additionally, one can allow the parameters of the composed error distribution to depend on covariates in a functional form. The SF model can then be cast into the framework of a Generalized Additive Model for Location, Scale and Shape (GAMLSS). Furthermore, a decision-making unit (DMU) typically produces multiple outputs. It does this by operating several sub-DMUs, which each employ a production process to produce a single output. Therefore, the production processes of the sub-DMUs are typically not independent. Consequently, the inefficiencies may be expected to be dependent, too. In this paper, the Distributional Stochastic Frontier Model (DSFM) is introduced. The multivariate distribution of the composed error term is modeled using a copula. As a result, the presented model is a generalization of the model for seemingly unrelated stochastic frontier regressions by Lai and Huang (2013).

preprint2022arXiv

Probabilistic Time Series Forecasts with Autoregressive Transformation Models

Probabilistic forecasting of time series is an important matter in many applications and research fields. In order to draw conclusions from a probabilistic forecast, we must ensure that the model class used to approximate the true forecasting distribution is expressive enough. Yet, characteristics of the model itself, such as its uncertainty or its feature-outcome relationship are not of lesser importance. This paper proposes Autoregressive Transformation Models (ATMs), a model class inspired by various research directions to unite expressive distributional forecasts using a semi-parametric distribution assumption with an interpretable model specification. We demonstrate the properties of ATMs both theoretically and through empirical evaluation on several simulated and real-world forecasting datasets.

preprint2021arXiv

Adaptive shrinkage of smooth functional effects towards a predefined functional subspace

In this paper, we propose a new horseshoe-type prior hierarchy for adaptively shrinking spline-based functional effects towards a predefined vector space of parametric functions. Instead of shrinking each spline coefficient towards zero, we use an adapted horseshoe prior to control the deviation from the predefined vector space. For this purpose, the modified horseshoe prior is set up with one scale parameter per spline and not one per coefficient. The presented prior allows for a large number of basis functions to capture all kinds of functional effects while the estimated functional effect is prevented from a highly oscillating overfit. We achieve this by integrating a smoothing penalty similar to the random walk prior commonly applied in Bayesian P-spline priors. In a simulation study, we demonstrate the properties of the new prior specification and compare it to other approaches from the literature. Furthermore, we showcase the applicability of the proposed method by estimating the energy consumption in Germany over the course of a day. For inference, we rely on Markov chain Monte Carlo simulations combining Gibbs sampling for the spline coefficients with slice sampling for all scale parameters in the model.

preprint2020arXiv

Analytic expressions for the Cumulative Distribution Function of the Composed Error Term in Stochastic Frontier Analysis with Truncated Normal and Exponential Inefficiencies

In the stochastic frontier model, the composed error term consists of the measurement error and the inefficiency term. A general assumption is that the inefficiency term follows a truncated normal or exponential distribution. In a wide variety of models evaluating the cumulative distribution function of the composed error term is required. This work introduces and proves four representation theorems for these distributions - two for each distributional assumptions. These representations can be utilized for a fast and accurate evaluation.

preprint2020arXiv

Beyond unidimensional poverty analysis using distributional copula models for mixed ordered-continuous outcomes

Poverty is a multidimensional concept often comprising a monetary outcome and other welfare dimensions such as education, subjective well-being or health, that are measured on an ordinal scale. In applied research, multidimensional poverty is ubiquitously assessed by studying each poverty dimension independently in univariate regression models or by combining several poverty dimensions into a scalar index. This inhibits a thorough analysis of the potentially varying interdependence between the poverty dimensions. We propose a multivariate copula generalized additive model for location, scale and shape (copula GAMLSS or distributional copula model) to tackle this challenge. By relating the copula parameter to covariates, we specifically examine if certain factors determine the dependence between poverty dimensions. Furthermore, specifying the full conditional bivariate distribution, allows us to derive several features such as poverty risks and dependence measures coherently from one model for different individuals. We demonstrate the approach by studying two important poverty dimensions: income and education. Since the level of education is measured on an ordinal scale while income is continuous, we extend the bivariate copula GAMLSS to the case of mixed ordered-continuous outcomes. The new model is integrated into the GJRM package in R and applied to data from Indonesia. Particular emphasis is given to the spatial variation of the income-education dependence and groups of individuals at risk of being simultaneously poor in both education and income dimensions.

preprint2019arXiv

Bayesian Effect Selection in Structured Additive Distributional Regression Models

We propose a novel spike and slab prior specification with scaled beta prime marginals for the importance parameters of regression coefficients to allow for general effect selection within the class of structured additive distributional regression. This enables us to model effects on all distributional parameters for arbitrary parametric distributions, and to consider various effect types such as non-linear or spatial effects as well as hierarchical regression structures. Our spike and slab prior relies on a parameter expansion that separates blocks of regression coefficients into overall scalar importance parameters and vectors of standardised coefficients. Hence, we can work with a scalar quantity for effect selection instead of a possibly high-dimensional effect vector, which yields improved shrinkage and sampling performance compared to the classical normal-inverse-gamma prior. We investigate the propriety of the posterior, show that the prior yields desirable shrinkage properties, propose a way of eliciting prior parameters and provide efficient Markov Chain Monte Carlo sampling. Using both simulated and three large-scale data sets, we show that our approach is applicable for data with a potentially large number of covariates, multilevel predictors accounting for hierarchically nested data and non-standard response distributions, such as bivariate normal or zero-inflated Poisson.

preprint2018arXiv

Lost in translation: On the impact of data coding on penalized regression with interactions

Penalized regression approaches are standard tools in quantitative genetics. It is known that the fit of an \emph{ordinary least squares} (OLS) regression is independent of certain transformations of the coding of the predictor variables, and that the standard mixed model \emph{ridge regression best linear unbiased prediction} (RRBLUP) is neither affected by translations of the variable coding, nor by global scaling. However, it has been reported that an extended version of this mixed model, which incorporates interactions by products of markers as additional predictor variables is affected by translations of the marker coding. In this work, we identify the cause of this loss of invariance in a general context of penalized regression on polynomials in the predictor variables. We show that in most cases, translating the coding of the predictor variables has an impact on effect estimates, with an exception when only the size of the coefficients of monomials of highest total degree are penalized. The invariance of RRBLUP can thus be considered as a special case of this setting, with a polynomial of total degree 1, where the size of the fixed effect (total degree 0) is not penalized but all coefficients of monomials of total degree 1 are. The extended RRBLUP, which includes interactions, is not invariant to translations because it does not only penalize interactions (total degree 2), but also additive effects (total degree 1). Our observations are not restricted to ridge regression, but generally valid for penalized regressions, for instance also for the $\ell_1$ penalty of LASSO.

preprint2016arXiv

Boosting Joint Models for Longitudinal and Time-to-Event Data

Joint Models for longitudinal and time-to-event data have gained a lot of attention in the last few years as they are a helpful technique to approach common a data structure in clinical studies where longitudinal outcomes are recorded alongside event times. Those two processes are often linked and the two outcomes should thus be modeled jointly in order to prevent the potential bias introduced by independent modelling. Commonly, joint models are estimated in likelihood based expectation maximization or Bayesian approaches using frameworks where variable selection is problematic and which do not immediately work for high-dimensional data. In this paper, we propose a boosting algorithm tackling these challenges by being able to simultaneously estimate predictors for joint models and automatically select the most influential variables even in high-dimensional data situations. We analyse the performance of the new algorithm in a simulation study and apply it to the Danish cystic fibrosis registry which collects longitudinal lung function data on patients with cystic fibrosis together with data regarding the onset of pulmonary infections. This is the first approach to combine state-of-the art algorithms from the field of machine-learning with the model class of joint models, providing a fully data-driven mechanism to select variables and predictor effects in a unified framework of boosting joint models.

preprint2015arXiv

Bayesian structured additive distributional regression with an application to regional income inequality in Germany

We propose a generic Bayesian framework for inference in distributional regression models in which each parameter of a potentially complex response distribution and not only the mean is related to a structured additive predictor. The latter is composed additively of a variety of different functional effect types such as nonlinear effects, spatial effects, random coefficients, interaction surfaces or other (possibly nonstandard) basis function representations. To enforce specific properties of the functional effects such as smoothness, informative multivariate Gaussian priors are assigned to the basis function coefficients. Inference can then be based on computationally efficient Markov chain Monte Carlo simulation techniques where a generic procedure makes use of distribution-specific iteratively weighted least squares approximations to the full conditionals. The framework of distributional regression encompasses many special cases relevant for treating nonstandard response structures such as highly skewed nonnegative responses, overdispersed and zero-inflated counts or shares including the possibility for zero- and one-inflation. We discuss distributional regression along a study on determinants of labour incomes for full-time working males in Germany with a particular focus on regional differences after the German reunification. Controlling for age, education, work experience and local disparities, we estimate full conditional income distributions allowing us to study various distributional quantities such as moments, quantiles or inequality measures in a consistent manner in one joint model. Detailed guidance on practical aspects of model choice including the selection of several competing distributions for labour incomes and the consideration of different covariate effects on the income distribution complete the distributional regression analysis. We find that next to a lower expected income, full-time working men in East Germany also face a more unequal income distribution than men in the West, ceteris paribus.

preprint2015arXiv

Markov-switching generalized additive models

We consider Markov-switching regression models, i.e. models for time series regression analyses where the functional relationship between covariates and response is subject to regime switching controlled by an unobservable Markov chain. Building on the powerful hidden Markov model machinery and the methods for penalized B-splines routinely used in regression analyses, we develop a framework for nonparametrically estimating the functional form of the effect of the covariates in such a regression model, assuming an additive structure of the predictor. The resulting class of Markov-switching generalized additive models is immensely flexible, and contains as special cases the common parametric Markov-switching regression models and also generalized additive and generalized linear models. The feasibility of the suggested maximum penalized likelihood approach is demonstrated by simulation and further illustrated by modelling how energy price in Spain depends on the Euro/Dollar exchange rate.

preprint2015arXiv

Maximum penalized likelihood estimation in semiparametric capture-recapture models

We discuss the semiparametric modeling of mark-recapture-recovery data where the temporal and/or individual variation of model parameters is explained via covariates. Typically, in such analyses a fixed (or mixed) effects parametric model is specified for the relationship between the model parameters and the covariates of interest. In this paper, we discuss the modeling of the relationship via the use of penalized splines, to allow for considerably more flexible functional forms. Corresponding models can be fitted via numerical maximum penalized likelihood estimation, employing cross-validation to choose the smoothing parameters in a data-driven way. Our contribution builds on and extends the existing literature, providing a unified inferential framework for semiparametric mark-recapture-recovery models for open populations, where the interest typically lies in the estimation of survival probabilities. The approach is applied to two real datasets, corresponding to grey herons (Ardea Cinerea), where we model the survival probability as a function of environmental condition (a time-varying global covariate), and Soay sheep (Ovis Aries), where we model the survival probability as a function of individual weight (a time-varying individual-specific covariate). The proposed semiparametric approach is compared to a standard parametric (logistic) regression and new interesting underlying dynamics are observed in both cases.

preprint2014arXiv

A Unified Framework of Constrained Regression

Generalized additive models (GAMs) play an important role in modeling and understanding complex relationships in modern applied statistics. They allow for flexible, data-driven estimation of covariate effects. Yet researchers often have a priori knowledge of certain effects, which might be monotonic or periodic (cyclic) or should fulfill boundary conditions. We propose a unified framework to incorporate these constraints for both univariate and bivariate effect estimates and for varying coefficients. As the framework is based on component-wise boosting methods, variables can be selected intrinsically, and effects can be estimated for a wide range of different distributional assumptions. Bootstrap confidence intervals for the effect estimates are derived to assess the models. We present three case studies from environmental sciences to illustrate the proposed seamless modeling framework. All discussed constrained effect estimates are implemented in the comprehensive R package mboost for model-based boosting.

preprint2014arXiv

Nonparametric inference in hidden Markov models using P-splines

Hidden Markov models (HMMs) are flexible time series models in which the distributions of the observations depend on unobserved serially correlated states. The state-dependent distributions in HMMs are usually taken from some class of parametrically specified distributions. The choice of this class can be difficult, and an unfortunate choice can have serious consequences for example on state estimates, on forecasts and generally on the resulting model complexity and interpretation, in particular with respect to the number of states. We develop a novel approach for estimating the state-dependent distributions of an HMM in a nonparametric way, which is based on the idea of representing the corresponding densities as linear combinations of a large number of standardized B-spline basis functions, imposing a penalty term on non-smoothness in order to maintain a good balance between goodness-of-fit and smoothness. We illustrate the nonparametric modeling approach in a real data application concerned with vertical speeds of a diving beaked whale, demonstrating that compared to parametric counterparts it can lead to models that are more parsimonious in terms of the number of states yet fit the data equally well.

preprint2014arXiv

Semiparametric stochastic volatility modelling using penalized splines

Stochastic volatility (SV) models mimic many of the stylized facts attributed to time series of asset returns, while maintaining conceptual simplicity. The commonly made assumption of conditionally normally distributed or Student-t-distributed returns, given the volatility, has however been questioned. In this manuscript, we introduce a novel maximum penalized likelihood approach for estimating the conditional distribution in an SV model in a nonparametric way, thus avoiding any potentially critical assumptions on the shape. The considered framework exploits the strengths both of the powerful hidden Markov model machinery and of penalized B-splines, and constitutes a powerful and flexible alternative to recently developed Bayesian approaches to semiparametric SV modelling. We demonstrate the feasibility of the approach in a simulation study before outlining its potential in applications to three series of returns on stocks and one series of stock index returns.

preprint2013arXiv

Bayesian Geoadditive Expectile Regression

Regression classes modeling more than the mean of the response have found a lot of attention in the last years. Expectile regression is a special and computationally convenient case of this family of models. Expectiles offer a quantile-like characterisation of a complete distribution and include the mean as a special case. In the frequentist framework the impact of a lot of covariates with very different structures have been made possible. We propose Bayesian expectile regression based on the asymmetric normal distribution. This renders possible incorporating for example linear, nonlinear, spatial and random effects in one model. Furthermore a detailed inference on the estimated parameters can be conducted. Proposal densities based on iterativly weighted least squares updates for the resulting Markov chain Monte Carlo (MCMC) simulation algorithm are proposed and the potential of the approach for extending the flexibility of expectile regression towards complex semiparametric regression specifications is discussed.

preprint2013arXiv

Penalized Likelihood and Bayesian Function Selection in Regression Models

Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice.

preprint2011arXiv

Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix.

Thomas Kneib

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Bayesian Conditional Transformation Models

Bayesian Discrete Conditional Transformation Models

Distributional Gradient Boosting Machines

Multivariate Distributional Stochastic Frontier Models

Probabilistic Time Series Forecasts with Autoregressive Transformation Models

Adaptive shrinkage of smooth functional effects towards a predefined functional subspace

Analytic expressions for the Cumulative Distribution Function of the Composed Error Term in Stochastic Frontier Analysis with Truncated Normal and Exponential Inefficiencies

Beyond unidimensional poverty analysis using distributional copula models for mixed ordered-continuous outcomes

Bayesian Effect Selection in Structured Additive Distributional Regression Models

Lost in translation: On the impact of data coding on penalized regression with interactions

Boosting Joint Models for Longitudinal and Time-to-Event Data

Bayesian structured additive distributional regression with an application to regional income inequality in Germany

Markov-switching generalized additive models

Maximum penalized likelihood estimation in semiparametric capture-recapture models

A Unified Framework of Constrained Regression

Nonparametric inference in hidden Markov models using P-splines

Semiparametric stochastic volatility modelling using penalized splines

Bayesian Geoadditive Expectile Regression

Penalized Likelihood and Bayesian Function Selection in Regression Models

Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models