Researcher profile

Lucas Kook

Lucas Kook contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Comparison of generalised additive models and neural networks in applications: A systematic review

Neural networks have become a popular tool in predictive modelling, more commonly associated with machine learning and artificial intelligence than with statistics. Generalised Additive Models (GAMs) are flexible non-linear statistical models that retain interpretability. Both are state-of-the-art in their own right, with their respective advantages and disadvantages. This paper analyses how these two model classes have performed on real-world tabular data. Following PRISMA guidelines, we conducted a systematic review of papers that performed empirical comparisons of GAMs and neural networks. Eligible papers were identified, yielding 143 papers, with 430 datasets. Key attributes at both paper and dataset levels were extracted and reported. Beyond summarising comparisons, we analyse reported performance metrics using mixed-effects modelling to investigate potential characteristics that can explain and quantify observed differences, including application area, study year, sample size, number of predictors, and neural network complexity. Across datasets, no consistent evidence of superiority was found for either GAMs or neural networks when considering the most frequently reported metrics (RMSE, $R^2$, and AUC). Neural networks tended to outperform in larger datasets and in those with more predictors, but this advantage narrowed over time. Conversely, GAMs remained competitive, particularly in smaller data settings, while retaining interpretability. Reporting of dataset characteristics and neural network complexity was incomplete in much of the literature, limiting transparency and reproducibility. This review highlights that GAMs and neural networks should be viewed as complementary approaches rather than competitors. For many tabular applications, the performance trade-off is modest, and interpretability may favour GAMs.

preprint2022arXiv

Deep interpretable ensembles

Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawback of deep ensembles in high-stake decision fields, in which interpretable models are desired. We propose a novel transformation ensemble which aggregates probabilistic predictions with the guarantee to preserve interpretability and yield uniformly better predictions than the ensemble members on average. Transformation ensembles are tailored towards interpretable deep transformation models but are applicable to a wider range of probabilistic neural networks. In experiments on several publicly available data sets, we demonstrate that transformation ensembles perform on par with classical deep ensembles in terms of prediction performance, discrimination, and calibration. In addition, we demonstrate how transformation ensembles quantify both aleatoric and epistemic uncertainty, and produce minimax optimal predictions under certain conditions.

preprint2022arXiv

deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression

In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as \pkg{mgcv}. The packages' modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models.

preprint2022arXiv

Distributional Anchor Regression

Prediction models often fail if train and test data do not stem from the same distribution. Out-of-distribution (OOD) generalization to unseen, perturbed test data is a desirable but difficult-to-achieve property for prediction models and in general requires strong assumptions on the data generating process (DGP). In a causally inspired perspective on OOD generalization, the test data arise from a specific class of interventions on exogenous random variables of the DGP, called anchors. Anchor regression models, introduced by Rothenhaeusler et al. (2021), protect against distributional shifts in the test data by employing causal regularization. However, so far anchor regression has only been used with a squared-error loss which is inapplicable to common responses such as censored continuous or ordinal data. Here, we propose a distributional version of anchor regression which generalizes the method to potentially censored responses with at least an ordered sample space. To this end, we combine a flexible class of parametric transformation models for distributional regression with an appropriate causal regularizer under a more general notion of residuals. In an exemplary application and several simulation scenarios we demonstrate the extent to which OOD generalization is possible.