Source author record

Thomas Nagler

Thomas Nagler appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning Applications Computation

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Second-Order Uncertainty Quantification: Variance-Based Measures

Uncertainty quantification is a critical aspect of machine learning models, providing important insights into the reliability of predictions and aiding the decision-making process in real-world applications. This paper proposes a novel way to use variance-based measures to quantify uncertainty on the basis of second-order distributions in classification problems. A distinctive feature of the measures is the ability to reason about uncertainties on a class-based level, which is useful in situations where nuanced decision-making is required. Recalling some properties from the literature, we highlight that the variance-based measures satisfy important (axiomatic) properties. In addition to this axiomatic approach, we present empirical results showing the measures to be effective and competitive to commonly used entropy-based measures.

preprint2022arXiv

Solving estimating equations with copulas

Thanks to their ability to capture complex dependence structures, copulas are frequently used to glue random variables into a joint model with arbitrary marginal distributions. More recently, they have been applied to solve statistical learning problems such as regression or classification. Framing such approaches as solutions of estimating equations, we generalize them in a unified framework. We can then obtain simultaneous, coherent inferences across multiple regression-like problems. We derive consistency, asymptotic normality, and validity of the bootstrap for corresponding estimators. The conditions allow for both continuous and discrete data as well as parametric, nonparametric, and semiparametric estimators of the copula and marginal distributions. The versatility of this methodology is illustrated by several theoretical examples, a simulation study, and an application to financial portfolio allocation.

preprint2022arXiv

Stationary vine copula models for multivariate time series

Multivariate time series exhibit two types of dependence: across variables and across time points. Vine copulas are graphical models for the dependence and can conveniently capture both types of dependence in the same model. We derive the maximal class of graph structures that guarantee stationarity under a natural and verifiable condition called translation invariance. We propose computationally efficient methods for estimation, simulation, prediction, and uncertainty quantification and show their validity by asymptotic results and simulations. The theoretical results allow for misspecified models and, even when specialized to the iid case, go beyond what is available in the literature. Their proofs are based on new results for general semiparametric method-of-moment estimators, which shall be of independent interest. The new model class is illustrated by an application to forecasting returns of a portfolio of 20 stocks, where they show excellent forecast performance. The paper is accompanied by an open source software implementation.

preprint2022arXiv

Statistical Dependence Analyses of Operational Flight Data Used for Landing Reconstruction Enhancement

The RTS smoother is widely used for state estimation and it is utilized here to increase the data quality with respect to physical coherence and to increase resolution. The purpose of this paper is to enhance the performance of the RTS smoother to reconstruct an aircraft landing using on board recorded data only. Thereby, errors and uncertainties of operational flight data (e.g. altitude, attitude, position, speed) recorded during flights of civil aircraft are minimized. These data can be used for subsequent analyses in terms of flight safety or efficiency, which is commonly referred to as Flight Data Monitoring (FDM). Statistical assumptions of the smoother theory are not always verified during application but (consciously or not) assumed to be fulfilled. These assumptions can hardly be verified prior to the smoother application, however, they can be verified using the results of an initial smoother iteration and modifications of specific smoother characteristics can be suggested. This project specifically verifies assumptions on the measurement noise characteristics. Variance and covariance of the measurement noise can be checked after the initial smoother application. It is discovered that these characteristics change over time and should be accounted for with a time varying covariance matrix. This sequence of matrices is estimated by kernel smoothing and replaces an initially assumed fixed and diagonal covariance matrix used for the first smoother run. The results of this second smoother iteration are mostly improved compared to the initial iteration, i.e. the errors are significantly reduced. Subsequently, the remaining dependence structures of the residuals of the second smoother iteration can be captured by copula models. Their interpretation is useful for a revision of the physical model utilized by the RTS smoother.

preprint2021arXiv

Explaining predictive models using Shapley values and non-parametric vine copulas

The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.

preprint2021arXiv

R friendly multi-threading in C++

Calling multi-threaded C++ code from R has its perils. Since the R interpreter is single-threaded, one must not check for user interruptions or print to the R console from multiple threads. One can, however, synchronize with R from the main thread. The R package RcppThread (current version 1.0.0) contains a header only C++ library for thread safe communication with R that exploits this fact. It includes C++ classes for threads, a thread pool, and parallel loops that routinely synchronize with R. This article explains the package's functionality and gives examples of its usage. The synchronization mechanism may also apply to other threading frameworks. Benchmarks suggest that, although synchronization causes overhead, the parallel abstractions of RcppThread are competitive with other popular libraries in typical scenarios encountered in statistical computing.

preprint2016arXiv

Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas

Practical applications of nonparametric density estimators in more than three dimensions suffer a great deal from the well-known curse of dimensionality: convergence slows down as dimension increases. We show that one can evade the curse of dimensionality by assuming a simplified vine copula model for the dependence between variables. We formulate a general nonparametric estimator for such a model and show under high-level assumptions that the speed of convergence is independent of dimension. We further discuss a particular implementation for which we validate the high-level assumptions and establish its asymptotic normality. Simulation experiments illustrate a large gain in finite sample performance when the simplifying assumption is at least approximately true. But even when it is severely violated, the vine copula based approach proves advantageous as soon as more than a few variables are involved. Lastly, we give an application of the estimator to a classification problem from astrophysics.

Thomas Nagler

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Second-Order Uncertainty Quantification: Variance-Based Measures

Solving estimating equations with copulas

Stationary vine copula models for multivariate time series

Statistical Dependence Analyses of Operational Flight Data Used for Landing Reconstruction Enhancement

Explaining predictive models using Shapley values and non-parametric vine copulas

R friendly multi-threading in C++

Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas