Researcher profile

Johanna F. Ziegel

Johanna F. Ziegel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2022arXiv

Distributional (Single) Index Models

A Distributional (Single) Index Model (DIM) is a semi-parametric model for distributional regression, that is, estimation of conditional distributions given covariates. The method is a combination of classical single index models for the estimation of the conditional mean of a response given covariates, and isotonic distributional regression. The model for the index is parametric, whereas the conditional distributions are estimated non-parametrically under a stochastic ordering constraint. We show consistency of our estimators and apply them to a highly challenging data set on the length of stay (LoS) of patients in intensive care units. We use the model to provide skillful and calibrated probabilistic predictions for the LoS of individual patients, that outperform the available methods in the literature.

preprint2022arXiv

Sequentially valid tests for forecast calibration

Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on a regular basis, such as every hour, day, or month, and their quality is monitored continuously. However, the classical statistical tools for forecast evaluation are static, in the sense that statistical tests for forecast calibration are only valid if the evaluation period is fixed in advance. Recently, e-values have been introduced as a new, dynamic method for assessing statistical significance. An e-value is a non-negative random variable with expected value at most one under a null hypothesis. Large e-values give evidence against the null hypothesis, and the multiplicative inverse of an e-value is a conservative p-value. E-values are particularly suitable for sequential forecast evaluation, since they naturally lead to statistical tests which are valid under optional stopping. This article proposes e-values for testing probabilistic calibration of forecasts, which is one of the most important notions of calibration. The proposed methods are also more generally applicable for sequential goodness-of-fit testing. We demonstrate that the e-values are competitive in terms of power when compared to extant methods, which do not allow sequential testing. Furthermore, they provide important and useful insights in the evaluation of probabilistic weather forecasts.

preprint2022arXiv

Valid sequential inference on probability forecast performance

Probability forecasts for binary events play a central role in many applications. Their quality is commonly assessed with proper scoring rules, which assign forecasts a numerical score such that a correct forecast achieves a minimal expected score. In this paper, we construct e-values for testing the statistical significance of score differences of competing forecasts in sequential settings. E-values have been proposed as an alternative to p-values for hypothesis testing, and they can easily be transformed into conservative p-values by taking the multiplicative inverse. The e-values proposed in this article are valid in finite samples without any assumptions on the data generating processes. They also allow optional stopping, so a forecast user may decide to interrupt evaluation taking into account the available data at any time and still draw statistically valid inference, which is generally not true for classical p-value based tests. In a case study on postprocessing of precipitation forecasts, state-of-the-art forecasts dominance tests and e-values lead to the same conclusions.

preprint2020arXiv

Evaluating Range Value at Risk Forecasts

The debate of what quantitative risk measure to choose in practice has mainly focused on the dichotomy between Value at Risk (VaR) -- a quantile -- and Expected Shortfall (ES) -- a tail expectation. Range Value at Risk (RVaR) is a natural interpolation between these two prominent risk measures, which constitutes a tradeoff between the sensitivity of the latter and the robustness of the former, turning it into a practically relevant risk measure on its own. As such, there is a need to statistically validate RVaR forecasts and to compare and rank the performance of different RVaR models, tasks subsumed under the term 'backtesting' in finance. The predictive performance is best evaluated and compared in terms of strictly consistent loss or scoring functions. That is, functions which are minimised in expectation by the correct RVaR forecast. Much like ES, it has been shown recently that RVaR does not admit strictly consistent scoring functions, i.e., it is not elicitable. Mitigating this negative result, this paper shows that a triplet of RVaR with two VaR components at different levels is elicitable. We characterise the class of strictly consistent scoring functions for this triplet. Additional properties of these scoring functions are examined, including the diagnostic tool of Murphy diagrams. The results are illustrated with a simulation study, and we put our approach in perspective with respect to the classical approach of trimmed least squares in robust regression.

preprint2020arXiv

Local Estimation of a Multivariate Density and its Derivatives

We analyze four different approaches to estimate a multivariate probability density (or the log-density) and its first and second order derivatives. Two methods, local log-likelihood and local Hyvärinen score estimation, are in terms of weighted scoring rules with local quadratic models. The other two approaches are matching of local moments and kernel density estimation. All estimators depend on a general kernel, and we use the Gaussian kernel to provide explicit examples. Asymptotic properties of the estimators are derived and compared. In terms of rates of convergence, a refined local moment matching estimator is the best.

preprint2020arXiv

Optimal solutions to the isotonic regression problem

In general, the solution to a regression problem is the minimizer of a given loss criterion, and depends on the specified loss function. The nonparametric isotonic regression problem is special, in that optimal solutions can be found by solely specifying a functional. These solutions will then be minimizers under all loss functions simultaneously as long as the loss functions have the requested functional as the Bayes act. For the functional, the only requirement is that it can be defined via an identification function, with examples including the expectation, quantile, and expectile functionals. Generalizing classical results, we characterize the optimal solutions to the isotonic regression problem for such functionals, and extend the results from the case of totally ordered explanatory variables to partial orders. For total orders, we show that any solution resulting from the pool-adjacent-violators algorithm is optimal. It is noteworthy, that simultaneous optimality is unattainable in the unimodal regression problem, despite its close connection.

preprint2016arXiv

Derivatives of isotropic positive definite functions on spheres

We show that isotropic positive definite functions on the $d$-dimensional sphere which are $2k$ times differentiable at zero have $2k+[(d-1)/2]$ continuous derivatives on $(0,π)$. This result is analogous to the result for radial positive definite functions on Euclidean spaces. We prove optimality of the result for all odd dimensions. The proof relies on montée, descente and turning bands operators on spheres which parallel the corresponding operators originating in the work of Matheron for radial positive definite functions on Euclidian spaces.

preprint2015arXiv

Cross-calibration of probabilistic forecasts

When providing probabilistic forecasts for uncertain future events, it is common to strive for calibrated forecasts, that is, the predictive distribution should be compatible with the observed outcomes. Several notions of calibration are available in the case of a single forecaster alongside with diagnostic tools and statistical tests to assess calibration in practice. Often, there is more than one forecaster providing predictions, and these forecasters may use information of the others and therefore influence one another. We extend common notions of calibration, where each forecaster is analysed individually, to notions of cross-calibration where each forecaster is analysed with respect to the other forecasters in a natural way. It is shown theoretically and in simulation studies that cross-calibration is a stronger requirement on a forecaster than calibration. Analogously to calibration for individual forecasters, we provide diagnostic tools and statistical tests to assess forecasters in terms of cross-calibration. The methods are illustrated in simulation examples and applied to probabilistic forecasts for inflation rates by the Bank of England.

preprint2015arXiv

Higher order elicitability and Osband's principle

A statistical functional, such as the mean or the median, is called elicitable if there is a scoring function or loss function such that the correct forecast of the functional is the unique minimizer of the expected score. Such scoring functions are called strictly consistent for the functional. The elicitability of a functional opens the possibility to compare competing forecasts and to rank them in terms of their realized scores. In this paper, we explore the notion of elicitability for multi-dimensional functionals and give both necessary and sufficient conditions for strictly consistent scoring functions. We cover the case of functionals with elicitable components, but we also show that one-dimensional functionals that are not elicitable can be a component of a higher order elicitable functional. In the case of the variance this is a known result. However, an important result of this paper is that spectral risk measures with a spectral measure with finite support are jointly elicitable if one adds the `correct' quantiles. A direct consequence of applied interest is that the pair (Value at Risk, Expected Shortfall) is jointly elicitable under mild conditions that are usually fulfilled in risk management applications.

preprint2014arXiv

Coherence and elicitability

The risk of a financial position is usually summarized by a risk measure. As this risk measure has to be estimated from historical data, it is important to be able to verify and compare competing estimation procedures. In statistical decision theory, risk measures for which such verification and comparison is possible, are called elicitable. It is known that quantile based risk measures such as value at risk are elicitable. In this paper we show that law-invariant spectral risk measures such as expected shortfall are not elicitable unless they reduce to minus the expected value. Hence, it is unclear how to perform forecast verification or comparison. However, the class of elicitable law-invariant coherent risk measures does not reduce to minus the expected value. We show that it consists of certain expectiles.

preprint2014arXiv

Distortion Risk Measures and Elicitability

We discuss equivalent axiomatic characterizations of distortion risk measures, and give a novel and concise proof of the characterization of elicitable distortion risk measures. Elicitability has recently been discussed as a desirable criterion for risk measures, motivated by statistical considerations of forecasting. We reveal the mathematical conflict between the requirements of elicitability and comonotonic additivity which intuitively explains why only Value-at-Risk and the mean are elicitable distortion risk measures in a general sense.

preprint2014arXiv

Limit theorems for nondegenerate U-statistics of continuous semimartingales

This paper presents the asymptotic theory for nondegenerate $U$-statistics of high frequency observations of continuous Itô semimartingales. We prove uniform convergence in probability and show a functional stable central limit theorem for the standardized version of the $U$-statistic. The limiting process in the central limit theorem turns out to be conditionally Gaussian with mean zero. Finally, we indicate potential statistical applications of our probabilistic results.

preprint2014arXiv

Risk measures with the CxLS property

In the present contribution we characterize law determined convex risk measures that have convex level sets at the level of distributions. By relaxing the assumptions in Weber (2006), we show that these risk measures can be identified with a class of generalized shortfall risk measures. As a direct consequence, we are able to extend the results in Ziegel (2014) and Bellini and Bignozzi (2014) on convex elicitable risk measures and confirm that expectiles are the only elicitable coherent risk measures. Further, we provide a simple characterization of robustness for convex risk measures in terms of a weak notion of mixture continuity.

preprint2013arXiv

Copula Calibration

We propose notions of calibration for probabilistic forecasts of general multivariate quantities. Probabilistic copula calibration is a natural analogue of probabilistic calibration in the univariate setting. It can be assessed empirically by checking for the uniformity of the copula probability integral transform (CopPIT), which is invariant under coordinate permutations and coordinatewise strictly monotone transformations of the predictive distribution and the outcome. The CopPIT histogram can be interpreted as a generalization and variant of the multivariate rank histogram, which has been used to check the calibration of ensemble forecasts. Climatological copula calibration is an analogue of marginal calibration in the univariate setting. Methods and tools are illustrated in a simulation study and applied to compare raw numerical model and statistically postprocessed ensemble forecasts of bivariate wind vectors.