Researcher profile

Holger Dette

Holger Dette contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

Inference for Multiple Change-points in Piecewise Locally Stationary Time Series

Change-point detection and locally stationary time series modeling are two major approaches for the analysis of non-stationary data. The former aims to identify stationary phases by detecting abrupt changes in the dynamics of a time series model, while the latter employs (locally) time-varying models to describe smooth changes in dependence structure of a time series. However, in some applications, abrupt and smooth changes can co-exist, and neither of the two approaches alone can model the data adequately. In this paper, we propose a novel likelihood-based procedure for the inference of multiple change-points in locally stationary time series. In contrast to traditional change-point analysis where an abrupt change occurs in a real-valued parameter, a change in locally stationary time series occurs in a parameter curve, and can be classified as a jump or a kink depending on whether the curve is discontinuous or not. We show that the proposed method can consistently estimate the number, locations, and the types of change-points. Two different asymptotic distributions corresponding respectively to jump and kink estimators are also established. Extensive simulation studies and a real data application to financial time series are provided.

preprint2026arXiv

Sequential Eigenvalue Statistics for Change-Point Detection in Covariance Matrices

Testing for change points in sequences of covariance matrices is an important and equally challenging problem in statistical methodology with applications in various fields. Motivated by the observation that even in cases where the ratio between dimension and sample size is as small as $0.05$, tests based on a fixed-dimension asymptotics do not keep their preassigned level, we propose to derive critical values of test statistics using an asymptotic regime where the dimension diverges at the same rate as the sample size. This paper introduces a novel and well-founded statistical methodology for detecting change points in a sequence of moderately dimensional covariance matrices. Our approach utilizes a min-type statistic based on a sequential process of likelihood ratio statistics. This is used to construct a test for the hypothesis of the existence of a change point with a corresponding estimator for its location. We provide theoretical guarantees by thoroughly analyzing the asymptotic properties of the sequential process of likelihood ratio statistics. In particular, we prove weak convergence towards a Gaussian process under the null hypothesis of no change. To identify the challenging dependency structure between consecutive test statistics, we employ tools from random matrix theory and stochastic processes.

preprint2023arXiv

Testing separability for continuous functional data

Analyzing the covariance structure of data is a fundamental task of statistics. While this task is simple for low-dimensional observations, it becomes challenging for more intricate objects, such as multivariate functions. Here, the covariance can be so complex that just saving a non-parametric estimate is impractical and structural assumptions are necessary to tame the model. One popular assumption for space-time data is separability of the covariance into purely spatial and temporal factors. In this paper, we present a new test for separability in the context of dependent functional time series. While most of the related work studies functional data in a Hilbert space of square integrable functions, we model the observations as objects in the space of continuous functions equipped with the supremum norm. We argue that this (mathematically challenging) setup enhances interpretability for users and is more in line with practical preprocessing. Our test statistic measures the maximal deviation between the estimated covariance kernel and a separable approximation. Critical values are obtained by a non-standard multiplier bootstrap for dependent data. We prove the statistical validity of our approach and demonstrate its practicability in a simulation study and a data example.

preprint2022arXiv

An RKHS approach for pivotal inference in functional linear regression

We develop methodology for testing hypotheses regarding the slope function in functional linear regression for time series via a reproducing kernel Hilbert space approach. In contrast to most of the literature, which considers tests for the exact nullity of the slope function, we are interested in the null hypothesis that the slope function vanishes only approximately, where deviations are measured with respect to the $L^2$-norm. An asymptotically pivotal test is proposed, which does not require the estimation of nuisance parameters and long-run covariances. The key technical tools to prove the validity of our approach include a uniform Bahadur representation and a weak invariance principle for a sequential process of estimates of the slope function. Both scalar-on-function and function-on-function linear regression are considered and finite-sample methods for implementing our methodology are provided. We also illustrate the potential of our methods by means of a small simulation study and a data example.

preprint2022arXiv

Detecting relevant changes in the spatiotemporal mean function

For a spatiotemporal process $\{X_j(s,t) | ~s \in S~,~t \in T \}_{j =1, \ldots , n} $, where $S$ denotes the set of spatial locations and $T$ the time domain, we consider the problem of testing for a change in the sequence of mean functions. In contrast to most of the literature we are not interested in arbitrarily small changes, but only in changes with a norm exceeding a given threshold. Asymptotically distribution free tests are proposed, which do not require the estimation of the long-run spatiotemporal covariance structure. In particular we consider a fully functional approach and a test based on the cumulative sum paradigm, investigate the large sample properties of the corresponding test statistics and study their finite sample properties by means of simulation study.

preprint2022arXiv

Statistical Quantification of Differential Privacy: A Local Approach

In this work, we introduce a new approach for statistical quantification of differential privacy in a black box setting. We present estimators and confidence intervals for the optimal privacy parameter of a randomized algorithm $A$, as well as other key variables (such as the "data-centric privacy level"). Our estimators are based on a local characterization of privacy and in contrast to the related literature avoid the process of "event selection" - a major obstacle to privacy validation. This makes our methods easy to implement and user-friendly. We show fast convergence rates of the estimators and asymptotic validity of the confidence intervals. An experimental study of various algorithms confirms the efficacy of our approach.

preprint2021arXiv

Efficient prediction of grain boundary energies from atomistic simulations via sequential design

Data based materials science is the new promise to accelerate materials design. Especially in computational materials science, data generation can easily be automatized. Usually, the focus is on processing and evaluating the data to derive rules or to discover new materials, while less attention is being paid on the strategy to generate the data. In this work, we show that by a sequential design of experiment scheme, the process of generating and learning from the data can be combined to discover the relevant sections of the parameter space. Our example is the energy of grain boundaries as a function of their geometric degrees of freedom, calculated via atomistic simulations. The sampling of this grain boundary energy space, or even subspaces of it, represents a challenge due to the presence of deep cusps of the energy, which are located at irregular intervals of the geometric parameters. Existing approaches to sample grain boundary energy subspaces therefore either need a huge amount of datapoints or a~priori knowledge of the positions of these cusps. We combine statistical methods with atomistic simulations and a sequential sampling technique and compare this strategy to a regular sampling technique. We thereby demonstrate that this sequential design is able to sample a subspace with a minimal amount of points while finding unknown cusps automatically.

preprint2021arXiv

Optimal designs for comparing regression curves -- dependence within and between groups

We consider the problem of designing experiments for the comparison of two regression curves describing the relation between a predictor and a response in two groups, where the data between and within the group may be dependent. In order to derive efficient designs we use results from stochastic analysis to identify the best linear unbiased estimator (BLUE) in a corresponding continuous time model. It is demonstrated that in general simultaneous estimation using the data from both groups yields more precise results than estimation of the parameters separately in the two groups. Using the BLUE from simultaneous estimation, we then construct an efficient linear estimator for finite sample size by minimizing the mean squared error between the optimal solution in the continuous time model and its discrete approximation with respect to the weights (of the linear estimator). Finally, the optimal design points are determined by minimizing the maximal width of a simultaneous confidence band for the difference of the two regression functions. The advantages of the new approach are illustrated by means of a simulation study, where it is shown that the use of the optimal designs yields substantially narrower confidence bands than the application of uniform designs.

preprint2021arXiv

Relevant change points in high dimensional time series

This paper investigates the problem of detecting relevant change points in the mean vector, say $μ_t =(μ_{1,t},\ldots ,μ_{d,t})^T$ of a high dimensional time series $(Z_t)_{t\in \mathbb{Z}}$. While the recent literature on testing for change points in this context considers hypotheses for the equality of the means $μ_h^{(1)}$ and $μ_h^{(2)}$ before and after the change points in the different components, we are interested in a null hypothesis of the form $$ H_0: |μ^{(1)}_{h} - μ^{(2)}_{h} | \leq Δ_h ~~~\mbox{ for all } ~~h=1,\ldots ,d $$ where $Δ_1, \ldots , Δ_d$ are given thresholds for which a smaller difference of the means in the $h$-th component is considered to be non-relevant. We propose a new test for this problem based on the maximum of squared and integrated CUSUM statistics and investigate its properties as the sample size $n$ and the dimension $d$ both converge to infinity. In particular, using Gaussian approximations for the maximum of a large number of dependent random variables, we show that on certain points of the boundary of the null hypothesis a standardised version of the maximum converges weakly to a Gumbel distribution.

preprint2020arXiv

A distribution free test for changes in the trend function of locally stationary processes

In the common time series model $X_{i,n} = μ(i/n) + \varepsilon_{i,n}$ with non-stationary errors we consider the problem of detecting a significant deviation of the mean function $μ$ from a benchmark $g (μ)$ (such as the initial value $μ(0)$ or the average trend $\int_{0}^{1} μ(t) dt$). The problem is motivated by a more realistic modelling of change point analysis, where one is interested in identifying relevant deviations in a smoothly varying sequence of means $ (μ(i/n))_{i =1,\ldots ,n }$ and cannot assume that the sequence is piecewise constant. A test for this type of hypotheses is developed using an appropriate estimator for the integrated squared deviation of the mean function and the threshold. By a new concept of self-normalization adapted to non-stationary processes an asymptotically pivotal test for the hypothesis of a relevant deviation is constructed. The results are illustrated by means of a simulation study and a data example.

preprint2020arXiv

A new approach for open-end sequential change point monitoring

We propose a new sequential monitoring scheme for changes in the parameters of a multivariate time series. In contrast to procedures proposed in the literature which compare an estimator from the training sample with an estimator calculated from the remaining data, we suggest to divide the sample at each time point after the training sample. Estimators from the sample before and after all separation points are then continuously compared calculating a maximum of norms of their differences. For open-end scenarios our approach yields an asymptotic level $α$ procedure, which is consistent under the alternative of a change in the parameter. By means of a simulation study it is demonstrated that the new method outperforms the commonly used procedures with respect to power and the feasibility of our approach is illustrated by analyzing two data examples.

preprint2020arXiv

A note on optimal designs for estimating the slope of a polynomial regression

In this note we consider the optimal design problem for estimating the slope of a polynomial regression with no intercept at a given point, say z. In contrast to previous work, which considers symmetric design spaces we investigate the model on the interval $[0, a]$ and characterize those values of $z$, where an explicit solution of the optimal design is possible.

preprint2020arXiv

A Portmanteau-type test for detecting serial correlation in locally stationary functional time series

The Portmanteau test provides the vanilla method for detecting serial correlations in classical univariate time series analysis. The method is extended to the case of observations from a locally stationary functional time series. Asymptotic critical values are obtained by a suitable block multiplier bootstrap procedure. The test is shown to asymptotically hold its level and to be consistent against general alternatives.

preprint2020arXiv

Are deviations in a gradually varying mean relevant? A testing approach based on sup-norm estimators

Classical change point analysis aims at (1) detecting abrupt changes in the mean of a possibly non-stationary time series and at (2) identifying regions where the mean exhibits a piecewise constant behavior. In many applications however, it is more reasonable to assume that the mean changes gradually in a smooth way. Those gradual changes may either be non-relevant (i.e., small), or relevant for a specific problem at hand, and the present paper presents statistical methodology to detect the latter. More precisely, we consider the common nonparametric regression model $X_{i} = μ(i/n) + \varepsilon_{i}$ with possibly non-stationary errors and propose a test for the null hypothesis that the maximum absolute deviation of the regression function $μ$ from a functional $g (μ)$ (such as the value $μ(0)$ or the integral $\int_{0}^{1} μ(t) dt$) is smaller than a given threshold on a given interval $[x_{0},x_{1}] \subseteq [0,1]$. A test for this type of hypotheses is developed using an appropriate estimator, say $\hat d_{\infty, n}$, for the maximum deviation $ d_{\infty}= \sup_{t \in [x_{0},x_{1}]} |μ(t) - g( μ) |$. We derive the limiting distribution of an appropriately standardized version of $\hat d_{\infty,n}$, where the standardization depends on the Lebesgue measure of the set of extremal points of the function $μ(\cdot)-g(μ)$. A refined procedure based on an estimate of this set is developed and its consistency is proved. The results are illustrated by means of a simulation study and a data example.

preprint2020arXiv

Design admissibility and de la Garza phenomenon in multi-factor experiments

The determination of an optimal design for a given regression problem is an intricate optimization problem, especially for models with multivariate predictors. Design admissibility and invariance are main tools to reduce the complexity of the optimization problem and have been successfully applied for models with univariate predictors. In particular several authors have developed sufficient conditions for the existence of saturated designs in univariate models, where the number of support points of the optimal design equals the number of parameters. These results generalize the celebrated de la Garza phenomenon (de la Garza, 1954) which states that for a polynomial regression model of degree $p-1$ any optimal design can be based on at most $p$ points. This paper provides - for the first time - extensions of these results for models with a multivariate predictor. In particular we study a geometric characterization of the support points of an optimal design to provide sufficient conditions for the occurrence of the de la Garza phenomenon in models with multivariate predictors and characterize properties of admissible designs in terms of admissibility of designs in conditional univariate regression models.

preprint2020arXiv

Detecting relevant differences in the covariance operators of functional time series -- a sup-norm approach

In this paper we propose statistical inference tools for the covariance operators of functional time series in the two sample and change point problem. In contrast to most of the literature the focus of our approach is not testing the null hypothesis of exact equality of the covariance operators. Instead we propose to formulate the null hypotheses in them form that "the distance between the operators is small", where we measure deviations by the sup-norm. We provide powerful bootstrap tests for these type of hypotheses, investigate their asymptotic properties and study their finite sample properties by means of a simulation study.

preprint2020arXiv

Efficient model-based Bioequivalence Testing

The classical approach to analyze pharmacokinetic (PK) data in bioequivalence studies aiming to compare two different formulations is to perform noncompartmental analysis (NCA) followed by two one-sided tests (TOST). In this regard the PK parameters $AUC$ and $C_{max}$ are obtained for both treatment groups and their geometric mean ratios are considered. According to current guidelines by the U.S. Food and Drug Administration and the European Medicines Agency the formulations are declared to be sufficiently similar if the $90\%$- confidence interval for these ratios falls between $0.8$ and $1.25$. As NCA is not a reliable approach in case of sparse designs, a model-based alternative has already been proposed for the estimation of $AUC$ and $C_{max}$ using non-linear mixed effects models. Here we propose another, more powerful test than the TOST and demonstrate its superiority through a simulation study both for NCA and model-based approaches. For products with high variability on PK parameters, this method appears to have closer type I errors to the conventionally accepted significance level of $0.05$, suggesting its potential use in situations where conventional bioequivalence analysis is not applicable.

preprint2020arXiv

Efficient tests for bio-equivalence in functional data

We study the problem of testing the equivalence of functional parameters (such as the mean or variance function) in the two sample functional data problem. In contrast to previous work, which reduces the functional problem to a multiple testing problem for the equivalence of scalar data by comparing the functions at each point, our approach is based on an estimate of a distance measuring the maximum deviation between the two functional parameters. Equivalence is claimed if the estimate for the maximum deviation does not exceed a given threshold. A bootstrap procedure is proposed to obtain quantiles for the distribution of the test statistic and consistency of the corresponding test is proved in the large sample scenario. As the methods proposed here avoid the use of the intersection-union principle they are less conservative and more powerful than the currently available methodology.

preprint2020arXiv

Prediction in locally stationary time series

We develop an estimator for the high-dimensional covariance matrix of a locally stationary process with a smoothly varying trend and use this statistic to derive consistent predictors in non-stationary time series. In contrast to the currently available methods for this problem the predictor developed here does not rely on fitting an autoregressive model and does not require a vanishing trend. The finite sample properties of the new methodology are illustrated by means of a simulation study and a financial indices study.

preprint2020arXiv

Quantifying deviations from separability in space-time functional processes

The estimation of covariance operators of spatio-temporal data is in many applications only computationally feasible under simplifying assumptions, such as separability of the covariance into strictly temporal and spatial factors.Powerful tests for this assumption have been proposed in the literature. However, as real world systems, such as climate data are notoriously inseparable, validating this assumption by statistical tests, seems inherently questionable. In this paper we present an alternative approach: By virtue of separability measures, we quantify how strongly the data's covariance operator diverges from a separable approximation. Confidence intervals localize these measures with statistical guarantees. This method provides users with a flexible tool, to weigh the computational gains of a separable model against the associated increase in bias. As separable approximations we consider the established methods of partial traces and partial products, and develop weak convergence principles for the corresponding estimators. Moreover, we also prove such results for estimators of optimal, separable approximations, which are arguably of most interest in applications. In particular we present for the first time statistical inference for this object, which has been confined to estimation previously. Besides confidence intervals, our results encompass tests for approximate separability. All methods proposed in this paper are free of nuisance parameters and do neither require computationally expensive resampling procedures nor the estimation of nuisance parameters. A simulation study underlines the advantages of our approach and its applicability is demonstrated by the investigation of German annual temperature data.

preprint2020arXiv

Statistical Inference for High Dimensional Panel Functional Time Series

In this paper we develop statistical inference tools for high dimensional functional time series. We introduce a new concept of physical dependent processes in the space of square integrable functions, which adopts the idea of basis decomposition of functional data in these spaces, and derive Gaussian and multiplier bootstrap approximations for sums of high dimensional functional time series. These results have numerous important statistical consequences. Exemplarily, we consider the development of joint simultaneous confidence bands for the mean functions and the construction of tests for the hypotheses that the mean functions in the spatial dimension are parallel. The results are illustrated by means of a small simulation study and in the analysis of Canadian temperature data.

preprint2020arXiv

Testing relevant hypotheses in functional time series via self-normalization

In this paper we develop methodology for testing relevant hypotheses about functional time series in a tuning-free way. Instead of testing for exact equality, for example for the equality of two mean functions from two independent time series, we propose to test the null hypothesis of no relevant deviation. In the two sample problem this means that an $L^2$-distance between the two mean functions is smaller than a pre-specified threshold. For such hypotheses self-normalization, which was introduced by Shao (2010) and Shao and Zhang (2010) and is commonly used to avoid the estimation of nuisance parameters, is not directly applicable. We develop new self-normalized procedures for testing relevant hypotheses in the one sample, two sample and change point problem and investigate their asymptotic properties. Finite sample properties of the proposed tests are illustrated by means of a simulation study and data examples. Our main focus is on functional time series, but extensions to other settings are also briefly discussed.

preprint2018arXiv

A nonparametric test for stationarity in functional time series

We propose a new measure for stationarity of a functional time series, which is based on an explicit representation of the $L^2$-distance between the spectral density operator of a non-stationary process and its best ($L^2$-)approximation by a spectral density operator corresponding to a stationary process. This distance can easily be estimated by sums of Hilbert-Schmidt inner products of periodogram operators (evaluated at different frequencies), and asymptotic normality of an appropriately standardized version of the estimator can be established for the corresponding estimate under the null hypothesis and alternative. As a result we obtain a simple asymptotic frequency domain level $α$ test (using the quantiles of the normal distribution) for the hypothesis of stationarity of functional time series. Other applications such as asymptotic confidence intervals for a measure of stationarity or the construction of tests for "relevant deviations from stationarity", are also briefly mentioned. We demonstrate in a small simulation study that the new method has very good finite sample properties. Moreover, we apply our test to annual temperature curves.