Source author record

Holger Dette

Holger Dette appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Methodology Applications math.PR math.CA Computation cond-mat.mtrl-sci Cryptography and Security econ.EM q-fin.GN q-fin.ST

Catalog footprint

What is connected

84works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Inference for Multiple Change-points in Piecewise Locally Stationary Time Series

Change-point detection and locally stationary time series modeling are two major approaches for the analysis of non-stationary data. The former aims to identify stationary phases by detecting abrupt changes in the dynamics of a time series model, while the latter employs (locally) time-varying models to describe smooth changes in dependence structure of a time series. However, in some applications, abrupt and smooth changes can co-exist, and neither of the two approaches alone can model the data adequately. In this paper, we propose a novel likelihood-based procedure for the inference of multiple change-points in locally stationary time series. In contrast to traditional change-point analysis where an abrupt change occurs in a real-valued parameter, a change in locally stationary time series occurs in a parameter curve, and can be classified as a jump or a kink depending on whether the curve is discontinuous or not. We show that the proposed method can consistently estimate the number, locations, and the types of change-points. Two different asymptotic distributions corresponding respectively to jump and kink estimators are also established. Extensive simulation studies and a real data application to financial time series are provided.

preprint2026arXiv

Sequential Eigenvalue Statistics for Change-Point Detection in Covariance Matrices

Testing for change points in sequences of covariance matrices is an important and equally challenging problem in statistical methodology with applications in various fields. Motivated by the observation that even in cases where the ratio between dimension and sample size is as small as $0.05$, tests based on a fixed-dimension asymptotics do not keep their preassigned level, we propose to derive critical values of test statistics using an asymptotic regime where the dimension diverges at the same rate as the sample size. This paper introduces a novel and well-founded statistical methodology for detecting change points in a sequence of moderately dimensional covariance matrices. Our approach utilizes a min-type statistic based on a sequential process of likelihood ratio statistics. This is used to construct a test for the hypothesis of the existence of a change point with a corresponding estimator for its location. We provide theoretical guarantees by thoroughly analyzing the asymptotic properties of the sequential process of likelihood ratio statistics. In particular, we prove weak convergence towards a Gaussian process under the null hypothesis of no change. To identify the challenging dependency structure between consecutive test statistics, we employ tools from random matrix theory and stochastic processes.

preprint2023arXiv

Testing separability for continuous functional data

Analyzing the covariance structure of data is a fundamental task of statistics. While this task is simple for low-dimensional observations, it becomes challenging for more intricate objects, such as multivariate functions. Here, the covariance can be so complex that just saving a non-parametric estimate is impractical and structural assumptions are necessary to tame the model. One popular assumption for space-time data is separability of the covariance into purely spatial and temporal factors. In this paper, we present a new test for separability in the context of dependent functional time series. While most of the related work studies functional data in a Hilbert space of square integrable functions, we model the observations as objects in the space of continuous functions equipped with the supremum norm. We argue that this (mathematically challenging) setup enhances interpretability for users and is more in line with practical preprocessing. Our test statistic measures the maximal deviation between the estimated covariance kernel and a separable approximation. Critical values are obtained by a non-standard multiplier bootstrap for dependent data. We prove the statistical validity of our approach and demonstrate its practicability in a simulation study and a data example.

preprint2022arXiv

An RKHS approach for pivotal inference in functional linear regression

We develop methodology for testing hypotheses regarding the slope function in functional linear regression for time series via a reproducing kernel Hilbert space approach. In contrast to most of the literature, which considers tests for the exact nullity of the slope function, we are interested in the null hypothesis that the slope function vanishes only approximately, where deviations are measured with respect to the $L^2$-norm. An asymptotically pivotal test is proposed, which does not require the estimation of nuisance parameters and long-run covariances. The key technical tools to prove the validity of our approach include a uniform Bahadur representation and a weak invariance principle for a sequential process of estimates of the slope function. Both scalar-on-function and function-on-function linear regression are considered and finite-sample methods for implementing our methodology are provided. We also illustrate the potential of our methods by means of a small simulation study and a data example.

preprint2022arXiv

Detecting relevant changes in the spatiotemporal mean function

For a spatiotemporal process $\{X_j(s,t) | ~s \in S~,~t \in T \}_{j =1, \ldots , n} $, where $S$ denotes the set of spatial locations and $T$ the time domain, we consider the problem of testing for a change in the sequence of mean functions. In contrast to most of the literature we are not interested in arbitrarily small changes, but only in changes with a norm exceeding a given threshold. Asymptotically distribution free tests are proposed, which do not require the estimation of the long-run spatiotemporal covariance structure. In particular we consider a fully functional approach and a test based on the cumulative sum paradigm, investigate the large sample properties of the corresponding test statistics and study their finite sample properties by means of simulation study.

preprint2022arXiv

Statistical Quantification of Differential Privacy: A Local Approach

In this work, we introduce a new approach for statistical quantification of differential privacy in a black box setting. We present estimators and confidence intervals for the optimal privacy parameter of a randomized algorithm $A$, as well as other key variables (such as the "data-centric privacy level"). Our estimators are based on a local characterization of privacy and in contrast to the related literature avoid the process of "event selection" - a major obstacle to privacy validation. This makes our methods easy to implement and user-friendly. We show fast convergence rates of the estimators and asymptotic validity of the confidence intervals. An experimental study of various algorithms confirms the efficacy of our approach.

preprint2021arXiv

Efficient prediction of grain boundary energies from atomistic simulations via sequential design

Data based materials science is the new promise to accelerate materials design. Especially in computational materials science, data generation can easily be automatized. Usually, the focus is on processing and evaluating the data to derive rules or to discover new materials, while less attention is being paid on the strategy to generate the data. In this work, we show that by a sequential design of experiment scheme, the process of generating and learning from the data can be combined to discover the relevant sections of the parameter space. Our example is the energy of grain boundaries as a function of their geometric degrees of freedom, calculated via atomistic simulations. The sampling of this grain boundary energy space, or even subspaces of it, represents a challenge due to the presence of deep cusps of the energy, which are located at irregular intervals of the geometric parameters. Existing approaches to sample grain boundary energy subspaces therefore either need a huge amount of datapoints or a~priori knowledge of the positions of these cusps. We combine statistical methods with atomistic simulations and a sequential sampling technique and compare this strategy to a regular sampling technique. We thereby demonstrate that this sequential design is able to sample a subspace with a minimal amount of points while finding unknown cusps automatically.

preprint2021arXiv

Optimal designs for comparing regression curves -- dependence within and between groups

We consider the problem of designing experiments for the comparison of two regression curves describing the relation between a predictor and a response in two groups, where the data between and within the group may be dependent. In order to derive efficient designs we use results from stochastic analysis to identify the best linear unbiased estimator (BLUE) in a corresponding continuous time model. It is demonstrated that in general simultaneous estimation using the data from both groups yields more precise results than estimation of the parameters separately in the two groups. Using the BLUE from simultaneous estimation, we then construct an efficient linear estimator for finite sample size by minimizing the mean squared error between the optimal solution in the continuous time model and its discrete approximation with respect to the weights (of the linear estimator). Finally, the optimal design points are determined by minimizing the maximal width of a simultaneous confidence band for the difference of the two regression functions. The advantages of the new approach are illustrated by means of a simulation study, where it is shown that the use of the optimal designs yields substantially narrower confidence bands than the application of uniform designs.

preprint2021arXiv

Relevant change points in high dimensional time series

This paper investigates the problem of detecting relevant change points in the mean vector, say $μ_t =(μ_{1,t},\ldots ,μ_{d,t})^T$ of a high dimensional time series $(Z_t)_{t\in \mathbb{Z}}$. While the recent literature on testing for change points in this context considers hypotheses for the equality of the means $μ_h^{(1)}$ and $μ_h^{(2)}$ before and after the change points in the different components, we are interested in a null hypothesis of the form $$ H_0: |μ^{(1)}_{h} - μ^{(2)}_{h} | \leq Δ_h ~~~\mbox{ for all } ~~h=1,\ldots ,d $$ where $Δ_1, \ldots , Δ_d$ are given thresholds for which a smaller difference of the means in the $h$-th component is considered to be non-relevant. We propose a new test for this problem based on the maximum of squared and integrated CUSUM statistics and investigate its properties as the sample size $n$ and the dimension $d$ both converge to infinity. In particular, using Gaussian approximations for the maximum of a large number of dependent random variables, we show that on certain points of the boundary of the null hypothesis a standardised version of the maximum converges weakly to a Gumbel distribution.

preprint2020arXiv

A distribution free test for changes in the trend function of locally stationary processes

In the common time series model $X_{i,n} = μ(i/n) + \varepsilon_{i,n}$ with non-stationary errors we consider the problem of detecting a significant deviation of the mean function $μ$ from a benchmark $g (μ)$ (such as the initial value $μ(0)$ or the average trend $\int_{0}^{1} μ(t) dt$). The problem is motivated by a more realistic modelling of change point analysis, where one is interested in identifying relevant deviations in a smoothly varying sequence of means $ (μ(i/n))_{i =1,\ldots ,n }$ and cannot assume that the sequence is piecewise constant. A test for this type of hypotheses is developed using an appropriate estimator for the integrated squared deviation of the mean function and the threshold. By a new concept of self-normalization adapted to non-stationary processes an asymptotically pivotal test for the hypothesis of a relevant deviation is constructed. The results are illustrated by means of a simulation study and a data example.

preprint2020arXiv

A new approach for open-end sequential change point monitoring

We propose a new sequential monitoring scheme for changes in the parameters of a multivariate time series. In contrast to procedures proposed in the literature which compare an estimator from the training sample with an estimator calculated from the remaining data, we suggest to divide the sample at each time point after the training sample. Estimators from the sample before and after all separation points are then continuously compared calculating a maximum of norms of their differences. For open-end scenarios our approach yields an asymptotic level $α$ procedure, which is consistent under the alternative of a change in the parameter. By means of a simulation study it is demonstrated that the new method outperforms the commonly used procedures with respect to power and the feasibility of our approach is illustrated by analyzing two data examples.

preprint2020arXiv

A note on optimal designs for estimating the slope of a polynomial regression

In this note we consider the optimal design problem for estimating the slope of a polynomial regression with no intercept at a given point, say z. In contrast to previous work, which considers symmetric design spaces we investigate the model on the interval $[0, a]$ and characterize those values of $z$, where an explicit solution of the optimal design is possible.

preprint2020arXiv

A Portmanteau-type test for detecting serial correlation in locally stationary functional time series

The Portmanteau test provides the vanilla method for detecting serial correlations in classical univariate time series analysis. The method is extended to the case of observations from a locally stationary functional time series. Asymptotic critical values are obtained by a suitable block multiplier bootstrap procedure. The test is shown to asymptotically hold its level and to be consistent against general alternatives.

preprint2020arXiv

Are deviations in a gradually varying mean relevant? A testing approach based on sup-norm estimators

Classical change point analysis aims at (1) detecting abrupt changes in the mean of a possibly non-stationary time series and at (2) identifying regions where the mean exhibits a piecewise constant behavior. In many applications however, it is more reasonable to assume that the mean changes gradually in a smooth way. Those gradual changes may either be non-relevant (i.e., small), or relevant for a specific problem at hand, and the present paper presents statistical methodology to detect the latter. More precisely, we consider the common nonparametric regression model $X_{i} = μ(i/n) + \varepsilon_{i}$ with possibly non-stationary errors and propose a test for the null hypothesis that the maximum absolute deviation of the regression function $μ$ from a functional $g (μ)$ (such as the value $μ(0)$ or the integral $\int_{0}^{1} μ(t) dt$) is smaller than a given threshold on a given interval $[x_{0},x_{1}] \subseteq [0,1]$. A test for this type of hypotheses is developed using an appropriate estimator, say $\hat d_{\infty, n}$, for the maximum deviation $ d_{\infty}= \sup_{t \in [x_{0},x_{1}]} |μ(t) - g( μ) |$. We derive the limiting distribution of an appropriately standardized version of $\hat d_{\infty,n}$, where the standardization depends on the Lebesgue measure of the set of extremal points of the function $μ(\cdot)-g(μ)$. A refined procedure based on an estimate of this set is developed and its consistency is proved. The results are illustrated by means of a simulation study and a data example.

preprint2020arXiv

Design admissibility and de la Garza phenomenon in multi-factor experiments

The determination of an optimal design for a given regression problem is an intricate optimization problem, especially for models with multivariate predictors. Design admissibility and invariance are main tools to reduce the complexity of the optimization problem and have been successfully applied for models with univariate predictors. In particular several authors have developed sufficient conditions for the existence of saturated designs in univariate models, where the number of support points of the optimal design equals the number of parameters. These results generalize the celebrated de la Garza phenomenon (de la Garza, 1954) which states that for a polynomial regression model of degree $p-1$ any optimal design can be based on at most $p$ points. This paper provides - for the first time - extensions of these results for models with a multivariate predictor. In particular we study a geometric characterization of the support points of an optimal design to provide sufficient conditions for the occurrence of the de la Garza phenomenon in models with multivariate predictors and characterize properties of admissible designs in terms of admissibility of designs in conditional univariate regression models.

preprint2020arXiv

Detecting relevant differences in the covariance operators of functional time series -- a sup-norm approach

In this paper we propose statistical inference tools for the covariance operators of functional time series in the two sample and change point problem. In contrast to most of the literature the focus of our approach is not testing the null hypothesis of exact equality of the covariance operators. Instead we propose to formulate the null hypotheses in them form that "the distance between the operators is small", where we measure deviations by the sup-norm. We provide powerful bootstrap tests for these type of hypotheses, investigate their asymptotic properties and study their finite sample properties by means of a simulation study.

preprint2020arXiv

Efficient model-based Bioequivalence Testing

The classical approach to analyze pharmacokinetic (PK) data in bioequivalence studies aiming to compare two different formulations is to perform noncompartmental analysis (NCA) followed by two one-sided tests (TOST). In this regard the PK parameters $AUC$ and $C_{max}$ are obtained for both treatment groups and their geometric mean ratios are considered. According to current guidelines by the U.S. Food and Drug Administration and the European Medicines Agency the formulations are declared to be sufficiently similar if the $90\%$- confidence interval for these ratios falls between $0.8$ and $1.25$. As NCA is not a reliable approach in case of sparse designs, a model-based alternative has already been proposed for the estimation of $AUC$ and $C_{max}$ using non-linear mixed effects models. Here we propose another, more powerful test than the TOST and demonstrate its superiority through a simulation study both for NCA and model-based approaches. For products with high variability on PK parameters, this method appears to have closer type I errors to the conventionally accepted significance level of $0.05$, suggesting its potential use in situations where conventional bioequivalence analysis is not applicable.

preprint2020arXiv

Efficient tests for bio-equivalence in functional data

We study the problem of testing the equivalence of functional parameters (such as the mean or variance function) in the two sample functional data problem. In contrast to previous work, which reduces the functional problem to a multiple testing problem for the equivalence of scalar data by comparing the functions at each point, our approach is based on an estimate of a distance measuring the maximum deviation between the two functional parameters. Equivalence is claimed if the estimate for the maximum deviation does not exceed a given threshold. A bootstrap procedure is proposed to obtain quantiles for the distribution of the test statistic and consistency of the corresponding test is proved in the large sample scenario. As the methods proposed here avoid the use of the intersection-union principle they are less conservative and more powerful than the currently available methodology.

preprint2020arXiv

Prediction in locally stationary time series

We develop an estimator for the high-dimensional covariance matrix of a locally stationary process with a smoothly varying trend and use this statistic to derive consistent predictors in non-stationary time series. In contrast to the currently available methods for this problem the predictor developed here does not rely on fitting an autoregressive model and does not require a vanishing trend. The finite sample properties of the new methodology are illustrated by means of a simulation study and a financial indices study.

preprint2020arXiv

Quantifying deviations from separability in space-time functional processes

The estimation of covariance operators of spatio-temporal data is in many applications only computationally feasible under simplifying assumptions, such as separability of the covariance into strictly temporal and spatial factors.Powerful tests for this assumption have been proposed in the literature. However, as real world systems, such as climate data are notoriously inseparable, validating this assumption by statistical tests, seems inherently questionable. In this paper we present an alternative approach: By virtue of separability measures, we quantify how strongly the data's covariance operator diverges from a separable approximation. Confidence intervals localize these measures with statistical guarantees. This method provides users with a flexible tool, to weigh the computational gains of a separable model against the associated increase in bias. As separable approximations we consider the established methods of partial traces and partial products, and develop weak convergence principles for the corresponding estimators. Moreover, we also prove such results for estimators of optimal, separable approximations, which are arguably of most interest in applications. In particular we present for the first time statistical inference for this object, which has been confined to estimation previously. Besides confidence intervals, our results encompass tests for approximate separability. All methods proposed in this paper are free of nuisance parameters and do neither require computationally expensive resampling procedures nor the estimation of nuisance parameters. A simulation study underlines the advantages of our approach and its applicability is demonstrated by the investigation of German annual temperature data.

preprint2020arXiv

Statistical Inference for High Dimensional Panel Functional Time Series

In this paper we develop statistical inference tools for high dimensional functional time series. We introduce a new concept of physical dependent processes in the space of square integrable functions, which adopts the idea of basis decomposition of functional data in these spaces, and derive Gaussian and multiplier bootstrap approximations for sums of high dimensional functional time series. These results have numerous important statistical consequences. Exemplarily, we consider the development of joint simultaneous confidence bands for the mean functions and the construction of tests for the hypotheses that the mean functions in the spatial dimension are parallel. The results are illustrated by means of a small simulation study and in the analysis of Canadian temperature data.

preprint2020arXiv

Testing relevant hypotheses in functional time series via self-normalization

In this paper we develop methodology for testing relevant hypotheses about functional time series in a tuning-free way. Instead of testing for exact equality, for example for the equality of two mean functions from two independent time series, we propose to test the null hypothesis of no relevant deviation. In the two sample problem this means that an $L^2$-distance between the two mean functions is smaller than a pre-specified threshold. For such hypotheses self-normalization, which was introduced by Shao (2010) and Shao and Zhang (2010) and is commonly used to avoid the estimation of nuisance parameters, is not directly applicable. We develop new self-normalized procedures for testing relevant hypotheses in the one sample, two sample and change point problem and investigate their asymptotic properties. Finite sample properties of the proposed tests are illustrated by means of a simulation study and data examples. Our main focus is on functional time series, but extensions to other settings are also briefly discussed.

preprint2018arXiv

A nonparametric test for stationarity in functional time series

We propose a new measure for stationarity of a functional time series, which is based on an explicit representation of the $L^2$-distance between the spectral density operator of a non-stationary process and its best ($L^2$-)approximation by a spectral density operator corresponding to a stationary process. This distance can easily be estimated by sums of Hilbert-Schmidt inner products of periodogram operators (evaluated at different frequencies), and asymptotic normality of an appropriately standardized version of the estimator can be established for the corresponding estimate under the null hypothesis and alternative. As a result we obtain a simple asymptotic frequency domain level $α$ test (using the quantiles of the normal distribution) for the hypothesis of stationarity of functional time series. Other applications such as asymptotic confidence intervals for a measure of stationarity or the construction of tests for "relevant deviations from stationarity", are also briefly mentioned. We demonstrate in a small simulation study that the new method has very good finite sample properties. Moreover, we apply our test to annual temperature curves.

preprint2016arXiv

Bayesian $D$-optimal designs for error-in-variables models

Bayesian optimality criteria provide a robust design strategy to parameter misspecification. We develop an approximate design theory for Bayesian $D$-optimality for non-linear regression models with covariates subject to measurement errors. Both maximum likelihood and least squares estimation are studied and explicit characterisations of the Bayesian $D$-optimal saturated designs for the Michaelis-Menten, Emax and exponential regression models are provided. Several data examples are considered for the case of no preference for specific parameter values, where Bayesian $D$-optimal saturated designs are calculated using the uniform prior and compared to several other designs, including the corresponding locally $D$-optimal designs, which are often used in practice.

preprint2016arXiv

Best linear unbiased estimators in continuous time regression models

In this paper the problem of best linear unbiased estimation is investigated for continuous-time regression models. We prove several general statements concerning the explicit form of the best linear unbiased estimator (BLUE), in particular when the error process is a smooth process with one or several derivatives of the response process available for construction of the estimators. We derive the explicit form of the BLUE for many specific models including the cases of continuous autoregressive errors of order two and integrated error processes (such as integrated Brownian motion). The results are illustrated by several examples.

preprint2016arXiv

Change point detection in autoregressive models with no moment assumptions

In this paper we consider the problem of detecting a change in the parameters of an autoregressive process, where the moments of the innovation process do not necessarily exist. An empirical likelihood ratio test for the existence of a change point is proposed and its asymptotic properties are studied. In contrast to other work on change point tests using empirical likelihood, we do not assume knowledge of the location of the change point. In particular, we prove that the maximizer of the empirical likelihood is a consistent estimator for the parameters of the autoregressive model in the case of no change point and derive the limiting distribution of the corresponding test statistic under the null hypothesis. We also establish consistency of the new test. A nice feature of the method consists in the fact that the resulting test is asymptotically distribution free and does not require an estimate of the long run variance. The asymptotic properties of the test are investigated by means of a small simulation study, which demonstrates good finite sample properties of the proposed method.

preprint2016arXiv

Detecting long-range dependence in non-stationary time series

An important problem in time series analysis is the discrimination between non-stationarity and longrange dependence. Most of the literature considers the problem of testing specific parametric hypotheses of non-stationarity (such as a change in the mean) against long-range dependent stationary alternatives. In this paper we suggest a simple approach, which can be used to test the null-hypothesis of a general non-stationary short-memory against the alternative of a non-stationary long-memory process. The test procedure works in the spectral domain and uses a sequence of approximating tvFARIMA models to estimate the time varying long-range dependence parameter. We prove uniform consistency of this estimate and asymptotic normality of an averaged version. These results yield a simple test (based on the quantiles of the standard normal distribution), and it is demonstrated in a simulation study that - despite of its semi-parametric nature - the new test outperforms the currently available methods, which are constructed to discriminate between specific parametric hypotheses of non-stationarity short- and stationarity long-range dependence.

preprint2016arXiv

Equivalence of dose response curves

This paper investigates the problem whether the difference between two parametric models $m_1,m_2$ describing the relation between a response variable and several covariates in two different groups is practically irrelevant, such that inference can be performed on the basis of the pooled sample. Statistical methodology is developed to test the hypotheses $H_0 : d(m_1,m_2)\geq ε$ versus $H_1 : d(m_1,m_2) < ε$ to demonstrate equivalence between the two regression curves $m_1,m_2$ for a pre-specified threshold $ε$, where $d$ denotes a distance measuring the distance between $m_1$ and $m_2$. Our approach is based on the asymptotic properties of a suitable estimator $d(\hat{m}_1; \hat{m}_2)$ of this distance. In order to improve the approximation of the nominal level for small sample sizes a bootstrap test is developed, which addresses the specific form of the interval hypotheses. In particular, data has to be generated under the null hypothesis, which implicitly defines a manifold for the parameter vector. The results are illustrated by means of a simulation study and a data example. It is demonstrated that the new methods substantially improve currently available approaches with respect to power and approximation of the nominal level.

preprint2016arXiv

Hankel determinants of random moment sequences

For $ t \in [0,1]$ let $\underline{H}_{2\lfloor nt \rfloor} = ( m_{i+j})_{i,j=0}^{\lfloor nt \rfloor} $ denote the Hankel matrix of order $2\lfloor nt \rfloor$ of a random vector $(m_1,\ldots ,m_{2n})$ on the moment space $\mathcal{M}_{2n}(I)$ of all moments (up to the order $2n$) of probability measures on the interval $I \subset \mathbb{R} $. In this paper we study the asymptotic properties of the stochastic process $\{ \log \det \underline{H}_{2\lfloor nt \rfloor} \}_{t\in [0,1]}$ as $n \to \infty$. In particular weak convergence and corresponding large deviation principles are derived after appropriate standardization.

preprint2016arXiv

Multiscale inference for a multivariate density with applications to X-ray astronomy

In this paper we propose methods for inference of the geometric features of a multivariate density. Our approach uses multiscale tests for the monotonicity of the density at arbitrary points in arbitrary directions. In particular, a significance test for a mode at a specific point is constructed. Moreover, we develop multiscale methods for identifying regions of monotonicity and a general procedure for detecting the modes of a multivariate density. It is is shown that the latter method localizes the modes with an effectively optimal rate. The theoretical results are illustrated by means of a simulation study and a data example. The new method is applied to and motivated by the determination and verification of the position of high-energy sources from X-ray observations by the Swift satellite which is important for a multiwavelength analysis of objects such as Active Galactic Nuclei.

preprint2016arXiv

Multiscale inference for multivariate deconvolution

In this paper we provide new methodology for inference of the geometric features of a multivariate density in deconvolution. Our approach is based on multiscale tests to detect significant directional derivatives of the unknown density at arbitrary points in arbitrary directions. The multiscale method is used to identify regions of monotonicity and to construct a general procedure for the detection of modes of the multivariate density. Moreover, as an important application a significance test for the presence of a local maximum at a pre-specified point is proposed. The performance of the new methods is investigated from a theoretical point of view and the finite sample properties are illustrated by means of a small simulation study.

preprint2016arXiv

On Wigner-Ville Spectra and the Unicity of Time-Varying Quantile-Based Spectral Densities

The unicity of the time-varying quantile-based spectrum proposed in Birr et al. (2016) is established via an asymptotic representation result involving Wigner-Ville spectra.

preprint2016arXiv

Optimal designs for active controlled dose finding trials with efficacy-toxicity outcomes

Nonlinear regression models addressing both efficacy and toxicity outcomes are increasingly used in dose-finding trials, such as in pharmaceutical drug development. However, research on related experimental design problems for corresponding active controlled trials is still scarce. In this paper we derive optimal designs to estimate efficacy and toxicity in an active controlled clinical dose finding trial when the bivariate continuous outcomes are modeled either by polynomials up to degree 2, the Michaelis- Menten model, the Emax model, or a combination thereof. We determine upper bounds on the number of different doses levels required for the optimal design and provide conditions under which the boundary points of the design space are included in the optimal design. We also provide an analytical description of the minimally supported $D$-optimal designs and show that they do not depend on the correlation between the bivariate outcomes. We illustrate the proposed methods with numerical examples and demonstrate the advantages of the $D$-optimal design for a trial, which has recently been considered in the literature.

preprint2016arXiv

Optimal designs for comparing regression models with correlated observations

We consider the problem of efficient statistical inference for comparing two regression curves estimated from two samples of dependent measurements. Based on a representation of the best pair of linear unbiased estimators in continuous time models as a stochastic integral, an efficient pair of linear unbiased estimators with corresponding optimal designs for finite sample size is constructed. This pair minimises the width of the confidence band for the difference between the estimated curves. We thus extend results readily available in the literature to the case of correlated observations and provide an easily implementable and efficient solution. The advantages of using such pairs of estimators with corresponding optimal designs for the comparison of regression models are illustrated via numerical examples.

preprint2016arXiv

Optimal designs for dose response curves with common parameters

A common problem in Phase II clinical trials is the comparison of dose response curves corresponding to different treatment groups. If the effect of the dose level is described by parametric regression models and the treatments differ in the administration frequency (but not in the sort of drug) a reasonable assumption is that the regression models for the different treatments share common parameters. This paper develops optimal design theory for the comparison of different regression models with common parameters. We derive upper bounds on the number of support points of admissible designs, and explicit expressions for $D$-optimal designs are derived for frequently used dose response models with a common location parameter. If the location and scale parameter in the different models coincide, minimally supported designs are determined and sufficient conditions for their optimality in the class of all designs derived. The results are illustrated in a dose-finding study comparing monthly and weekly administration.

preprint2016arXiv

Optimal designs for regression models with autoregressive errors structure

In the one-parameter regression model with AR(1) and AR(2) errors we find explicit expressions and a continuous approximation of the optimal discrete design for the signed least square estimator. The results are used to derive the optimal variance of the best linear estimator in the continuous time model and to construct efficient estimators and corresponding optimal designs for finite samples. The resulting procedure (estimator and design) provides nearly the same efficiency as the weighted least squares and its variance is close to the optimal variance in the continuous time model. The results are illustrated by several examples demonstrating the feasibility of our approach.

preprint2016arXiv

Optimal discrimination designs for semi-parametric models

Much of the work in the literature on optimal discrimination designs assumes that the models of interest are fully specified, apart from unknown parameters in some models. Recent work allows errors in the models to be non-normally distributed but still requires the specification of the mean structures. This research is motivated by the interesting work of Otsu (2008) to discriminate among semi-parametric models by generalizing the KL-optimality criterion proposed by López-Fidalgo et al. (2007) and Tommasi and López-Fidalgo (2010). In our work we provide further important insights in this interesting optimality criterion. In particular, we propose a practical strategy for finding optimal discrimination designs among semi-parametric models that can also be verified using an equivalence theorem. In addition, we study properties of such optimal designs and identify important cases where the proposed semi-parametric optimal discrimination designs coincide with the celebrated T -optimal designs.

preprint2016arXiv

Quantile Spectral Analysis for Locally Stationary Time Series

Classical spectral methods are subject to two fundamental limitations: they only can account for covariance-related serial dependencies, and they require second-order stationarity. Much attention has been devoted lately to quantile-based spectral methods that go beyond covariance-based serial dependence features. At the same time, covariance-based methods relaxing stationarity into much weaker {\it local stationarity} conditions have been developed for a variety of time-series models. Here, we are combining those two approaches by proposing quantile-based spectral methods for locally stationary processes. We therefore introduce a time-varying version of the copula spectra that have been recently proposed in the literature, along with a suitable local lag-window estimator. We propose a new definition of local {\it strict} stationarity that allows us to handle completely general non-linear processes without any moment assumptions, thus accommodating our quantile-based concepts and methods. We establish a central limit theorem for the new estimators, and illustrate the power of the proposed methodology by means of a simulation study. Moreover, in two empirical studies (namely of the Standard \& Poor's 500 series and a temperature dataset recorded in Hohenpeissenberg) we demonstrate that the new approach detects important variations in serial dependence structures both across time and across quantiles. Such variations remain completely undetected, and are actually undetectable, via classical covariance-based spectral methods.

preprint2016arXiv

Quantile spectral processes: Asymptotic analysis and inference

Quantile- and copula-related spectral concepts recently have been considered by various authors. Those spectra, in their most general form, provide a full characterization of the copulas associated with the pairs $(X_t,X_{t-k})$ in a process $(X_t)_{t\in\mathbb{Z}}$, and account for important dynamic features, such as changes in the conditional shape (skewness, kurtosis), time-irreversibility, or dependence in the extremes that their traditional counterparts cannot capture. Despite various proposals for estimation strategies, only quite incomplete asymptotic distributional results are available so far for the proposed estimators, which constitutes an important obstacle for their practical application. In this paper, we provide a detailed asymptotic analysis of a class of smoothed rank-based cross-periodograms associated with the copula spectral density kernels introduced in Dette et al. [Bernoulli 21 (2015) 781-831]. We show that, for a very general class of (possibly nonlinear) processes, properly scaled and centered smoothed versions of those cross-periodograms, indexed by couples of quantile levels, converge weakly, as stochastic processes, to Gaussian processes. A first application of those results is the construction of asymptotic confidence intervals for copula spectral density kernels. The same convergence results also provide asymptotic distributions (under serially dependent observations) for a new class of rank-based spectral methods involving the Fourier transforms of rank-based serial statistics such as the Spearman, Blomqvist or Gini autocovariance coefficients.

preprint2015arXiv

A new approach to optimal designs for correlated observations

This paper presents a new and efficient method for the construction of optimal designs for regression models with dependent error processes. In contrast to most of the work in this field, which starts with a model for a finite number of observations and considers the asymptotic properties of estimators and designs as the sample size converges to infinity, our approach is based on a continuous time model. We use results from stochastic anal- ysis to identify the best linear unbiased estimator (BLUE) in this model. Based on the BLUE, we construct an efficient linear estimator and corresponding optimal designs in the model for finite sample size by minimizing the mean squared error between the opti- mal solution in the continuous time model and its discrete approximation with respect to the weights (of the linear estimator) and the optimal design points, in particular in the multi-parameter case. In contrast to previous work on the subject the resulting estimators and corresponding optimal designs are very efficient and easy to implement. This means that they are practi- cally not distinguishable from the weighted least squares estimator and the corresponding optimal designs, which have to be found numerically by non-convex discrete optimization. The advantages of the new approach are illustrated in several numerical examples.

preprint2015arXiv

Change point analysis of second order characteristics in non-stationary time series

An important assumption in the work on testing for structural breaks in time series consists in the fact that the model is formulated such that the stochastic process under the null hypothesis of "no change-point" is stationary. This assumption is crucial to derive (asymptotic) critical values for the corresponding testing procedures using an elegant and powerful mathematical theory, but it might be not very realistic from a practical point of view. This paper develops change point analysis under less restrictive assumptions and deals with the problem of detecting change points in the marginal variance and correlation structures of a non-stationary time series. A CUSUM approach is proposed, which is used to test the "classical" hypothesis of the form $H_0: θ_1=θ_2$ vs. $H_1: θ_1 \not =θ_2$, where $θ_1$ and $θ_2$ denote second order parameters of the process before and after a change point. The asymptotic distribution of the CUSUM test statistic is derived under the null hypothesis. This distribution depends in a complicated way on the dependency structure of the nonlinear non-stationary time series and a bootstrap approach is developed to generate critical values. The results are then extended to test the hypothesis of a {\it non relevant change point}, i.e. $H_0: | θ_1-θ_2 | \leq δ$, which reflects the fact that inference should not be changed, if the difference between the parameters before and after the change-point is small. In contrast to previous work, our approach does neither require the mean to be constant nor - in the case of testing for lag $k$-correlation - that the mean, variance and fourth order joint cumulants are constant under the null hypothesis. In particular, we allow that the variance has a change point at a different location than the auto-covariance.

preprint2015arXiv

Confidence bands for multivariate and time dependent inverse regression models

Uniform asymptotic confidence bands for a multivariate regression function in an inverse regression model with a convolution-type operator are constructed. The results are derived using strong approximation methods and a limit theorem for the supremum of a stationary Gaussian field over an increasing system of sets. As a particular application, asymptotic confidence bands for a time dependent regression function $f_t(x)$ ($x\in \mathbb {R}^d,t\in \mathbb {R}$) in a convolution-type inverse regression model are obtained. Finally, we demonstrate the practical feasibility of our proposed methods in a simulation study and an application to the estimation of the luminosity profile of the elliptical galaxy NGC5017. To the best knowledge of the authors, the results presented in this paper are the first which provide uniform confidence bands for multivariate nonparametric function estimation in inverse problems.

preprint2015arXiv

Confidence Corridors for Multivariate Generalized Quantile Regression

We focus on the construction of confidence corridors for multivariate nonparametric generalized quantile regression functions. This construction is based on asymptotic results for the maximal deviation between a suitable nonparametric estimator and the true function of interest which follow after a series of approximation steps including a Bahadur representation, a new strong approximation theorem and exponential tail inequalities for Gaussian random fields. As a byproduct we also obtain confidence corridors for the regression function in the classical mean regression. In order to deal with the problem of slowly decreasing error in coverage probability of the asymptotic confidence corridors, which results in meager coverage for small sample sizes, a simple bootstrap procedure is designed based on the leading term of the Bahadur representation. The finite sample properties of both procedures are investigated by means of a simulation study and it is demonstrated that the bootstrap procedure considerably outperforms the asymptotic bands in terms of coverage accuracy. Finally, the bootstrap confidence corridors are used to study the efficacy of the National Supported Work Demonstration, which is a randomized employment enhancement program launched in the 1970s. This article has supplementary materials.

preprint2015arXiv

Detecting gradual changes in locally stationary processes

In a wide range of applications, the stochastic properties of the observed time series change over time. The changes often occur gradually rather than abruptly: the properties are (approximately) constant for some time and then slowly start to change. In many cases, it is of interest to locate the time point where the properties start to vary. In contrast to the analysis of abrupt changes, methods for detecting smooth or gradual change points are less developed and often require strong parametric assumptions. In this paper, we develop a fully nonparametric method to estimate a smooth change point in a locally stationary framework. We set up a general procedure which allows us to deal with a wide variety of stochastic properties including the mean, (auto)covariances and higher moments. The theoretical part of the paper establishes the convergence rate of the new estimator. In addition, we examine its finite sample performance by means of a simulation study and illustrate the methodology by two applications to financial return data.

preprint2015arXiv

Efficient computation of Bayesian optimal discriminating designs

An efficient algorithm for the determination of Bayesian optimal discriminating designs for competing regression models is developed, where the main focus is on models with general distributional assumptions beyond the "classical" case of normally distributed homoscedastic errors. For this purpose we consider a Bayesian version of the Kullback- Leibler (KL) optimality criterion introduced by López-Fidalgo et al. (2007). Discretizing the prior distribution leads to local KL-optimal discriminating design problems for a large number of competing models. All currently available methods either require a large computation time or fail to calculate the optimal discriminating design, because they can only deal efficiently with a few model comparisons. In this paper we develop a new algorithm for the determination of Bayesian optimal discriminating designs with respect to the Kullback-Leibler criterion. It is demonstrated that the new algorithm is able to calculate the optimal discriminating designs with reasonable accuracy and computational time in situations where all currently available procedures are either slow or fail.

preprint2015arXiv

Model Selection versus Model Averaging in Dose Finding Studies

Phase II dose finding studies in clinical drug development are typically conducted to adequately characterize the dose response relationship of a new drug. An important decision is then on the choice of a suitable dose response function to support dose selection for the subsequent Phase III studies. In this paper we compare different approaches for model selection and model averaging using mathematical properties as well as simulations. Accordingly, we review and illustrate asymptotic properties of model selection criteria and investigate their behavior when changing the sample size but keeping the effect size constant. In a large scale simulation study we investigate how the various approaches perform in realistically chosen settings. Finally, the different methods are illustrated with a recently conducted Phase II dosefinding study in patients with chronic obstructive pulmonary disease.

preprint2015arXiv

Of copulas, quantiles, ranks and spectra: An $L_1$-approach to spectral analysis

In this paper, we present an alternative method for the spectral analysis of a univariate, strictly stationary time series $\{Y_t\}_{t\in \mathbb {Z}}$. We define a "new" spectrum as the Fourier transform of the differences between copulas of the pairs $(Y_t,Y_{t-k})$ and the independence copula. This object is called a copula spectral density kernel and allows to separate the marginal and serial aspects of a time series. We show that this spectrum is closely related to the concept of quantile regression. Like quantile regression, which provides much more information about conditional distributions than classical location-scale regression models, copula spectral density kernels are more informative than traditional spectral densities obtained from classical autocovariances. In particular, copula spectral density kernels, in their population versions, provide (asymptotically provide, in their sample versions) a complete description of the copulas of all pairs $(Y_t,Y_{t-k})$. Moreover, they inherit the robustness properties of classical quantile regression, and do not require any distributional assumptions such as the existence of finite moments. In order to estimate the copula spectral density kernel, we introduce rank-based Laplace periodograms which are calculated as bilinear forms of weighted $L_1$-projections of the ranks of the observed time series onto a harmonic regression model. We establish the asymptotic distribution of those periodograms, and the consistency of adequately smoothed versions. The finite-sample properties of the new methodology, and its potential for applications are briefly investigated by simulations and a short empirical example.

preprint2015arXiv

Optimal designs in regression with correlated errors

This paper discusses the problem of determining optimal designs for regression models, when the observations are dependent and taken on an interval. A complete solution of this challenging optimal design problem is given for a broad class of regression models and covariance kernels. We propose a class of estimators which are only slightly more complicated than the ordinary least-squares estimators. We then demonstrate that we can design the experiments, such that asymptotically the new estimators achieve the same precision as the best linear unbiased estimator computed for the whole trajectory of the process. As a by-product we derive explicit expressions for the BLUE in the continuous time model and analytic expressions for the optimal designs in a wide class of regression models. We also demonstrate that for a finite number of observations the precision of the proposed procedure, which includes the estimator and design, is very close to the best achievable. The results are illustrated on a few numerical examples.

preprint2015arXiv

Quantile Correlations: Uncovering temporal dependencies in financial time series

We conduct an empirical study using the quantile-based correlation function to uncover the temporal dependencies in financial time series. The study uses intraday data for the S\&P 500 stocks from the New York Stock Exchange. After establishing an empirical overview we compare the quantile-based correlation function to stochastic processes from the GARCH family and find striking differences. This motivates us to propose the quantile-based correlation function as a powerful tool to assess the agreements between stochastic processes and empirical data.

preprint2015arXiv

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix

For a sample of $n$ independent identically distributed $p$-dimensional centered random vectors with covariance matrix $\mathbfΣ_n$ let $\tilde{\mathbf{S}}_n$ denote the usual sample covariance (centered by the mean) and $\mathbf{S}_n$ the non-centered sample covariance matrix (i.e. the matrix of second moment estimates), where $p> n$. In this paper, we provide the limiting spectral distribution and central limit theorem for linear spectral statistics of the Moore-Penrose inverse of $\mathbf{S}_n$ and $\tilde{\mathbf{S}}_n$. We consider the large dimensional asymptotics when the number of variables $p\rightarrow\infty$ and the sample size $n\rightarrow\infty$ such that $p/n\rightarrow c\in (1, +\infty)$. We present a Marchenko-Pastur law for both types of matrices, which shows that the limiting spectral distributions for both sample covariance matrices are the same. On the other hand, we demonstrate that the asymptotic distribution of linear spectral statistics of the Moore-Penrose inverse of $\tilde{\mathbf{S}}_n$ differs in the mean from that of $\mathbf{S}_n$.

preprint2014arXiv

$E$-optimal designs for second-order response surface models

$E$-optimal experimental designs for a second-order response surface model with $k\geq1$ predictors are investigated. If the design space is the $k$-dimensional unit cube, Galil and Kiefer [J. Statist. Plann. Inference 1 (1977a) 121-132] determined optimal designs in a restricted class of designs (defined by the multiplicity of the minimal eigenvalue) and stated their universal optimality as a conjecture. In this paper, we prove this claim and show that these designs are in fact $E$-optimal in the class of all approximate designs. Moreover, if the design space is the unit ball, $E$-optimal designs have not been found so far and we also provide a complete solution to this optimal design problem. The main difficulty in the construction of $E$-optimal designs for the second-order response surface model consists in the fact that for the multiplicity of the minimum eigenvalue of the "optimal information matrix" is larger than one (in contrast to the case $k=1$) and as a consequence the corresponding optimality criterion is not differentiable at the optimal solution. These difficulties are solved by considering nonlinear Chebyshev approximation problems, which arise from a corresponding equivalence theorem. The extremal polynomials which solve these Chebyshev problems are constructed explicitly leading to a complete solution of the corresponding $E$-optimal design problems.

preprint2014arXiv

Bayesian T-optimal discriminating designs

The problem of constructing Bayesian optimal discriminating designs for a class of regression models with respect to the T-optimality criterion introduced by Atkinson and Fedorov (1975a) is considered. It is demonstrated that the discretization of the integral with respect to the prior distribution leads to locally T-optimal discrimination designs can only deal with a few comparisons, but the discretization of the Bayesian prior easily yields to discrimination design problems for more than 100 competing models. A new efficient method is developed to deal with problems of this type. It combines some features of the classical exchange type algorithm with the gradient methods. Convergence is proved and it is demonstrated that the new method can find Bayesian optimal discriminating designs in situations where all currently available procedures fail.

preprint2014arXiv

Designing dose finding studies with an active control for exponential families

In a recent paper Dette et al. (2014) introduced optimal design problems for dose fnding studies with an active control. These authors concentrated on regression models with normal distributed errors (with known variance) and the problem of determining optimal designs for estimating the smallest dose, which achieves the same treatment effect as the active control. This paper discusses the problem of designing active-controlled dose fnding studies from a broader perspective. In particular, we consider a general class of optimality criteria and models arising from an exponential family, which are frequently used analyzing count data. We investigate under which circumstances optimal designs for dose fnding studies including a placebo can be used to obtain optimal designs for studies with an active control. Optimal designs are constructed for several situations and the differences arising from different distributional assumptions are investigated in detail. In particular, our results are applicable for constructing optimal experimental designs to analyze active-controlled dose fnding studies with discrete data, and we illustrate the efficiency of the new optimal designs with two recent examples from our consulting projects.

preprint2014arXiv

Detecting Gradual Changes in Locally Stationary Processes

In a wide range of applications, the stochastic properties of the observed time series change over time. The changes often occur gradually rather than abruptly: the prop- erties are (approximately) constant for some time and then slowly start to change. In such situations, it is frequently of interest to locate the time point where the properties start to vary. In contrast to the analysis of abrupt changes, methods for detecting smooth or gradual change points are less developed and often require strong paramet- ric assumptions. In this paper, we develop a fully nonparametric method to estimate a smooth change point in a locally stationary framework. We set up a general procedure which allows to deal with a wide variety of stochastic properties including the mean, (auto)covariances and higher-order moments. The theoretical part of the paper estab- lishes the convergence rate of the new estimator. In addition, we examine its finite sample performance by means of a simulation study and illustrate the methodology by applications to temperature and financial return data.

preprint2014arXiv

Detecting relevant changes in time series models

Most of the literature on change-point analysis by means of hypothesis testing considers hypotheses of the form H0 : θ_1 = θ_2 vs. H1 : θ_1 != θ_2, where θ_1 and θ_2 denote parameters of the process before and after a change point. This paper takes a different perspective and investigates the null hypotheses of no relevant changes, i.e. H0 : ||θ_1 - θ_2|| ? \leq Δ?, where || \cdot || is an appropriate norm. This formulation of the testing problem is motivated by the fact that in many applications a modification of the statistical analysis might not be necessary, if the difference between the parameters before and after the change-point is small. A general approach to problems of this type is developed which is based on the CUSUM principle. For the asymptotic analysis weak convergence of the sequential empirical process has to be established under the alternative of non-stationarity, and it is shown that the resulting test statistic is asymptotically normal distributed. Several applications of the methodology are given including tests for relevant changes in the mean, variance, parameter in a linear regression model and distribution function among others. The finite sample properties of the new tests are investigated by means of a simulation study and illustrated by analyzing a data example from economics.

preprint2014arXiv

Nonparametric tests for detecting breaks in the jump behaviour of a time-continuous process

This paper is concerned with tests for changes in the jump behaviour of a time-continuous process. Based on results on weak convergence of a sequential empirical tail integral process, asymptotics of certain tests statistics for breaks in the jump measure of an Ito semimartingale are constructed. Whenever limiting distributions depend in a complicated way on the unknown jump measure, empirical quantiles are obtained using a multiplier bootstrap scheme. An extensive simulation study shows a good performance of our tests in finite samples.

preprint2014arXiv

Optimal designs for comparing curves

We consider the optimal design problem for a comparison of two regression curves, which is used to establish the similarity between the dose response relationships of two groups. An optimal pair of designs minimizes the width of the confidence band for the difference between the two regression functions. Optimal design theory (equivalence theorems, efficiency bounds) is developed for this non standard design problem and for some commonly used dose response models optimal designs are found explicitly. The results are illustrated in several examples modeling dose response relationships. It is demonstrated that the optimal pair of designs for the comparison of the regression curves is not the pair of the optimal designs for the individual models. In particular it is shown that the use of the optimal designs proposed in this paper instead of commonly used "non-optimal" designs yields a reduction of the width of the confidence band by more than 50%.

preprint2013arXiv

A test for stationarity based on empirical processes

In this paper we investigate the problem of testing the assumption of stationarity in locally stationary processes. The test is based on an estimate of a Kolmogorov-Smirnov type distance between the true time varying spectral density and its best approximation through a stationary spectral density. Convergence of a time varying empirical spectral process indexed by a class of certain functions is proved, and furthermore the consistency of a bootstrap procedure is shown which is used to approximate the limiting distribution of the test statistic. Compared to other methods proposed in the literature for the problem of testing for stationarity the new approach has at least two advantages: On one hand, the test can detect local alternatives converging to the null hypothesis at any rate $g_T\to0$ such that $g_TT^{1/2}\to \infty$, where $T$ denotes the sample size. On the other hand, the estimator is based on only one regularization parameter while most alternative procedures require two. Finite sample properties of the method are investigated by means of a simulation study, and a comparison with several other tests is provided which have been proposed in the literature.

preprint2013arXiv

Censored quantile regression processes under dependence and penalization

We consider quantile regression processes from censored data under dependent data structures and derive a uniform Bahadur representation for those processes. We also consider cases where the dimension of the parameter in the quantile regression model is large. It is demonstrated that traditional penalized estimators such as the adaptive lasso yield sub-optimal rates if the coefficients of the quantile regression cross zero. New penalization techniques are introduced which are able to deal with specific problems of censored data and yield estimates with an optimal rate. In contrast to most of the literature, the asymptotic analysis does not require the assumption of independent observations, but is based on rather weak assumptions, which are satisfied for many kinds of dependent data.

preprint2013arXiv

Complete classes of designs for nonlinear regression models and principal representations of moment spaces

In a recent paper Yang and Stufken [Ann. Statist. 40 (2012a) 1665-1685] gave sufficient conditions for complete classes of designs for nonlinear regression models. In this note we demonstrate that there is an alternative way to validate this result. Our main argument utilizes the fact that boundary points of moment spaces generated by Chebyshev systems possess unique representations.

preprint2013arXiv

Detection of multiple structural breaks in multivariate time series

We propose a new nonparametric procedure for the detection and estimation of multiple structural breaks in the autocovariance function of a multivariate (second- order) piecewise stationary process, which also identifies the components of the series where the breaks occur. The new method is based on a comparison of the estimated spectral distribution on different segments of the observed time series and consists of three steps: it starts with a consistent test, which allows to prove the existence of structural breaks at a controlled type I error. Secondly, it estimates sets containing possible break points and finally these sets are reduced to identify the relevant structural breaks and corresponding components which are responsible for the changes in the autocovariance structure. In contrast to all other methods which have been proposed in the literature, our approach does not make any parametric assumptions, is not especially designed for detecting one single change point and addresses the problem of multiple structural breaks in the autocovariance function directly with no use of the binary segmentation algorithm. We prove that the new procedure detects all components and the corresponding locations where structural breaks occur with probability converging to one as the sample size increases and provide data-driven rules for the selection of all regularization parameters. The results are illustrated by analyzing financial returns, and in a simulation study it is demonstrated that the new procedure outperforms the currently available nonparametric methods for detecting breaks in the dependency structure of multivariate time series.

preprint2013arXiv

Measuring stationarity in long-memory processes

In this paper we consider the problem of measuring stationarity in locally stationary long-memory processes. We introduce an $L_2$-distance between the spectral density of the locally stationary process and its best approximation under the assumption of stationarity. The distance is estimated by a numerical approximation of the integrated spectral periodogram and asymptotic normality of the resulting estimate is established. The results can be used to construct a simple test for the hypothesis of stationarity in locally stationary long-range dependent processes. We also propose a bootstrap procedure to improve the approximation of the nominal level and prove its consistency. Throughout the paper, we will work with Riemann sums of a squared periodogram instead of integrals (as it is usually done in the literature) and as a by-product of independent interest it is demonstrated that the two approaches behave differently in the limit.

preprint2013arXiv

Misspecification in copula-based regression

In a recent paper Noh et al. (2013) proposed a new semiparametric estimate of a regression function with a multivariate predictor, which is based on a specification of the dependence structure between the predictor and the response by means of a parametric copula. This paper investigates the effect which occurs under misspecification of the parametric model. We demonstrate that even for a one or two dimensional predictor the error caused by a \wrong" specification of the parametric family is rather severe, if the regression is not monotone in one of the components of the predictor. Moreover, we also show that these problems occur for all of the commonly used copula families and we illustrate in several examples that the copula-based regression may lead to invalid results even when more exible copula models such as vine copulae (with the common parametric families) are used in the estimation procedure.

preprint2013arXiv

Multiplier bootstrap of tail copulas with applications

For the problem of estimating lower tail and upper tail copulas, we propose two bootstrap procedures for approximating the distribution of the corresponding empirical tail copulas. The first method uses a multiplier bootstrap of the empirical tail copula process and requires estimation of the partial derivatives of the tail copula. The second method avoids this estimation problem and uses multipliers in the two-dimensional empirical distribution function and in the estimates of the marginal distributions. For both multiplier bootstrap procedures, we prove consistency. For these investigations, we demonstrate that the common assumption of the existence of continuous partial derivatives in the the literature on tail copula estimation is so restrictive, such that the tail copula corresponding to tail independence is the only tail copula with this property. Moreover, we are able to solve this problem and prove weak convergence of the empirical tail copula process under nonrestrictive smoothness assumptions that are satisfied for many commonly used models. These results are applied in several statistical problems, including minimum distance estimation and goodness-of-fit testing.

preprint2013arXiv

Optimal design for linear models with correlated observations

In the common linear regression model the problem of determining optimal designs for least squares estimation is considered in the case where the observations are correlated. A necessary condition for the optimality of a given design is provided, which extends the classical equivalence theory for optimal designs in models with uncorrelated errors to the case of dependent data. If the regression functions are eigenfunctions of an integral operator defined by the covariance kernel, it is shown that the corresponding measure defines a universally optimal design. For several models universally optimal designs can be identified explicitly. In particular, it is proved that the uniform distribution is universally optimal for a class of trigonometric regression models with a broad class of covariance kernels and that the arcsine distribution is universally optimal for the polynomial regression model with correlation structure defined by the logarithmic potential. To the best knowledge of the authors these findings provide the first explicit results on optimal designs for regression models with correlated observations, which are not restricted to the location scale model.

preprint2013arXiv

Optimal designs for multi-response generalized linear models with applications in thermal spraying

We consider the problem of designing experiments for investigating particle in-flight properties in thermal spraying. Observations are available on an extensive design for an initial day and thereafter in limited number for any particular day. Generalized linear models including additional day effects are used for analyzing the process, where the models vary with respect to different responses. We construct robust D-optimal designs to collect additional data on any current day, which are efficient for the estimation of the parameters in all models under consideration. These designs improve a reference fractional factorial design substantially. We also investigate designs, which maximize the power of the test for an additional day effect. The results are used to design additional experiments of the thermal spraying process and a comparison of the statistical analysis based on a reference design as well as on a selected D-optimal design is performed.

preprint2013arXiv

Optimal designs for nonlinear regression models with respect to non-informative priors

In nonlinear regression models the Fisher information depends on the parameters of the model. Consequently, optimal designs maximizing some functional of the information matrix cannot be implemented directly but require some preliminary knowledge about the unknown parameters. Bayesian optimality criteria provide an attractive solution to this problem. These criteria depend sensitively on a reasonable specification of a prior distribution for the model parameters which might not be available in all applications. In this paper we investigate Bayesian optimality criteria with non-informative prior dis- tributions. In particular, we study the Jeffreys and the Berger-Bernardo prior for which the corresponding optimality criteria are not necessarily concave. Several examples are investigated where optimal designs with respect to the new criteria are calculated and compared to Bayesian optimal designs based on a uniform and a functional uniform prior.

preprint2013arXiv

Optimal discriminating designs for several competing regression models

The problem of constructing optimal discriminating designs for a class of regression models is considered. We investigate a version of the $T_p$-optimality criterion as introduced by Atkinson and Fedorov [Biometrika 62 (1975a) 289-303]. The numerical construction of optimal designs is very hard and challenging, if the number of pairwise comparisons is larger than 2. It is demonstrated that optimal designs with respect to this type of criteria can be obtained by solving (nonlinear) vector-valued approximation problems. We use a characterization of the best approximations to develop an efficient algorithm for the determination of the optimal discriminating designs. The new procedure is compared with the currently available methods in several numerical examples, and we demonstrate that the new method can find optimal discriminating designs in situations where the currently available procedures fail.

preprint2013arXiv

Robust T-optimal discriminating designs

This paper considers the problem of constructing optimal discriminating experimental designs for competing regression models on the basis of the T-optimality criterion introduced by Atkinson and Fedorov [Biometrika 62 (1975) 57-70]. T-optimal designs depend on unknown model parameters and it is demonstrated that these designs are sensitive with respect to misspecification. As a solution to this problem we propose a Bayesian and standardized maximin approach to construct robust and efficient discriminating designs on the basis of the T-optimality criterion. It is shown that the corresponding Bayesian and standardized maximin optimality criteria are closely related to linear optimality criteria. For the problem of discriminating between two polynomial regression models which differ in the degree by two the robust T-optimal discriminating designs can be found explicitly. The results are illustrated in several examples.

preprint2013arXiv

Smooth backfitting in additive inverse regression

We consider the problem of estimating an additive regression function in an inverse regres- sion model with a convolution type operator. A smooth backfitting procedure is developed and asymptotic normality of the resulting estimator is established. Compared to other meth- ods for the estimation in additive models the new approach neither requires observations on a regular grid nor the estimation of the joint density of the predictor. It is also demonstrated by means of a simulation study that the backfitting estimator outperforms the marginal in- tegration method at least by a factor two with respect to the integrated mean squared error criterion.

preprint2012arXiv

$T$-optimal designs for discrimination between two polynomial models

This paper is devoted to the explicit construction of optimal designs for discrimination between two polynomial regression models of degree $n-2$ and $n$. In a fundamental paper, Atkinson and Fedorov [Biometrika 62 (1975a) 57--70] proposed the $T$-optimality criterion for this purpose. Recently, Atkinson [MODA 9, Advances in Model-Oriented Design and Analysis (2010) 9--16] determined $T$-optimal designs for polynomials up to degree 6 numerically and based on these results he conjectured that the support points of the optimal design are cosines of the angles that divide half of the circle into equal parts if the coefficient of $x^{n-1}$ in the polynomial of larger degree vanishes. In the present paper we give a strong justification of the conjecture and determine all $T$-optimal designs explicitly for any degree $n\in\mathbb{N}$. In particular, we show that there exists a one-dimensional class of $T$-optimal designs. Moreover, we also present a generalization to the case when the ratio between the coefficients of $x^{n-1}$ and $x^n$ is smaller than a certain critical value. Because of the complexity of the optimization problem, $T$-optimal designs have only been determined numerically so far, and this paper provides the first explicit solution of the $T$-optimal design problem since its introduction by Atkinson and Fedorov [Biometrika 62 (1975a) 57--70]. Finally, for the remaining cases (where the ratio of coefficients is larger than the critical value), we propose a numerical procedure to calculate the $T$-optimal designs. The results are also illustrated in an example.

preprint2012arXiv

Asymptotic optimal designs under long-range dependence error structure

We discuss the optimal design problem in regression models with long-range dependence error structure. Asymptotic optimal designs are derived and it is demonstrated that these designs depend only indirectly on the correlation function. Several examples are investigated to illustrate the theory. Finally, the optimal designs are compared with asymptotic optimal designs which were derived by Bickel and Herzberg [Ann. Statist. 7 (1979) 77--95] for regression models with short-range dependent error.

preprint2012arXiv

Distributions on unbounded moment spaces and random moment sequences

In this paper we define distributions on moment spaces corresponding to measures on the real line with an unbounded support. We identify these distributions as limiting distributions of random moment vectors defined on compact moment spaces and as distributions corresponding to random spectral measures associated with the Jacobi, Laguerre and Hermite ensemble from random matrix theory. For random vectors on the unbounded moment spaces we prove a central limit theorem where the centering vectors correspond to the moments of the Marchenko-Pastur distribution and Wigner's semi-circle law.

preprint2012arXiv

Model checks for the volatility under microstructure noise

We consider the problem of testing the parametric form of the volatility for high frequency data. It is demonstrated that in the presence of microstructure noise commonly used tests do not keep the preassigned level and are inconsistent. The concept of preaveraging is used to construct new tests, which do not suffer from these drawbacks. These tests are based on a Kolmogorov-Smirnov or Cramer-von-Mises functional of an integrated stochastic process, for which weak convergence to a (conditional) Gaussian process is established. The finite sample properties of a bootstrap version of the test are illustrated by means of a simulation study.

preprint2012arXiv

Significance testing in quantile regression

We consider the problem of testing significance of predictors in multivariate nonparametric quantile regression. A stochastic process is proposed, which is based on a comparison of the responses with a nonparametric quantile regression estimate under the null hypothesis. It is demonstrated that under the null hypothesis this process converges weakly to a centered Gaussian process and the asymptotic properties of the test under fixed and local alternatives are also discussed. In particular we show, that - in contrast to the nonparametric approach based on estimation of $L^2$-distances - the new test is able to detect local alternatives which converge to the null hypothesis with any rate $a_n \to 0$ such that $a_n \sqrt{n} \to \infty$ (here $n$ denotes the sample size). We also present a small simulation study illustrating the finite sample properties of a bootstrap version of the the corresponding Kolmogorov-Smirnov test.

preprint2012arXiv

Zeros and ratio asymptotics for matrix orthogonal polynomials

Ratio asymptotics for matrix orthogonal polynomials with recurrence coefficients $A_n$ and $B_n$ having limits $A$ and $B$ respectively (the matrix Nevai class) were obtained by Durán. In the present paper we obtain an alternative description of the limiting ratio. We generalize it to recurrence coefficients which are asymptotically periodic with higher periodicity, and/or which are slowly varying in function of a parameter. Under such assumptions, we also find the limiting zero distribution of the matrix orthogonal polynomials, generalizing results by Durán-López-Saff and Dette-Reuther to the non-Hermitian case. Our proofs are based on "normal family" arguments and on the solution to a quadratic eigenvalue problem. As an application of our results we obtain new explicit formulas for the spectral measures of the matrix Chebyshev polynomials of the first and second kind, and we derive the asymptotic eigenvalue distribution for a class of random band matrices generalizing the tridiagonal matrices introduced by Dumitriu-Edelman.

preprint2011arXiv

A note on the de la Garza phenomenon for locally optimal designs

The celebrated de la Garza phenomenon states that for a polynomial regression model of degree $p-1$ any optimal design can be based on at most $p$ design points. In a remarkable paper, Yang [Ann. Statist. 38 (2010) 2499--2524] showed that this phenomenon exists in many locally optimal design problems for nonlinear models. In the present note, we present a different view point on these findings using results about moment theory and Chebyshev systems. In particular, we show that this phenomenon occurs in an even larger class of models than considered so far.

preprint2011arXiv

A test for Archimedeanity in bivariate copula models

We propose a new test for the hypothesis that a bivariate copula is an Archimedean copula. The test statistic is based on a combination of two measures resulting from the characterization of Archimedean copulas by the property of associativity and by a strict upper bound on the diagonal by the Fréchet-upper bound. We prove weak convergence of this statistic and show that the critical values of the corresponding test can be determined by the multiplier bootstrap method. The test is shown to be consistent against all departures from Archimedeanity if the copula satisfies weak smoothness assumptions. A simulation study is presented which illustrates the finite sample properties of the new test.

preprint2011arXiv

Matrix measures, random moments and Gaussian ensembles

We consider the moment space $\mathcal{M}_n$ corresponding to $p \times p$ real or complex matrix measures defined on the interval $[0,1]$. The asymptotic properties of the first $k$ components of a uniformly distributed vector $(S_{1,n}, ..., S_{n,n})^* \sim \mathcal{U} (\mathcal{M}_n)$ are studied if $n \to \infty$. In particular, it is shown that an appropriately centered and standardized version of the vector $(S_{1,n}, ..., S_{k,n})^*$ converges weakly to a vector of $k$ independent $p \times p$ Gaussian ensembles. For the proof of our results we use some new relations between ordinary moments and canonical moments of matrix measures which are of their own interest. In particular, it is shown that the first $k$ canonical moments corresponding to the uniform distribution on the real or complex moment space $\mathcal{M}_n$ are independent multivariate Beta distributed random variables and that each of these random variables converge in distribution (if the parameters converge to infinity) to the Gaussian orthogonal ensemble or to the Gaussian unitary ensemble, respectively.

preprint2011arXiv

New estimators of the Pickands dependence function and a test for extreme-value dependence

We propose a new class of estimators for Pickands dependence function which is based on the concept of minimum distance estimation. An explicit integral representation of the function $A^*(t)$, which minimizes a weighted $L^2$-distance between the logarithm of the copula $C(y^{1-t},y^t)$ and functions of the form $A(t)\log(y)$ is derived. If the unknown copula is an extreme-value copula, the function $A^*(t)$ coincides with Pickands dependence function. Moreover, even if this is not the case, the function $A^*(t)$ always satisfies the boundary conditions of a Pickands dependence function. The estimators are obtained by replacing the unknown copula by its empirical counterpart and weak convergence of the corresponding process is shown. A comparison with the commonly used estimators is performed from a theoretical point of view and by means of a simulation study. Our asymptotic and numerical results indicate that some of the new estimators outperform the estimators, which were recently proposed by Genest and Segers [Ann. Statist. 37 (2009) 2990--3022]. As a by-product of our results, we obtain a simple test for the hypothesis of an extreme-value copula, which is consistent against all positive quadrant dependent alternatives satisfying weak differentiability assumptions of first order.

preprint2011arXiv

Response-adaptive dose-finding under model uncertainty

Dose-finding studies are frequently conducted to evaluate the effect of different doses or concentration levels of a compound on a response of interest. Applications include the investigation of a new medicinal drug, a herbicide or fertilizer, a molecular entity, an environmental toxin, or an industrial chemical. In pharmaceutical drug development, dose-finding studies are of critical importance because of regulatory requirements that marketed doses are safe and provide clinically relevant efficacy. Motivated by a dose-finding study in moderate persistent asthma, we propose response-adaptive designs addressing two major challenges in dose-finding studies: uncertainty about the dose-response models and large variability in parameter estimates. To allocate new cohorts of patients in an ongoing study, we use optimal designs that are robust under model uncertainty. In addition, we use a Bayesian shrinkage approach to stabilize the parameter estimates over the successive interim analyses used in the adaptations. This approach allows us to calculate updated parameter estimates and model probabilities that can then be used to calculate the optimal design for subsequent cohorts. The resulting designs are hence robust with respect to model misspecification and additionally can efficiently adapt to the information accrued in an ongoing study. We focus on adaptive designs for estimating the minimum effective dose, although alternative optimality criteria or mixtures thereof could be used, enabling the design to address multiple objectives.

preprint2010arXiv

Optimal designs for discriminating between dose-response models in toxicology studies

We consider design issues for toxicology studies when we have a continuous response and the true mean response is only known to be a member of a class of nested models. This class of non-linear models was proposed by toxicologists who were concerned only with estimation problems. We develop robust and efficient designs for model discrimination and for estimating parameters in the selected model at the same time. In particular, we propose designs that maximize the minimum of $D$- or $D_1$-efficiencies over all models in the given class. We show that our optimal designs are efficient for determining an appropriate model from the postulated class, quite efficient for estimating model parameters in the identified model and also robust with respect to model misspecification. To facilitate the use of optimal design ideas in practice, we have also constructed a website that freely enables practitioners to generate a variety of optimal designs for a range of models and also enables them to evaluate the efficiency of any design.

preprint2010arXiv

Optimal designs for random effect models with correlated errors with applications in population pharmacokinetics

We consider the problem of constructing optimal designs for population pharmacokinetics which use random effect models. It is common practice in the design of experiments in such studies to assume uncorrelated errors for each subject. In the present paper a new approach is introduced to determine efficient designs for nonlinear least squares estimation which addresses the problem of correlation between observations corresponding to the same subject. We use asymptotic arguments to derive optimal design densities, and the designs for finite sample sizes are constructed from the quantiles of the corresponding optimal distribution function. It is demonstrated that compared to the optimal exact designs, whose determination is a hard numerical problem, these designs are very efficient. Alternatively, the designs derived from asymptotic theory could be used as starting designs for the numerical computation of exact optimal designs. Several examples of linear and nonlinear models are presented in order to illustrate the methodology. In particular, it is demonstrated that naively chosen equally spaced designs may lead to less accurate estimation.

preprint1994arXiv

Some new asymptotic properties for the zeros of Jacobi, Laguerre and Hermite polynomials

For the generalized Jacobi, Laguerre and Hermite polynomials $P_n^{(α_n, β_n)} (x), L_n^{(α_n)} (x),$\break $H_n^{(γ_n)} (x)$ the limit distributions of the zeros are found, when the sequences $α_n$ or $β_n$ tend to infinity with a larger order than $n$. The derivation uses special properties of the sequences in the corresponding recurrence formulae. The results are used to give second order approximations for the largest and smallest zero which improve (and generalize) the limit statements in a paper of Moak, Saff and Varga [11].

Holger Dette

What is connected

Connect this record

See the researcher in context

Building this map preview

84 published item(s)

Inference for Multiple Change-points in Piecewise Locally Stationary Time Series

Sequential Eigenvalue Statistics for Change-Point Detection in Covariance Matrices

Testing separability for continuous functional data

An RKHS approach for pivotal inference in functional linear regression

Detecting relevant changes in the spatiotemporal mean function

Statistical Quantification of Differential Privacy: A Local Approach

Efficient prediction of grain boundary energies from atomistic simulations via sequential design

Optimal designs for comparing regression curves -- dependence within and between groups

Relevant change points in high dimensional time series

A distribution free test for changes in the trend function of locally stationary processes

A new approach for open-end sequential change point monitoring

A note on optimal designs for estimating the slope of a polynomial regression

A Portmanteau-type test for detecting serial correlation in locally stationary functional time series

Are deviations in a gradually varying mean relevant? A testing approach based on sup-norm estimators

Design admissibility and de la Garza phenomenon in multi-factor experiments

Detecting relevant differences in the covariance operators of functional time series -- a sup-norm approach

Efficient model-based Bioequivalence Testing

Efficient tests for bio-equivalence in functional data

Prediction in locally stationary time series

Quantifying deviations from separability in space-time functional processes

Statistical Inference for High Dimensional Panel Functional Time Series

Testing relevant hypotheses in functional time series via self-normalization

A nonparametric test for stationarity in functional time series

Bayesian $D$-optimal designs for error-in-variables models

Best linear unbiased estimators in continuous time regression models

Change point detection in autoregressive models with no moment assumptions

Detecting long-range dependence in non-stationary time series

Equivalence of dose response curves

Hankel determinants of random moment sequences

Multiscale inference for a multivariate density with applications to X-ray astronomy

Multiscale inference for multivariate deconvolution

On Wigner-Ville Spectra and the Unicity of Time-Varying Quantile-Based Spectral Densities

Optimal designs for active controlled dose finding trials with efficacy-toxicity outcomes

Optimal designs for comparing regression models with correlated observations

Optimal designs for dose response curves with common parameters

Optimal designs for regression models with autoregressive errors structure

Optimal discrimination designs for semi-parametric models

Quantile Spectral Analysis for Locally Stationary Time Series

Quantile spectral processes: Asymptotic analysis and inference

A new approach to optimal designs for correlated observations

Change point analysis of second order characteristics in non-stationary time series

Confidence bands for multivariate and time dependent inverse regression models

Confidence Corridors for Multivariate Generalized Quantile Regression

Detecting gradual changes in locally stationary processes

Efficient computation of Bayesian optimal discriminating designs

Model Selection versus Model Averaging in Dose Finding Studies

Of copulas, quantiles, ranks and spectra: An $L_1$-approach to spectral analysis

Optimal designs in regression with correlated errors

Quantile Correlations: Uncovering temporal dependencies in financial time series

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix

$E$-optimal designs for second-order response surface models

Bayesian T-optimal discriminating designs

Designing dose finding studies with an active control for exponential families

Detecting Gradual Changes in Locally Stationary Processes

Detecting relevant changes in time series models

Nonparametric tests for detecting breaks in the jump behaviour of a time-continuous process

Optimal designs for comparing curves

A test for stationarity based on empirical processes

Censored quantile regression processes under dependence and penalization

Complete classes of designs for nonlinear regression models and principal representations of moment spaces

Detection of multiple structural breaks in multivariate time series

Measuring stationarity in long-memory processes

Misspecification in copula-based regression

Multiplier bootstrap of tail copulas with applications

Optimal design for linear models with correlated observations

Optimal designs for multi-response generalized linear models with applications in thermal spraying

Optimal designs for nonlinear regression models with respect to non-informative priors

Optimal discriminating designs for several competing regression models

Robust T-optimal discriminating designs

Smooth backfitting in additive inverse regression

$T$-optimal designs for discrimination between two polynomial models

Asymptotic optimal designs under long-range dependence error structure

Distributions on unbounded moment spaces and random moment sequences

Model checks for the volatility under microstructure noise