Source author record

Lukas Steinberger

Lukas Steinberger appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory

Catalog footprint

What is connected

4works

2topics

3close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Conditional predictive inference for stable algorithms

We investigate generically applicable and intuitively appealing prediction intervals based on $k$-fold cross validation. We focus on the conditional coverage probability of the proposed intervals, given the observations in the training sample (hence, training conditional validity), and show that it is close to the nominal level, in an appropriate sense, provided that the underlying algorithm used for computing point predictions is sufficiently stable when feature-response pairs are omitted. Our results are based on a finite sample analysis of the empirical distribution function of $k$-fold cross validation residuals and hold in non-parametric settings with only minimal assumptions on the error distribution. To illustrate our results, we also apply them to high-dimensional linear predictors, where we obtain uniform asymptotic training conditional validity as both sample size and dimension tend to infinity at the same rate and consistent parameter estimation typically fails. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters (cf. Bickel and Freedman, 1983; El Karoui and Purdom, 2018; Mammen, 1996), cross validation methods can be successfully applied to obtain reliable predictive inference even in high dimensions and conditionally on the training data.

preprint2022arXiv

Interactive versus non-interactive locally differentially private estimation: Two elbows for the quadratic functional

Local differential privacy has recently received increasing attention from the statistics community as a valuable tool to protect the privacy of individual data owners without the need of a trusted third party. Similar to the classical notion of randomized response, the idea is that data owners randomize their true information locally and only release the perturbed data. Many different protocols for such local perturbation procedures can be designed. In most estimation problems studied in the literature so far, however, no significant difference in terms of minimax risk between purely non-interactive protocols and protocols that allow for some amount of interaction between individual data providers could be observed. In this paper we show that for estimating the integrated square of a density, sequentially interactive procedures improve substantially over the best possible non-interactive procedure in terms of minimax rate of estimation. In particular, in the non-interactive scenario we identify an elbow in the minimax rate at $s=\frac34$, whereas in the sequentially interactive scenario the elbow is at $s=\frac12$. This is markedly different from both, the case of direct observations, where the elbow is well known to be at $s=\frac14$, as well as from the case where Laplace noise is added to the original data, where an elbow at $s= \frac94$ is obtained. We also provide adaptive estimators that achieve the optimal rate up to log-factors, we draw connections to non-parametric goodness-of-fit testing and estimation of more general integral functionals and conduct a series of numerical experiments. The fact that a particular locally differentially private, but interactive, mechanism improves over the simple non-interactive one is also of great importance for practical implementations of local differential privacy.

preprint2016arXiv

Leave-one-out prediction intervals in linear regression models with many variables

We study prediction intervals based on leave-one-out residuals in a linear regression model where the number of explanatory variables can be large compared to sample size. We establish uniform asymptotic validity (conditional on the training sample) of the proposed interval under minimal assumptions on the unknown error distribution and the high dimensional design. Our intervals are generic in the sense that they are valid for a large class of linear predictors used to obtain a point forecast, such as robust M-estimators, James-Stein type estimators and penalized estimators like the LASSO. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters, leave-one-out methods can be successfully applied to obtain reliable predictive inference even in high dimensions.

preprint2016arXiv

The relative effects of dimensionality and multiplicity of hypotheses on the F-test in linear regression

Recently, several authors have re-examined the power of the classical F-test in linear regression in a `large-p, large-n' framework (cf. Zhong and Chen (2011), Wang and Cui (2013)). They highlight the loss of power as the number of regressors p increases relative to sample size n. These papers essentially focus only on the overall test of the null hypothesis that all p slope coefficients are equal to zero. Here, we consider the general case of testing q linear hypotheses on the (p+1)-dimensional regression parameter vector that includes p slope coefficients and an intercept parameter. In the case of Gaussian design, we describe the dependence of the local asymptotic power function on both the relative number of parameters p and the number of hypotheses q being tested, showing that the negative effect of dimensionality is less severe if the number of hypotheses is small. Using the recent work of Srivastava and Vershynin (2013) on high dimensional sample covariance matrices we are also able to substantially generalize previous results for non-Gaussian regressors.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint