Source author record

Paul Kabaila

Paul Kabaila appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

11works
5topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2020arXiv

Confidence intervals in general regression models that utilize uncertain prior information

We consider a general regression model, without a scale parameter. Our aim is to construct a confidence interval for a scalar parameter of interest $θ$ that utilizes the uncertain prior information that a distinct scalar parameter $τ$ takes the specified value $t$. This confidence interval should have good coverage properties. It should also have scaled expected length, where the scaling is with respect to the usual confidence interval, that (a) is substantially less than 1 when the prior information is correct, (b) has a maximum value that is not too large and (c) is close to 1 when the data and prior information are highly discordant. The asymptotic joint distribution of the maximum likelihood estimators $θ$ and $τ$ is similar to the joint distributions of these estimators in the particular case of a linear regression with normally distributed errors having known variance. This similarity is used to construct a confidence interval with the desired properties by using the confidence interval, computed using the R package ciuupi, that utilizes the uncertain prior information in this particular linear regression case. An important practical application of this confidence interval is to a quantal bioassay carried out to compare two similar compounds. In this context, the uncertain prior information is that the hypothesis of "parallelism" holds. We provide extensive numerical results that illustrate the properties of this confidence interval in this context.

preprint2015arXiv

The impact of a Hausman pretest on the coverage probability and expected length of confidence intervals

In the analysis of clustered and longitudinal data, which includes a covariate that varies both between and within clusters (e.g. time-varying covariate in longitudinal data), a Hausman pretest is commonly used to decide whether subsequent inference is made using the linear random intercept model or the fixed effects model. We assess the effect of this pretest on the coverage probability and expected length of a confidence interval for the slope parameter. Our results show that for the small levels of significance of the Hausman pretest commonly used in applications, the minimum coverage probability of this confidence interval can be far below nominal. Furthermore, the expected length of this confidence interval is, on average, larger than the expected length of a confidence interval for the slope parameter based on the fixed effects model with the same minimum coverage.

preprint2014arXiv

A new method of randomization of lattice rules for multiple integration

Cranley and Patterson put forward the following randomization as the basis for the estimation of the error of a lattice rule for an integral of a one-periodic function over the unit cube in s dimensions. The lattice rule is randomized using independent random shifts in each coordinate direction that are uniformly distributed in the interval [0,1]. This randomized lattice rule results in an unbiased estimator of the multiple integral. However, in practice, random variables that are independent and uniformly distributed on [0,1] are not available, since this would require an infinite number of random independent bits. A more realistic practical implementation of the Cranley and Patterson randomization uses rs independent random bits, in the following way. The lattice rule is randomized using independent random shifts in each coordinate direction that are uniformly distributed on {0, 1/2^r, ... ,(2^r-1)/2^r}, where r may be large. For a rank-1 lattice rule with 2^m quadrature points and r >= m, we show that this randomized lattice rule leads to an estimator of the multiple integral that typically has a large bias. We therefore propose that these rs independent random bits be used to perform a new randomization that employs an extension, in the number of quadrature points, to a lattice rule with 2^(m+sr) quadrature points (leading to embedded lattice rules).This new randomization is shown to lead to an estimator of the multiple integral that has much smaller bias.

preprint2013arXiv

On randomized confidence intervals for the binomial probability

Suppose that X_1,X_2,...,X_n are independent and identically Bernoulli(theta) distributed. Also suppose that our aim is to find an exact confidence interval for theta that is the intersection of a 1-α/2 upper confidence interval and a 1-α/2 lower confidence interval. The Clopper-Pearson interval is the standard such confidence interval for theta, which is widely used in practice. We consider the randomized confidence interval of Stevens, 1950 and present some extensions, including pseudorandomized confidence intervals. We also consider the "data-randomized" confidence interval of Korn, 1987 and point out some additional attractive features of this interval. We also contribute to the discussion about the practical use of such confidence intervals.

preprint2012arXiv

Simultaneous confidence intervals for the population cell means, for two-by-two factorial data, that utilize uncertain prior information

Consider a two-by-two factorial experiment with more than 1 replicate. Suppose that we have uncertain prior information that the two-factor interaction is zero. We describe new simultaneous frequentist confidence intervals for the 4 population cell means, with simultaneous confidence coefficient 1-alpha, that utilize this prior information in the following sense. These simultaneous confidence intervals define a cube with expected volume that (a) is relatively small when the two-factor interaction is zero and (b) has maximum value that is not too large. Also, these intervals coincide with the standard simultaneous confidence intervals obtained by Tukey's method, with simultaneous confidence coefficient 1-alpha, when the data strongly contradict the prior information that the two-factor interaction is zero. We illustrate the application of these new simultaneous confidence intervals to a real data set.

preprint2010arXiv

Variable-width confidence intervals in Gaussian regression and penalized maximum likelihood estimators

Hard thresholding, LASSO , adaptive LASSO and SCAD point estimators have been suggested for use in the linear regression context when most of the components of the regression parameter vector are believed to be zero, a sparsity type of assumption. Potscher and Schneider, 2010, Electronic Journal of Statistics, have considered the properties of fixed-width confidence intervals that include one of these point estimators (for all possible data values). They consider a normal linear regression model with orthogonal regressors and show that these confidence intervals are longer than the standard confidence interval (based on the maximum likelihood estimator) when the tuning parameter for these point estimators is chosen to lead to either conservative or consistent model selection. We extend this analysis to the case of variable-width confidence intervals that include one of these point estimators (for all possible data values). In consonance with these findings of Potscher and Schneider, we find that these confidence intervals perform poorly by comparison with the standard confidence interval, when the tuning parameter for these point estimators is chosen to lead to consistent model selection. However, when the tuning parameter for these point estimators is chosen to lead to conservative model selection, our conclusions differ from those of Potscher and Schneider. We consider the variable-width confidence intervals of Farchione and Kabaila, 2008, Statistics & Probability Letters, which have advantages over the standard confidence interval in the context that there is a belief in a sparsity type of assumption. These variable-width confidence intervals are shown to include the hard thresholding, LASSO, adaptive LASSO and SCAD estimators (for all possible data values) provided that the tuning parameters for these estimators are chosen to belong to an appropriate interval.

preprint2009arXiv

The Asymptotic Efficiency of Improved Prediction Intervals

Barndorff-Nielsen and Cox (1994, p.319) modify an estimative prediction limit to obtain an improved prediction limit with better coverage properties. Kabaila and Syuhada (2008) present a simulation-based approximation to this improved prediction limit, which avoids the extensive algebraic manipulations required for this modification. We present a modification of an estimative prediction interval, analogous to the Barndorff-Nielsen and Cox modification, to obtain an improved prediction interval with better coverage properties. We also present an analogue, for the prediction interval context, of this simulation-based approximation. The parameter estimator on which the estimative and improved prediction limits and intervals are based is assumed to have the same asymptotic distribution as the (conditional) maximum likelihood estimator. The improved prediction limit and interval depend on the asymptotic conditional bias of this estimator. This bias can be very sensitive to very small changes in the estimator. It may require considerable effort to find this bias. We show, however, that the improved prediction limit and interval have asymptotic efficiencies that are functionally independent of this bias. Thus, improved prediction limits and intervals obtained using the Barndorff-Nielsen and Cox type of methodology can conveniently be based on the (conditional) maximum likelihood estimator, whose asymptotic conditional bias is given by the formula of Vidoni (2004, p.144). Also, improved prediction limits and intervals obtained using Kabaila and Syuhada type approximations have asymptotic efficiencies that are independent of the estimator on which these intervals are based.

preprint2007arXiv

Upper bounds on the minimum coverage probability of confidence intervals in regression after variable selection

We consider a linear regression model, with the parameter of interest a specified linear combination of the regression parameter vector. We suppose that, as a first step, a data-based model selection (e.g. by preliminary hypothesis tests or minimizing AIC) is used to select a model. It is common statistical practice to then construct a confidence interval for the parameter of interest based on the assumption that the selected model had been given to us a priori. This assumption is false and it can lead to a confidence interval with poor coverage properties. We provide an easily-computed finite sample upper bound (calculated by repeated numerical evaluation of a double integral) to the minimum coverage probability of this confidence interval. This bound applies for model selection by any of the following methods: minimum AIC, minimum BIC, maximum adjusted R-squared, minimum Mallows' Cp and t-tests. The importance of this upper bound is that it delineates general categories of design matrices and model selection procedures for which this confidence interval has poor coverage properties. This upper bound is shown to be a finite sample analogue of an earlier large sample upper bound due to Kabaila and Leeb.