Source author record

Steven N. MacEachern

Steven N. MacEachern appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning math.ST stat.OT Statistics Theory

Catalog footprint

What is connected

8works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Rediscovering a little known fact about the t-test and the F-test: Algebraic, Geometric, Distributional and Graphical Considerations

We discuss the role that the null hypothesis should play in the construction of a test statistic used to make a decision about that hypothesis. To construct the test statistic for a point null hypothesis about a binomial proportion, a common recommendation is to act as if the null hypothesis is true. We argue that, on the surface, the one-sample t-test of a point null hypothesis about a Gaussian population mean does not appear to follow the recommendation. We show how simple algebraic manipulations of the usual t-statistic lead to an equivalent test procedure consistent with the recommendation. We provide geometric intuition regarding this equivalence and we consider extensions to testing nested hypotheses in Gaussian linear models. We discuss an application to graphical residual diagnostics where the form of the test statistic makes a practical difference. By examining the formulation of the test statistic from multiple perspectives in this familiar example, we provide simple, concrete illustrations of some important issues that can guide the formulation of effective solutions to more complex statistical problems.

preprint2020arXiv

The Dependent Dirichlet Process and Related Models

Standard regression approaches assume that some finite number of the response distribution characteristics, such as location and scale, change as a (parametric or nonparametric) function of predictors. However, it is not always appropriate to assume a location/scale representation, where the error distribution has unchanging shape over the predictor space. In fact, it often happens in applied research that the distribution of responses under study changes with predictors in ways that cannot be reasonably represented by a finite dimensional functional form. This can seriously affect the answers to the scientific questions of interest, and therefore more general approaches are indeed needed. This gives rise to the study of fully nonparametric regression models. We review some of the main Bayesian approaches that have been employed to define probability models where the complete response distribution may vary flexibly with predictors. We focus on developments based on modifications of the Dirichlet process, historically termed dependent Dirichlet processes, and some of the extensions that have been proposed to tackle this general problem using nonparametric approaches.

preprint2018arXiv

Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression

Bayesian methods have proven themselves to be successful across a wide range of scientific problems and have many well-documented advantages over competing methods. However, these methods run into difficulties for two major and prevalent classes of problems: handling data sets with outliers and dealing with model misspecification. We outline the drawbacks of previous solutions to both of these problems and propose a new method as an alternative. When working with the new method, the data is summarized through a set of insufficient statistics, targeting inferential quantities of interest, and the prior distribution is updated with the summary statistics rather than the complete data. By careful choice of conditioning statistics, we retain the main benefits of Bayesian methods while reducing the sensitivity of the analysis to features of the data not captured by the conditioning statistics. For reducing sensitivity to outliers, classical robust estimators (e.g., M-estimators) are natural choices for conditioning statistics. A major contribution of this work is the development of a data augmented Markov chain Monte Carlo (MCMC) algorithm for the linear model and a large class of summary statistics. We demonstrate the method on simulated and real data sets containing outliers and subject to model misspecification. Success is manifested in better predictive performance for data points of interest as compared to competing methods.

preprint2016arXiv

Bandwidth Selection for Kernel Density Estimation with a Markov Chain Monte Carlo Sample

Markov chain Monte Carlo samplers produce dependent streams of variates drawn from the limiting distribution of the Markov chain. With this as motivation, we introduce novel univariate kernel density estimators which are appropriate for the stationary sequences of dependent variates. We modify the asymptotic mean integrated squared error criterion to account for dependence and find that the modified criterion suggests data-driven adjustments to standard bandwidth selection methods. Simulation studies show that our proposed methods find bandwidths close to the optimal value while standard methods lead to smaller bandwidths and hence to undersmoothed density estimates. Empirically, the proposed methods have considerably smaller integrated mean squared error than do standard methods.

preprint2015arXiv

Block Hyper-g Priors in Bayesian Regression

The development of prior distributions for Bayesian regression has traditionally been driven by the goal of achieving sensible model selection and parameter estimation. The formalization of properties that characterize good performance has led to the development and popularization of thick tailed mixtures of g priors such as the Zellner--Siow and hyper-g priors. The properties of a particular prior are typically illuminated under limits on the likelihood or the prior. In this paper we introduce a new, conditional information asymptotic that is motivated by the common data analysis setting where at least one regression coefficient is much larger than others. We analyze existing mixtures of g priors under this limit and reveal two new behaviors, Essentially Least Squares (ELS) estimation and the Conditional Lindley's Paradox (CLP), and argue that these behaviors are, in general, undesirable. As the driver behind both of these behaviors is the use of a single, latent scale parameter that is common to all coefficients, we propose a block hyper-g prior, defined by first partitioning the covariates into groups and then placing independent hyper-g priors on the corresponding blocks of coefficients. We provide conditions under which ELS and the CLP are avoided by the new class of priors, and provide consistency results under traditional sample size asymptotics.

preprint2012arXiv

Regularization of Case-Specific Parameters for Robustness and Efficiency

Regularization methods allow one to handle a variety of inferential problems where there are more covariates than cases. This allows one to consider a potentially enormous number of covariates for a problem. We exploit the power of these techniques, supersaturating models by augmenting the "natural" covariates in the problem with an additional indicator for each case in the data set. We attach a penalty term for these case-specific indicators which is designed to produce a desired effect. For regression methods with squared error loss, an $\ell_1$ penalty produces a regression which is robust to outliers and high leverage cases; for quantile regression methods, an $\ell_2$ penalty decreases the variance of the fit enough to overcome an increase in bias. The paradigm thus allows us to robustify procedures which lack robustness and to increase the efficiency of procedures which are robust. We provide a general framework for the inclusion of case-specific parameters in regularization problems, describing the impact on the effective loss for a variety of regression and classification problems. We outline a computational strategy by which existing software can be modified to solve the augmented regularization problem, providing conditions under which such modification will converge to the optimum solution. We illustrate the benefits of including case-specific parameters in the context of mean regression and quantile regression through analysis of NHANES and linguistic data sets.

preprint2012arXiv

Restricting exchangeable nonparametric distributions

Distributions over exchangeable matrices with infinitely many columns, such as the Indian buffet process, are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly- suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtained by restricting the domain of existing models. Such models allow us to specify the distribution over the number of features per data point, and can achieve better performance on data sets where the number of features is not well-modeled by the original distribution.

preprint2011arXiv

Bayesian Synthesis: Combining subjective analyses, with an application to ozone data

Bayesian model averaging enables one to combine the disparate predictions of a number of models in a coherent fashion, leading to superior predictive performance. The improvement in performance arises from averaging models that make different predictions. In this work, we tap into perhaps the biggest driver of different predictions---different analysts---in order to gain the full benefits of model averaging. In a standard implementation of our method, several data analysts work independently on portions of a data set, eliciting separate models which are eventually updated and combined through a specific weighting method. We call this modeling procedure Bayesian Synthesis. The methodology helps to alleviate concerns about the sizable gap between the foundational underpinnings of the Bayesian paradigm and the practice of Bayesian statistics. In experimental work we show that human modeling has predictive performance superior to that of many automatic modeling techniques, including AIC, BIC, Smoothing Splines, CART, Bagged CART, Bayes CART, BMA and LARS, and only slightly inferior to that of BART. We also show that Bayesian Synthesis further improves predictive performance. Additionally, we examine the predictive performance of a simple average across analysts, which we dub Convex Synthesis, and find that it also produces an improvement.

Steven N. MacEachern

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Rediscovering a little known fact about the t-test and the F-test: Algebraic, Geometric, Distributional and Graphical Considerations

The Dependent Dirichlet Process and Related Models

Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression

Bandwidth Selection for Kernel Density Estimation with a Markov Chain Monte Carlo Sample

Block Hyper-g Priors in Bayesian Regression

Regularization of Case-Specific Parameters for Robustness and Efficiency

Restricting exchangeable nonparametric distributions

Bayesian Synthesis: Combining subjective analyses, with an application to ozone data