Source author record

P. Richard Hahn

P. Richard Hahn appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning q-fin.ST Artificial Intelligence math.FA math.PR math.ST Statistics Theory

Catalog footprint

What is connected

17works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bayesian decision theory for tree-based adaptive screening tests with an application to youth delinquency

Crime prevention strategies based on early intervention depend on accurate risk assessment instruments for identifying high risk youth. It is important in this context that the instruments be convenient to administer, which means, in particular, that they should also be reasonably brief; adaptive screening tests are useful for this purpose. Adaptive tests constructed using classification and regression trees are becoming a popular alternative to traditional Item Response Theory (IRT) approaches for adaptive testing. However, tree-based adaptive tests lack a principled criterion for terminating the test. This paper develops a Bayesian decision theory framework for measuring the trade-off between brevity and accuracy, when considering tree-based adaptive screening tests of different lengths. We also present a novel method for designing tree-based adaptive tests, motivated by this framework. The framework and associated adaptive test method are demonstrated through an application to youth delinquency risk assessment in Honduras; it is shown that an adaptive test requiring a subject to answer fewer than 10 questions can identify high risk youth nearly as accurately as an unabridged survey containing 173 items.

preprint2022arXiv

Do forecasts of bankruptcy cause bankruptcy? A machine learning sensitivity analysis

It is widely speculated that auditors' public forecasts of bankruptcy are, at least in part, self-fulfilling prophecies in the sense that they might actually cause bankruptcies that would not have otherwise occurred. This conjecture is hard to prove, however, because the strong association between bankruptcies and bankruptcy forecasts could simply indicate that auditors are skillful forecasters with unique access to highly predictive covariates. In this paper, we investigate the causal effect of bankruptcy forecasts on bankruptcy using nonparametric sensitivity analysis. We contrast our analysis with two alternative approaches: a linear bivariate probit model with an endogenous regressor, and a recently developed bound on risk ratios called E-values. Additionally, our machine learning approach incorporates a monotonicity constraint corresponding to the assumption that bankruptcy forecasts do not make bankruptcies less likely. Finally, a tree-based posterior summary of the treatment effect estimates allows us to explore which observable firm characteristics moderate the inducement effect.

preprint2020arXiv

A Survey of Learning Causality with Data: Problems and Methods

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.

preprint2020arXiv

A Symmetric Prior for Multinomial Probit Models

Fitted probabilities from widely used Bayesian multinomial probit models can depend strongly on the choice of a base category, which is used to uniquely identify the parameters of the model. This paper proposes a novel identification strategy, and associated prior distribution for the model parameters, that renders the prior symmetric with respect to relabeling the outcome categories. The new prior permits an efficient Gibbs algorithm that samples rank-deficient covariance matrices without resorting to Metropolis-Hastings updates.

preprint2020arXiv

Estimating heterogeneous effects of continuous exposures using Bayesian tree ensembles: revisiting the impact of abortion rates on crime

In estimating the causal effect of a continuous exposure or treatment, it is important to control for all confounding factors. However, most existing methods require parametric specification for how control variables influence the outcome or generalized propensity score, and inference on treatment effects is usually sensitive to this choice. Additionally, it is often the goal to estimate how the treatment effect varies across observed units. To address this gap, we propose a semiparametric model using Bayesian tree ensembles for estimating the causal effect of a continuous treatment of exposure which (i) does not require a priori parametric specification of the influence of control variables, and (ii) allows for identification of effect modification by pre-specified moderators. The main parametric assumption we make is that the effect of the exposure on the outcome is linear, with the steepness of this relationship determined by a nonparametric function of the moderators, and we provide heuristics to diagnose the validity of this assumption. We apply our methods to revisit a 2001 study of how abortion rates affect incidence of crime.

preprint2020arXiv

Semi-supervised learning and the question of true versus estimated propensity scores

A straightforward application of semi-supervised machine learning to the problem of treatment effect estimation would be to consider data as "unlabeled" if treatment assignment and covariates are observed but outcomes are unobserved. According to this formulation, large unlabeled data sets could be used to estimate a high dimensional propensity function and causal inference using a much smaller labeled data set could proceed via weighted estimators using the learned propensity scores. In the limiting case of infinite unlabeled data, one may estimate the high dimensional propensity function exactly. However, longstanding advice in the causal inference community suggests that estimated propensity scores (from labeled data alone) are actually preferable to true propensity scores, implying that the unlabeled data is actually useless in this context. In this paper we examine this paradox and propose a simple procedure that reconciles the strong intuition that a known propensity functions should be useful for estimating treatment effects with the previous literature suggesting otherwise. Further, simulation studies suggest that direct regression may be preferable to inverse-propensity weight estimators in many circumstances.

preprint2016arXiv

Regularization and confounding in linear regression for treatment effect estimation

This paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of regularization-induced confounding is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shrinking control variable regression coefficients. Then, a simultaneous regression model is presented which permits regularization priors to be specified in a way that avoids this unintentional re-confounding. The new model is illustrated on synthetic and empirical data.

preprint2016arXiv

Sparse Mean-Variance Portfolios: A Penalized Utility Approach

This paper considers mean-variance optimization under uncertainty, specifically when one desires a sparsified set of optimal portfolio weights. From the standpoint of a Bayesian investor, our approach produces a small portfolio from many potential assets while acknowledging uncertainty in asset returns and parameter estimates. We demonstrate the procedure using static and dynamic models for asset returns.

preprint2016arXiv

Variable Selection in Seemingly Unrelated Regressions with Random Predictors

This paper considers linear model selection when the response is vector-valued and the predictors are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a "post-inference model summarization" strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing.

preprint2015arXiv

A Bayesian partial identification approach to inferring the prevalence of accounting misconduct

This paper describes the use of flexible Bayesian regression models for estimating a partially identified probability function. Our approach permits efficient sensitivity analysis concerning the posterior impact of priors on the partially identified component of the regression model. The new methodology is illustrated on an important problem where only partially observed data is available - inferring the prevalence of accounting misconduct among publicly traded U.S. businesses.

preprint2015arXiv

Model specification via sequential coherence and backward induction

This paper describes how to specify probability models for data analysis via a backward induction procedure. The new approach yields coherent, prior-free uncertainty assessment. After presenting some intuition-building examples, the new approach is applied to a kernel density estimator, which leads to a novel method for computing point-wise credible intervals in nonparametric density estimation. The new approach has two additional advantages; 1) the posterior mean density can be accurately approximated without resorting to Monte Carlo simulation and 2) concentration bounds are easily established as a function of sample size.

preprint2015arXiv

Optimal ETF Selection for Passive Investing

This paper considers the problem of isolating a small number of exchange traded funds (ETFs) that suffice to capture the fundamental dimensions of variation in U.S. financial markets. First, the data is fit to a vector-valued Bayesian regression model, which is a matrix-variate generalization of the well known stochastic search variable selection (SSVS) of George and McCulloch (1993). ETF selection is then performed using the decoupled shrinkage and selection (DSS) procedure described in Hahn and Carvalho (2015), adapted in two ways: to the vector-response setting and to incorporate stochastic covariates. The selected set of ETFs is obtained under a number of different penalty and modeling choices. Optimal portfolios are constructed from selected ETFs by maximizing the Sharpe ratio posterior mean, and they are compared to the (unknown) optimal portfolio based on the full Bayesian model. We compare our selection results to popular ETF advisor Wealthfront.com. Additionally, we consider selecting ETFs by modeling a large set of mutual funds.

preprint2014arXiv

A Bayesian hierarchical model for inferring player strategy types in a number guessing game

This paper presents an in-depth statistical analysis of an experiment designed to measure the extent to which players in a simple game behave according to a popular behavioral economic model. The p-beauty contest is a multi-player number guessing game that has been widely used to study strategic behavior. This paper describes beauty contest experiments for an audience of data analysts, with a special focus on a class of models for game play called k-step thinking models, which allow each player in the game to employ an idiosyncratic strategy. We fit a Bayesian statistical model to estimate the proportion of our player population whose game play is compatible with a k-step thinking model. Our findings put this number at approximately 25%.

preprint2014arXiv

A Structural Approach to Coordinate-Free Statistics

We consider the question of learning in general topological vector spaces. By exploiting known (or parametrized) covariance structures, our Main Theorem demonstrates that any continuous linear map corresponds to a certain isomorphism of embedded Hilbert spaces. By inverting this isomorphism and extending continuously, we construct a version of the Ordinary Least Squares estimator in absolute generality. Our Gauss-Markov theorem demonstrates that OLS is a "best linear unbiased estimator", extending the classical result. We construct a stochastic version of the OLS estimator, which is a continuous disintegration exactly for the class of "uncorrelated implies independent" (UII) measures. As a consequence, Gaussian measures always exhibit continuous disintegrations through continuous linear maps, extending a theorem of the first author. Applying this framework to some problems in machine learning, we prove a useful representation theorem for covariance tensors, and show that OLS defines a good kriging predictor for vector-valued arrays on general index spaces. We also construct a support-vector machine classifier in this setting. We hope that our article shines light on some deeper connections between probability theory, statistics and machine learning, and may serve as a point of intersection for these three communities.

preprint2014arXiv

Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective

Selecting a subset of variables for linear models remains an active area of research. This paper reviews many of the recent contributions to the Bayesian model selection and shrinkage prior literature. A posterior variable selection summary is proposed, which distills a full posterior distribution over regression coefficients into a sequence of sparse linear predictors.

preprint2014arXiv

Shrinkage priors for linear instrumental variable models with many instruments

This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated into the analysis in a robust way. Second, the new prior is implemented via an importance sampling scheme, which utilizes posterior Monte Carlo samples from a first-stage Bayesian regression analysis. This modular computation makes sensitivity analyses straightforward. Two simulation studies are provided to demonstrate the advantages of the new method. As an empirical illustration, the new method is used to estimate a key parameter in macro-economic models: the elasticity of inter-temporal substitution. The empirical analysis produces substantive conclusions in line with previous studies, but certain inconsistencies of earlier analyses are resolved.

preprint2010arXiv

Predictor-dependent shrinkage for linear regression via partial factor modeling

In prediction problems with more predictors than observations, it can sometimes be helpful to use a joint probability model, $π(Y,X)$, rather than a purely conditional model, $π(Y \mid X)$, where $Y$ is a scalar response variable and $X$ is a vector of predictors. This approach is motivated by the fact that in many situations the marginal predictor distribution $π(X)$ can provide useful information about the parameter values governing the conditional regression. However, under very mild misspecification, this marginal distribution can also lead conditional inferences astray. Here, we explore these ideas in the context of linear factor models, to understand how they play out in a familiar setting. The resulting Bayesian model performs well across a wide range of covariance structures, on real and simulated data.

P. Richard Hahn

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Bayesian decision theory for tree-based adaptive screening tests with an application to youth delinquency

Do forecasts of bankruptcy cause bankruptcy? A machine learning sensitivity analysis

A Survey of Learning Causality with Data: Problems and Methods

A Symmetric Prior for Multinomial Probit Models

Estimating heterogeneous effects of continuous exposures using Bayesian tree ensembles: revisiting the impact of abortion rates on crime

Semi-supervised learning and the question of true versus estimated propensity scores

Regularization and confounding in linear regression for treatment effect estimation

Sparse Mean-Variance Portfolios: A Penalized Utility Approach

Variable Selection in Seemingly Unrelated Regressions with Random Predictors

A Bayesian partial identification approach to inferring the prevalence of accounting misconduct

Model specification via sequential coherence and backward induction

Optimal ETF Selection for Passive Investing

A Bayesian hierarchical model for inferring player strategy types in a number guessing game

A Structural Approach to Coordinate-Free Statistics

Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective

Shrinkage priors for linear instrumental variable models with many instruments

Predictor-dependent shrinkage for linear regression via partial factor modeling