Source author record

Carlos M. Carvalho

Carlos M. Carvalho appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications q-fin.ST Computation

Catalog footprint

What is connected

11works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bayesian inference for treatment effects under nested subsets of controls

When constructing a model to estimate the causal effect of a treatment, it is necessary to control for other factors which may have confounding effects. Because the ignorability assumption is not testable, however, it is usually unclear which minimal set of controls is appropriate -- as is their appropriate functional form in the model -- and effect estimation can be sensitive to these choices. A common approach in this case is to fit several models, each with a different control specification (under the assumption that the available controls are sufficient but possibly not all necessary to deconfound the treatment effect), but it is difficult to reconcile inference for the treatment effect under the multiple resulting posterior distributions. Therefore we propose a two-stage approach to measure the sensitivity of effect estimation with respect to control specification. In the first stage, a model is fit with all available controls using a prior carefully selected to adjust for confounding. In the second stage, posterior distributions are calculated for the treatment effect under submodels of nested sets of controls using projected posteriors under the full model, providing valid Bayesian inference. We demonstrate how our approach can be used to detect influential confounders in a dataset, and apply it in a sensitivity analysis of an observational study measuring the effect of legalized abortion on crime rates.

preprint2020arXiv

Estimating heterogeneous effects of continuous exposures using Bayesian tree ensembles: revisiting the impact of abortion rates on crime

In estimating the causal effect of a continuous exposure or treatment, it is important to control for all confounding factors. However, most existing methods require parametric specification for how control variables influence the outcome or generalized propensity score, and inference on treatment effects is usually sensitive to this choice. Additionally, it is often the goal to estimate how the treatment effect varies across observed units. To address this gap, we propose a semiparametric model using Bayesian tree ensembles for estimating the causal effect of a continuous treatment of exposure which (i) does not require a priori parametric specification of the influence of control variables, and (ii) allows for identification of effect modification by pre-specified moderators. The main parametric assumption we make is that the effect of the exposure on the outcome is linear, with the steepness of this relationship determined by a nonparametric function of the moderators, and we provide heuristics to diagnose the validity of this assumption. We apply our methods to revisit a 2001 study of how abortion rates affect incidence of crime.

preprint2020arXiv

Model interpretation through lower-dimensional posterior summarization

Nonparametric regression models have recently surged in their power and popularity, accompanying the trend of increasing dataset size and complexity. While these models have proven their predictive ability in empirical settings, they are often difficult to interpret and do not address the underlying inferential goals of the analyst or decision maker. In this paper, we propose a modular two-stage approach for creating parsimonious, interpretable summaries of complex models which allow freedom in the choice of modeling technique and the inferential target. In the first stage a flexible model is fit which is believed to be as accurate as possible. In the second stage, lower-dimensional summaries are constructed by projecting draws from the distribution onto simpler structures. These summaries naturally come with valid Bayesian uncertainty estimates. Further, since we use the data only once to move from prior to posterior, these uncertainty estimates remain valid across multiple summaries and after iteratively refining a summary. We apply our method and demonstrate its strengths across a range of simulated and real datasets. Code to reproduce the examples shown is avaiable at github.com/spencerwoody/ghost

preprint2020arXiv

Targeted Smooth Bayesian Causal Forests: An analysis of heterogeneous treatment effects for simultaneous versus interval medical abortion regimens over gestation

We introduce Targeted Smooth Bayesian Causal Forests (tsBCF), a nonparametric Bayesian approach for estimating heterogeneous treatment effects which vary smoothly over a single covariate in the observational data setting. The tsBCF method induces smoothness by parameterizing terminal tree nodes with smooth functions, and allows for separate regularization of treatment effects versus prognostic effect of control covariates. Smoothing parameters for prognostic and treatment effects can be chosen to reflect prior knowledge or tuned in a data-dependent way. We use tsBCF to analyze a new clinical protocol for early medical abortion. Our aim is to assess relative effectiveness of simultaneous versus interval administration of mifepristone and misoprostol over the first nine weeks of gestation. The model reflects our expectation that the relative effectiveness varies smoothly over gestation, but not necessarily over other covariates. We demonstrate the performance of the tsBCF method on benchmarking experiments. Software for tsBCF is available at https://github.com/jestarling/tsbcf/.

preprint2016arXiv

Regularization and confounding in linear regression for treatment effect estimation

This paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of regularization-induced confounding is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shrinking control variable regression coefficients. Then, a simultaneous regression model is presented which permits regularization priors to be specified in a way that avoids this unintentional re-confounding. The new model is illustrated on synthetic and empirical data.

preprint2016arXiv

Sparse Mean-Variance Portfolios: A Penalized Utility Approach

This paper considers mean-variance optimization under uncertainty, specifically when one desires a sparsified set of optimal portfolio weights. From the standpoint of a Bayesian investor, our approach produces a small portfolio from many potential assets while acknowledging uncertainty in asset returns and parameter estimates. We demonstrate the procedure using static and dynamic models for asset returns.

preprint2015arXiv

Optimal ETF Selection for Passive Investing

This paper considers the problem of isolating a small number of exchange traded funds (ETFs) that suffice to capture the fundamental dimensions of variation in U.S. financial markets. First, the data is fit to a vector-valued Bayesian regression model, which is a matrix-variate generalization of the well known stochastic search variable selection (SSVS) of George and McCulloch (1993). ETF selection is then performed using the decoupled shrinkage and selection (DSS) procedure described in Hahn and Carvalho (2015), adapted in two ways: to the vector-response setting and to incorporate stochastic covariates. The selected set of ETFs is obtained under a number of different penalty and modeling choices. Optimal portfolios are constructed from selected ETFs by maximizing the Sharpe ratio posterior mean, and they are compared to the (unknown) optimal portfolio based on the full Bayesian model. We compare our selection results to popular ETF advisor Wealthfront.com. Additionally, we consider selecting ETFs by modeling a large set of mutual funds.

preprint2014arXiv

Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective

Selecting a subset of variables for linear models remains an active area of research. This paper reviews many of the recent contributions to the Bayesian model selection and shrinkage prior literature. A posterior variable selection summary is proposed, which distills a full posterior distribution over regression coefficients into a sequence of sparse linear predictors.

preprint2013arXiv

A Tractable State-Space Model for Symmetric Positive-Definite Matrices

Bayesian analysis of state-space models includes computing the posterior distribution of the system's parameters as well as filtering, smoothing, and predicting the system's latent states. When the latent states wander around $\mathbb{R}^n$ there are several well-known modeling components and computational tools that may be profitably combined to achieve these tasks. However, there are scenarios, like tracking an object in a video or tracking a covariance matrix of financial assets returns, when the latent states are restricted to a curve within $\mathbb{R}^n$ and these models and tools do not immediately apply. Within this constrained setting, most work has focused on filtering and less attention has been paid to the other aspects of Bayesian state-space inference, which tend to be more challenging. To that end, we present a state-space model whose latent states take values on the manifold of symmetric positive-definite matrices and for which one may easily compute the posterior distribution of the latent states and the system's parameters, in addition to filtered distributions and one-step ahead predictions. Deploying the model within the context of finance, we show how one can use realized covariance matrices as data to predict latent time-varying covariance matrices. This approach out-performs factor stochastic volatility.

preprint2013arXiv

Efficient Data Augmentation in Dynamic Models for Binary and Count Data

Dynamic linear models with Gaussian observations and Gaussian states lead to closed-form formulas for posterior simulation. However, these closed-form formulas break down when the response or state evolution ceases to be Gaussian. Dynamic, generalized linear models exemplify a class of models for which this is the case, and include, amongst other models, dynamic binomial logistic regression and dynamic negative binomial regression. Finding and appraising posterior simulation techniques for these models is important since modeling temporally correlated categories or counts is useful in a variety of disciplines, including ecology, economics, epidemiology, medicine, and neuroscience. In this paper, we present one such technique, Pólya-Gamma data augmentation, and compare it against two competing methods. We find that the Pólya-Gamma approach works well for dynamic logistic regression and for dynamic negative binomial regression when the count sizes are small. Supplementary files are provided for replicating the benchmarks.

preprint2010arXiv

Particle Learning and Smoothing

Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particles. State smoothing in the presence of parameter uncertainty is also solved as a by-product of PL. In a number of examples, we show that PL outperforms existing particle filtering alternatives and proves to be a competitor to MCMC.

Carlos M. Carvalho

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Bayesian inference for treatment effects under nested subsets of controls

Estimating heterogeneous effects of continuous exposures using Bayesian tree ensembles: revisiting the impact of abortion rates on crime

Model interpretation through lower-dimensional posterior summarization

Targeted Smooth Bayesian Causal Forests: An analysis of heterogeneous treatment effects for simultaneous versus interval medical abortion regimens over gestation

Regularization and confounding in linear regression for treatment effect estimation

Sparse Mean-Variance Portfolios: A Penalized Utility Approach

Optimal ETF Selection for Passive Investing

Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective

A Tractable State-Space Model for Symmetric Positive-Definite Matrices

Efficient Data Augmentation in Dynamic Models for Binary and Count Data

Particle Learning and Smoothing