Source author record

Jason Roy

Jason Roy appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning stat.OT

Catalog footprint

What is connected

6works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Hierarchical Bayesian Bootstrap for Heterogeneous Treatment Effect Estimation

A major focus of causal inference is the estimation of heterogeneous average treatment effects (HTE) - average treatment effects within strata of another variable of interest such as levels of a biomarker, education, or age strata. Inference involves estimating a stratum-specific regression and integrating it over the distribution of confounders in that stratum - which itself must be estimated. Standard practice involves estimating these stratum-specific confounder distributions independently (e.g. via the empirical distribution or Rubin's Bayesian bootstrap), which becomes problematic for sparsely populated strata with few observed confounder vectors. In this paper, we develop a nonparametric hierarchical Bayesian bootstrap (HBB) prior over the stratum-specific confounder distributions for HTE estimation. The HBB partially pools the stratum-specific distributions, thereby allowing principled borrowing of confounder information across strata when sparsity is a concern. We show that posterior inference under the HBB can yield efficiency gains over standard marginalization approaches while avoiding strong parametric assumptions about the confounder distribution. We use our approach to estimate the adverse event risk of proton versus photon chemoradiotherapy across various cancer types.

preprint2022arXiv

A Bayesian nonparametric approach for causal inference with multiple mediators

Mediation analysis with contemporaneously observed multiple mediators is an important area of causal inference. Recent approaches for multiple mediators are often based on parametric models and thus may suffer from model misspecification. Also, much of the existing literature either only allow estimation of the joint mediation effect, or, estimate the joint mediation effect as the sum of individual mediator effects, which often is not a reasonable assumption. In this paper, we propose a methodology which overcomes the two aforementioned drawbacks. Our method is based on a novel Bayesian nonparametric (BNP) approach, wherein the joint distribution of the observed data (outcome, mediators, treatment, and confounders) is modeled flexibly using an enriched Dirichlet process mixture with three levels: the first level characterizing the conditional distribution of the outcome given the mediators, treatment and the confounders, the second level corresponding to the conditional distribution of each of the mediators given the treatment and the confounders, and the third level corresponding to the distribution of the treatment and the confounders. We use standardization (g-computation) to compute causal mediation effects under three uncheckable assumptions that allow identification of the individual and joint mediation effects. The efficacy of our proposed method is demonstrated with simulations. We apply our proposed method to analyze data from a study of Ventilator-associated Pneumonia (VAP) co-infected patients, where the effect of the abundance of Pseudomonas on VAP infection is suspected to be mediated through antibiotics.

preprint2022arXiv

Addressing Positivity Violations in Causal Effect Estimation using Gaussian Process Priors

In observational studies, causal inference relies on several key identifying assumptions. One identifiability condition is the positivity assumption, which requires the probability of treatment be bounded away from 0 and 1. That is, for every covariate combination, it should be possible to observe both treated and control subjects, i.e., the covariate distributions should overlap between treatment arms. If the positivity assumption is violated, population-level causal inference necessarily involves some extrapolation. Ideally, a greater amount of uncertainty about the causal effect estimate should be reflected in such situations. With that goal in mind, we construct a Gaussian process model for estimating treatment effects in the presence of practical violations of positivity. Advantages of our method include minimal distributional assumptions, a cohesive model for estimating treatment effects, and more uncertainty associated with areas in the covariate space where there is less overlap. We assess the performance of our approach with respect to bias and efficiency using simulation studies. The method is then applied to a study of critically ill female patients to examine the effect of undergoing right heart catheterization.

preprint2020arXiv

Bayesian Nonparametric Cost-Effectiveness Analyses: Causal Estimation and Adaptive Subgroup Discovery

Cost-effectiveness analyses (CEAs) are at the center of health economic decision making. While these analyses help policy analysts and economists determine coverage, inform policy, and guide resource allocation, they are statistically challenging for several reasons. Cost and effectiveness are correlated and follow complex joint distributions which are difficult to capture parametrically. Effectiveness (often measured as increased survival time) and accumulated cost tends to be right-censored in many applications. Moreover, CEAs are often conducted using observational data with non-random treatment assignment. Policy-relevant causal estimation therefore requires robust confounding control. Finally, current CEA methods do not address cost-effectiveness heterogeneity in a principled way - often presenting population-averaged estimates even though significant effect heterogeneity may exist. Motivated by these challenges, we develop a nonparametric Bayesian model for joint cost-survival distributions in the presence of censoring. Our approach utilizes a joint Enriched Dirichlet Process prior on the covariate effects of cost and survival time, while using a Gamma Process prior on the baseline survival time hazard. Causal CEA estimands, with policy-relevant interpretations, are identified and estimated via a Bayesian nonparametric g-computation procedure. Finally, we outline how the induced clustering of the Enriched Dirichlet Process can be used to adaptively detect presence of subgroups with different cost-effectiveness profiles. We outline an MCMC procedure for full posterior inference and evaluate frequentist properties via simulations. We use our model to assess the cost-efficacy of chemotherapy versus radiation adjuvant therapy for treating endometrial cancer in the SEER-Medicare database.

preprint2020arXiv

Estimating the impact of treatment compliance over time on smoking cessation using data from ecological momentary assessments (EMA)

The Wisconsin Smoker's Health Study (WSHS2) was a longitudinal trial conducted to compare the effectiveness of two commonly used smoking cessation treatments, varenicline and combination nicotine replacement therapy (cNRT) with the less intense standard of care, nicotine patch. The main outcome of the WSHS2 study was that all three treatments had equivalent treatment effects. However, in-depth analysis of the compliance data collected via ecological momentary assessment (EMA) were not analyzed. Compliance to the treatment regimens may represent a confounder as varenicline and cNRT are more intense treatments and would likely have larger treatment effects if all subjects complied. In order to estimate the causal compliance effect, we view the counterfactual, the outcome that would have been observed if the subject was allocated to the treatment counter to the fact, as a missing data problem and proceed to impute the counterfactual. Our contribution to the methodological literature lies in the extension of this idea to a more general analytic approach that includes mediators and confounders of the mediator-outcome relationship. Simulation results suggest that our method works well and application to the WSHS2 data suggest that the treatment effect of nicotine patch, varenicline, and cNRT are equivalent after accounting for differences in treatment compliance.

preprint2019arXiv

A Bayesian Nonparametric Model for Zero-Inflated Outcomes: Prediction, Clustering, and Causal Estimation

Researchers are often interested in predicting outcomes, conducting clustering analysis to detect distinct subgroups of their data, or computing causal treatment effects. Pathological data distributions that exhibit skewness and zero-inflation complicate these tasks - requiring highly flexible, data-adaptive modeling. In this paper, we present a fully nonparametric Bayesian generative model for continuous, zero-inflated outcomes that simultaneously predicts structural zeros, captures skewness, and clusters patients with similar joint data distributions. The flexibility of our approach yields predictions that capture the joint data distribution better than commonly used zero-inflated methods. Moreover, we demonstrate that our model can be coherently incorporated into a standardization procedure for computing causal effect estimates that are robust to such data pathologies. Uncertainty at all levels of this model flow through to the causal effect estimates of interest - allowing easy point estimation, interval estimation, and posterior predictive checks verifying positivity, a required causal identification assumption. Our simulation results show point estimates to have low bias and interval estimates to have close to nominal coverage under complicated data settings. Under simpler settings, these results hold while incurring lower efficiency loss than comparator methods. Lastly, we use our proposed method to analyze zero-inflated inpatient medical costs among endometrial cancer patients receiving either chemotherapy and radiation therapy in the SEER medicare database.