Source author record

David Benkeser

David Benkeser appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning math.ST Statistics Theory Applications

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Huber loss-based super learner with applications to healthcare expenditures

Complex distributions of the healthcare expenditure pose challenges to statistical modeling via a single model. Super learning, an ensemble method that combines a range of candidate models, is a promising alternative for cost estimation and has shown benefits over a single model. However, standard approaches to super learning may have poor performance in settings where extreme values are present, such as healthcare expenditure data. We propose a super learner based on the Huber loss, a "robust" loss function that combines squared error loss with absolute loss to down-weight the influence of outliers. We derive oracle inequalities that establish bounds on the finite-sample and asymptotic performance of the method. We show that the proposed method can be used both directly to optimize Huber risk, as well as in finite-sample settings where optimizing mean squared error is the ultimate goal. For this latter scenario, we provide two methods for performing a grid search for values of the robustification parameter indexing the Huber loss. Simulations and real data analysis demonstrate appreciable finite-sample gains in cost prediction and causal effect estimation using our proposed method.

preprint2022arXiv

Efficient estimation of modified treatment policy effects based on the generalized propensity score

Continuous treatments have posed a significant challenge for causal inference, both in the formulation and identification of scientifically meaningful effects and in their robust estimation. Traditionally, focus has been placed on techniques applicable to binary or categorical treatments with few levels, allowing for the application of propensity score-based methodology with relative ease. Efforts to accommodate continuous treatments introduced the generalized propensity score, yet estimators of this nuisance parameter commonly utilize parametric regression strategies that sharply limit the robustness and efficiency of inverse probability weighted estimators of causal effect parameters. We formulate and investigate a novel, flexible estimator of the generalized propensity score based on a nonparametric function estimator that provably converges at a suitably fast rate to the target functional so as to facilitate statistical inference. With this estimator, we demonstrate the construction of nonparametric inverse probability weighted estimators of a class of causal effect estimands tailored to continuous treatments. To ensure the asymptotic efficiency of our proposed estimators, we outline several non-restrictive selection procedures for utilizing a sieve estimation framework to undersmooth estimators of the generalized propensity score. We provide the first characterization of such inverse probability weighted estimators achieving the nonparametric efficiency bound in a setting with continuous treatments, demonstrating this in numerical experiments. We further evaluate the higher-order efficiency of our proposed estimators by deriving and numerically examining the second-order remainder of the corresponding efficient influence function in the nonparametric model. Open source software implementing our proposed estimation techniques, the haldensify R package, is briefly discussed.

preprint2021arXiv

Inference for natural mediation effects under case-cohort sampling with applications in identifying COVID-19 vaccine correlates of protection

Combating the SARS-CoV2 pandemic will require the fast development of effective preventive vaccines. Regulatory agencies may open accelerated approval pathways for vaccines if an immunological marker can be established as a mediator of a vaccine's protection. A rich source of information for identifying such correlates are large-scale efficacy trials of COVID-19 vaccines, where immune responses are measured subject to a case-cohort sampling design. We propose two approaches to estimation of mediation parameters in the context of case-cohort sampling designs. We establish the theoretical large-sample efficiency of our proposed estimators and evaluate them in a realistic simulation to understand whether they can be employed in the analysis of COVID-19 vaccine efficacy trials.

preprint2020arXiv

Nonparametric inference for interventional effects with multiple mediators

Understanding the pathways whereby an intervention has an effect on an outcome is a common scientific goal. A rich body of literature provides various decompositions of the total intervention effect into pathway specific effects. Interventional direct and indirect effects provide one such decomposition. Existing estimators of these effects are based on parametric models with confidence interval estimation facilitated via the nonparametric bootstrap. We provide theory that allows for more flexible, possibly machine learning-based, estimation techniques to be considered. In particular, we establish weak convergence results that facilitate the construction of closed-form confidence intervals and hypothesis tests. Finally, we demonstrate multiple robustness properties of the proposed estimators. Simulations show that inference based on large-sample theory has adequate small-sample performance. Our work thus provides a means of leveraging modern statistical learning techniques in estimation of interventional mediation effects.

preprint2019arXiv

Design and analysis considerations for a sequentially randomized HIV prevention trial

TechStep is a randomized trial of a mobile health interventions targeted towards transgender adolescents. The interventions include a short message system, a mobile-optimized web application, and electronic counseling. The primary outcomes are self-reported sexual risk behaviors and uptake of HIV preventing medication. In order that we may evaluate the efficacy of several different combinations of interventions, the trial has a sequentially randomized design. We use a causal framework to formalize the estimands of the primary and key secondary analyses of the TechStep trial data. Targeted minimum loss-based estimators of these quantities are described and studied in simulation.