Source author record

Alejandro Schuler

Alejandro Schuler appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Applications

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A Causal Machine Learning Framework for Predicting Preventable Hospital Readmissions

Clinical predictive algorithms are increasingly being used to form the basis for optimal treatment policies--that is, to enable interventions to be targeted to the patients who will presumably benefit most. Despite taking advantage of recent advances in supervised machine learning, these algorithms remain, in a sense, blunt instruments--often being developed and deployed without a full accounting of the causal aspects of the prediction problems they are intended to solve. Indeed, in many settings, including among patients at risk of readmission, the riskiest patients may derive less benefit from a preventative intervention compared to those at lower risk. Moreover, targeting an intervention to a population, rather than limiting it to a small group of high-risk patients, may lead to far greater overall utility if the patients with the most modifiable (or preventable) outcomes across the population could be identified. Based on these insights, we introduce a causal machine learning framework that decouples this prediction problem into causal and predictive parts, which clearly delineates the complementary roles of causal inference and prediction in this problem. We estimate treatment effects using causal forests, and characterize treatment effect heterogeneity across levels of predicted risk using these estimates. Furthermore, we show how these effect estimates could be used in concert with the modeled "payoffs" associated with successful prevention of individual readmissions to maximize overall utility. Based on data taken from before and after the implementation of a readmissions prevention intervention at Kaiser Permanente Northern California, our results suggest that nearly four times as many readmissions could be prevented annually with this approach compared to targeting this intervention using predicted risk.

preprint2020arXiv

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation -- crucial in applications like healthcare and weather forecasting. NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm. Furthermore, we show how the Natural Gradient is required to correct the training dynamics of our multiparameter boosting approach. NGBoost can be used with any base learner, any family of distributions with continuous parameters, and any scoring rule. NGBoost matches or exceeds the performance of existing methods for probabilistic prediction while offering additional benefits in flexibility, scalability, and usability. An open-source implementation is available at github.com/stanfordmlgroup/ngboost.

preprint2020arXiv

Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model

Clinical researchers often select among and evaluate risk prediction models using standard machine learning metrics based on confusion matrices. However, if these models are used to allocate interventions to patients, standard metrics calculated from retrospective data are only related to model utility (in terms of reductions in outcomes) under certain assumptions. When predictions are delivered repeatedly throughout time (e.g. in a patient encounter), the relationship between standard metrics and utility is further complicated. Several kinds of evaluations have been used in the literature, but it has not been clear what the target of estimation is in each evaluation. We synthesize these approaches, determine what is being estimated in each of them, and discuss under what assumptions those estimates are valid. We demonstrate our insights using simulated data as well as real data used in the design of an early warning system. Our theoretical and empirical results show that evaluations without interventional data either do not estimate meaningful quantities, require strong assumptions, or are limited to estimating best-case scenario bounds.

Alejandro Schuler

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

A Causal Machine Learning Framework for Predicting Preventable Hospital Readmissions

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model