Source author record

Danielle Braun

Danielle Braun appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology

Catalog footprint

What is connected

5works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Evaluation of Model-Based PM$_{2.5}$ Estimates for Exposure Assessment During Wildfire Smoke Episodes in the Western U.S

Investigating the health impacts of wildfire smoke requires data on people's exposure to fine particulate matter (PM$_{2.5}$) across space and time. In recent years, it has become common to use machine learning models to fill gaps in monitoring data. However, it remains unclear how well these models are able to capture spikes in PM$_{2.5}$ during and across wildfire events. Here, we evaluate the accuracy of two sets of high-coverage and high-resolution machine learning-derived PM$_{2.5}$ data sets created by Di et al. (2021) and Reid et al. (2021). In general, the Reid estimates are more accurate than the Di estimates when compared to independent validation data from mobile smoke monitors deployed by the US Forest Service. However, both models tend to severely under-predict PM$_{2.5}$ on high-pollution days. Our findings complement other recent studies calling for increased air pollution monitoring in the western US and support the inclusion of wildfire-specific monitoring observations and predictor variables in model-based estimates of PM$_{2.5}$. Lastly, we call for more rigorous error quantification of machine-learning derived exposure data sets, with special attention to extreme events.

preprint2023arXiv

Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information

Environmental Protection Agency (EPA) air quality (AQ) monitors, the gold standard for measuring air pollutants, are sparsely positioned across the US due to their costliness. Low-cost sensors (LCS) are increasingly being used by the public to fill in the gaps in AQ monitoring; however, LCS are not as accurate as EPA monitors. In this work, we investigate factors impacting the differences between an individual's true (unobserved) exposure to fine particulate matter (PM2.5) and the exposure reported by their nearest AQ instrument, which could be either an EPA monitor or an LCS. Three factors contributing to these differences are (1) distance to the nearest AQ instrument, (2) local variability in AQ, and (3) device measurement error. We examine the contributions of each component to the overall error in reported AQ using simulations based on California data. The simulations explore different combinations of hypothetical LCS placement strategies (at schools, near major roads, and in environmentally and socioeconomically marginalized census tracts) for different numbers of LCS, with varying plausible amounts of LCS device measurement error. For each scenario, we evaluate the accuracy of daily AQ information available from individuals' nearest AQ instrument with respect to absolute errors and misclassifications of the Air Quality Index, stratified by socioeconomic and demographic characteristics. We illustrate how real-time AQ reporting could be improved (or, in some cases, worsened) by using LCS, both for the population overall and for marginalized communities specifically. This work has implications for the integration of LCS into real-time AQ reporting platforms.

preprint2022arXiv

Assessing the causal effects of a stochastic intervention in time series data: Are heat alerts effective in preventing deaths and hospitalizations?

The methodological development of this paper is motivated by the need to address the following scientific question: does the issuance of heat alerts prevent adverse health effects? Our goal is to address this question within a causal inference framework in the context of time series data. A key challenge is that causal inference methods require the overlap assumption to hold: each unit (i.e., a day) must have a positive probability of receiving the treatment (i.e., issuing a heat alert on that day). In our motivating example, the overlap assumption is often violated: the probability of issuing a heat alert on a cooler day is zero. To overcome this challenge, we propose a stochastic intervention for time series data which is implemented via an incremental time-varying propensity score (ItvPS). The ItvPS intervention is executed by multiplying the probability of issuing a heat alert on day $t$ -- conditional on past information up to day $t$ -- by an odds ratio $δ_t$. First, we introduce a new class of causal estimands that relies on the ItvPS intervention. We provide theoretical results to show that these causal estimands can be identified and estimated under a weaker version of the overlap assumption. Second, we propose nonparametric estimators based on the ItvPS and derive an upper bound for the variances of these estimators. Third, we extend this framework to multi-site time series using a spatial meta-analysis approach. Fourth, we show that the proposed estimators perform well in terms of bias and root mean squared error via simulations. Finally, we apply our proposed approach to estimate the causal effects of increasing the probability of issuing heat alerts on each warm-season day in reducing deaths and hospitalizations among Medicare enrollees in $2,837$ U.S. counties.

preprint2022arXiv

Statistical methods for Mendelian models with multiple genes and cancers

Risk evaluation to identify individuals who are at greater risk of cancer as a result of heritable pathogenic variants is a valuable component of individualized clinical management. Using principles of Mendelian genetics, Bayesian probability theory, and variant-specific knowledge, Mendelian models derive the probability of carrying a pathogenic variant and developing cancer in the future, based on family history. Existing Mendelian models are widely employed, but are generally limited to specific genes and syndromes. However, the upsurge of multi-gene panel germline testing has spurred the discovery of many new gene-cancer associations that are not presently accounted for in these models. We have developed PanelPRO, a flexible, efficient Mendelian risk prediction framework that can incorporate an arbitrary number of genes and cancers, overcoming the computational challenges that arise because of the increased model complexity. We implement an eleven-gene, eleven-cancer model, the largest Mendelian model created thus far, based on this framework. Using simulations and a clinical cohort with germline panel testing data, we evaluate model performance, validate the reverse-compatibility of our approach with existing Mendelian models, and illustrate its usage. Our implementation is freely available for research use in the PanelPRO R package.

preprint2020arXiv

Combining Breast Cancer Risk Prediction Models

Accurate risk stratification is key to reducing cancer morbidity through targeted screening and preventative interventions. Numerous breast cancer risk prediction models have been developed, but they often give predictions with conflicting clinical implications. Integrating information from different models may improve the accuracy of risk predictions, which would be valuable for both clinicians and patients. BRCAPRO and BCRAT are two widely used models based on largely complementary sets of risk factors. BRCAPRO is a Bayesian model that uses detailed family history information to estimate the probability of carrying a BRCA1/2 mutation, as well as future risk of breast and ovarian cancer, based on mutation prevalence and penetrance (age-specific probability of developing cancer given genotype). BCRAT uses a relative hazard model based on first-degree family history and non-genetic risk factors. We consider two approaches for combining BRCAPRO and BCRAT: 1) modifying the penetrance functions in BRCAPRO using relative hazard estimates from BCRAT, and 2) training an ensemble model that takes as input BRCAPRO and BCRAT predictions. We show that the combination models achieve performance gains over BRCAPRO and BCRAT in simulations and data from the Cancer Genetics Network.