Researcher profile

Brian J Reich

Brian J Reich contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

A penalized complexity prior for deep Bayesian transfer learning with application to materials informatics

A key task in the emerging field of materials informatics is to use machine learning to predict a material's properties and functions. A fast and accurate predictive model allows researchers to more efficiently identify or construct a material with desirable properties. As in many fields, deep learning is one of the state-of-the art approaches, but fully training a deep learning model is not always feasible in materials informatics due to limitations on data availability, computational resources, and time. Accordingly, there is a critical need in the application of deep learning to materials informatics problems to develop efficient transfer learning algorithms. The Bayesian framework is natural for transfer learning because the model trained from the source data can be encoded in the prior distribution for the target task of interest. However, the Bayesian perspective on transfer learning is relatively unaccounted for in the literature, and is complicated for deep learning because the parameter space is large and the interpretations of individual parameters are unclear. Therefore, rather than subjective prior distributions for individual parameters, we propose a new Bayesian transfer learning approach based on the penalized complexity prior on the Kullback-Leibler divergence between the predictive models of the source and target tasks. We show via simulations that the proposed method outperforms other transfer learning methods across a variety of settings. The new method is then applied to a predictive materials science problem where we show improved precision for estimating the band gap of a material based on its structural properties.

preprint2022arXiv

DeepKriging: Spatially Dependent Deep Neural Networks for Spatial Prediction

In spatial statistics, a common objective is to predict values of a spatial process at unobserved locations by exploiting spatial dependence. Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Gaussian and categorical data, the Kriging prediction is no longer optimal, and the associated variance is often overly optimistic. Although deep neural networks (DNNs) are widely used for general classification and prediction, they have not been studied thoroughly for data with spatial dependence. In this work, we propose a novel DNN structure for spatial prediction, where the spatial dependence is captured by adding an embedding layer of spatial coordinates with basis functions. We show in theory and simulation studies that the proposed DeepKriging method has a direct link to Kriging in the Gaussian case, and it has multiple advantages over Kriging for non-Gaussian and non-stationary data, i.e., it provides non-linear predictions and thus has smaller approximation errors, it does not require operations on covariance matrices and thus is scalable for large datasets, and with sufficiently many hidden neurons, it provides the optimal prediction in terms of model capacity. We further explore the possibility of quantifying prediction uncertainties based on density prediction without assuming any data distribution. Finally, we apply the method to predicting PM2.5 concentrations across the continental United States.

preprint2020arXiv

A review of spatial causal inference methods for environmental and epidemiological applications

The scientific rigor and computational methods of causal inference have had great impacts on many disciplines, but have only recently begun to take hold in spatial applications. Spatial casual inference poses analytic challenges due to complex correlation structures and interference between the treatment at one location and the outcomes at others. In this paper, we review the current literature on spatial causal inference and identify areas of future work. We first discuss methods that exploit spatial structure to account for unmeasured confounding variables. We then discuss causal analysis in the presence of spatial interference including several common assumptions used to reduce the complexity of the interference patterns under consideration. These methods are extended to the spatiotemporal case where we compare and contrast the potential outcomes framework with Granger causality, and to geostatistical analyses involving spatial random fields of treatments and responses. The methods are introduced in the context of observational environmental and epidemiological studies, and are compared using both a simulation study and analysis of the effect of ambient air pollution on COVID-19 mortality rate. Code to implement many of the methods using the popular Bayesian software OpenBUGS is provided.

preprint2019arXiv

A comparison of statistical and machine learning methods for creating national daily maps of ambient PM$_{2.5}$ concentration

A typical problem in air pollution epidemiology is exposure assessment for individuals for which health data are available. Due to the sparsity of monitoring sites and the limited temporal frequency with which measurements of air pollutants concentrations are collected (for most pollutants, once every 3 or 6 days), epidemiologists have been moving away from characterizing ambient air pollution exposure solely using measurements. In the last few years, substantial research efforts have been placed in developing statistical methods or machine learning techniques to generate estimates of air pollution at finer spatial and temporal scales (daily, usually) with complete coverage. Some of these methods include: geostatistical techniques, such as kriging; spatial statistical models that use the information contained in air quality model outputs (statistical downscaling models); linear regression modeling approaches that leverage the information in GIS covariates (land use regression); or machine learning methods that mine the information contained in relevant variables (neural network and deep learning approaches). Although some of these exposure modeling approaches have been used in several air pollution epidemiological studies, it is not clear how much the predicted exposures generated by these methods differ, and which method generates more reliable estimates. In this paper, we aim to address this gap by evaluating a variety of exposure modeling approaches, comparing their predictive performance and computational difficulty. Using PM$_{2.5}$ in year 2011 over the continental U.S. as case study, we examine the methods' performances across seasons, rural vs urban settings, and levels of PM$_{2.5}$ concentrations (low, medium, high).