Researcher profile

Andrew D. White

Andrew D. White contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Iterative Symbolic Regression for Learning Transport Equations

Computational fluid dynamics (CFD) analysis is widely used in engineering. Although CFD calculations are accurate, the computational cost associated with complex systems makes it difficult to obtain empirical equations between system variables. Here we combine active learning (AL) and symbolic regression (SR) to get a symbolic equation for system variables from CFD simulations. Gaussian process regression-based AL allows for automated selection of variables by selecting the most instructive points from the available range of possible parameters. The results from these experiments are then passed to SR to find empirical symbolic equations for CFD models. This approach is scalable and applicable for any desired number of CFD design parameters. To demonstrate the effectiveness, we use this method with two model systems. We recover an empirical equation for the pressure drop in a bent pipe and a new equation for predicting backflow in a heart valve under arotic insufficiency.

preprint2022arXiv

Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching

Natural language processing models have emerged that can generate usable software and automate a number of programming tasks with high fidelity. These tools have yet to have an impact on the chemistry community. Yet, our initial testing demonstrates that this form of Artificial Intelligence is poised to transform chemistry and chemical engineering research. Here, we review developments that brought us to this point, examine applications in chemistry, and give our perspective on how this may fundamentally alter research and teaching.

preprint2022arXiv

Physics is the New Data

The rapid development of machine learning (ML) methods has fundamentally affected numerous applications ranging from computer vision, biology, and medicine to accounting and text analytics. Until now, it was the availability of large and often labeled data sets that enabled significant breakthroughs. However, the adoption of these methods in classical physical disciplines has been relatively slow, a tendency that can be traced to the intrinsic differences between correlative approaches of purely data-based ML and the causal hypothesis-driven nature of physical sciences. Furthermore, anomalous behaviors of classical ML necessitate addressing issues such as explainability and fairness of ML. We also note the sequence in which deep learning became mainstream in different scientific disciplines - starting from medicine and biology and then towards theoretical chemistry, and only after that, physics - is rooted in the progressively more complex level of descriptors, constraints, and causal structures available for incorporation in ML architectures. Here we put forth that over the next decade, physics will become a new data, and this will continue the transition from dot-coms and scientific computing concepts of the 90ies to big data of 2000-2010 to deep learning of 2010-2020 to physics-enabled scientific ML.

preprint2021arXiv

City-wide modeling of Vehicle-to-Grid Economics to Understand Effects of Battery Performance

Vehicle-to-grid (V2G) is a promising approach to solve the problem of grid-level intermittent supply and demand mismatch, caused due to renewable energy resources, because it uses the existing resource of electric vehicle (EV) batteries as the energy storage medium. EV battery design together with an impetus on profitability for participating EV owners is pivotal for V2G success. To better understand what battery device parameters are most important for V2G adoption, we model the economics of V2G process under realistic conditions. Most previous studies that perform V2G economic analysis, assume ideal driving conditions, use linear battery degradation models, or only consider V2G for ancillary services. Our model accounts realistic battery degradation, empirical charging efficiencies, for randomness in commute behavior, and historic hourly electricity prices in six cities in the United States. We model user behavior with Bayesian optimization to provide a best-case scenario for V2G. Across all cities, we find that charging rate and efficiency are the most important factors that determine EV users' profits. Surprisingly, EV battery cost and thus degradation due to cycling has little effect. These findings should help focus research on figures of merit that better reflect real usage of batteries in a V2G economy.

preprint2021arXiv

Graph Neural Network Based Coarse-Grained Mapping Prediction

The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the mapping operators produced by experts. We present a graph neural network based CG mapping predictor called DEEP SUPERVISED GRAPH PARTITIONING MODEL(DSGPM) that treats mapping operators as a graph segmentation problem. DSGPM is trained on a novel dataset, Human-annotated Mappings (HAM), consisting of 1,206 molecules with expert annotated mapping operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally, we find that predicted CG mapping operators indeed result in good CG MD models when used in simulation.

preprint2021arXiv

Inferring Spatial Source of Disease Outbreaks using Maximum Entropy

Mathematical modeling of disease outbreaks can infer the future trajectory of an epidemic, which can inform policy decisions. Another task is inferring the origin of a disease, which is relatively difficult with current mathematical models. Such frameworks -- across varying levels of complexity -- are typically sensitive to input data on epidemic parameters, case-counts and mortality rates, which are generally noisy and incomplete. To alleviate these limitations, we propose a maximum entropy framework that fits epidemiological models, provides a calibrated infection origin probabilities, and is robust to noise due to a prior belief model. Maximum entropy is agnostic to the parameters or model structure used and allows for flexible use when faced with sparse data conditions and incomplete knowledge in the dynamical phase of disease-spread, providing for more reliable modeling at early stages of outbreaks. We evaluate the performance of our model by predicting future disease trajectories in synthetic graph networks and the real mobility network of New York state. In addition, unlike existing approaches, we demonstrate that the method can be used to infer the origin of the outbreak with accurate confidence. Indeed, despite the prevalent belief on the feasibility of contact-tracing being limited to the initial stages of an outbreak, we report the possibility of reconstructing early disease dynamics, including the epidemic seed, at advanced stages.