Researcher profile

Alpha A. Lee

Alpha A. Lee contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Achieving Robustness to Aleatoric Uncertainty with Heteroscedastic Bayesian Optimisation

Bayesian optimisation is a sample-efficient search methodology that holds great promise for accelerating drug and materials discovery programs. A frequently-overlooked modelling consideration in Bayesian optimisation strategies however, is the representation of heteroscedastic aleatoric uncertainty. In many practical applications it is desirable to identify inputs with low aleatoric noise, an example of which might be a material composition which consistently displays robust properties in response to a noisy fabrication process. In this paper, we propose a heteroscedastic Bayesian optimisation scheme capable of representing and minimising aleatoric noise across the input space. Our scheme employs a heteroscedastic Gaussian process (GP) surrogate model in conjunction with two straightforward adaptations of existing acquisition functions. First, we extend the augmented expected improvement (AEI) heuristic to the heteroscedastic setting and second, we introduce the aleatoric noise-penalised expected improvement (ANPEI) heuristic. Both methodologies are capable of penalising aleatoric noise in the suggestions and yield improved performance relative to homoscedastic Bayesian optimisation and random sampling on toy problems as well as on two real-world scientific datasets. Code is available at: \url{https://github.com/Ryan-Rhys/Heteroscedastic-BO}

preprint2022arXiv

Inferring global dynamics from local structure in liquid electrolytes

Ion transport in concentrated electrolytes plays a fundamental role in electrochemical systems such as lithium ion batteries. Nonetheless, the mechanism of transport amid strong ion-ion interactions remains enigmatic. A key question is whether the dynamics of ion transport can be predicted by the local static structure alone, and if so what are the key structural motifs that determine transport. In this paper, we show that machine learning can successfully decompose global conductivity into the spatio-temporal average of local, instantaneous ionic contributions, and relate this ``local molar conductivity" field to the local ionic environment. Our machine learning model accurately predicts the molar conductivity of electrolyte systems that were not part of the training set, suggesting that the dynamics of ion transport is predictable from local static structure. Further, through analysing this machine-learned local conductivity field, we observe that fluctuations in local conductivity at high concentration are negatively correlated with total molar conductivity. Surprisingly, these fluctuations arise due to a long tail distribution of low conductivity ions, rather than distinct ion pairs, and are spatially correlated through both like- and unlike-charge interactions. More broadly, our approach shows how machine learning can aid the understanding of complex soft matter systems, by learning a function that attributes global collective properties to local, atomistic contributions.

preprint2022arXiv

Rapid Discovery of Stable Materials by Coordinate-free Coarse Graining

A fundamental challenge in materials science pertains to elucidating the relationship between stoichiometry, stability, structure, and property. Recent advances have shown that machine learning can be used to learn such relationships, allowing the stability and functional properties of materials to be accurately predicted. However, most of these approaches use atomic coordinates as input and are thus bottle-necked by crystal structure identification when investigating novel materials. Our approach solves this bottleneck by coarse-graining the infinite search space of atomic coordinates into a combinatorially enumerable search space. The key idea is to use Wyckoff representations -- coordinate-free sets of symmetry-related positions in a crystal -- as the input to a machine learning model. Our model demonstrates exceptionally high precision in discovering new theoretically stable materials, identifying 1,569 materials that lie below the known convex hull of previously calculated materials from just 5,675 ab-initio calculations. Our approach opens up fundamental advances in computational materials discovery.

preprint2021arXiv

Machine learnt approximations to the bridge function yield improved closures for the Ornstein-Zernike equation

A key challenge for soft materials design and coarse-graining simulations is determining interaction potentials between components that give rise to desired condensed-phase structures. In theory, the Ornstein-Zernike equation provides an elegant framework for solving this inverse problem. Pioneering work in liquid state theory derived analytical closures for the framework. However, these analytical closures are approximations, valid only for specific classes of interaction potentials. In this work, we combine the physics of liquid state theory with machine learning to infer a closure directly from simulation data. The resulting closure is more accurate than commonly used closures across a broad range of interaction potentials. We show for two examples of a prototypical inverse design problem, fitting a coarse-grained simulation potential, that our approach leads to improved one-step inversion.

preprint2020arXiv

Materials Graph Transformer predicts the outcomes of inorganic reactions with reliable uncertainties

A common bottleneck for materials discovery is synthesis. While recent methodological advances have resulted in major improvements in the ability to predicatively design novel materials, researchers often still rely on trial-and-error approaches for determining synthesis procedures. In this work, we develop a model that predicts the major product of solid-state reactions. The cardinal feature of this approach is the construction of fixed-length, learned representations of reactions. Precursors are represented as nodes on a `reaction graph', and message-passing operations between nodes are used to embody the interactions between precursors in the reaction mixture. Through an ablation study, it is shown that this framework not only outperforms less physically-motivated baseline methods but also more reliably assesses the uncertainty in its predictions.

preprint2020arXiv

Predicting materials properties without crystal structure: Deep representation learning from stoichiometry

Machine learning has the potential to accelerate materials discovery by accurately predicting materials properties at a low computational cost. However, the model inputs remain a key stumbling block. Current methods typically use descriptors constructed from knowledge of either the full crystal structure -- therefore only applicable to materials with already characterised structures -- or structure-agnostic fixed-length representations hand-engineered from the stoichiometry. We develop a machine learning approach that takes only the stoichiometry as input and automatically learns appropriate and systematically improvable descriptors from data. Our key insight is to treat the stoichiometric formula as a dense weighted graph between elements. Compared to the state of the art for structure-agnostic methods, our approach achieves lower errors with less data.

preprint2019arXiv

Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction

Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening, and instead suggest that area under the precision-recall curve should be used in conjunction with the receiver operating characteristic. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.

preprint2018arXiv

Geometry of energy landscapes and the optimizability of deep neural networks

Deep neural networks are workhorse models in machine learning with multiple layers of non-linear functions composed in series. Their loss function is highly non-convex, yet empirically even gradient descent minimisation is sufficient to arrive at accurate and predictive models. It is hitherto unknown why are deep neural networks easily optimizable. We analyze the energy landscape of a spin glass model of deep neural networks using random matrix theory and algebraic geometry. We analytically show that the multilayered structure holds the key to optimizability: Fixing the number of parameters and increasing network depth, the number of stationary points in the loss function decreases, minima become more clustered in parameter space, and the tradeoff between the depth and width of minima becomes less severe. Our analytical results are numerically verified through comparison with neural networks trained on a set of classical benchmark datasets. Our model uncovers generic design principles of machine learning models.

preprint2017arXiv

Fluctuation Spectra and Force Generation in Non-equilibrium Systems

Many biological systems are appropriately viewed as passive inclusions immersed in an active bath: from proteins on active membranes to microscopic swimmers confined by boundaries. The non-equilibrium forces exerted by the active bath on the inclusions or boundaries often regulate function, and such forces may also be exploited in artificial active materials. Nonetheless, the general phenomenology of these active forces remains elusive. We show that the fluctuation spectrum of the active medium, the partitioning of energy as a function of wavenumber, controls the phenomenology of force generation. We find that for a narrow, unimodal spectrum, the force exerted by a non-equilibrium system on two embedded walls depends on the width and the position of the peak in the fluctuation spectrum, and oscillates between repulsion and attraction as a function of wall separation. We examine two apparently disparate examples: the Maritime Casimir effect and recent simulations of active Brownian particles. A key implication of our work is that important non-equilibrium interactions are encoded within the fluctuation spectrum. In this sense the noise becomes the signal.