Researcher profile

Alexander Shapeev

Alexander Shapeev contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2021arXiv

Free energy of (CoxMn1-x)3O4 mixed phases from machine-learning-enhanced ab initio calculations

(CoxMn1-x)3O4 is a promising candidate material for solar thermochemical energy storage. A high-temperature model for this system would provide a valuable tool for evaluating its potential. However, predicting phase diagrams of complex systems with ab initio calculations is challenging due to the varied sources affecting the free energy, and with the prohibitive amount of configurations needed in the configurational entropy calculation. In this work, we compare three different machine learning (ML) approaches for sampling the configuration space of (CoxMn1-x)3O4, including a simpler ML approach, which would be suitable for application in high-throughput studies. We use experimental data for a feature of the phase diagram to assess the accuracy of model predictions. We find that with some methods, data pre-treatment is needed to obtain accurate predictions due to inherently composition-imbalanced training data for a mixed phase. We highlight that the important entropy contributions depend on the physical regimes of the system under investigation and that energy predictions with ML models are more challenging at compositions where there are energetically competing ground state crystal structures. Similar methods to those outlined here can be used to screen other candidate materials for thermochemical energy storage

preprint2020arXiv

In operando active learning of interatomic interaction during large-scale simulations

A well-known drawback of state-of-the-art machine-learning interatomic potentials is their poor ability to extrapolate beyond the training domain. For small-scale problems with tens to hundreds of atoms this can be solved by using active learning which is able to select atomic configurations on which a potential attempts extrapolation and add them to the ab initio-computed training set. In this sense an active learning algorithm can be viewed as an on-the-fly interpolation of an ab initio model. For large-scale problems, possibly involving tens of thousands of atoms, this is not feasible because one cannot afford even a single density functional theory computation with such a large number of atoms. This work marks a new milestone toward fully automatic ab initio-accurate large-scale atomistic simulations. We develop an active learning algorithm that identifies local subregions of the simulation region where the potential extrapolates. Then the algorithm constructs periodic configurations out of these local, non-periodic subregions, sufficiently small to be computable with plane-wave density functional theory codes, in order to obtain accurate ab initio energies. We benchmark our algorithm on the problem of the screw dislocation motion in bcc tungsten and show that our algorithm reaches ab initio accuracy, down to typical magnitudes of numerical noise in DFT codes. We show that our algorithm reproduces material properties such as core structure, Peierls barrier, and Peierls stress. This unleashes new capabilities for computational materials science toward applications which have currently been out of scope if approached solely by ab initio methods.

preprint2019arXiv

Deeper Connections between Neural Networks and Gaussian Processes Speed-up Active Learning

Active learning methods for neural networks are usually based on greedy criteria which ultimately give a single new design point for the evaluation. Such an approach requires either some heuristics to sample a batch of design points at one active learning iteration, or retraining the neural network after adding each data point, which is computationally inefficient. Moreover, uncertainty estimates for neural networks sometimes are overconfident for the points lying far from the training sample. In this work we propose to approximate Bayesian neural networks (BNN) by Gaussian processes, which allows us to update the uncertainty estimates of predictions efficiently without retraining the neural network, while avoiding overconfident uncertainty prediction for out-of-sample points. In a series of experiments on real-world data including large-scale problems of chemical and physical modeling, we show superiority of the proposed approach over the state-of-the-art methods.

preprint2018arXiv

Dropout-based Active Learning for Regression

Active learning is relevant and challenging for high-dimensional regression models when the annotation of the samples is expensive. Yet most of the existing sampling methods cannot be applied to large-scale problems, consuming too much time for data processing. In this paper, we propose a fast active learning algorithm for regression, tailored for neural network models. It is based on uncertainty estimation from stochastic dropout output of the network. Experiments on both synthetic and real-world datasets show comparable or better performance (depending on the accuracy metric) as compared to the baselines. This approach can be generalized to other deep learning architectures. It can be used to systematically improve a machine-learning model as it offers a computationally efficient way of sampling additional data.