Researcher profile

Felix A. Faber

Felix A. Faber contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Rapid Discovery of Stable Materials by Coordinate-free Coarse Graining

A fundamental challenge in materials science pertains to elucidating the relationship between stoichiometry, stability, structure, and property. Recent advances have shown that machine learning can be used to learn such relationships, allowing the stability and functional properties of materials to be accurately predicted. However, most of these approaches use atomic coordinates as input and are thus bottle-necked by crystal structure identification when investigating novel materials. Our approach solves this bottleneck by coarse-graining the infinite search space of atomic coordinates into a combinatorially enumerable search space. The key idea is to use Wyckoff representations -- coordinate-free sets of symmetry-related positions in a crystal -- as the input to a machine learning model. Our model demonstrates exceptionally high precision in discovering new theoretically stable materials, identifying 1,569 materials that lie below the known convex hull of previously calculated materials from just 5,675 ab-initio calculations. Our approach opens up fundamental advances in computational materials discovery.

preprint2020arXiv

An assessment of the structural resolution of various fingerprints commonly used in machine learning

Atomic environment fingerprints are widely used in computational materials science, from machine learning potentials to the quantification of similarities between atomic configurations. Many approaches to the construction of such fingerprints, also called structural descriptors, have been proposed. In this work, we compare the performance of fingerprints based on the Overlap Matrix(OM), the Smooth Overlap of Atomic Positions (SOAP), Behler-Parrinello atom-centered symmetry functions (ACSF), modified Behler-Parrinello symmetry functions (MBSF) used in the ANI-1ccx potential and the Faber-Christensen-Huang-Lilienfeld (FCHL) fingerprint under various aspects. We study their ability to resolve differences in local environments and in particular examine whether there are certain atomic movements that leave the fingerprints exactly or nearly invariant. For this purpose, we introduce a sensitivity matrix whose eigenvalues quantify the effect of atomic displacement modes on the fingerprint. Further, we check whether these displacements correlate with the variation of localized physical quantities such as forces. Finally, we extend our examination to the correlation between molecular fingerprints obtained from the atomic fingerprints and global quantities of entire molecules.

preprint2020arXiv

FCHL revisited: faster and more accurate quantum machine learning

We introduce the FCHL19 representation for atomic environments in molecules or condensed-phase systems. Machine learning models based on FCHL19 are able to yield predictions of atomic forces and energies of query compounds with chemical accuracy on the scale of milliseconds. FCHL19 is a revision of our previous work [Faber et al. 2018] where the representation is discretized and the individual features are rigorously optimized using Monte Carlo optimization. Combined with a Gaussian kernel function that incorporates elemental screening, chemical accuracy is reached for energy learning on the QM7b and QM9 datasets after training for minutes and hours, respectively. The model also shows good performance for non-bonded interactions in the condensed phase for a set of water clusters with an MAE binding energy error of less than 0.1 kcal/mol/molecule after training on 3,200 samples. For force learning on the MD17 dataset, our optimized model similarly displays state-of-the-art accuracy with a regressor based on Gaussian process regression. When the revised FCHL19 representation is combined with the operator quantum machine learning regressor, forces and energies can be predicted in only a few milliseconds per atom. The model presented herein is fast and lightweight enough for use in general chemistry problems as well as molecular dynamics simulations.

preprint2017arXiv

Machine learning prediction errors better than DFT accuracy

We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of thirteen electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to $\sim$117k distinct molecules. Molecular structures and properties at hybrid density functional theory (DFT) level of theory used for training and testing come from the QM9 database [Ramakrishnan et al, {\em Scientific Data} {\bf 1} 140022 (2014)] and include dipole moment, polarizability, HOMO/LUMO energies and gap, electronic spatial extent, zero point vibrational energy, enthalpies and free energies of atomization, heat capacity and the highest fundamental vibrational frequency. Various representations from the literature have been studied (Coulomb matrix, bag of bonds, BAML and ECFP4, molecular graphs (MG)), as well as newly developed distribution based variants including histograms of distances (HD), and angles (HDA/MARAD), and dihedrals (HDAD). Regressors include linear models (Bayesian ridge regression (BR) and linear regression with elastic net regularization (EN)), random forest (RF), kernel ridge regression (KRR) and two types of neural net works, graph convolutions (GC) and gated graph networks (GG). We present numerical evidence that ML model predictions deviate from DFT less than DFT deviates from experiment for all properties. Furthermore, our out-of-sample prediction errors with respect to hybrid DFT reference are on par with, or close to, chemical accuracy. Our findings suggest that ML models could be more accurate than hybrid DFT if explicitly electron correlated quantum (or experimental) data was available.