Source author record

Heather J. Kulik

Heather J. Kulik appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.chem-ph cond-mat.mtrl-sci Machine Learning Biomolecules cond-mat.soft

Catalog footprint

What is connected

14works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Transferable Recommender Approach for Selecting the Best Density Functional Approximations in Chemical Discovery

Approximate density functional theory (DFT) has become indispensable owing to its cost-accuracy trade-off in comparison to more computationally demanding but accurate correlated wavefunction theory. To date, however, no single density functional approximation (DFA) with universal accuracy has been identified, leading to uncertainty in the quality of data generated from DFT. With electron density fitting and transfer learning, we build a DFA recommender that selects the DFA with the lowest expected error with respect to gold standard but cost-prohibitive coupled cluster theory in a system-specific manner. We demonstrate this recommender approach on vertical spin-splitting energy evaluation for challenging transition metal complexes. Our recommender predicts top-performing DFAs and yields excellent accuracy (ca. 2 kcal/mol) for chemical discovery, outperforming both individual transfer learning models and the single best functional in a set of 48 DFAs. We demonstrate the transferability of the DFA recommender to experimentally synthesized compounds with distinct chemistry.

preprint2022arXiv

Active Learning Exploration of Transition Metal Complexes to Discover Method-Insensitive and Synthetically Accessible Chromophores

Transition metal chromophores with earth-abundant transition metals are an important design target for their applications in lighting and non-toxic bioimaging, but their design is challenged by the scarcity of complexes that simultaneously have optimal target absorption energies in the visible region as well as well-defined ground states. Machine learning (ML) accelerated discovery could overcome such challenges by enabling screening of a larger space, but is limited by the fidelity of the data used in ML model training, which is typically from a single approximate density functional. To address this limitation, we search for consensus in predictions among 23 density functional approximations across multiple rungs of Jacobs ladder. To accelerate the discovery of complexes with absorption energies in the visible region while minimizing MR character, we use 2D efficient global optimization to sample candidate low-spin chromophores from multi-million complex spaces. Despite the scarcity (i.e., approx. 0.01\%) of potential chromophores in this large chemical space, we identify candidates with high likelihood (i.e., > 10\%) of computational validation as the ML models improve during active learning, representing a 1,000-fold acceleration in discovery. Absorption spectra of promising chromophores from time-dependent density functional theory verify that 2/3 of candidates have the desired excited state properties. The observation that constituent ligands from our leads have demonstrated interesting optical properties in the literature exemplifies the effectiveness of our construction of a realistic design space and active learning approach.

preprint2022arXiv

Exploiting Ligand Additivity for Transferable Machine Learning of Multireference Character Across Known Transition Metal Complex Ligands

Accurate virtual high-throughput screening (VHTS) of transition metal complexes (TMCs) remains challenging due to the possibility of high multi-reference (MR) character that complicates property evaluation. We compute MR diagnostics for over 5,000 ligands present in previously synthesized transition metal complexes in the Cambridge Structural Database (CSD). To accomplish this task, we introduce an iterative approach for consistent ligand charge assignment for ligands in the CSD. Across this set, we observe that MR character correlates linearly with the inverse value of the averaged bond order over all bonds in the molecule. We then demonstrate that ligand additivity of MR character holds in TMCs, which suggests that the TMC MR character can be inferred from the sum of the MR character of the ligands. Encouraged by this observation, we leverage ligand additivity and develop a ligand-derived machine learning representation to train neural networks to predict the MR character of TMCs from properties of the constituent ligands. This approach yields models with excellent performance and superior transferability to unseen ligand chemistry and compositions.

preprint2022arXiv

Ligand Additivity and Divergent Trends in Two Types of Delocalization Errors from Approximate Density Functional Theory

Despite its widespread use, the predictive accuracy of density functional theory (DFT) is hampered by delocalization errors, especially for correlated systems such as transition-metal complexes. Two complementary tuning strategies have been developed to reduce delocalization error: eliminating the global curvature with respect to charge addition or removal, and computing a linear response Hubbard U as a measure of local curvature at the metal center at fixed charge and applying it to the transition-metal complex in a DFT+U framework. We investigate the relationship between the two measures of delocalization error as we manipulate the ligand field strength by varying the number of strong-field ligands in a series of heteroleptic complexes or by geometrically constraining the metal-ligand bond length in homoleptic octahedral complexes. We show that across these sets of complexes with varying ligand fields, an inverse relationship generally exists between global and local curvatures. We find that effects of ligand substitution on both measures of delocalization are typically additive, but the two quantities seldom coincide. The observation of ligand additivity suggests opportunities for evaluating errors on homoleptic complexes to infer corrections for lower-symmetry complexes.

preprint2022arXiv

Machine learning models predict calculation outcomes with the transferability necessary for computational catalysis

Virtual high throughput screening (VHTS) and machine learning (ML) have greatly accelerated the design of single-site transition-metal catalysts. VHTS of catalysts, however, is often accompanied with high calculation failure rate and wasted computational resources due to the difficulty of simultaneously converging all mechanistically relevant reactive intermediates to expected geometries and electronic states. We demonstrate a dynamic classifier approach, i.e., a convolutional neural network that monitors geometry optimization on the fly, and exploit its good performance and transferability for catalyst design. We show that the dynamic classifier performs well on all reactive intermediates in the representative catalytic cycle of the radical rebound mechanism for methane-to-methanol despite being trained on only one reactive intermediate. The dynamic classifier also generalizes to chemically distinct intermediates and metal centers absent from the training data without loss of accuracy or model confidence. We rationalize this superior model transferability to the use of on-the-fly electronic structure and geometric information generated from density functional theory calculations and the convolutional layer in the dynamic classifier. Combined with model uncertainty quantification, the dynamic classifier saves more than half of the computational resources that would have been wasted on unsuccessful calculations for all reactive intermediates being considered.

preprint2022arXiv

Putting Density Functional Theory to the Test in Machine-Learning-Accelerated Materials Discovery

Accelerated discovery with machine learning (ML) has begun to provide the advances in efficiency needed to overcome the combinatorial challenge of computational materials design. Nevertheless, ML-accelerated discovery both inherits the biases of training data derived from density functional theory (DFT) and leads to many attempted calculations that are doomed to fail. Many compelling functional materials and catalytic processes involve strained chemical bonds, open-shell radicals and diradicals, or metal-organic bonds to open-shell transition-metal centers. Although promising targets, these materials present unique challenges for electronic structure methods and combinatorial challenges for their discovery. In this Perspective, we describe the advances needed in accuracy, efficiency, and approach beyond what is typical in conventional DFT-based ML workflows. These challenges have begun to be addressed through ML models trained to predict the results of multiple methods or the differences between them, enabling quantitative sensitivity analysis. For DFT to be trusted for a given data point in a high-throughput screen, it must pass a series of tests. ML models that predict the likelihood of calculation success and detect the presence of strong correlation will enable rapid diagnoses and adaptation strategies. These "decision engines" represent the first steps toward autonomous workflows that avoid the need for expert determination of the robustness of DFT-based materials discoveries.

preprint2022arXiv

Two Wrongs Can Make a Right: A Transfer Learning Approach for Chemical Discovery with Chemical Accuracy

Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high throughput screening (VHTS). Nevertheless, most VHTS is carried out with approximate density functional theory (DFT) using a single functional. Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates MR effect on chemical property prediction is not well established. We evaluate MR diagnostics of over 10,000 transition metal complexes (TMCs) and compare to those in organic molecules. We reveal that only some MR diagnostics are transferable across these materials spaces. By studying the influence of MR character on chemical properties (i.e., MR effect) that involves multiple potential energy surfaces (i.e., adiabatic spin splitting, $ΔE_\mathrm{H-L}$, and ionization potential, IP), we observe that cancellation in MR effect outweighs accumulation. Differences in MR character are more important than the total degree of MR character in predicting MR effect in property prediction. Motivated by this observation, we build transfer learning models to directly predict CCSD(T)-level adiabatic $ΔE_\mathrm{H-L}$ and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving chemical accuracy (i.e., 1 kcal/mol) for robust VHTS.

preprint2021arXiv

Molecular orbital projectors in non-empirical jmDFT recover exact conditions in transition metal chemistry

Low-cost, non-empirical corrections to semi-local density functional theory are essential for accurately modeling transition metal chemistry. Here, we demonstrate the judiciously-modified density functional theory (jmDFT) approach with non-empirical U and J parameters obtained directly from frontier orbital energetics on a series of transition metal complexes. We curate a set of nine representative Ti(III) and V(IV) $d^1$ transition metal complexes and evaluate their flat plane errors along the fractional spin and charge lines. We demonstrate that while jmDFT improves upon both DFT+U and semi-local DFT with the standard atomic orbital projectors (AOPs), it does so inefficiently. We rationalize these inefficiencies by quantifying hybridization in the relevant frontier orbitals for both the case of fractional spins and fractional charges. To overcome these limitations, we introduce a procedure for computing a molecular orbital projector (MOP) basis for use with jmDFT. We demonstrate this single set of $d^1$ MOPs to be suitable for nearly eliminating all energetic delocalization error and static correlation error. In all cases, the MOP jmDFT outperforms AOP jmDFT, and it eliminates most flat plane errors at non-empirical values. Unlike widely employed DFT+U or hybrid functionals, jmDFT nearly eliminates energetic delocalization error and static correlation error within a non-empirical framework.

preprint2021arXiv

Representations and Strategies for Transferable Machine Learning Models in Chemical Discovery

Strategies for machine-learning(ML)-accelerated discovery that are general across materials composition spaces are essential, but demonstrations of ML have been primarily limited to narrow composition variations. By addressing the scarcity of data in promising regions of chemical space for challenging targets like open-shell transition-metal complexes, general representations and transferable ML models that leverage known relationships in existing data will accelerate discovery. Over a large set (ca. 1000) of isovalent transition-metal complexes, we quantify evident relationships for different properties (i.e., spin-splitting and ligand dissociation) between rows of the periodic table (i.e., 3d/4d metals and 2p/3p ligands). We demonstrate an extension to graph-based revised autocorrelation (RAC) representation (i.e., eRAC) that incorporates the effective nuclear charge alongside the nuclear charge heuristic that otherwise overestimates dissimilarity of isovalent complexes. To address the common challenge of discovery in a new space where data is limited, we introduce a transfer learning approach in which we seed models trained on a large amount of data from one row of the periodic table with a small number of data points from the additional row. We demonstrate the synergistic value of the eRACs alongside this transfer learning strategy to consistently improve model performance. Analysis of these models highlights how the approach succeeds by reordering the distances between complexes to be more consistent with the periodic table, a property we expect to be broadly useful for other materials domains.

preprint2016arXiv

How large should the QM region be in QM/MM calculations? The case of catechol O-methyltransferase

Hybrid quantum mechanical-molecular mechanical (QM/MM) simulations are widely used in studies of enzymatic catalysis. Until recently, it has been cost prohibitive to determine the asymptotic limit of key energetic and structural properties with respect to increasingly large QM regions. Leveraging recent advances in electronic structure efficiency and accuracy, we investigate catalytic properties in catechol O-methyltransferase, a representative example of a methyltransferase critical to human health. Using QM regions ranging in size from reactants-only (64 atoms) to nearly one-third of the entire protein (940 atoms), we show that properties such as the activation energy approach within chemical accuracy of the large-QM asymptotic limits rather slowly, requiring approximately 500-600 atoms if the QM residues are chosen simply by distance from the substrate. This slow approach to asymptotic limit is due to charge transfer from protein residues to the reacting substrates. Our large QM/MM calculations enable identification of charge separation for fragments in the transition state as a key component of enzymatic methyl transfer rate enhancement. We introduce charge shift analysis that reveals the minimum number of protein residues (ca. 11-16 residues or 200-300 atoms for COMT) needed for quantitative agreement with large-QM simulations. The identified residues are not those that would be typically selected using criteria such as chemical intuition or proximity. These results provide a recipe for a more careful determination of QM region sizes in future QM/MM studies of enzymes.

preprint2016arXiv

Where Does the Density Localize? Convergent Behavior for Global Hybrids, Range Separation, and DFT+U

Approximate density functional theory (DFT) suffers from many-electron self- interaction error, otherwise known as delocalization error, that may be diagnosed and then corrected through elimination of the deviation from exact piecewise linear behavior between integer electron numbers. Although paths to correction of energetic delocalization error are well- established, the impact of these corrections on the electron density is less well-studied. Here, we compare the effect on density delocalization of DFT+U, global hybrid tuning, and range- separated hybrid tuning on a diverse test set of 32 transition metal complexes and observe the three methods to have qualitatively equivalent effects on the ground state density. Regardless of valence orbital diffuseness (i.e., from 2p to 5p), ligand electronegativity (i.e., from Al to O), basis set (i.e., plane wave versus localized basis set), metal (i.e., Ti, Fe, Ni) and spin state, or tuning method, we consistently observe substantial charge loss at the metal and gain at ligand atoms (ca. 0.3-0.5 e or more). This charge loss at the metal is preferentially from the minority spin, leading to increasing magnetic moment as well. Using accurate wavefunction theory references, we observe that a minimum error in partial charges and magnetic moments occur at higher tuning parameters than typically employed to eliminate energetic delocalization error. These observations motivate the need to develop multi-faceted approximate-DFT error correction approaches that separately treat density delocalization and energetic errors in order to recover both correct density and magnetization properties.

preprint2015arXiv

Direct Observation of Early-stage Quantum Dot Growth Mechanisms with High-temperature Ab Initio Molecular Dynamics

Colloidal quantum dots (QDs) exhibit highly desirable size- and shape-dependent properties for applications from electronic devices to imaging. Indium phosphide QDs have emerged as a primary candidate to replace the more toxic CdSe QDs, but production of InP QDs with the desired properties lags behind other QD materials due to a poor understanding of how to tune the growth process. Using high-temperature ab initio molecular dynamics (AIMD) simulations, we report the first direct observation of the early stage intermediates and subsequent formation of an InP cluster from separated indium and phosphorus precursors. In our simulations, indium agglomeration precedes formation of In-P bonds. We observe a predominantly intercomplex pathway in which In-P bonds form between one set of precursor copies while the carboxylate ligand of a second indium precursor in the agglomerated indium abstracts a ligand from the phosphorus precursor. This process produces an indium-rich cluster with structural properties comparable to those in bulk zinc-blende InP crystals. Minimum energy pathway characterization of the AIMD-sampled reaction events confirms these observations and identifies that In-carboxylate dissociation energetics solely determine the barrier along the In-P bond formation pathway, which is lower for intercomplex (13 kcal/mol) than intracomplex (21 kcal/mol) mechanisms. The phosphorus precursor chemistry, on the other hand, controls the thermodynamics of the reaction. Our observations of the differing roles of precursors in controlling QD formation strongly suggests that the challenges thus far encountered in InP QD synthesis optimization may be attributed to an overlooked need for a cooperative tuning strategy that simultaneously addresses the chemistry of both indium and phosphorus precursors.

preprint2015arXiv

Quantum Chemistry for Solvated Molecules on Graphical Processing Units (GPUs)using Polarizable Continuum Models

The conductor-like polarization model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in chemical simulations. However, its application in quantum mechanical calculations of large-scale biomolecular systems can be limited by computational expense of both the gas phase electronic structure and the solvation interaction. We have previously used graphical processing units (GPUs) to accelerate the first of these steps. Here, we extend the use of GPUs to accelerate electronic structure calculations including C-PCM solvation. Implementation on the GPU leads to significant acceleration of the generation of the required integrals for C-PCM. We further propose two strategies to improve the solution of the required linear equations: a dynamic convergence threshold and a randomized block-Jacobi preconditioner. These strategies are not specific to GPUs and are expected to be beneficial for both CPU and GPU implementations. We benchmark the performance of the new implementation using over 20 small proteins in solvent environment. Using a single GPU, our method evaluates the C-PCM related integrals and their derivatives more than 10X faster than a conventional CPU based implementation. Our improvements to the linear solver provide a further 3X acceleration. The overall calculations including C-PCM solvation require typically 20-40% more effort than their gas phase counterparts for moderate basis set and molecule surface discretization level. The relative cost of the C-PCM solvation correction decreases as the basis sets and/or cavity radii increase. Therefore description of solvation with this model should be routine. We also discuss applications to the study of the conformational landscape of an amyloid fibril.

preprint2015arXiv

Towards quantifying the role of exact exchange in predictions of transition metal complex properties

We estimate the prediction sensitivity with respect to Hartree-Fock exchange in approximate density functionals for representative Fe(II) and Fe(III) octahedral complexes. Based on the observation that the range of parameters spanned by the most widely-employed functionals is relatively narrow, we compute electronic structure property and spin-state orderings across a relatively broad range of Hartree-Fock exchange (0-50%) ratios. For the entire range considered, we consistently observe linear relationships between spin-state ordering that differ only based on the element of the direct ligand and thus may be broadly employed as measures of functional sensitivity in predictions of organometallic compounds. The role Hartree-Fock exchange in hybrid functionals is often assumed to play is to correct self-interaction error-driven electron delocalization (e.g. from transition metal centers to neighboring ligands). Surprisingly, we instead observe that increasing Hartree-Fock exchange reduces charge on iron centers, corresponding to effective delocalization of charge to ligands, thus challenging notions of the role of Hartree-Fock exchange in shifting predictions of spin-state ordering.

Heather J. Kulik

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

A Transferable Recommender Approach for Selecting the Best Density Functional Approximations in Chemical Discovery

Active Learning Exploration of Transition Metal Complexes to Discover Method-Insensitive and Synthetically Accessible Chromophores

Exploiting Ligand Additivity for Transferable Machine Learning of Multireference Character Across Known Transition Metal Complex Ligands

Ligand Additivity and Divergent Trends in Two Types of Delocalization Errors from Approximate Density Functional Theory

Machine learning models predict calculation outcomes with the transferability necessary for computational catalysis

Putting Density Functional Theory to the Test in Machine-Learning-Accelerated Materials Discovery

Two Wrongs Can Make a Right: A Transfer Learning Approach for Chemical Discovery with Chemical Accuracy

Molecular orbital projectors in non-empirical jmDFT recover exact conditions in transition metal chemistry

Representations and Strategies for Transferable Machine Learning Models in Chemical Discovery

How large should the QM region be in QM/MM calculations? The case of catechol O-methyltransferase

Where Does the Density Localize? Convergent Behavior for Global Hybrids, Range Separation, and DFT+U

Direct Observation of Early-stage Quantum Dot Growth Mechanisms with High-temperature Ab Initio Molecular Dynamics

Quantum Chemistry for Solvated Molecules on Graphical Processing Units (GPUs)using Polarizable Continuum Models

Towards quantifying the role of exact exchange in predictions of transition metal complex properties