Source author record

Yixiao Chen

Yixiao Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.chem-ph physics.comp-ph Artificial Intelligence cond-mat.mtrl-sci cond-mat.stat-mech Distributed, Parallel, and Cluster Computing Machine Learning

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Agentic Discovery of Exchange-Correlation Density Functionals

The development of accurate exchange-correlation (XC) functionals remains a longstanding challenge in density functional theory (DFT). The vast majority of XC functionals have been hand designed by human researchers combining physical insight, exact constraints, and empirical fitting. Recent advances in large language models enable a systematic, automated alternative to this human-driven design loop. This report presents an agentic search system in which an LLM proposes structured functional-form changes guided by evolutionary history. The system attempts to improve functional performance through an iterative plan-execute-summarize loop, where improvements are measurable by optimizing functional parameters against a standard thermochemistry dataset, then evaluating performance on a held-out subset. The strongest discovered functional, SAFS26-a (Seed Agentic Functional Search 2026), improves upon the gold-standard ωB97M-V baseline by ~9%. These results also surface a cautionary lesson for AI-assisted science: models powerful enough to discover genuine improvements are equally capable of exploiting unphysical shortcuts to game the benchmark; domain expertise translated into explicitly enforced constraints remains essential to keeping results scientifically grounded.

preprint2025arXiv

Ab Initio Melting Properties of Water and Ice from Machine Learning Potentials

Liquid water exhibits several important anomalous properties in the vicinity of the melting temperature ($T_{\mathrm{m}}$) of ice Ih, including a higher density than ice and a density maximum at 4~$^{\circ}$C. Experimentally, an isotope effect on $T_{\mathrm{m}}$ is observed: the melting temperature of H$_2$O is approximately 4~K lower than that of D$_2$O. This difference can only be explained by nuclear quantum effects (NQEs), which can be accurately captured using path integral molecular dynamics (PIMD). Here we run PIMD simulations driven by Deep Potential (DP) models trained on data from density functional theory (DFT) based on SCAN, revPBE0-D3, SCAN0, and revPBE-D3 and a DP model trained on the MB-pol potential. We calculate the \tm of ice, the density discontinuity at melting, and the temperature of density maximum ($T_{\mathrm{dm}}$) of the liquid. We find that the model based on MB-pol agrees well with experiment. The models based on DFT incorrectly predict that NQEs lower $T_{\mathrm{m}}$. For the density discontinuity, SCAN and SCAN0 predict values close to the experimental result, while revPBE-D3 and revPBE0-D3 significantly underestimate it. Additionally, the models based on SCAN and SCAN0 correctly predict that the $T_{\mathrm{dm}}$ is higher than $T_{\mathrm{m}}$, while those based on revPBE-D3 and revPBE0-D3 predict the opposite. We attribute the deviations of the DFT-based models from experiment to the overestimation of hydrogen bond strength. Our results set the stage for more accurate simulations of aqueous systems grounded on DFT.

preprint2025arXiv

Assessment of First-Principles Methods in Modeling the Melting Properties of Water

First-principles simulations have played a crucial role in deepening our understanding of the thermodynamic properties of water, and machine learning potentials (MLPs) trained on these first-principles data widen the range of accessible properties. However, the capabilities of different first-principles methods are not yet fully understood due to the lack of systematic benchmarks, the underestimation of the uncertainties introduced by MLPs, and the neglect of nuclear quantum effects (NQEs). Here, we systematically assess first-principles methods by calculating key melting properties using path integral molecular dynamics (PIMD) driven by Deep Potential (DP) models trained on data from density functional theory (DFT) with SCAN, revPBE0-D3, SCAN0 and revPBE-D3 functionals, as well as from the MB-pol potential. We find that MB-pol is in qualitatively good agreement with the experiment in all properties tested, whereas the four DFT functionals incorrectly predict that NQEs increase the melting temperature. SCAN and SCAN0 slightly underestimate the density change between water and ice upon melting, but revPBE-D3 and revPBE0-D3 severely underestimate it. Moreover, SCAN and SCAN0 correctly predict that the maximum liquid density occurs at a temperature higher than the melting point, while revPBE-D3 and revPBE0-D3 predict the opposite behavior. Our results highlight limitations in widely used first-principles methods and call for a reassessment of their predictive power in aqueous systems.

preprint2022arXiv

DeePKS+ABACUS as a Bridge between Expensive Quantum Mechanical Models and Machine Learning Potentials

Recently, the development of machine learning (ML) potentials has made it possible to perform large-scale and long-time molecular simulations with the accuracy of quantum mechanical (QM) models. However, for high-level QM methods, such as density functional theory (DFT) at the meta-GGA level and/or with exact exchange, quantum Monte Carlo, etc., generating a sufficient amount of data for training a ML potential has remained computationally challenging due to their high cost. In this work, we demonstrate that this issue can be largely alleviated with Deep Kohn-Sham (DeePKS), a ML-based DFT model. DeePKS employs a computationally efficient neural network-based functional model to construct a correction term added upon a cheap DFT model. Upon training, DeePKS offers closely-matched energies and forces compared with high-level QM method, but the number of training data required is orders of magnitude less than that required for training a reliable ML potential. As such, DeePKS can serve as a bridge between expensive QM models and ML potentials: one can generate a decent amount of high-accuracy QM data to train a DeePKS model, and then use the DeePKS model to label a much larger amount of configurations to train a ML potential. This scheme for periodic systems is implemented in a DFT package ABACUS, which is open-source and ready for use in various applications.

preprint2022arXiv

DP Compress: a Model Compression Scheme for Generating Efficient Deep Potential Models

Machine-learning-based interatomic potential energy surface (PES) models are revolutionizing the field of molecular modeling. However, although much faster than electronic structure schemes, these models suffer from costly computations via deep neural networks to predict the energy and atomic forces, resulting in lower running efficiency as compared to the typical empirical force fields. Herein, we report a model compression scheme for boosting the performance of the Deep Potential (DP) model, a deep learning based PES model. This scheme, we call DP Compress, is an efficient post-processing step after the training of DP models (DP Train). DP Compress combines several DP-specific compression techniques, which typically speed up DP-based molecular dynamics simulations by an order of magnitude faster, and consume an order of magnitude less memory. We demonstrate that DP Compress is sufficiently accurate by testing a variety of physical properties of Cu, H2O, and Al-Cu-Mg systems. DP Compress applies to both CPU and GPU machines and is publicly available online.

preprint2022arXiv

Extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms

High-performance computing, together with a neural network model trained from data generated with first-principles methods, has greatly boosted applications of \textit{ab initio} molecular dynamics in terms of spatial and temporal scales on modern supercomputers. Previous state-of-the-art can achieve $1-2$ nanoseconds molecular dynamics simulation per day for 100-million atoms on the entire Summit supercomputer. In this paper, we have significantly reduced the memory footprint and computational time by a comprehensive approach with both algorithmic and system innovations. The neural network model is compressed by model tabulation, kernel fusion, and redundancy removal. Then optimizations such as acceleration of customized kernel, tabulation of activation function, MPI+OpenMP parallelization are implemented on GPU and ARM architectures. Testing results of the copper system show that the optimized code can scale up to the entire machine of both Fugaku and Summit, and the corresponding system size can be extended by a factor of $134$ to an unprecedented $17$ billion atoms. The strong scaling of a $13.5$-million atom copper system shows that the time-to-solution can be 7 times faster, reaching $11.2$ nanoseconds per day. This work opens the door for unprecedentedly large-scale molecular dynamics simulations based on {\it ab initio} accuracy and can be potentially utilized in studying more realistic applications such as mechanical properties of metals, semiconductor devices, batteries, etc. The optimization techniques detailed in this paper also provide insight for relevant high-performance computing applications.

preprint2020arXiv

Ground state energy functional with Hartree-Fock efficiency and chemical accuracy

We introduce the Deep Post-Hartree-Fock (DeePHF) method, a machine learning based scheme for constructing accurate and transferable models for the ground-state energy of electronic structure problems. DeePHF predicts the energy difference between results of highly accurate models such as the coupled cluster method and low accuracy models such as the the Hartree-Fock (HF) method, using the ground-state electronic orbitals as the input. It preserves all the symmetries of the original high accuracy model. The added computational cost is less than that of the reference HF or DFT and scales linearly with respect to system size. We examine the performance of DeePHF on organic molecular systems using publicly available datasets and obtain the state-of-art performance, particularly on large datasets.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint