Source author record

Yingzhou Li

Yingzhou Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis physics.comp-ph physics.chem-ph Machine Learning cond-mat.mtrl-sci quant-ph cond-mat.str-el Distributed, Parallel, and Cluster Computing math-ph math.MP math.PR

Catalog footprint

What is connected

18works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

On the Continuity of Schur-Horn Mapping

The Schur-Horn theorem is a well-known result that characterizes the relationship between the diagonal elements and eigenvalues of a symmetric (Hermitian) matrix. In this paper, we extend this theorem by exploring the eigenvalue perturbation of a symmetric (Hermitian) matrix with fixed diagonals, which is referred to as the continuity of the Schur-Horn mapping. We introduce a concept called strong Schur-Horn continuity, characterized by minimal constraints on the perturbation. We demonstrate that several categories of matrices exhibit strong Schur-Horn continuity. Leveraging this notion, along with a majorization constraint on the perturbation, we prove the Schur-Horn continuity for general symmetric (Hermitian) matrices. The Schur-Horn continuity finds applications in oblique manifold optimization related to quantum computing.

preprint2022arXiv

Global Convergence of Triangularized Orthogonalization-free Method

This paper proves the global convergence of a triangularized orthogonalization-free method (TriOFM). TriOFM, in general, applies a triangularization idea to the gradient of an objective function and removes the rotation invariance in minimizers. More precisely, in this paper, the TriOFM works as an eigensolver for sizeable sparse matrices and obtains eigenvectors without any orthogonalization step. Due to the triangularization, the iteration is a discrete-time flow in a non-conservative vector field. The global convergence relies on the stable manifold theorem, whereas the convergence to stationary points is proved in detail in this paper. We provide two proofs inspired by the noisy power method and the noisy optimization method, respectively.

preprint2022arXiv

KSSOLV 2.0: An efficient MATLAB toolbox for solving the Kohn-Sham equations with plane-wave basis set

KSSOLV (Kohn-Sham Solver) is a MATLAB toolbox for performing Kohn-Sham density functional theory (DFT) calculations with a plane-wave basis set. KSSOLV 2.0 preserves the design features of the original KSSOLV software to allow users and developers to easily set up a problem and perform ground-state calculations as well as to prototype and test new algorithms. Furthermore, it includes new functionalities such as new iterative diagonalization algorithms, k-point sampling for electron band structures, geometry optimization and advanced algorithms for performing DFT calculations with local, semi-local, and hybrid exchange-correlation functionals. It can be used to study the electronic structures of both molecules and solids. We describe these new capabilities in this work through a few use cases. We also demonstrate the numerical accuracy and computational efficiency of KSSOLV on a variety of examples.

preprint2022arXiv

Quantum Orbital Minimization Method for Excited States Calculation on Quantum Computer

We propose a quantum-classical hybrid variational algorithm, the quantum orbital minimization method (qOMM), for obtaining the ground state and low-lying excited states of a Hermitian operator. Given parameterized ansatz circuits representing eigenstates, qOMM implements quantum circuits to represent the objective function in the orbital minimization method and adopts classical optimizer to minimize the objective function with respect to parameters in ansatz circuits. The objective function has orthogonality implicitly embedded, which allows qOMM to apply a different ansatz circuit to each reference state. We carry out numerical simulations that seek to find excited states of the $\text{H}_{2}$, $\text{LiH}$, and a toy model consisting of 4 hydrogen atoms arranged in a square lattice in the STO-3G basis and UCCSD ansatz circuits. Comparing the numerical results with existing excited states methods, qOMM is less prone to getting stuck in local minima and can achieve convergence with more shallow ansatz circuits.

preprint2022arXiv

Universal approximation of symmetric and anti-symmetric functions

We consider universal approximations of symmetric and anti-symmetric functions, which are important for applications in quantum physics, as well as other scientific and engineering computations. We give constructive approximations with explicit bounds on the number of parameters with respect to the dimension and the target accuracy $ε$. While the approximation still suffers from the curse of dimensionality, to the best of our knowledge, these are the first results in the literature with explicit error bounds for functions with symmetry or anti-symmetry constraints.

preprint2020arXiv

Butterfly-Net: Optimal Function Representation Based on Convolutional Neural Networks

Deep networks, especially convolutional neural networks (CNNs), have been successfully applied in various areas of machine learning as well as to challenging problems in other scientific and engineering fields. This paper introduces Butterfly-Net, a low-complexity CNN with structured and sparse cross-channel connections, together with a Butterfly initialization strategy for a family of networks. Theoretical analysis of the approximation power of Butterfly-Net to the Fourier representation of input data shows that the error decays exponentially as the depth increases. Combining Butterfly-Net with a fully connected neural network, a large class of problems are proved to be well approximated with network complexity depending on the effective frequency bandwidth instead of the input dimension. Regular CNN is covered as a special case in our analysis. Numerical experiments validate the analytical results on the approximation of Fourier kernels and energy functionals of Poisson's equations. Moreover, all experiments support that training from Butterfly initialization outperforms training from random initialization. Also, adding the remaining cross-channel connections, although significantly increase the parameter number, does not much improve the post-training accuracy and is more sensitive to data distribution.

preprint2020arXiv

Butterfly-Net2: Simplified Butterfly-Net and Fourier Transform Initialization

Structured CNN designed using the prior information of problems potentially improves efficiency over conventional CNNs in various tasks in solving PDEs and inverse problems in signal processing. This paper introduces BNet2, a simplified Butterfly-Net and inline with the conventional CNN. Moreover, a Fourier transform initialization is proposed for both BNet2 and CNN with guaranteed approximation power to represent the Fourier transform operator. Experimentally, BNet2 and the Fourier transform initialization strategy are tested on various tasks, including approximating Fourier transform operator, end-to-end solvers of linear and nonlinear PDEs, and denoising and deblurring of 1D signals. On all tasks, under the same initialization, BNet2 achieves similar accuracy as CNN but has fewer parameters. And Fourier transform initialized BNet2 and CNN consistently improve the training and testing accuracy over the randomly initialized CNN.

preprint2020arXiv

Distributed-memory $\mathcal{H}$-matrix Algebra I: Data Distribution and Matrix-vector Multiplication

We introduce a data distribution scheme for $\mathcal{H}$-matrices and a distributed-memory algorithm for $\mathcal{H}$-matrix-vector multiplication. Our data distribution scheme avoids an expensive $Ω(P^2)$ scheduling procedure used in previous work, where $P$ is the number of processes, while data balancing is well-preserved. Based on the data distribution, our distributed-memory algorithm evenly distributes all computations among $P$ processes and adopts a novel tree-communication algorithm to reduce the latency cost. The overall complexity of our algorithm is $O\Big(\frac{N \log N}{P} + α\log P + β\log^2 P \Big)$ for $\mathcal{H}$-matrices under weak admissibility condition, where $N$ is the matrix size, $α$ denotes the latency, and $β$ denotes the inverse bandwidth. Numerically, our algorithm is applied to address both two- and three-dimensional problems of various sizes among various numbers of processes. On thousands of processes, good parallel efficiency is still observed.

preprint2020arXiv

ELSI -- An Open Infrastructure for Electronic Structure Solvers

Routine applications of electronic structure theory to molecules and periodic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries (finite non-periodic vs. infinite periodic boundary conditions) yield matrices with different structures. The ELectronic Structure Infrastructure (ELSI) project provides an open-source software interface to facilitate the implementation and optimal use of high-performance solver libraries covering cubic scaling eigensolvers, linear scaling density-matrix-based algorithms, and other reduced scaling methods in between. In this paper, we present recent improvements and developments inside ELSI, mainly covering (1) new solvers connected to the interface, (2) matrix layout and communication adapted for parallel calculations of periodic and/or spin-polarized systems, (3) routines for density matrix extrapolation in geometry optimization and molecular dynamics calculations, and (4) general utilities such as parallel matrix I/O and JSON output. The ELSI interface has been integrated into four electronic structure code projects (DFTB+, DGDFT, FHI-aims, SIESTA), allowing us to rigorously benchmark the performance of the solvers on an equal footing. Based on results of a systematic set of large-scale benchmarks performed with Kohn-Sham density-functional theory and density-functional tight-binding theory, we identify factors that strongly affect the efficiency of the solvers, and propose a decision layer that assists with the solver selection process. Finally, we describe a reverse communication interface encoding matrix-free iterative solver strategies that are amenable, e.g., for use with planewave basis sets.

preprint2020arXiv

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

This paper proposes an efficient method for computing selected generalized eigenpairs of a sparse Hermitian definite matrix pencil $(A,B)$. Based on Zolotarev's best rational function approximations of the signum function and conformal mapping techniques, we construct the best rational function approximation of a rectangular function supported on an arbitrary interval via function compositions with partial fraction representations. This new best rational function approximation can be applied to construct spectrum filters of $(A,B)$ with a smaller number of poles than a direct construction without function compositions. Combining fast direct solvers and the shift-invariant generalized minimal residual method, a hybrid fast algorithm is proposed to apply spectral filters efficiently. Compared to the state-of-the-art algorithm FEAST, the proposed rational function approximation is more efficient when sparse matrix factorizations are required to solve multi-shift linear systems in the eigensolver, since the smaller number of matrix factorizations is needed in our method. The efficiency and stability of the proposed method are demonstrated by numerical examples from computational chemistry.

preprint2020arXiv

Optimal Orbital Selection for Full Configuration Interaction (OptOrbFCI): Pursuing the Basis Set Limit under a Budget

Full configuration interaction (FCI) solvers are limited to small basis sets due to their expensive computational costs. An optimal orbital selection for FCI (OptOrbFCI) is proposed to boost the power of existing FCI solvers to pursue the basis set limit under a computational budget. The optimization problem coincides with that of the complete active space SCF method (CASSCF), while OptOrbFCI is algorithmically quite different. OptOrbFCI effectively finds an optimal rotation matrix via solving a constrained optimization problem directly to compress the orbitals of large basis sets to one with a manageable size, conducts FCI calculations only on rotated orbital sets, and produces a variational ground-state energy and its wave function. Coupled with coordinate descent full configuration interaction (CDFCI), we demonstrate the efficiency and accuracy of the method on the carbon dimer and nitrogen dimer under basis sets up to cc-pV5Z. We also benchmark the binding curve of the nitrogen dimer under the cc-pVQZ basis set with 28 selected orbitals, which provide consistently lower ground-state energies than the FCI results under the cc-pVDZ basis set. The dissociation energy in this case is found to be of higher accuracy.

preprint2020arXiv

Tensor Ring Decomposition: Optimization Landscape and One-loop Convergence of Alternating Least Squares

In this work, we study the tensor ring decomposition and its associated numerical algorithms. We establish a sharp transition of algorithmic difficulty of the optimization problem as the bond dimension increases: On one hand, we show the existence of spurious local minima for the optimization landscape even when the tensor ring format is much over-parameterized, i.e., with bond dimension much larger than that of the true target tensor. On the other hand, when the bond dimension is further increased, we establish one-loop convergence for alternating least square algorithm for tensor ring decomposition. The theoretical results are complemented by numerical experiments for both local minimum and one-loop convergence for the alternating least square algorithm.

preprint2020arXiv

The CECAM Electronic Structure Library and the modular software development paradigm

First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been quite successful for decades. The rapid progress in methodology, however, has resulted in an ever increasing complexity of those programs, which implies a growing amount of replication in coding and in the recurrent re-engineering needed to adapt to evolving hardware architecture. The Electronic Structure Library (\esl) was initiated by CECAM (European Centre for Atomic and Molecular Calculations) to catalyze a paradigm shift away from the monolithic model and promote modularization, with the ambition to extract common tasks from electronic structure programs and redesign them as free, open-source libraries. They include "heavy-duty" ones with a high degree of parallelisation, and potential for adaptation to novel hardware within them, thereby separating the sophisticated computer science aspects of performance optimization and re-engineering from the computational science done by scientists when implementing new ideas. It is a community effort, undertaken by developers of various successful codes, now facing the challenges arising in the new model. This modular paradigm will improve overall coding efficiency and enable specialists (computer scientists or computational scientists) to use their skills more effectively. It will lead to a more sustainable and dynamic evolution of software as well as lower barriers to entry for new developers.

preprint2019arXiv

A stochastic version of Stein Variational Gradient Descent for efficient sampling

We propose in this work RBM-SVGD, a stochastic version of Stein Variational Gradient Descent (SVGD) method for efficiently sampling from a given probability measure and thus useful for Bayesian inference. The method is to apply the Random Batch Method (RBM) for interacting particle systems proposed by Jin et al to the interacting particle systems in SVGD. While keeping the behaviors of SVGD, it reduces the computational cost, especially when the interacting kernel has long range. Numerical examples verify the efficiency of this new version of SVGD.

preprint2019arXiv

Coordinate-wise descent methods for leading eigenvalue problem

Leading eigenvalue problems for large scale matrices arise in many applications. Coordinate-wise descent methods are considered in this work for such problems based on a reformulation of the leading eigenvalue problem as a non-convex optimization problem. The convergence of several coordinate-wise methods is analyzed and compared. Numerical examples of applications to quantum many-body problems demonstrate the efficiency and provide benchmarks of the proposed coordinate-wise descent methods.

preprint2017arXiv

Bold Diagrammatic Monte Carlo in the Lens of Stochastic Iterative Methods

This work aims at understanding of bold diagrammatic Monte Carlo (BDMC) methods for stochastic summation of Feynman diagrams from the angle of stochastic iterative methods. The convergence enhancement trick of the BDMC is investigated from the analysis of condition number and convergence of the stochastic iterative methods. Numerical experiments are carried out for model systems to compare the BDMC with related stochastic iterative approaches.

preprint2015arXiv

A Multiscale Butterfly Algorithm for Multidimensional Fourier Integral Operators

This paper presents an efficient multiscale butterfly algorithm for computing Fourier integral operators (FIOs) of the form $(\mathcal{L} f)(x) = \int_{R^d}a(x,ξ) e^{2πıΦ(x,ξ)}\hat{f}(ξ) dξ$, where $Φ(x,ξ)$ is a phase function, $a(x,ξ)$ is an amplitude function, and $f(x)$ is a given input. The frequency domain is hierarchically decomposed into a union of Cartesian coronas. The integral kernel $a(x,ξ) e^{2πıΦ(x,ξ)}$ in each corona satisfies a special low-rank property that enables the application of a butterfly algorithm on the Cartesian phase-space grid. This leads to an algorithm with quasi-linear operation complexity and linear memory complexity. Different from previous butterfly methods for the FIOs, this new approach is simple and reduces the computational cost by avoiding extra coordinate transformations. Numerical examples in two and three dimensions are provided to demonstrate the practical advantages of the new algorithm.

preprint2015arXiv

Butterfly Factorization

The paper introduces the butterfly factorization as a data-sparse approximation for the matrices that satisfy a complementary low-rank property. The factorization can be constructed efficiently if either fast algorithms for applying the matrix and its adjoint are available or the entries of the matrix can be sampled individually. For an $N \times N$ matrix, the resulting factorization is a product of $O(\log N)$ sparse matrices, each with $O(N)$ non-zero entries. Hence, it can be applied rapidly in $O(N\log N)$ operations. Numerical results are provided to demonstrate the effectiveness of the butterfly factorization and its construction algorithms.

Yingzhou Li

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

On the Continuity of Schur-Horn Mapping

Global Convergence of Triangularized Orthogonalization-free Method

KSSOLV 2.0: An efficient MATLAB toolbox for solving the Kohn-Sham equations with plane-wave basis set

Quantum Orbital Minimization Method for Excited States Calculation on Quantum Computer

Universal approximation of symmetric and anti-symmetric functions

Butterfly-Net: Optimal Function Representation Based on Convolutional Neural Networks

Butterfly-Net2: Simplified Butterfly-Net and Fourier Transform Initialization

Distributed-memory $\mathcal{H}$-matrix Algebra I: Data Distribution and Matrix-vector Multiplication

ELSI -- An Open Infrastructure for Electronic Structure Solvers

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

Optimal Orbital Selection for Full Configuration Interaction (OptOrbFCI): Pursuing the Basis Set Limit under a Budget

Tensor Ring Decomposition: Optimization Landscape and One-loop Convergence of Alternating Least Squares

The CECAM Electronic Structure Library and the modular software development paradigm

A stochastic version of Stein Variational Gradient Descent for efficient sampling

Coordinate-wise descent methods for leading eigenvalue problem

Bold Diagrammatic Monte Carlo in the Lens of Stochastic Iterative Methods

A Multiscale Butterfly Algorithm for Multidimensional Fourier Integral Operators

Butterfly Factorization