Source author record

Yunfeng Cai

Yunfeng Cai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Biological Physics math.NA Biomolecules Computation and Language cond-mat.mtrl-sci Numerical Analysis physics.comp-ph quant-ph

Catalog footprint

What is connected

8works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Convergence analysis of a block preconditioned steepest descent eigensolver with implicit deflation

Gradient-type iterative methods for solving Hermitian eigenvalue problems can be accelerated by using preconditioning and deflation techniques. A preconditioned steepest descent iteration with implicit deflation (PSD-id) is one of such methods. The convergence behavior of the PSD-id is recently investigated based on the pioneering work of Samokish on the preconditioned steepest descent method (PSD). The resulting non-asymptotic estimates indicate a superlinear convergence of the PSD-id under strong assumptions on the initial guess. The present paper utilizes an alternative convergence analysis of the PSD by Neymeyr under much weaker assumptions. We embed Neymeyr's approach into the analysis of the PSD-id using a restricted formulation of the PSD-id. More importantly, we extend the new convergence analysis of the PSD-id to a practically preferred block version of the PSD-id, or BPSD-id, and show the cluster robustness of the BPSD-id. Numerical examples are provided to validate the theoretical estimates.

preprint2022arXiv

On the Power-Law Hessian Spectrums in Deep Learning

It is well-known that the Hessian of deep loss landscape matters to optimization, generalization, and even robustness of deep learning. Recent works empirically discovered that the Hessian spectrum in deep learning has a two-component structure that consists of a small number of large eigenvalues and a large number of nearly-zero eigenvalues. However, the theoretical mechanism or the mathematical behind the Hessian spectrum is still largely under-explored. To the best of our knowledge, we are the first to demonstrate that the Hessian spectrums of well-trained deep neural networks exhibit simple power-law structures. Inspired by the statistical physical theories and the spectral analysis of natural proteins, we provide a maximum-entropy theoretical interpretation for explaining why the power-law structure exist and suggest a spectral parallel between protein evolution and training of deep neural networks. By conducing extensive experiments, we further use the power-law spectral framework as a useful tool to explore multiple novel behaviors of deep learning.

preprint2022arXiv

SpaceE: Knowledge Graph Embedding by Relational Linear Transformation in the Entity Space

Translation distance based knowledge graph embedding (KGE) methods, such as TransE and RotatE, model the relation in knowledge graphs as translation or rotation in the vector space. Both translation and rotation are injective; that is, the translation or rotation of different vectors results in different results. In knowledge graphs, different entities may have a relation with the same entity; for example, many actors starred in one movie. Such a non-injective relation pattern cannot be well modeled by the translation or rotation operations in existing translation distance based KGE methods. To tackle the challenge, we propose a translation distance-based KGE method called SpaceE to model relations as linear transformations. The proposed SpaceE embeds both entities and relations in knowledge graphs as matrices and SpaceE naturally models non-injective relations with singular linear transformations. We theoretically demonstrate that SpaceE is a fully expressive model with the ability to infer multiple desired relation patterns, including symmetry, skew-symmetry, inversion, Abelian composition, and non-Abelian composition. Experimental results on link prediction datasets illustrate that SpaceE substantially outperforms many previous translation distance based knowledge graph embedding methods, especially on datasets with many non-injective relations. The code is available based on the PaddlePaddle deep learning platform https://www.paddlepaddle.org.cn.

preprint2020arXiv

An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem

This paper considers the sparse generalized eigenvalue problem (SGEP), which aims to find the leading eigenvector with at most $k$ nonzero entries. SGEP naturally arises in many applications in machine learning, statistics, and scientific computing, for example, the sparse principal component analysis (SPCA), the sparse discriminant analysis (SDA), and the sparse canonical correlation analysis (SCCA). In this paper, we focus on the development of a three-stage algorithm named {\em inverse-free truncated Rayleigh-Ritz method} ({\em IFTRR}) to efficiently solve SGEP. In each iteration of IFTRR, only a small number of matrix-vector products is required. This makes IFTRR well-suited for large scale problems. Particularly, a new truncation strategy is proposed, which is able to find the support set of the leading eigenvector effectively. Theoretical results are developed to explain why IFTRR works well. Numerical simulations demonstrate the merits of IFTRR.

preprint2020arXiv

Magnetic Noise Enabled Biocompass

The discovery of magnetic protein provides a new understanding of a biocompass at the molecular level. However, the mechanism by which magnetic protein enables a biocompass is still under debate, mainly because of the absence of permanent magnetism in the magnetic protein at room temperature. Here, based on a widely accepted radical pair model of a biocompass, we propose a microscopic mechanism that allows the biocompass to operate without a finite magnetization of the magnetic protein in a biological environment. With the structure of the magnetic protein, we show that the magnetic fluctuation, rather than the permanent magnetism, of the magnetic protein can enable geomagnetic field sensing. An analysis of the quantum dynamics of our microscopic model reveals the necessary conditions for optimal sensitivity. Our work clarifies the mechanism by which magnetic protein enables a biocompass.

preprint2020arXiv

Solving the Robust Matrix Completion Problem via a System of Nonlinear Equations

We consider the problem of robust matrix completion, which aims to recover a low rank matrix $L_*$ and a sparse matrix $S_*$ from incomplete observations of their sum $M=L_*+S_*\in\mathbb{R}^{m\times n}$. Algorithmically, the robust matrix completion problem is transformed into a problem of solving a system of nonlinear equations, and the alternative direction method is then used to solve the nonlinear equations. In addition, the algorithm is highly parallelizable and suitable for large scale problems. Theoretically, we characterize the sufficient conditions for when $L_*$ can be approximated by a low rank approximation of the observed $M_*$. And under proper assumptions, it is shown that the algorithm converges to the true solution linearly. Numerical simulations show that the simple method works as expected and is comparable with state-of-the-art methods.

preprint2016arXiv

Convergence analysis of a locally accelerated preconditioned steepest descent method for Hermitian-definite generalized eigenvalue problems

By extending the classical analysis techniques due to Samokish, Faddeev and Faddeeva, and Longsine and McCormick among others, we prove the convergence of preconditioned steepest descent with implicit deflation (PSD-id) method for solving Hermitian-definite generalized eigenvalue problems. Furthermore, we derive a nonasymptotic estimate of the rate of convergence of the \psdid method. We show that with the proper choice of the shift, the indefinite shift-and-invert preconditioner is a locally accelerated preconditioner, and is asymptotically optimal that leads to superlinear convergence. Numerical examples are presented to verify the theoretical results on the convergence behavior of the \psdid method for solving ill-conditioned Hermitian-definite generalized eigenvalue problems arising from electronic structure calculations. While rigorous and full-scale convergence proofs of preconditioned block steepest descent methods in practical use still largely eludes us, we believe the theoretical results presented in this paper sheds light on an improved understanding of the convergence behavior of these block methods.

preprint2013arXiv

Hybrid preconditioning for iterative diagonalization of ill-conditioned generalized eigenvalue problems in electronic structure calculations

The iterative diagonalization of a sequence of large ill-conditioned generalized eigenvalue problems is a computational bottleneck in quantum mechanical methods employing a nonorthogonal basis for {\em ab initio} electronic structure calculations. We propose a hybrid preconditioning scheme to effectively combine global and locally accelerated preconditioners for rapid iterative diagonalization of such eigenvalue problems. In partition-of-unity finite-element (PUFE) pseudopotential density-functional calculations, employing a nonorthogonal basis, we show that the hybrid preconditioned block steepest descent method is a cost-effective eigensolver, outperforming current state-of-the-art global preconditioning schemes, and comparably efficient for the ill-conditioned generalized eigenvalue problems produced by PUFE as the locally optimal block preconditioned conjugate-gradient method for the well-conditioned standard eigenvalue problems produced by planewave methods.

Yunfeng Cai

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Convergence analysis of a block preconditioned steepest descent eigensolver with implicit deflation

On the Power-Law Hessian Spectrums in Deep Learning

SpaceE: Knowledge Graph Embedding by Relational Linear Transformation in the Entity Space

An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem

Magnetic Noise Enabled Biocompass

Solving the Robust Matrix Completion Problem via a System of Nonlinear Equations

Convergence analysis of a locally accelerated preconditioned steepest descent method for Hermitian-definite generalized eigenvalue problems

Hybrid preconditioning for iterative diagonalization of ill-conditioned generalized eigenvalue problems in electronic structure calculations