Researcher profile

Huiyuan Li

Huiyuan Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Bridging the Gap between Sparse Matrix Reordering and Factorization: A Deep Learning Framework for Fill-in Reduction

Sparse matrix reordering can significantly reduce the fill-in during matrix factorization, thereby decreasing the computational and storage requirements in sparse matrix computations. Finding a minimal fill-in ordering is known to be an NP-hard problem. Moreover, there is a paradox: matrix reordering is applied before matrix factorization, but fill-ins that matrix reordering methods aim at are generated from matrix factorization. To bridge the gap between reordering and factorization, we propose a deep learning framework to minimize a fill-in surrogate function based on spectral embedding. First, we employ a multi-grid-like GNN architecture to learn to approximate the smallest eigenvectors of its graph Laplacian matrix, i.e. spectral embedding, and capture the global structural information of the matrix. Then, another multi-grid-like GNN architecture is used to minimize the potential space where fill-in can occur based on the rank distribution. Experimental results indicate that our approach achieves competitive performance compared with traditional graph-theoretic algorithms and deep learning methods.

preprint2026arXiv

Learning Fill-in Reduction Ordering via Graph Policy Optimization for Sparse Matrices

Matrix reordering in large sparse solvers seeks a permutation that minimizes factorization fill-in to reduce memory and computation. Because the minimum fill-in ordering problem is NP-complete and fill-in is implicit in the sparsity pattern, graph-theoretic heuristics are used. Existing reinforcement learning methods either ignore sparsity patterns--missing the global fill-in--or lack local exact fill-in feedback. We propose a graph policy optimization method, modeling fill-ins from global and local views: both the policy and value networks use a multi-hop graph neural backbone to embed global fill-in; the policy further interacts with symbolic factorization over graphs to extract local, step-level fill-ins, and the resulting feedback is aligned with the value network via an adaptive saturation function to improve convergence. On the SuiteSparse Matrix Collection, our method achieves mean reductions of 29.3 in fill-ins and 31.3 in peak memory usage over state-of-the-art baselines.

preprint2026arXiv

Self-Supervised Learning for Sparse Matrix Reordering

Rearranging the rows or columns of a sparse matrix using an appropriate ordering can significantly reduce fill-ins, i.e., new nonzeros introduced during matrix factorization, decreasing memory usage and runtime. However, finding an ordering that minimizes fill-ins is NP-complete. Existing approaches, including graph-theoretic and deep learning methods, rely on surrogate objectives without theoretical guarantees. The Fill-Path Theorem reveals a direct and intrinsic relationship between fill-in generation and the sparse structure of the matrix as path triplet inequalities. Here we first employ a multigrid graph network to capture structural information for each vertex. We then derive a triplet sampling strategy based on inequalities. Finally, we introduce an end-max chain loss function to reduce the number of triplets whose predicted scores satisfy these inequalities. Experimental evaluations on the publicly available SuiteSparse matrix collection demonstrate the superiority of the proposed method in terms of both fill-in reduction and speedup in LU factorization time.

preprint2026arXiv

Shifting the Sweet Spot: High-Performance Matrix-Free Method for High-Order Elasticity

In high-order finite element analysis for elasticity, matrix-free (PA) methods are a key technology for overcoming the memory bottleneck of traditional Full Assembly (FA). However, existing implementations fail to fully exploit the special structure of modern CPU architectures and tensor-product elements, causing their performance "sweet spot" to anomalously remain at the low order of $p \approx 2$, which severely limits the potential of high-order methods. To address this challenge, we design and implement a highly optimized PA operator within the MFEM framework, deeply integrated with a Geometric Multigrid (GMG) preconditioner. Our multi-level optimization strategy includes replacing the original $O(p^6)$ generic algorithm with an efficient $O(p^4)$ one based on tensor factorization, exploiting Voigt symmetry to reduce redundant computations for the elasticity problem, and employing macro-kernel fusion to enhance data locality and break the memory bandwidth bottleneck. Extensive experiments on mainstream x86 and ARM architectures demonstrate that our method successfully shifts the performance "sweet spot" to the higher-order region of $p \ge 6$. Compared to the MFEM baseline, the optimized core operator (kernel) achieves speedups of 7x to 83x, which translates to a 3.6x to 16.8x end-to-end performance improvement in the complete solution process. This paper provides a validated and efficient practical path for conducting large-scale, high-order elasticity simulations on mainstream CPU hardware.

preprint2020arXiv

Vectorial ball Prolate spheroidal wave functions with the divergence free constraint

In this paper, we introduce one family of vectorial prolate spheroidal wave functions of real order $α>-1$ on the unit ball in $R^3$, which satisfy the divergence free constraint, thus are termed as divergence free vectorial ball PSWFs. They are vectorial eigenfunctions of an integral operator related to the finite Fourier transform, and solve the divergence free constrained maximum concentration problem in three dimensions, i.e., to what extent can the total energy of a band-limited divergence free vectorial function be concentrated on the unit ball? Interestingly, any optimally concentrated divergence free vectorial functions, when represented in series in vector spherical harmonics, shall be also concentrated in one of the three vectorial spherical harmonics modes. Moreover, divergence free ball PSWFs are exactly the vectorial eigenfunctions of the second order Sturm-Liouville differential operator which defines the scalar ball PSWFs. Indeed, the divergence free vectorial ball PSWFs possess a simple and close relation with the scalar ball PSWFs such that they share the same merits. Simultaneously, it turns out that the divergence free ball PSWFs solve another second order Sturm-Liouville eigen equation defined through the curl operator $\nabla\times $ instead of the gradient operator $\nabla$.

preprint2008arXiv

Discrete Fourier analysis on fundamental domain of $A_d$ lattice and on simplex in $d$-variables

A discrete Fourier analysis on the fundamental domain $Ω_d$ of the $d$-dimensional lattice of type $A_d$ is studied, where $Ω_2$ is the regular hexagon and $Ω_3$ is the rhombic dodecahedron, and analogous results on $d$-dimensional simplex are derived by considering invariant and anti-invariant elements. Our main results include Fourier analysis in trigonometric functions, interpolation and cubature formulas on these domains. In particular, a trigonometric Lagrange interpolation on the simplex is shown to satisfy an explicit compact formula and the Lebesgue constant of the interpolation is shown to be in the order of $(\log n)^d$. The basic trigonometric functions on the simplex can be identified with Chebyshev polynomials in several variables already appeared in literature. We study common zeros of these polynomials and show that they are nodes for a family of Gaussian cubature formulas, which provides only the second known example of such formulas.