Researcher profile

Jian-Feng Cai

Jian-Feng Cai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

AdaPreLoRA: Adafactor Preconditioned Low-Rank Adaptation

Low-Rank Adaptation (LoRA) reparameterizes a weight update as a product of two low-rank factors, but the Jacobian $J_{G}$ of the generator mapping the factors to the weight matrix is rank-deficient, so the factor-space preconditioner $J_{G}^* {F}_t J_{G}$ induced by any ${W}$-space preconditioner ${F}_t$ is singular, and consequently the standard chain rule cannot be uniquely inverted to map a preconditioned ${W}$-space direction back to a factor-space update. We cast existing LoRA optimizers in a unified framework parameterized by two choices: (i) which invertible surrogate for $J_{G}^* {F}_t J_{G}$ to use, and (ii) which ${F}_t$ on ${W}$ to use. Existing methods occupy four families along these axes: factor-space adaptive updates, block-diagonal surrogates for $J_{G}^* J_{G}$, Frobenius-residual pseudoinverse methods, and Riemannian manifold constraint. Within this design space, a gradient-statistics-aware ${F}_t$ paired with a closed-form factor-space solve at ${O}((m+n)r)$ memory remains underexplored. We propose \textbf{AdaPreLoRA}, which fills this gap by adopting the Adafactor diagonal Kronecker preconditioner ${H}_t$ on ${W}$ and selecting from the resulting factor-space solution family the element minimizing an ${H}_t$-weighted imbalance between the two factor contributions; by construction, the resulting factor update is the closest LoRA approximation to the preconditioned ${W}$-space direction under the ${H}_t$-weighted norm. Across GPT-2 (E2E), Mistral-7B and Qwen2-7B (GLUE, ARC, GSM8K), and diffusion-model personalization, AdaPreLoRA is competitive with or improves over a representative set of LoRA optimizers while keeping peak GPU memory at the LoRA optimizer level.

preprint2024arXiv

Interlacing Polynomial Method for the Column Subset Selection Problem

This paper investigates the spectral norm version of the column subset selection problem. Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a positive integer $k\leq\text{rank}(\mathbf{A})$, the objective is to select exactly $k$ columns of $\mathbf{A}$ that minimize the spectral norm of the residual matrix after projecting $\mathbf{A}$ onto the space spanned by the selected columns. We use the method of interlacing polynomials introduced by Marcus-Spielman-Srivastava to derive a new upper bound on the minimal approximation error. This new bound is asymptotically sharp when the matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ obeys a spectral power-law decay. The relevant expected characteristic polynomials can be written as an extension of the expected polynomial for the restricted invertibility problem, incorporating two extra variable substitution operators. Finally, we propose a deterministic polynomial-time algorithm that achieves this error bound up to a computational error.

preprint2022arXiv

Approximation Theory of Total Variation Minimization for Data Completion

Total variation (TV) minimization is one of the most important techniques in modern signal/image processing, and has wide range of applications. While there are numerous recent works on the restoration guarantee of the TV minimization in the framework of compressed sensing, there are few works on the restoration guarantee of the restoration from partial observations. This paper is to analyze the error of TV based restoration from random entrywise samples. In particular, we estimate the error between the underlying original data and the approximate solution that interpolates (or approximates with an error bound depending on the noise level) the given data that has the minimal TV seminorm among all possible solutions. Finally, we further connect the error estimate for the discrete model to the sparse gradient restoration problem and to the approximation to the underlying function from which the underlying true data comes.

preprint2022arXiv

Approximation Theory of Wavelet Frame Based Image Restoration

In this paper, we analyze the error estimate of a wavelet frame based image restoration method from degraded and incomplete measurements. We present the error between the underlying original discrete image and the approximate solution which has the minimal $\ell_1$-norm of the canonical wavelet frame coefficients among all possible solutions. Then we further connect the error estimate for the discrete model to the approximation to the underlying function from which the underlying image comes.

preprint2022arXiv

Generalized Low-rank plus Sparse Tensor Estimation by Fast Riemannian Optimization

We investigate a generalized framework to estimate a latent low-rank plus sparse tensor, where the low-rank tensor often captures the multi-way principal components and the sparse tensor accounts for potential model mis-specifications or heterogeneous signals that are unexplainable by the low-rank part. The framework is flexible covering both linear and non-linear models, and can easily handle continuous or categorical variables. We propose a fast algorithm by integrating the Riemannian gradient descent and a novel gradient pruning procedure. Under suitable conditions, the algorithm converges linearly and can simultaneously estimate both the low-rank and sparse tensors. The statistical error bounds of final estimates are established in terms of the gradient of loss function. The error bounds are generally sharp under specific statistical models, e.g., the robust tensor PCA and the community detection in hypergraph networks with outlier vertices. Moreover, our method achieves non-trivial error bounds for heavy-tailed tensor PCA whenever the noise has a finite $2+\varepsilon$ moment. We apply our method to analyze the international trade flow dataset and the statistician hypergraph co-authorship network, both yielding new and interesting findings.

preprint2022arXiv

Provable Tensor-Train Format Tensor Completion by Riemannian Optimization

The tensor train (TT) format enjoys appealing advantages in handling structural high-order tensors. The recent decade has witnessed the wide applications of TT-format tensors from diverse disciplines, among which tensor completion has drawn considerable attention. Numerous fast algorithms, including the Riemannian gradient descent (RGrad), have been proposed for the TT-format tensor completion. However, the theoretical guarantees of these algorithms are largely missing or sub-optimal, partly due to the complicated and recursive algebraic operations in TT-format decomposition. Moreover, existing results established for the tensors of other formats, for example, Tucker and CP, are inapplicable because the algorithms treating TT-format tensors are substantially different and more involved. In this paper, we provide, to our best knowledge, the first theoretical guarantees of the convergence of RGrad algorithm for TT-format tensor completion, under a nearly optimal sample size condition. The RGrad algorithm converges linearly with a constant contraction rate that is free of tensor condition number without the necessity of re-conditioning. We also propose a novel approach, referred to as the sequential second-order moment method, to attain a warm initialization under a similar sample size requirement. As a byproduct, our result even significantly refines the prior investigation of RGrad algorithm for matrix completion. Lastly, statistically (near) optimal rate is derived for RGrad algorithm if the observed entries consist of random sub-Gaussian noise. Numerical experiments confirm our theoretical discovery and showcase the computational speedup gained by the TT-format decomposition.

preprint2022arXiv

Two Stage Continuous Domain Regularization for Piecewise Constant Image Restoration

The finite-rate-of-innovation (FRI) framework which corresponds a signal/image to a structured low-rank matrix is emerging as an alternative to the traditional sparse regularization. This is because such an off-the-grid approach is able to alleviate the basis mismatch between the true support in the continuous domain and the discrete grid. In this paper, we propose a two-stage off-the-grid regularization model for the image restoration. Given that the discontinuities/edges of the image lie in the zero level set of a band-limited periodic function, we can derive that the Fourier samples of the gradient of the image satisfy an annihilation relation, resulting in a low-rank two-fold Hankel matrix. In addition, since the singular value decomposition of a low-rank Hankel matrix corresponds to an adaptive tight frame system which can represent the image with sparse canonical coefficients, our approach consists of the following two stages. The first stage learns the tight wavelet frame system from a given measurement, and the second stage restores the image via the analysis approach based sparse regularization. The numerical results are presented to demonstrate that the proposed approach is compared favorably against several popular discrete regularization approaches and structured low-rank matrix approaches.

preprint2021arXiv

Accelerated Structured Alternating Projections for Robust Spectrally Sparse Signal Recovery

Consider a spectrally sparse signal $\boldsymbol{x}$ that consists of $r$ complex sinusoids with or without damping. We study the robust recovery problem for the spectrally sparse signal under the fully observed setting, which is about recovering $\boldsymbol{x}$ and a sparse corruption vector $\boldsymbol{s}$ from their sum $\boldsymbol{z}=\boldsymbol{x}+\boldsymbol{s}$. In this paper, we exploit the low-rank property of the Hankel matrix formed by $\boldsymbol{x}$, and formulate the problem as the robust recovery of a corrupted low-rank Hankel matrix. We develop a highly efficient non-convex algorithm, coined Accelerated Structured Alternating Projections (ASAP). The high computational efficiency and low space complexity of ASAP are achieved by fast computations involving structured matrices, and a subspace projection method for accelerated low-rank approximation. Theoretical recovery guarantee with a linear convergence rate has been established for ASAP, under some mild assumptions on $\boldsymbol{x}$ and $\boldsymbol{s}$. Empirical performance comparisons on both synthetic and real-world data confirm the advantages of ASAP, in terms of computational efficiency and robustness aspects.

preprint2021arXiv

Provable Near-Optimal Low-Multilinear-Rank Tensor Recovery

We consider the problem of recovering a low-multilinear-rank tensor from a small amount of linear measurements. We show that the Riemannian gradient algorithm initialized by one step of iterative hard thresholding can reconstruct an order-$d$ tensor of size $n\times\ldots\times n$ and multilinear rank $(r,\ldots,r)$ with high probability from only $O(nr^2 + r^{d+1})$ measurements, assuming $d$ is a constant. This sampling complexity is optimal in $n$, compared to existing results whose sampling complexities are all unnecessarily large in $n$. The analysis relies on the tensor restricted isometry property (TRIP) and the geometry of the manifold of all tensors with a fixed multilinear rank. High computational efficiency of our algorithm is also achieved by doing higher order singular value decomposition on intermediate small tensors of size only $2r\times \ldots\times 2r$ rather than on tensors of size $n\times \ldots\times n$ as usual.

preprint2020arXiv

Data Driven Tight Frame for Compressed Sensing MRI Reconstruction via Off-the-Grid Regularization

Recently, the finite-rate-of-innovation (FRI) based continuous domain regularization is emerging as an alternative to the conventional on-the-grid sparse regularization for the compressed sensing (CS) due to its ability to alleviate the basis mismatch between the true support of the shape in the continuous domain and the discrete grid. In this paper, we propose a new off-the-grid regularization for the CS-MRI reconstruction. Following the recent works on two dimensional FRI, we assume that the discontinuities/edges of the image are localized in the zero level set of a band-limited periodic function. This assumption induces the linear dependencies among the Fourier samples of the gradient of the image, which leads to a low rank two-fold Hankel matrix. We further observe that the singular value decomposition of a low rank Hankel matrix corresponds to an adaptive tight frame system which can represent the image with sparse canonical coefficients. Based on this observation, we propose a data driven tight frame based off-the-grid regularization model for the CS-MRI reconstruction. To solve the nonconvex and nonsmooth model, a proximal alternating minimization algorithm with a guaranteed global convergence is adopted. Finally, the numerical experiments show that our proposed data driven tight frame based approach outperforms the existing approaches.

preprint2020arXiv

Fast Cadzow's Algorithm and a Gradient Variant

The Cadzow's algorithm is a signal denoising and recovery method which was designed for signals corresponding to low rank Hankel matrices. In this paper we first introduce a Fast Cadzow's algorithm which is developed by incorporating a novel subspace projection to reduce the high computational cost of the SVD in the Cadzow's algorithm. Then a Gradient method and a Fast Gradient method are proposed to address the non-decreasing MSE issue when applying the Cadzow's or Fast Cadzow's algorithm for signal denoising. Extensive empirical performance comparisons demonstrate that the proposed algorithms can complete the denoising and recovery tasks more efficiently and effectively.

preprint2017arXiv

Hankel Matrix Nuclear Norm Regularized Tensor Completion for $N$-dimensional Exponential Signals

Signals are generally modeled as a superposition of exponential functions in spectroscopy of chemistry, biology and medical imaging. For fast data acquisition or other inevitable reasons, however, only a small amount of samples may be acquired and thus how to recover the full signal becomes an active research topic. But existing approaches can not efficiently recover $N$-dimensional exponential signals with $N\geq 3$. In this paper, we study the problem of recovering N-dimensional (particularly $N\geq 3$) exponential signals from partial observations, and formulate this problem as a low-rank tensor completion problem with exponential factor vectors. The full signal is reconstructed by simultaneously exploiting the CANDECOMP/PARAFAC structure and the exponential structure of the associated factor vectors. The latter is promoted by minimizing an objective function involving the nuclear norm of Hankel matrices. Experimental results on simulated and real magnetic resonance spectroscopy data show that the proposed approach can successfully recover full signals from very limited samples and is robust to the estimated tensor rank.