Source author record

Shenglong Zhou

Shenglong Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Computer Vision Information Theory math.IT eess.IV math.ST Statistics Theory

Catalog footprint

What is connected

9works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Preconditioned Inexact Stochastic ADMM for Deep Model

The recent advancement of foundation models (FMs) has brought about a paradigm shift, revolutionizing various sectors worldwide. The popular optimizers used to train these models are stochastic gradient descent-based algorithms, which face inherent limitations, such as slow convergence and stringent assumptions for convergence. In particular, data heterogeneity arising from distributed settings poses significant challenges to their theoretical and numerical performance. This paper develops an algorithm, PISA (Preconditioned Inexact Stochastic Alternating Direction Method of Multipliers). Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz continuity of the gradient on a bounded region, thereby removing the need for other conditions commonly imposed by stochastic methods. This capability enables the proposed algorithm to tackle the challenge of data heterogeneity effectively. Moreover, the algorithmic architecture enables scalable parallel computing and supports various preconditions, such as second-order information, second moment, and orthogonalized momentum by Newton-Schulz iterations. Incorporating the latter two preconditions in PISA yields two computationally efficient variants: SISA and NSISA. Comprehensive experimental evaluations for training or fine-tuning diverse deep models, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate superior numerical performance of SISA and NSISA compared to various state-of-the-art optimizers.

preprint2022arXiv

Communication-Efficient ADMM-based Federated Learning

Federated learning has shown its advances over the last few years but is facing many challenges, such as how algorithms save communication resources, how they reduce computational costs, and whether they converge. To address these issues, this paper proposes exact and inexact ADMM-based federated learning. They are not only communication-efficient but also converge linearly under very mild conditions, such as convexity-free and irrelevance to data distributions. Moreover, the inexact version has low computational complexity, thereby alleviating the computational burdens significantly.

preprint2022arXiv

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters

Accurate retinal vessel segmentation is challenging because of the complex texture of retinal vessels and low imaging contrast. Previous methods generally refine segmentation results by cascading multiple deep networks, which are time-consuming and inefficient. In this paper, we propose two novel methods to address these challenges. First, we devise a light-weight module, named multi-scale residual similarity gathering (MRSG), to generate pixel-wise adaptive filters (PA-Filters). Different from cascading multiple deep networks, only one PA-Filter layer can improve the segmentation results. Second, we introduce a response cue erasing (RCE) strategy to enhance the segmentation accuracy. Experimental results on the DRIVE, CHASE_DB1, and STARE datasets demonstrate that our proposed method outperforms state-of-the-art methods while maintaining a compact structure. Code is available at https://github.com/Limingxing00/Retinal-Vessel-Segmentation-ISBI20222.

preprint2022arXiv

Test-time Batch Normalization

Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift. In this paper, targeting of alleviating distribution shift in test time, we revisit the batch normalization (BN) in the training process and reveals two key insights benefiting test-time optimization: $(i)$ preserving the same gradient backpropagation form as training, and $(ii)$ using dataset-level statistics for robust optimization and inference. Based on the two insights, we propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss. We verify the effectiveness of our method on two typical settings with distribution shift, i.e., domain generalization and robustness tasks. Our GpreBN significantly improves the test-time performance and achieves the state of the art results.

preprint2021arXiv

Computing One-bit Compressive Sensing via Double-Sparsity Constrained Optimization

One-bit compressive sensing gains its popularity in signal processing and communications due to its low storage costs and low hardware complexity. However, it has been a challenging task to recover the signal only by exploiting the one-bit (the sign) information. In this paper, we appropriately formulate the one-bit compressive sensing into a double-sparsity constrained optimization problem. The first-order optimality conditions for this nonconvex and discontinuous problem are established via the newly introduced $τ$-stationarity, based on which, a gradient projection subspace pursuit (\texttt{GPSP}) algorithm is developed. It is proven that \texttt{GPSP} can converge globally and terminate within finite steps. Numerical experiments have demonstrated its excellent performance in terms of a high order of accuracy with a fast computational speed.

preprint2014arXiv

Exact Recovery for Sparse Signal via Weighted $l_1$ Minimization

Numerical experiments in literature on compressed sensing have indicated that the reweighted $l_1$ minimization performs exceptionally well in recovering sparse signal. In this paper, we develop exact recovery conditions and algorithm for sparse signal via weighted $l_1$ minimization from the insight of the classical NSP (null space property) and RIC (restricted isometry constant) bound. We first introduce the concept of WNSP (weighted null space property) and reveal that it is a necessary and sufficient condition for exact recovery. We then prove that the RIC bound by weighted $l_1$ minimization is $δ_{ak}<\sqrt{\frac{a-1}{a-1+γ^2}}$, where $a>1$, $0<γ\leq1$ is determined by an optimization problem over the null space. When $γ< 1$ this bound is greater than $\sqrt{\frac{a-1}{a}}$ from $l_1$ minimization. In addition, we also establish the bound on $δ_k$ and show that it can be larger than the sharp one 1/3 via $l_1$ minimization and also greater than 0.4343 via weighted $l_1$ minimization under some mild cases. Finally, we achieve a modified iterative reweighted $l_1$ minimization (MIRL1) algorithm based on our selection principle of weight, and the numerical experiments demonstrate that our algorithm behaves much better than $l_1$ minimization and iterative reweighted $l_1$ minimization (IRL1) algorithm.

preprint2014arXiv

Gradient Support Projection Algorithm for Affine Feasibility Problem with Sparsity and Nonnegativity

Let $A$ be a real $M \times N$ measurement matrix and $b\in \mathbb{R}^M$ be an observations vector. The affine feasibility problem with sparsity and nonnegativity ($AFP_{SN}$ for short) is to find a sparse and nonnegative vector $x\in \mathbb{R}^N$ with $Ax=b$ if such $x$ exists. In this paper, we focus on establishment of optimization approach to solving the $AFP_{SN}$. By discussing tangent cone and normal cone of sparse constraint, we give the first necessary optimality conditions, $α$-Stability, T-Stability and N-Stability, and the second necessary and sufficient optimality conditions for the related minimization problems with the $AFP_{SN}$. By adopting Armijo-type stepsize rule, we present a framework of gradient support projection algorithm for the $AFP_{SN}$ and prove its full convergence when matrix $A$ is $s$-regular. By doing some numerical experiments, we show the excellent performance of the new algorithm for the $AFP_{SN}$ without and with noise.

preprint2014arXiv

Sparse and Low-Rank Covariance Matrices Estimation

This paper aims at achieving a simultaneously sparse and low-rank estimator from the semidefinite population covariance matrices. We first benefit from a convex optimization which develops $l_1$-norm penalty to encourage the sparsity and nuclear norm to favor the low-rank property. For the proposed estimator, we then prove that with large probability, the Frobenious norm of the estimation rate can be of order $O(\sqrt{s(\log{r})/n})$ under a mild case, where $s$ and $r$ denote the number of sparse entries and the rank of the population covariance respectively, $n$ notes the sample capacity. Finally an efficient alternating direction method of multipliers with global convergence is proposed to tackle this problem, and meantime merits of the approach are also illustrated by practicing numerical simulations.

preprint2013arXiv

New RIC Bounds via l_q-minimization with 0<q<=1 in Compressed Sensing

The restricted isometry constants (RICs) play an important role in exact recovery theory of sparse signals via l_q(0<q<=1) relaxations in compressed sensing. Recently, Cai and Zhang[6] have achieved a sharp bound δ_tk<\sqrt{1-1/t} for t>=4/3 to guarantee the exact recovery of k sparse signals through the l_1 minimization. This paper aims to establish new RICs bounds via l_q(0<q<=1) relaxation. Based on a key inequality on l_q norm, we show that (i) the exact recovery can be succeeded via l_{1/2} and l_1 minimizations if δ_tk<\sqrt{1-1/t} for any t>1, (ii)several sufficient conditions can be derived, such as for any 0<q<1/2, δ_2k<0.5547 when k>=2, for any 1/2<q<1, δ_2k<0.6782 when k>=1, (iii) the bound on δ_k is given as well for any 0<q<=1, especially for q=1/2,1, we obtain δ_k<1/3 when k(>=2) is even or δ_k<0.3203 when k(>=3) is odd.

Shenglong Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Preconditioned Inexact Stochastic ADMM for Deep Model

Communication-Efficient ADMM-based Federated Learning

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters

Test-time Batch Normalization

Computing One-bit Compressive Sensing via Double-Sparsity Constrained Optimization

Exact Recovery for Sparse Signal via Weighted $l_1$ Minimization

Gradient Support Projection Algorithm for Affine Feasibility Problem with Sparsity and Nonnegativity

Sparse and Low-Rank Covariance Matrices Estimation

New RIC Bounds via l_q-minimization with 0<q<=1 in Compressed Sensing