Source author record

Jiahuan Wang

Jiahuan Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Information Theory math.IT Artificial Intelligence Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

6works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Local Gradient Regulation Stabilizes Federated Learning under Client Heterogeneity

Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, yet its stability is fundamentally challenged by statistical heterogeneity in realistic deployments. Here, we show that client heterogeneity destabilizes FL primarily by distorting local gradient dynamics during client-side optimization, causing systematic drift that accumulates across communication rounds and impedes global convergence. This observation highlights local gradients as a key regulatory lever for stabilizing heterogeneous FL systems. Building on this insight, we develop a general client-side perspective that regulates local gradient contributions without incurring additional communication overhead. Inspired by swarm intelligence, we instantiate this perspective through Exploratory--Convergent Gradient Re-aggregation (ECGR), which balances well-aligned and misaligned gradient components to preserve informative updates while suppressing destabilizing effects. Theoretical analysis and extensive experiments, including evaluations on the LC25000 medical imaging dataset, demonstrate that regulating local gradient dynamics consistently stabilizes federated learning across state-of-the-art methods under heterogeneous data distributions.

preprint2026arXiv

Revealing Modular Gradient Noise Imbalance in LLMs: Calibrating Adam via Signal-to-Noise Ratio

The impressive performance of large language models (LLMs) arises from their massive scale and heterogeneous module composition. However, this structural heterogeneity introduces additional optimization challenges. While adaptive optimizers such as Adam(W) provide per-parameter adaptivity, they do not explicitly account for module-level gradient heterogeneity, resulting in slower convergence, suboptimal performance, or training instability. Existing approaches typically rely on manually tuned module-specific learning rates or specific optimization strategies, which are computationally costly and difficult to generalize across tasks or models. To establish a more principled approach, we first analyze the noise-damping behavior of Adam in high-noise modules and introduce \textbf{Module-wise Learning Rate Scaling via SNR (MoLS)}. MoLS estimates module-level SNRs to scale Adam updates, allowing automated module-wise learning rate allocation without manual tuning. Empirical results through multiple LLM training benchmarks demonstrate that MoLS improves convergence speed and generalization, achieving performance comparable to carefully tuned module-specific learning rates, while remaining compatible with memory-efficient training algorithms.

preprint2026arXiv

Stability and Generalization for Decentralized Markov SGD

Stochastic gradient methods are central to large-scale learning, yet their generalization theory typically relies on independent sampling assumptions. In many practical applications, data are generated by Markov chains and learning is performed in a decentralized manner, which introduces significant analytical challenges. In this work, we investigate the stability and generalization of decentralized stochastic gradient descent (SGD) and stochastic gradient descent ascent (SGDA) under Markov chain sampling. Leveraging a stability-based framework, we characterize how Markovian dependence and decentralized communication jointly influence generalization behavior. Our analysis captures the effects of network topology, Markov chain mixing properties, and primal-dual dynamics. We establish non-asymptotic generalization bounds for both algorithms, extending existing results on Markov stochastic gradient methods to decentralized and minimax settings.

preprint2026arXiv

Unveiling High-Probability Generalization in Decentralized SGD

Decentralized stochastic gradient descent (D-SGD) is an efficient method for large-scale distributed learning. Existing generalization studies mainly address expected results, achieving rates limited to $\mathcal{O}\left(\frac{1}{δ\sqrt{mn}}\right)$, where $δ$ is the confidence parameter, $m$ the number of workers, and $n$ the sample size. When $m=1$, D-SGD reduces to traditional SGD, whose optimal high-probability generalization bound is $\mathcal{O}\left(\frac{1}{\sqrt{n}}\log (1/δ)\right)$. This discrepancy reveals a gap between high-probability guarantees for SGD and those for D-SGD. To close this, we develop a high-probability learning theory for D-SGD, aiming for the optimal $\mathcal{O}\left(\frac{1}{\sqrt{mn}}\log (1/δ)\right)$ rate. We refine bounds for D-SGD using pointwise uniform stability in distributed learning-a weaker notion than uniform stability-and analyze them across convex, strongly convex, and non-convex settings. We also provide high-probability results for gradient-based measures in non-convex cases where only local minima exist, and derive optimization error and excess risk bounds. Finally, accounting for communication overhead, we analyze generalization bounds for local models within time-varying frameworks.

preprint2021arXiv

Complementary Waveforms for Range-Doppler Sidelobe Suppression Based on a Null Space Approach

While Doppler resilient complementary waveforms have previously been considered to suppress range sidelobes within a Doppler interval of interest in radar systems, their capability of Doppler resilience has not been fully utilized. In this paper, a new construction of Doppler resilient complementary waveforms based on a null space is proposed. With this new construction, one can flexibly include a specified Doppler interval of interest or even an overall Doppler interval into a term which results in range sidelobes. We can force this term to zero, which can be solved to obtain a null space. From the null space, the characteristic vector to control the transmission of basic Golay waveforms, and the coefficients of the receiver filter for Golay complementary waveform can be extracted. Besides, based on the derived null space, two challenging non-convex optimization problems are formulated and solved for maximizing the signal-to-noise ratio (SNR). Moreover, the coefficients of the receiver filter and the characteristic vector can be applied to fully polarimetric radar systems to achieve nearly perfect Doppler resilient performance, and hence fully suppress the inter-antenna interferences.

preprint2020arXiv

Quasi-Orthogonal Z-Complementary Pairs and Their Applications in Fully Polarimetric Radar Systems

One objective of this paper is to propose a novel class of sequence pairs, called "Quasi-orthogonal Z-complementary pairs (QOZCPs)", each depicting Z-complementary property for their aperiodic auto-correlation sums and also have a zero correlation zone when their aperiodic cross-correlation is considered. Construction of QOZCPs based on Successively Distributed Algorithms under Majorization Minimization (SDAMM) is presented. Another objective of this paper is to apply the proposed QOZCPs in fully polarimetric radar systems and analyse the corresponding ambiguity functions. It turns out that QOZCP waveforms are much more Doppler resilient than the known Golay complementary waveforms.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint