Researcher profile

Yang Lu

Yang Lu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Decision Boundary-aware Generation for Long-tailed Learning

Long-tailed data bias decision boundaries toward head classes and degrade tail class accuracy. Diffusion-based generative augmentation address this problem by generating additional data, while head-to-tail transfer further mitigate the generator bias inherit from long-tailed dataset. However, we show that while head-to-tail transfer helps balance the decision space of the classifier, it also induces latent non-local feature mixing that entangles inter-class features, causing decision boundary overlap and tail class distribution shift. To address this, we first identify the problem of boundary ambiguity and then propose Decision Boundary-aware Generation (DBG) framework, which promotes near-boundary representation learning by generating informative near-boundary samples. Overall, DBG rebalances the long-tailed dataset while yielding more separable decision space for long-tailed learning. Across standard long-tailed benchmarks, DBG consistently improves tail class and overall accuracy with less inter-class overlap. The code of DBG is available at https://github.com/keepdigitalabc-svg/DBG.

preprint2026arXiv

Estimation of the intercept parameter in integrated Galton-Watson processes

We study estimation of the intercept parameter in an integrated Galton-Watson process, a basic building-block for many count-valued time series models. In this unit root setting, the ordinary least squares estimator is inconsistent, whereas an existing weighted least squares (WLS) estimator is consistent only in the case where the process is transient, a condition that depends on the unknown intercept parameter . We propose an alternative WLS estimator based on the new weight function of $1/t$, and show that it is consistent regardless of whether the process is transient or null recurrent, with a convergence rate of $\sqrt{\ln n}$.

preprint2026arXiv

Fine-Tuning Impairs the Balancedness of Foundation Models in Long-tailed Personalized Federated Learning

Personalized federated learning (PFL) with foundation models has emerged as a promising paradigm enabling clients to adapt to heterogeneous data distributions. However, real-world scenarios often face the co-occurrence of non-IID data and long-tailed class distributions, presenting unique challenges that remain underexplored in PFL. In this paper, we investigate this long-tailed personalized federated learning and observe that current methods suffer from two limitations: (i) fine-tuning degrades performance below zero-shot baselines due to the erosion of inherent class balance in foundation models; (ii) conventional personalization techniques further transfer this bias to local models through parameter or feature-level fusion. To address these challenges, we propose Federated Learning via Gradient Purification and Residual Learning (FedPuReL), which preserves balanced knowledge in the global model while enabling unbiased personalization. Specifically, we purify local gradients using zero-shot predictions to maintain a class-balanced global model, and model personalization as residual correction atop the frozen global model. Extensive experiments demonstrate that FedPuReL consistently outperforms state-of-the-art methods, achieving superior performance on both global and personalized models across diverse long-tailed scenarios. The code is available at https://github.com/shihaohou/FedPuReL.

preprint2026arXiv

One-Shot Hierarchical Federated Clustering

Driven by the growth of Web-scale decentralized services, Federated Clustering (FC) aims to extract knowledge from heterogeneous clients in an unsupervised manner while preserving the clients' privacy, which has emerged as a significant challenge due to the lack of label guidance and the Non-Independent and Identically Distributed (non-IID) nature of clients. In real scenarios such as personalized recommendation and cross-device user profiling, the global cluster may be fragmented and distributed among different clients, and the clusters may exist at different granularities or even nested. Although Hierarchical Clustering (HC) is considered promising for exploring such distributions, the sophisticated recursive clustering process makes it more computationally expensive and vulnerable to privacy exposure, thus relatively unexplored under the federated learning scenario. This paper introduces an efficient one-shot hierarchical FC framework that performs client-end distribution exploration and server-end distribution aggregation through one-way prototype-level communication from clients to the server. A fine partition mechanism is developed to generate successive clusterlets to describe the complex landscape of the clients' clusters. Then, a multi-granular learning mechanism on the server is proposed to fuse the clusterlets, even when they have inconsistent granularities generated from different clients. It turns out that the complex cluster distributions across clients can be efficiently explored, and extensive experiments comparing state-of-the-art methods on ten public datasets demonstrate the superiority of the proposed method.

preprint2026arXiv

Reasoning emerges from constrained inference manifolds in large language models

Reasoning in large language models is predominantly evaluated through labeled benchmarks, conflating task performance with the quality of internal inference. Here we study reasoning as an intrinsic dynamical process by examining the evolution of internal representations during inference. We find that inference-time dynamics consistently self-organize into low-dimensional manifolds embedded within high-dimensional representation spaces. we find that such geometric compression, although pervasive, is not sufficient for stable or reliable reasoning. Instead, effective reasoning dynamics emerge within a constrained structural regime characterized by three conditions: adequate representational expressivity, spontaneous manifold compression, and preservation of non-degenerate information volume within the compressed subspace. Models outside this regime exhibit characteristic pathological inference dynamics. Based on these insights, we introduce a unified, label-free diagnostic computed solely from internal dynamics. These findings suggest that reasoning in LLMs is fundamentally governed by geometric and informational constraints, offering a complementary framework to benchmark-centric assessment.

preprint2026arXiv

Using Directed Acyclic Graphs to Illustrate Common Biases in Diagnostic Test Accuracy Studies

Background: Diagnostic test accuracy (DTA) studies, like etiological studies, are susceptible to various biases including reference standard error bias, partial verification bias, spectrum effect, confounding, and bias from misassumption of conditional independence. While directed acyclic graphs (DAGs) are widely used in etiological research to identify and illustrate bias structures, they have not been systematically applied to DTA studies. Methods: We developed DAGs to illustrate the causal structures underlying common biases in DTA studies. For each bias, we present the corresponding DAG structure and demonstrate the parallel with equivalent biases in etiological studies. We use real-world examples to illustrate each bias mechanism. Results: We demonstrate that five major biases in DTA studies can be represented using DAGs with clear structural parallels to etiological studies: reference standard error bias corresponds to exposure misclassification, misassumption of conditional independence creates spurious correlations similar to unmeasured confounding, spectrum effect parallels effect modification, confounding operates through backdoor paths in both settings, and partial verification bias mirrors selection bias. These DAG representations reveal the causal mechanisms underlying each bias and suggest appropriate correction strategies. Conclusions: DAGs provide a valuable framework for understanding bias structures in DTA studies and should complement existing quality assessment tools like STARD and QUADAS-2. We recommend incorporating DAGs during study design to prospectively identify potential biases and during reporting to enhance transparency. DAG construction requires interdisciplinary collaboration and sensitivity analyses under alternative causal structures.

preprint2026arXiv

zkRansomware: Proof-of-Data Recoverability and Multi-round Game Theoretic Modeling of Ransomware Decisions

Ransomware is still one of the most serious cybersecurity threats. Victims often pay but fail to regain access to their data, while also facing the danger of losing data privacy. These uncertainties heavily shape the attacker-victim dynamics in decision-making. In this paper, we introduce and analyze zkRansomware. This new ransomware model integrates zero-knowledge proofs to enable verifiable data recovery and uses smart contracts to enforce multi-round payments while mitigating the risk of data disclosure and privacy loss. We show that zkRansomware is technically feasible using existing cryptographic and blockchain tools and, perhaps counterintuitively, can align incentives between the attacker and the victim. Finally, we develop a theoretical decision-making framework for zkRansomware that distinguishes it from known ransomware decision models and discusses its implications for ransomware risk analysis and response decision support.