Source author record

Yanjie Dong

Yanjie Dong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Information Theory math.IT Computer Science and Game Theory Networking and Internet Architecture

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

FedSM: Robust Semantics-Guided Feature Mixup for Bias Reduction in Federated Learning with Long-Tail Data

Federated Learning (FL) enables collaborative model training across decentralized clients without sharing private data. However, FL suffers from biased global models due to non-IID and long-tail data distributions. We propose \textbf{FedSM}, a novel client-centric framework that mitigates this bias through semantics-guided feature mixup and lightweight classifier retraining. FedSM uses a pretrained image-text-aligned model to compute category-level semantic relevance, guiding the category selection of local features to mix-up with global prototypes to generate class-consistent pseudo-features. These features correct classifier bias, especially when data are heavily skewed. To address the concern of potential domain shift between the pretrained model and the data, we propose probabilistic category selection, enhancing feature diversity to effectively mitigate biases. All computations are performed locally, requiring minimal server overhead. Extensive experiments on long-tail datasets with various imbalanced levels demonstrate that FedSM consistently outperforms state-of-the-art methods in accuracy, with high robustness to domain shift and computational efficiency.

preprint2024arXiv

LMaaS: Exploring Pricing Strategy of Large Model as a Service for Communication

The next generation of communication is envisioned to be intelligent communication, that can replace traditional symbolic communication, where highly condensed semantic information considering both source and channel will be extracted and transmitted with high efficiency. The recent popular large models such as GPT4 and the boosting learning techniques lay a solid foundation for the intelligent communication, and prompt the practical deployment of it in the near future. Given the characteristics of "training once and widely use" of those multimodal large language models, we argue that a pay-as-you-go service mode will be suitable in this context, referred to as Large Model as a Service (LMaaS). However, the trading and pricing problem is quite complex with heterogeneous and dynamic customer environments, making the pricing optimization problem challenging in seeking on-hand solutions. In this paper, we aim to fill this gap and formulate the LMaaS market trading as a Stackelberg game with two steps. In the first step, we optimize the seller's pricing decision and propose an Iterative Model Pricing (IMP) algorithm that optimizes the prices of large models iteratively by reasoning customers' future rental decisions, which is able to achieve a near-optimal pricing solution. In the second step, we optimize customers' selection decisions by designing a robust selecting and renting (RSR) algorithm, which is guaranteed to be optimal with rigorous theoretical proof. Extensive experiments confirm the effectiveness and robustness of our algorithms.

preprint2020arXiv

Communication-Efficient Robust Federated Learning Over Heterogeneous Datasets

This work investigates fault-resilient federated learning when the data samples are non-uniformly distributed across workers, and the number of faulty workers is unknown to the central server. In the presence of adversarially faulty workers who may strategically corrupt datasets, the local messages exchanged (e.g., local gradients and/or local model parameters) can be unreliable, and thus the vanilla stochastic gradient descent (SGD) algorithm is not guaranteed to converge. Recently developed algorithms improve upon vanilla SGD by providing robustness to faulty workers at the price of slowing down convergence. To remedy this limitation, the present work introduces a fault-resilient proximal gradient (FRPG) algorithm that relies on Nesterov's acceleration technique. To reduce the communication overhead of FRPG, a local (L) FRPG algorithm is also developed to allow for intermittent server-workers parameter exchanges. For strongly convex loss functions, FRPG and LFRPG have provably faster convergence rates than a benchmark robust stochastic aggregation algorithm. Moreover, LFRPG converges faster than FRPG while using the same communication rounds. Numerical tests performed on various real datasets confirm the accelerated convergence of FRPG and LFRPG over the robust stochastic aggregation benchmark and competing alternatives.

preprint2020arXiv

Cross-Layer Scheduling and Beamforming in Smart-Grid Powered Cellular Networks With Heterogeneous Energy Coordination

User scheduling, beamforming and energy coordination are investigated in smart-grid powered cellular networks (SGPCNs), where the base stations are powered by a smart grid and natural renewable energy sources. Heterogeneous energy coordination is considered in SGPCNs, namely energy merchandizing with the smart grid and energy exchanging among the base stations. A long-term grid-energy expenditure minimization problem with proportional-rate constraints is formulated for SGPCNs. Since user scheduling is coupled with the beamforming vectors, the formulated problem is challenging to handle via standard convex optimization methods. In practice, the beamforming vectors need to be updated over each slot according to the channel variations. User scheduling needs to be updated over several slots (frame) since the frequent scheduling of user equipment can cause reliability issues. Therefore, the Lyapunov optimization method is used to decouple the problem. A practical two-scale algorithm is proposed to schedule users at each frame, and obtain the beamforming vectors and amount of exchanged natural renewable energy at each slot. We prove that the proposed two-scale algorithm can asymptotically achieve the optimal solutions via tuning a control parameter. Numerical results verify the performance of the proposed two-scale algorithm.

preprint2016arXiv

Fronthauling for 5G LTE-U Ultra Dense Cloud Small Cell Networks

Ultra dense cloud small cell network (UDCSNet), which combines cloud computing and massive deployment of small cells, is a promising technology for the fifth-generation (5G) LTE-U mobile communications because it can accommodate the anticipated explosive growth of mobile users' data traffic. As a result, fronthauling becomes a challenging problem in 5G LTE-U UDCSNet. In this article, we present an overview of the challenges and requirements of the fronthaul technology in 5G \mbox{LTE-U} UDCSNets. We survey the advantages and challenges for various candidate fronthaul technologies such as optical fiber, millimeter-wave based unlicensed spectrum, Wi-Fi based unlicensed spectrum, sub 6GHz based licensed spectrum, and free-space optical based unlicensed spectrum.