Source author record

Zhonghao Lyu

Zhonghao Lyu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Information Theory math.IT

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FlexSpec: Frozen Drafts Meet Evolving Targets in Edge-Cloud Collaborative LLM Speculative Decoding

Deploying large language models (LLMs) in mobile and edge computing environments is constrained by limited on-device resources, scarce wireless bandwidth, and frequent model evolution. Although edge-cloud collaborative inference with speculative decoding (SD) can reduce end-to-end latency by executing a lightweight draft model at the edge and verifying it with a cloud-side target model, existing frameworks fundamentally rely on tight coupling between the two models. Consequently, repeated model synchronization introduces excessive communication overhead, increasing end-to-end latency, and ultimately limiting the scalability of SD in edge environments. To address these limitations, we propose FlexSpec, a communication-efficient collaborative inference framework tailored for evolving edge-cloud systems. The core design of FlexSpec is a shared-backbone architecture that allows a single and static edge-side draft model to remain compatible with a large family of evolving cloud-side target models. By decoupling edge deployment from cloud-side model updates, FlexSpec eliminates the need for edge-side retraining or repeated model downloads, substantially reducing communication and maintenance costs. Furthermore, to accommodate time-varying wireless conditions and heterogeneous device constraints, we develop a channel-aware adaptive speculation mechanism that dynamically adjusts the speculative draft length based on real-time channel state information and device energy budgets. Extensive experiments demonstrate that FlexSpec achieves superior performance compared to conventional SD approaches in terms of inference efficiency.

preprint2022arXiv

An Overview on Over-the-Air Federated Edge Learning

Over-the-air federated edge learning (Air-FEEL) has emerged as a promising solution to support edge artificial intelligence (AI) in future beyond 5G (B5G) and 6G networks. In Air-FEEL, distributed edge devices use their local data to collaboratively train AI models while preserving data privacy, in which the over-the-air model/gradient aggregation is exploited for enhancing the learning efficiency. This article provides an overview on the state of the art of Air-FEEL. First, we present the basic principle of Air-FEEL, and introduce the technical challenges for Air-FEEL design due to the over-the-air aggregation errors, as well as the resource and data heterogeneities at edge devices. Next, we present the fundamental performance metrics for Air-FEEL, and review resource management solutions and design considerations for enhancing the Air-FEEL performance. Finally, several interesting research directions are pointed out to motivate future work.