Researcher profile

Yulin Shao

Yulin Shao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Hierarchical Online-Scheduling for Energy-Efficient Split Inference with Progressive Transmission

Device-edge collaborative inference with Deep Neural Networks (DNNs) faces fundamental trade-offs among accuracy, latency and energy consumption. Current scheduling exhibits two drawbacks: a granularity mismatch between coarse, task-level decisions and fine-grained, packet-level channel dynamics, and insufficient awareness of per-task complexity. Consequently, scheduling solely at the task level leads to inefficient resource utilization. This paper proposes a novel ENergy-ACcuracy Hierarchical optimization framework for split Inference, named ENACHI, that jointly optimizes task- and packet-level scheduling to maximize accuracy under energy and delay constraints. A two-tier Lyapunov-based framework is developed for ENACHI, with a progressive transmission technique further integrated to enhance adaptivity. At the task level, an outer drift-plus-penalty loop makes online decisions for DNN partitioning and bandwidth allocation, and establishes a reference power budget to manage the long-term energy-accuracy trade-off. At the packet level, an uncertainty-aware progressive transmission mechanism is employed to adaptively manage per-sample task complexity. This is integrated with a nested inner control loop implementing a novel reference-tracking policy, which dynamically adjusts per-slot transmit power to adapt to fluctuating channel conditions. Experiments on ImageNet dataset demonstrate that ENACHI outperforms state-of-the-art benchmarks under varying deadlines and bandwidths, achieving a 43.12\% gain in inference accuracy with a 62.13\% reduction in energy consumption under stringent deadlines, and exhibits high scalability by maintaining stable energy consumption in congested multi-user scenarios.

preprint2024arXiv

Point Cloud in the Air

Acquisition and processing of point clouds (PCs) is a crucial enabler for many emerging applications reliant on 3D spatial data, such as robot navigation, autonomous vehicles, and augmented reality. In most scenarios, PCs acquired by remote sensors must be transmitted to an edge server for fusion, segmentation, or inference. Wireless transmission of PCs not only puts on increased burden on the already congested wireless spectrum, but also confronts a unique set of challenges arising from the irregular and unstructured nature of PCs. In this paper, we meticulously delineate these challenges and offer a comprehensive examination of existing solutions while candidly acknowledging their inherent limitations. In response to these intricacies, we proffer four pragmatic solution frameworks, spanning advanced techniques, hybrid schemes, and distributed data aggregation approaches. In doing so, our goal is to chart a path toward efficient, reliable, and low-latency wireless PC transmission.

preprint2022arXiv

Channel-Adaptive Wireless Image Transmission with OFDM

We present a learning-based channel-adaptive joint source and channel coding (CA-JSCC) scheme for wireless image transmission over multipath fading channels. The proposed method is an end-to-end autoencoder architecture with a dual-attention mechanism employing orthogonal frequency division multiplexing (OFDM) transmission. Unlike the previous works, our approach is adaptive to channel-gain and noise-power variations by exploiting the estimated channel state information (CSI). Specifically, with the proposed dual-attention mechanism, our model can learn to map the features and allocate transmission-power resources judiciously based on the estimated CSI. Extensive numerical experiments verify that CA-JSCC achieves state-of-the-art performance among existing JSCC schemes. In addition, CA-JSCC is robust to varying channel conditions and can better exploit the limited channel resources by transmitting critical features over better subchannels.

preprint2022arXiv

Dynamic gNodeB Sleep Control for Energy-Conserving 5G Radio Access Network

5G radio access network (RAN) is consuming much more energy than legacy RAN due to the denser deployments of gNodeBs (gNBs) and higher single-gNB power consumption. In an effort to achieve an energy-conserving RAN, this paper develops a dynamic on-off switching paradigm, where the ON/OFF states of gNBs can be dynamically configured according to the evolvements of the associated users. We formulate the dynamic sleep control for a cluster of gNBs as a Markov decision process (MDP) and analyze various switching policies to reduce the energy expenditure. The optimal policy of the MDP that minimizes the energy expenditure can be derived from dynamic programming, but the computation is expensive. To circumvent this issue, this paper puts forth a greedy policy and an index policy for gNB sleep control. When there is no constraint on the number of gNBs that can be turned off, we prove the dual-threshold structure of the greedy policy and analyze its connections with the optimal policy. Inspired by the dual-threshold structure and Whittle index, we develop an index policy by decoupling the original MDP into multiple one-dimensional MDPs -- the indexability of the decoupled MDP is proven and an algorithm to compute the index is proposed. Extensive simulation results verify that the index policy exhibits close-to-optimal performance in terms of the energy expenditure of the gNB cluster. As far as the computational complexity is concerned, on the other hand, the index policy is much more efficient than the optimal policy, which is computationally prohibitive when the number of gNBs is large.

preprint2022arXiv

Efficient FFT Computation in IFDMA Transceivers

Interleaved Frequency Division Multiple Access (IFDMA) has the salient advantage of lower Peak-to-Average Power Ratio (PAPR) than its competitors like Orthogonal FDMA (OFDMA). A recent research effort put forth a new IFDMA transceiver design significantly less complex than conventional IFDMA transceivers. The new IFDMA transceiver design reduces the complexity by exploiting a certain correspondence between the IFDMA signal processing and the Cooley-Tukey IFFT/FFT algorithmic structure so that IFDMA streams can be inserted/extracted at different stages of an IFFT/FFT module according to the sizes of the streams. Although the prior work has laid down the theoretical foundation for the new IFDMA transceiver's structure, the practical realization of the transceiver on specific hardware with resource constraints has not been carefully investigated. This paper is an attempt to fill the gap. Specifically, this paper puts forth a heuristic algorithm called multi-priority scheduling (MPS) to schedule the execution of the butterfly computations in the IFDMA transceiver with the constraint of a limited number of hardware processors. The resulting FFT computation, referred to as MPS-FFT, has a much lower computation time than conventional FFT computation when applied to the IFDMA signal processing. Importantly, we derive a lower bound for the optimal IFDMA FFT computation time to benchmark MPS-FFT. Our experimental results indicate that when the number of hardware processors is a power of two: 1) MPS-FFT has near-optimal computation time; 2) MPS-FFT incurs less than 44.13\% of the computation time of the conventional pipelined FFT.

preprint2022arXiv

Federated Spatial Reuse Optimization in Next-Generation Decentralized IEEE 802.11 WLANs

As wireless standards evolve, more complex functionalities are introduced to address the increasing requirements in terms of throughput, latency, security, and efficiency. To unleash the potential of such new features, artificial intelligence (AI) and machine learning (ML) are currently being exploited for deriving models and protocols from data, rather than by hand-programming. In this paper, we explore the feasibility of applying ML in next-generation wireless local area networks (WLANs). More specifically, we focus on the IEEE 802.11ax spatial reuse (SR) problem and predict its performance through federated learning (FL) models. The set of FL solutions overviewed in this work is part of the 2021 International Telecommunication Union (ITU) AI for 5G Challenge.

preprint2022arXiv

Phase Code Discovery for Pulse Compression Radar: A Genetic Algorithm Approach

Discovering sequences with desired properties has long been an interesting intellectual pursuit. In pulse compression radar (PCR), discovering phase codes with low aperiodic autocorrelations is essential for a good estimation performance. The design of phase code, however, is mathematically non-trivial as the aperiodic autocorrelation properties of a sequence are intractable to characterize. In this paper, we put forth a genetic algorithm (GA) approach to discover new phase codes for PCR with the mismatched filter (MMF) receiver. The developed GA, dubbed GASeq, discovers better phase codes than the state of the art. At a code length of 59, the sequence discovered by GASeq achieves a signal-to-clutter ratio (SCR) of 50.84, while the best-known sequence has an SCR of 45.16. In addition, the efficiency and scalability of GASeq enable us to search phase codes with a longer code length, which thwarts existing deep learning-based approaches. At a code length of 100, the best phase code discovered by GASeq exhibit an SCR of 63.23.

preprint2022arXiv

Semantic Communications with Discrete-time Analog Transmission: A PAPR Perspective

Recent progress in deep learning (DL)-based joint source-channel coding (DeepJSCC) has led to a new paradigm of semantic communications. Two salient features of DeepJSCC-based semantic communications are the exploitation of semantic-aware features directly from the source signal, and the discrete-time analog transmission (DTAT) of these features. Compared with traditional digital communications, semantic communications with DeepJSCC provide superior reconstruction performance at the receiver and graceful degradation with diminishing channel quality, but also exhibit a large peak-to-average power ratio (PAPR) in the transmitted signal. An open question has been whether the gains of DeepJSCC come from the additional freedom brought by the high-PAPR continuous-amplitude signal. In this paper, we address this question by exploring three PAPR reduction techniques in the application of image transmission. We confirm that the superior image reconstruction performance of DeepJSCC-based semantic communications can be retained while the transmitted PAPR is suppressed to an acceptable level. This observation is an important step towards the implementation of DeepJSCC in practical semantic communication systems.

preprint2022arXiv

Uncertainty-of-Information Scheduling: A Restless Multi-armed Bandit Framework

This paper proposes using the uncertainty of information (UoI), measured by Shannon's entropy, as a metric for information freshness. We consider a system in which a central monitor observes multiple binary Markov processes through a communication channel. The UoI of a Markov process corresponds to the monitor's uncertainty about its state. At each time step, only one Markov process can be selected to update its state to the monitor; hence there is a tradeoff among the UoIs of the processes that depend on the scheduling policy used to select the process to be updated. The age of information (AoI) of a process corresponds to the time since its last update. In general, the associated UoI can be a non-increasing function, or even an oscillating function, of its AoI, making the scheduling problem particularly challenging. This paper investigates scheduling policies that aim to minimize the average sum-UoI of the processes over the infinite time horizon. We formulate the problem as a restless multi-armed bandit (RMAB) problem, and develop a Whittle index policy that is near-optimal for the RMAB after proving its indexability. We further provide an iterative algorithm to compute the Whittle index for the practical deployment of the policy. Although this paper focuses on UoI scheduling, our results apply to a general class of RMABs for which the UoI scheduling problem is a special case. Specifically, this paper's Whittle index policy is valid for any RMAB in which the bandits are binary Markov processes and the penalty is a concave function of the belief state of the Markov process. Numerical results demonstrate the excellent performance of the Whittle index policy for this class of RMABs.

preprint2020arXiv

New Transceiver Designs for Interleaved Frequency Division Multiple Access

This paper puts forth a class of new transceiver designs for interleaved frequency division multiple access (IFDMA) systems. These transceivers are significantly less complex than conventional IFDMA transceiver. The simple new designs are founded on a key observation that multiplexing and demultiplexing of IFDMA data streams of different sizes are coincident with the IFFTs and FFTs of different sizes embedded within the Cooley-Tukey recursive FFT decomposition scheme. For flexible resource allocation, this paper puts forth a new IFDMA resource allocation framework called Multi-IFDMA, in which a user can be allocated multiple IFDMA streams. Our new transceivers are unified designs in that they can be used in conventional IFDMA as well as multi-IFDMA systems. Two other well-known multiple-access schemes are localized FDMA (LFDMA) and orthogonal FDMA (OFDMA). In terms of flexibility in resource allocation, Multi-IFDMA, LFDMA, and OFDMA are on an equal footing. With our new transceiver designs, however, IFDMA has the following advantages (besides other known advantages not due to our new transceiver designs): 1) IFDMA/Multi-IFDMA transceivers are significantly less complex than LFDMA transceivers; in addition, IFDMA/Multi-IFDMA has better Peak-to-Average Power Ratio (PAPR) than LFDMA; 2) IFDMA/Multi-IFDMA transceivers and OFDMA transceivers are comparable in complexity; but IFDMA/Multi-IFDMA has significantly better PAPR than OFDMA.

preprint2020arXiv

Sporadic Ultra-Time-Critical Crowd Messaging in V2X

Life-critical warning message, abbreviated as warning message, is a special event-driven message that carries emergency warning information in Vehicle-to-Everything (V2X). Three important characteristics that distinguish warning messages from ordinary vehicular messages are sporadicity, crowding, and ultra-time-criticality. In other words, warning messages come only once in a while in a sporadic manner; however, when they come, they tend to come as a crowd and they need to be delivered in short order. This paper puts forth a medium-access control (MAC) protocol for warning messages. To circumvent potential inefficiency arising from sporadicity, we propose an override network architecture whereby warning messages are delivered on the spectrum of the ordinary vehicular messages. Specifically, a vehicle with a warning message first sends an interrupt signal to pre-empt the transmission of ordinary messages, so that the warning message can use the wireless spectrum originally allocated to ordinary messages. In this way, no exclusive spectrum resources need to be pre-allocated to the sporadic warning messages. To meet the crowding and ultra-time-criticality aspects, we use advanced channel access techniques to ensure highly reliable delivery of warning messages within an ultra-short time in the order of 10 ms. In short, the overall MAC protocol operates by means of interrupt-and-access.