Source author record

Yao Ge

Yao Ge appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.SP Information Theory math.IT Computation and Language Information Retrieval Machine Learning Artificial Intelligence

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Achievable Rate and Coding Principle for MIMO Multicarrier Systems With Cross-Domain MAMP Receiver Over Doubly Selective Channels

The integration of multicarrier modulation and multiple-input-multiple-output (MIMO) is critical for reliable transmission of wireless signals in complex environments, which significantly improve spectrum efficiency. Existing studies have shown that popular orthogonal time frequency space (OTFS) and affine frequency division multiplexing (AFDM) offer significant advantages over orthogonal frequency division multiplexing (OFDM) in uncoded doubly selective channels. However, it remains uncertain whether these benefits extend to coded systems. Meanwhile, the information-theoretic limit analysis of coded MIMO multicarrier systems and the corresponding low-complexity receiver design remain unclear. To overcome these challenges, this paper proposes a multi-slot cross-domain memory approximate message passing (MS-CD-MAMP) receiver as well as develops its information-theoretic (i.e., achievable rate) limit and optimal coding principle for MIMO-multicarrier modulation (e.g., OFDM, OTFS, and AFDM) systems. The proposed MS-CD-MAMP receiver can exploit not only the time domain channel sparsity for low complexity but also the corresponding symbol domain constellation constraints for performance enhancement. Meanwhile, limited by the high-dimensional complex state evolution (SE), a simplified single-input single-output variational SE is proposed to derive the achievable rate of MS-CD-MAMP and the optimal coding principle with the goal of maximizing the achievable rate. Numerical results show that coded MIMO-OFDM/OTFS/AFDM with MS-CD-MAMP achieve the same maximum achievable rate in doubly selective channels, whose finite-length performance with practical optimized low-density parity-check (LDPC) codes is only 0.5 $\sim$ 1.8 dB away from the associated theoretical limit, and has 0.8 $\sim$ 4.4 dB gain over the well-designed point-to-point LDPC codes.

preprint2026arXiv

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Evaluating large language models (LLMs) in the biomedical domain requires benchmarks that can distinguish reasoning from pattern matching and remain discriminative as model capabilities improve. Existing biomedical question answering (QA) benchmarks are limited in this respect. Multiple-choice formats can allow models to succeed through answer elimination rather than inference, while widely circulated exam-style datasets are increasingly vulnerable to performance saturation and training data contamination. Multi-hop reasoning, defined as the ability to integrate information across multiple sources to derive an answer, is central to clinically meaningful tasks such as diagnostic support, literature-based discovery, and hypothesis generation, yet remains underrepresented in current biomedical QA benchmarks. MedHopQA is a disease-centered multi-hop reasoning benchmark consisting of 1,000 expert-curated question-answer pairs introduced as a shared task at BioCreative IX. Each question requires synthesis of information across two distinct Wikipedia articles, and answers are provided in an open-ended free-text format. Gold annotations are augmented with ontology-grounded synonym sets from MONDO, NCBI Gene, and NCBI Taxonomy to support both lexical and concept-level evaluation. MedHopQA was constructed through a structured process combining human annotation, triage, iterative verification, and LLM-as-a-judge validation. To reduce leaderboard gaming and contamination risk, the 1,000 scored questions are embedded within a publicly downloadable set of 10,000 questions, with answers withheld, on a CodaBench leaderboard. MedHopQA provides both a benchmark and a reusable framework for constructing future biomedical QA datasets that prioritize compositional reasoning, saturation resistance, and contamination resistance as core design constraints.

preprint2024arXiv

Message Feedback Interference Cancellation Aided UAMP Iterative Detector for OTFS Systems

The designing of efficient signal detectors is important and yet challenge for orthogonal time frequency space (OTFS) systems in high-mobility scenarios. In this letter, we develop an efficient message feedback interference cancellation aided unitary approximate message passing (denoted as UAMPMFIC) iterative detector, where the latest feedback messages from variable nodes are utilized for more reliable interference cancellation and performance improvement. A fast recursive scheme is leveraged in the proposed UAMP-MFIC detector to prevent complexity increasing. To further alleviate the error-propagation and improve the receiver performance, we also develop the bidirectional symbol detection structures, where Turbo UAMP-MFIC detector and iterative weight UAMP-MFIC detector are proposed to efficiently fuse the estimation results of forward and backward UAMP-MFIC detectors. The simulation results are finally provided to demonstrate performance improvement of our proposed detectors over existing detectors.

preprint2022arXiv

Energy Efficiency for Proactive Eavesdropping in Cooperative Cognitive Radio Networks

This paper investigates a distant proactive eavesdropping system in cooperative cognitive radio (CR) networks. Specifically, an amplify-and-forward (AF) full-duplex (FD) secondary transmitter assists to relay the received signal from suspicious users to legitimate monitor for wireless information surveillance. In return, the secondary transmitter is granted to share the spectrum belonging to the suspicious users for its own information transmission. To improve the eavesdropping, the transmitted secondary user's signal can also be used as a jamming signal to moderate the data rate of the suspicious link. We consider two cases, i.e., non-negligible processing delay (NNPD) and negligible processing delay (NPD) at secondary transmitter. Our target is to maximize network energy efficiency (NEE) via jointly optimizing the AF relay matrix and precoding vector at the secondary transmitter, as well as the receiver combining vector at monitor, subject to the maximum power constraint at the secondary transmitter and minimum data rate requirement of the secondary user. We also guarantee that the achievable data rate of the eavesdropping link should be no less than that of the suspicious link for efficient surveillance. Due to the non-convexity of the formulated NEE maximization problem, we develop an efficient path-following algorithm and a robust alternating optimization (AO) method as solutions under perfect and imperfect channel state information (CSI) conditions, respectively. We also analyze the convergence and computational complexity of the proposed schemes. Numerical results are provided to validate the effectiveness of our proposed schemes.

preprint2022arXiv

Few-shot learning for medical text: A systematic review

Objective: Few-shot learning (FSL) methods require small numbers of labeled instances for training. As many medical topics have limited annotated textual data in practical settings, FSL-based natural language processing (NLP) methods hold substantial promise. We aimed to conduct a systematic review to explore the state of FSL methods for medical NLP. Materials and Methods: We searched for articles published between January 2016 and August 2021 using PubMed/Medline, Embase, ACL Anthology, and IEEE Xplore Digital Library. To identify the latest relevant methods, we also searched other sources such as preprint servers (eg., medRxiv) via Google Scholar. We included all articles that involved FSL and any type of medical text. We abstracted articles based on data source(s), aim(s), training set size(s), primary method(s)/approach(es), and evaluation method(s). Results: 31 studies met our inclusion criteria-all published after 2018; 22 (71%) since 2020. Concept extraction/named entity recognition was the most frequently addressed task (13/31; 42%), followed by text classification (10/31; 32%). Twenty-one (68%) studies reconstructed existing datasets to create few-shot scenarios synthetically, and MIMIC-III was the most frequently used dataset (7/31; 23%). Common methods included FSL with attention mechanisms (12/31; 39%), prototypical networks (8/31; 26%), and meta-learning (6/31; 19%). Discussion: Despite the potential for FSL in biomedical NLP, progress has been limited compared to domain-independent FSL. This may be due to the paucity of standardized, public datasets, and the relative underperformance of FSL methods on biomedical topics. Creation and release of specialized datasets for biomedical FSL may aid method development by enabling comparative analyses.

preprint2022arXiv

How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation

Graph neural networks (GNNs), as a group of powerful tools for representation learning on irregular data, have manifested superiority in various downstream tasks. With unstructured texts represented as concept maps, GNNs can be exploited for tasks like document retrieval. Intrigued by how can GNNs help document retrieval, we conduct an empirical study on a large-scale multi-discipline dataset CORD-19. Results show that instead of the complex structure-oriented GNNs such as GINs and GATs, our proposed semantics-oriented graph functions achieve better and more stable performance based on the BM25 retrieved candidates. Our insights in this case study can serve as a guideline for future work to develop effective GNNs with appropriate semantics-oriented inductive biases for textual reasoning tasks like document retrieval and classification. All code for this case study is available at https://github.com/HennyJie/GNN-DocRetrieval.

preprint2022arXiv

Joint Channel Estimation and Data Detection for Hybrid RIS aided Millimeter Wave OTFS Systems

For high mobility communication scenario, the recently emerged orthogonal time frequency space (OTFS) modulation introduces a new delay-Doppler domain signal space, and can provide better communication performance than traditional orthogonal frequency division multiplexing system. This article focuses on the joint channel estimation and data detection (JCEDD) for hybrid reconfigurable intelligent surface (HRIS) aided millimeter wave (mmWave) OTFS systems. Firstly, a new transmission structure is designed. Within the pilot durations of the designed structure, partial HRIS elements are alternatively activated. The time domain channel model is then exhibited. Secondly, the received signal model for both the HRIS over time domain and the base station over delay-Doppler domain are studied. Thirdly, by utilizing channel parameters acquired at the HRIS, an HRIS beamforming design strategy is proposed. For the OTFS transmission, we propose a JCEDD scheme over delay-Doppler domain. In this scheme, message passing (MP) algorithm is designed to simultaneously obtain the equivalent channel gain and the data symbols. On the other hand, the channel parameters, i.e., the Doppler shift, the channel sparsity, and the channel variance, are updated through expectation-maximization (EM) algorithm. By iteratively executing the MP and EM algorithm, both the channel and the unknown data symbols can be accurately acquired. Finally, simulation results are provided to validate the effectiveness of our proposed JCEDD scheme.

preprint2021arXiv

OTFS Signaling for Uplink NOMA of Heterogeneous Mobility Users

We investigate a coded uplink non-orthogonal multiple access (NOMA) configuration in which groups of co-channel users are modulated in accordance with orthogonal time frequency space (OTFS). We take advantage of OTFS characteristics to achieve NOMA spectrum sharing in the delay-Doppler domain between stationary and mobile users. We develop an efficient iterative turbo receiver based on the principle of successive interference cancellation (SIC) to overcome the co-channel interference (CCI). We propose two turbo detector algorithms: orthogonal approximate message passing with linear minimum mean squared error (OAMP-LMMSE) and Gaussian approximate message passing with expectation propagation (GAMP-EP). The interactive OAMP-LMMSE detector and GAMP-EP detector are respectively assigned for the reception of the stationary and mobile users. We analyze the convergence performance of our proposed iterative SIC turbo receiver by utilizing a customized extrinsic information transfer (EXIT) chart and simplify the corresponding detector algorithms to further reduce receiver complexity. Our proposed iterative SIC turbo receiver demonstrates performance improvement over existing receivers and robustness against imperfect SIC process and channel state information uncertainty.

preprint2021arXiv

Receiver Design for OTFS with Fractionally Spaced Sampling Approach

The recent emergence of orthogonal time frequency space (OTFS) modulation as a novel PHY-layer mechanism is more suitable in high-mobility wireless communication scenarios than traditional orthogonal frequency division multiplexing (OFDM). Although multiple studies have analyzed OTFS performance using theoretical and ideal baseband pulseshapes, a challenging and open problem is the development of effective receivers for practical OTFS systems that must rely on non-ideal pulseshapes for transmission. This work focuses on the design of practical receivers for OTFS. We consider a fractionally spaced sampling (FSS) receiver in which the sampling rate is an integer multiple of the symbol rate. For rectangular pulses used in OTFS transmission, we derive a general channel input-output relationship of OTFS in delay-Doppler domain without the common reliance on impractical assumptions such as ideal bi-orthogonal pulses and on-the-grid delay/Doppler shifts. We propose two equalization algorithms: iterative combining message passing (ICMP) and turbo message passing (TMP) for symbol detection by exploiting delay-Doppler channel sparsity and the frequency diversity gain via FSS. We analyze the convergence performance of TMP receiver and propose simplified message passing (MP) receivers to further reduce complexity. Our FSS receivers demonstrate stronger performance than traditional receivers and robustness to the imperfect channel state information knowledge.

Yao Ge

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Achievable Rate and Coding Principle for MIMO Multicarrier Systems With Cross-Domain MAMP Receiver Over Doubly Selective Channels

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Message Feedback Interference Cancellation Aided UAMP Iterative Detector for OTFS Systems

Energy Efficiency for Proactive Eavesdropping in Cooperative Cognitive Radio Networks

Few-shot learning for medical text: A systematic review

How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation

Joint Channel Estimation and Data Detection for Hybrid RIS aided Millimeter Wave OTFS Systems

OTFS Signaling for Uplink NOMA of Heterogeneous Mobility Users

Receiver Design for OTFS with Fractionally Spaced Sampling Approach