Source author record

Weijun Zeng

Weijun Zeng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Artificial Intelligence Computation and Language Computer Vision Robotics

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

Multimodal Large Language Models (MLLMs) have significantly advanced document understanding, yet current Doc-VQA evaluations score only the final answer and leave the supporting evidence unchecked. This answer-only approach masks a critical failure mode: a model can land on the correct answer while grounding it in the wrong passage -- a critical risk in high-stakes domains like law, finance, and medicine, where every conclusion must be traceable to a specific source region. To address this, we introduce CiteVQA, a benchmark that requires models to return element-level bounding-box citations alongside each answer, evaluating both jointly. CiteVQA comprises 1,897 questions across 711 PDFs spanning seven domains and two languages, averaging 40.6 pages per document. To ensure fidelity and scalability, the ground-truth citations are generated by an automated pipeline-which identifies crucial evidence via masking ablation-and are subsequently validated through expert review. At the core of our evaluation is Strict Attributed Accuracy (SAA), which credits a prediction only when the answer and the cited region are both correct. Auditing 20 MLLMs reveals a pervasive Attribution Hallucination: models frequently produce the right answer while citing the wrong region. The strongest system (Gemini-3.1-Pro-Preview) achieves an SAA of only 76.0, and the strongest open-source MLLM reaches just 22.5. Ultimately, towards trustworthy document intelligence, CiteVQA exposes a reliability gap that answer-only evaluations overlook, providing the instrumentation needed to close it. Our repository is available at https://github.com/opendatalab/CiteVQA.

preprint2026arXiv

GFM4GA: Graph Foundation Model for Group Anomaly Detection

Group anomaly detection is crucial in many network applications, but faces challenges due to diverse anomaly patterns. Motivated by the success of large language models (LLMs) in natural language processing, graph foundation models (GFMs) is proposed to handle few-shot learning task with fewer labeling efforts. GFMs have been successfully applied to detection of individual anomalies but cannot be generalized to group anomalies, as group anomaly patterns must be detected as a whole and individuals in an abnormal group can look rather normal. Therefore, we propose GFM4GA, a novel graph foundation model for group anomaly detection. The pipeline is pretrained via dual-level contrastive learning based on feature-based estimation and group extraction, to capture potential group anomaly structure and feature inconsistencies. In the downstream tasks, the pipeline is finetuned in parameter-constrained and group-anomaly-proportion weighted few-shot settings, and its adaptive ability to unseen group anomalies expanded via group contexts determined by labeled anomaly neighbors. Experiments show that GFM4GA surpasses group anomaly detectors and GFMs for individual anomalies, achieving average improvements of 2.85% in AUROC and 2.55% in AUPRC.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2013arXiv

Turbo DPSK in Bi-directional Relaying

In this paper, iterative differential phase-shift keying (DPSK) demodulation and channel decoding scheme is investigated for the Joint Channel decoding and physical layer Network Coding (JCNC) approach in two-way relaying systems. The Bahl, Cocke, Jelinek, and Raviv (BCJR) algorithm for both coherent and noncoherent detection is derived for soft-in soft-out decoding of DPSK signalling over the two-user multiple-access channel with Rayleigh fading. Then, we propose a pragmatic approach with the JCNC scheme for iteratively exploiting the extrinsic information of the outer code. With coherent detection, we show that DPSK can be well concatenated with simple convolutional codes to achieve excellent coding gain just like in traditional point-to-point communication scenarios. The proposed noncoherent detection, which essentially requires that the channel keeps constant over two consecutive symbols, can work without explicit channel estimation. Simulation results show that the iterative processing converges very fast and most of the coding gain is obtained within two iterations.

preprint2011arXiv

Joint Network and LDPC Coding for Bi-directional Relaying

In this paper, we consider joint network and LDPC coding for practically implementing the denosie-and-forward protocol over bi-directional relaying. the closed-form expressions for computing the log-likelihood ratios of the network-coded codewords have been derived for both real and complex multiple-access channels. It is revealed that the equivalent channel observed at the relay is an asymmetrical channel, where the channel input is the XOR form of the two source nodes.

Weijun Zeng

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

GFM4GA: Graph Foundation Model for Group Anomaly Detection

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Turbo DPSK in Bi-directional Relaying

Joint Network and LDPC Coding for Bi-directional Relaying