Source author record

Yifan Lu

Yifan Lu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Computer Vision cs.CY eess.SY Systems and Control

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Context Scaling with LongCat ZigZag Attention

We introduce LongCat ZigZag Attention (LoZA), which is a sparse attention scheme designed to transform any existing full-attention models into sparse versions with rather limited compute budget. In long-context scenarios, LoZA can achieve significant speed-ups both for prefill-intensive (e.g., retrieval-augmented generation) and decode-intensive (e.g., tool-integrated reasoning) cases. Specifically, by applying LoZA to LongCat-Flash during mid-training, we serve LongCat-Flash-Exp as a long-context foundation model that can swiftly process up to 1 million tokens, enabling efficient long-term reasoning and long-horizon agentic capabilities.

preprint2026arXiv

Evidence-Grounded Multi-Agent Planning Support for Urban Carbon Governance via RAG

Urban carbon governance requires planners to integrate heterogeneous evidence -- emission inventories, statistical yearbooks, policy texts, technical measures, and academic findings -- into actionable, cross-departmental plans. Large Language Models (LLMs) can assist planning workflows, yet their factual reliability and evidential traceability remain critical barriers in professional use. This paper presents an evidence-grounded multi-agent planning support system for urban carbon governance built upon standard text-based Retrieval-Augmented Generation (RAG) (without GraphRAG). We align the system with the typical planning workflow by decomposing tasks into four specialized agents: (i) evidence Q\&A for fact checking and compliance queries, (ii) emission status assessment for diagnostic analysis, (iii) planning recommendation for generating multi-sector governance pathways, and (iv) report integration for producing planning-style deliverables. We evaluate the system in two task families: factual retrieval and comprehensive planning generation. On factual retrieval tasks, introducing RAG increases the average score from below 6 to above 90, and dramatically improves key-field extraction (e.g., region and numeric values near 100\% detection). A real-city case study (Ningbo, China) demonstrates end-to-end report generation with strong relevance, coverage, and coherence in expert review, while also highlighting boundary inconsistencies across data sources as a practical limitation.

preprint2026arXiv

Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding

Large language models accumulate extensive parametric knowledge through pre-training. However, knowledge conflicts occur when outdated or incorrect parametric knowledge conflicts with external knowledge in the context. Existing methods address knowledge conflicts through contrastive decoding, but in conflict-free scenarios, static approaches disrupt output distribution. Other dynamic decoding methods attempt to measure the degree of conflict but still struggle with complex real-world situations. In this paper, we propose a two-stage decoding method called Dynamic Cognitive Reconciliation Decoding (DCRD), to predict and mitigate context-memory conflicts. DCRD first analyzes the attention map to assess context fidelity and predict potential conflicts. Based on this prediction, the input is directed to one of two decoding paths: (1) greedy decoding, or (2) context fidelity-based dynamic decoding. This design enables DCRD to handle conflicts efficiently while maintaining high accuracy and decoding efficiency in conflict-free cases. Additionally, to simulate scenarios with frequent knowledge updates, we constructed ConflictKG, a knowledge conflict QA benchmark. Experiments on four LLMs across six QA datasets show that DCRD outperforms all baselines, achieving state-of-the-art performance.

preprint2023arXiv

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Diverse video captioning aims to generate a set of sentences to describe the given video in various aspects. Mainstream methods are trained with independent pairs of a video and a caption from its ground-truth set without exploiting the intra-set relationship, resulting in low diversity of generated captions. Different from them, we formulate diverse captioning into a semantic-concept-guided set prediction (SCG-SP) problem by fitting the predicted caption set to the ground-truth set, where the set-level relationship is fully captured. Specifically, our set prediction consists of two synergistic tasks, i.e., caption generation and an auxiliary task of concept combination prediction providing extra semantic supervision. Each caption in the set is attached to a concept combination indicating the primary semantic content of the caption and facilitating element alignment in set prediction. Furthermore, we apply a diversity regularization term on concepts to encourage the model to generate semantically diverse captions with various concept combinations. These two tasks share multiple semantics-specific encodings as input, which are obtained by iterative interaction between visual features and conceptual queries. The correspondence between the generated captions and specific concept combinations further guarantees the interpretability of our model. Extensive experiments on benchmark datasets show that the proposed SCG-SP achieves state-of-the-art (SOTA) performance under both relevance and diversity metrics.

preprint2022arXiv

PIC 4th Challenge: Semantic-Assisted Multi-Feature Encoding and Multi-Head Decoding for Dense Video Captioning

The task of Dense Video Captioning (DVC) aims to generate captions with timestamps for multiple events in one video. Semantic information plays an important role for both localization and description of DVC. We present a semantic-assisted dense video captioning model based on the encoding-decoding framework. In the encoding stage, we design a concept detector to extract semantic information, which is then fused with multi-modal visual features to sufficiently represent the input video. In the decoding stage, we design a classification head, paralleled with the localization and captioning heads, to provide semantic supervision. Our method achieves significant improvements on the YouMakeup dataset under DVC evaluation metrics and achieves high performance in the Makeup Dense Video Captioning (MDVC) task of PIC 4th Challenge.

preprint2020arXiv

System Integration and Control Design of a Maglev Platform for Space Vibration Isolation

Micro-vibration has been a dominant factor impairing the performance of scientific experiments which are expected to be deployed in a micro-gravity environment such as Spacelab. The micro-vibration has a serious impact on scientific experiments requiring a quasi-static environment. Therefore, we proposed a maglev vibration isolation platform (MVIP) operating in six degrees of freedom (DOF) to fulfill the environmental requirements. In view of non-contact and large stroke requirements for micro-vibration isolation, an optimization method was utilized to design the actuator. Mathematical models of the actuator's remarkable nonlinearity were established so that its output can be compensated according to floater's varying position and the system's performance may be satisfied. Furthermore, aiming to adapt to an energy-limited environment such as Spacelab, an optimum allocation scheme was put forward. Considering the actuator's nonlinearity, accuracy and minimum energy-consumption can be obtained simultaneously. In view of operating in six DOF, methods for nonlinear compensation and system decoupling were discussed, the necessary controller was also presented. Simulation and experiments validate the system's performance. With a movement range of 10x10x8 mm and rotations of 200 mrad, the decay ratio of -40 dB/Dec between 1-10 Hz was obtained under closed-loop control.

Yifan Lu

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Efficient Context Scaling with LongCat ZigZag Attention

Evidence-Grounded Multi-Agent Planning Support for Urban Carbon Governance via RAG

Mitigating Context-Memory Conflicts in LLMs through Dynamic Cognitive Reconciliation Decoding

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

PIC 4th Challenge: Semantic-Assisted Multi-Feature Encoding and Multi-Head Decoding for Dense Video Captioning

System Integration and Control Design of a Maglev Platform for Space Vibration Isolation