Researcher profile

Jia Li

Jia Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Accelerated Topological Pumping in Photonic Waveguides Based on Global Adiabatic Criteria

Adiabatic topological pumping enables robust transport of energy and information, yet its operational speed is fundamentally constrained by the instantaneous adiabatic condition, which necessitates prohibitively slow parameter variations. Here, we propose a paradigm shift from instantaneous to global adiabaticity. We derive a global adiabatic criterion (GAC) that establishes an absolute fidelity bound by controlling the root-mean-square nonadiabaticity. Building on this framework, we introduce a fluctuation-suppression acceleration criterion to minimize spatial inhomogeneity, allowing for a safe increase in mean nonadiabaticity without compromising fidelity. We experimentally demonstrate this principle in femtosecond-laser-written photonic Su-Schrieffer-Heeger waveguide arrays via scalable power-law coupling modulation. Our accelerated topological pumping achieves a fidelity of >0.95 with a fivefold reduction in device length compared to conventional schemes, exhibits the predicted linear scaling with system size, and maintains robust performance across a bandwidth exceeding 400 nm. This GAC framework provides a universal design rule for fast, compact, and robust adiabatic devices across both quantum and classical topological platforms.

preprint2026arXiv

AlgBench: To What Extent Do Large Reasoning Models Understand Algorithms?

Reasoning ability has become a central focus in the advancement of Large Reasoning Models (LRMs). Although notable progress has been achieved on several reasoning benchmarks such as MATH500 and LiveCodeBench, existing benchmarks for algorithmic reasoning remain limited, failing to answer a critical question: Do LRMs truly master algorithmic reasoning? To answer this question, we propose AlgBench, an expert-curated benchmark that evaluates LRMs under an algorithm-centric paradigm. AlgBench consists of over 3,000 original problems spanning 27 algorithms, constructed by ACM algorithmic experts and organized under a comprehensive taxonomy, including Euclidean-structured, non-Euclidean-structured, non-optimized, local-optimized, global-optimized, and heuristic-optimized categories. Empirical evaluations on leading LRMs (e.g., Gemini-3-Pro, DeepSeek-v3.2-Speciale and GPT-o3) reveal substantial performance heterogeneity: while models perform well on non-optimized tasks (up to 92%), accuracy drops sharply to around 49% on globally optimized algorithms such as dynamic programming. Further analysis uncovers \textbf{strategic over-shifts}, wherein models prematurely abandon correct algorithmic designs due to necessary low-entropy tokens. These findings expose fundamental limitations of problem-centric reinforcement learning and highlight the necessity of an algorithm-centric training paradigm for robust algorithmic reasoning.

preprint2026arXiv

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely because existing benchmarks do not provide a systematic way to assess memory mechanisms. In this paper, we study agent memory from a self-evolving perspective and introduce EvoMemBench, a unified benchmark organized along two axes: memory scope (in-episode vs. cross-episode) and memory content (knowledge-oriented vs. execution-oriented). We compare 15 representative memory methods with strong long-context baselines under a standardized protocol. Results show that current memory systems are still far from a general solution: long-context baselines remain highly competitive, memory helps most when the current context is insufficient or tasks are difficult, and no single memory form works consistently across all settings. Retrieval-based methods remain strong for knowledge-intensive settings, whereas procedural and long-term memory methods are more effective for execution-oriented tasks when their stored experience matches the task structure. We hope EvoMemBench facilitates future research on more effective memory systems for LLM-based agents. Our code is available at https://github.com/DSAIL-Memory/EvoMemBench.

preprint2026arXiv

Incentivizing In-depth Reasoning over Long Contexts with Process Advantage Shaping

Reinforcement Learning with Verifiable Rewards (RLVR) has proven effective in enhancing LLMs short-context reasoning, but its performance degrades in long-context scenarios that require both precise grounding and robust long-range reasoning. We identify the "almost-there" phenomenon in long-context reasoning, where trajectories are largely correct but fail at the final step, and attribute this failure to two factors: (1) the lack of high reasoning density in long-context QA data that push LLMs beyond mere grounding toward sophisticated multi-hop reasoning; and (2) the loss of valuable learning signals during long-context RL training due to the indiscriminate penalization of partially correct trajectories with incorrect outcomes. To overcome this bottleneck, we propose DeepReasonQA, a KG-driven synthesis framework that controllably constructs high-difficulty, multi-hop long-context QA pairs with inherent reasoning chains. Building on this, we introduce Long-context Process Advantage Shaping (LongPAS), a simple yet effective method that performs fine-grained credit assignment by evaluating reasoning steps along Validity and Relevance dimensions, which captures critical learning signals from "almost-there" trajectories. Experiments on three long-context reasoning benchmarks show that our approach substantially outperforms RLVR baselines and matches frontier LLMs while using far fewer parameters. Further analysis confirms the effectiveness of our methods in strengthening long-context reasoning while maintaining stable RL training.

preprint2026arXiv

Residue Theorem, Regularization and Parity Theorem

In this paper, we employ contour integration and residue calculus to derive explicit parity formulas for (cyclotomic) multiple zeta values (MZVs). A key innovation lies in applying double shuffle regularization to the contour integrals, which leads to two distinct regularized parity formulas-one via shuffle and one via stuffle regularization. Notably, this demonstrates for the first time that the contour integral method can be extended to the regularized setting (including the case $k_r=1$), thereby overcoming a limitation of previous approaches. Our results not only provide explicit parity relations at arbitrary depths but also lay the groundwork for extending this technique to other variants of multiple zeta values.

preprint2026arXiv

Self-regulated emergence of heavy-tailed weight distributions in evolving complex network architectures

Synaptic plasticity typically produces heavy-tailed distributions of synaptic strengths, consisting of a few strong connections among many weaker ones. Meanwhile, structural plasticity relies on distinct signaling cascades to reshape network topology. We propose a model in which both types of plasticity adhere to the Hebbian principle while operating within homeostatically regulated activity. Synaptic plasticity alone generates heavy-tailed weight distributions, but only when any activity spreading beyond neighboring units is discarded. However, when combined with Hebbian structural plasticity, i.e., adaptive rewiring, heavy-tailed weight distributions also arise with more extensive activity flow. Furthermore, adaptive rewiring provides complex network structures with convergent-divergent circuits similar to those that facilitate signal transmission throughout the nervous system. Having adaptive weight adjustment and rewiring driven by the same homeostatic dynamics gives our model a parsimonious and robust framework that simultaneously produces heavy-tailed weight distributions and convergent-divergent units under a wide range of dynamical regimes. Consequently, the model accounts for key connectivity structures in both C. elegans and the mouse, suggesting that its principles are shared across species of different complexities.

preprint2026arXiv

TableCache: Primary Foreign Key Guided KV Cache Precomputation for Low Latency Text-to-SQL

In Text-to-SQL tasks, existing LLM-based methods often include extensive database schemas in prompts, leading to long context lengths and increased prefilling latency. While user queries typically focus on recurrent table sets-offering an opportunity for KV cache sharing across queries-current inference engines, such as SGLang and vLLM, generate redundant prefix cache copies when processing user queries with varying table orders. To address this inefficiency, we propose precomputing table representations as KV caches offline and querying the required ones online. A key aspect of our approach is the computation of table caches while preserving primary foreign key relationships between tables. Additionally, we construct a Table Trie structure to facilitate efficient KV cache lookups during inference. To enhance cache performance, we introduce a cache management system with a query reranking strategy to improve cache hit rates and a computation loading pipeline for parallelizing model inference and cache loading. Experimental results show that our proposed TableCache achieves up to a 3.62x speedup in Time to First Token (TTFT) with negligible performance degradation.

preprint2026arXiv

Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval

Graph-based Retrieval-Augmented Generation (GraphRAG) has become the important paradigm for enhancing Large Language Models (LLMs) with external knowledge. However, existing approaches are constrained by their reliance on high-quality knowledge graphs: manually built ones are not scalable, while automatically extracted ones are limited by the performance of LLM extractors, especially when using smaller, local-deployed models. To address this, we introduce Think-on-Graph 3.0 (ToG-3), a novel framework featuring a Multi-Agent Context Evolution and Retrieval (MACER) mechanism. Its core contribution is the dynamic construction and iterative refinement of a Chunk-Triplets-Community heterogeneous graph index, powered by a Dual-Evolution process that adaptively evolves both the query and the retrieved sub-graph during reasoning. ToG-3 dynamically builds a targeted graph index tailored to the query, enabling precise evidence retrieval and reasoning even with lightweight LLMs. Extensive experiments demonstrate that ToG-3 outperforms compared baselines on both deep and broad reasoning benchmarks, and ablation studies confirm the efficacy of the components of MACER framework. The source code are available in https://github.com/DataArcTech/ToG-3.

preprint2026arXiv

Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models

Diffusion Language Models (DLMs) have recently demonstrated remarkable capabilities in natural language processing tasks. However, the potential of Retrieval-Augmented Generation (RAG), which shows great successes for enhancing large language models (LLMs), has not been well explored, due to the fundamental difference between LLM and DLM decoding. To fill this critical gap, we systematically test the performance of DLMs within the RAG framework. Our findings reveal that DLMs coupled with RAG show promising potentials with stronger dependency on contextual information, but suffer from limited generation precision. We identify a key underlying issue: Response Semantic Drift (RSD), where the generated answer progressively deviates from the query's original semantics, leading to low precision content. We trace this problem to the denoising strategies in DLMs, which fail to maintain semantic alignment with the query throughout the iterative denoising process. To address this, we propose Semantic-Preserving REtrieval-Augmented Diffusion (SPREAD), a novel framework that introduces a query-relevance-guided denoising strategy. By actively guiding the denoising trajectory, SPREAD ensures the generation remains anchored to the query's semantics and effectively suppresses drift. Experimental results demonstrate that SPREAD significantly enhances the precision and effectively mitigates RSD of generated answers within the RAG framework.