Researcher profile

Yizhi Li

Yizhi Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Large Language Models (LLMs) apply uniform computation to all tokens, despite language exhibiting highly non-uniform information density. This token-uniform regime wastes capacity on locally predictable spans while under-allocating computation to semantically critical transitions. We propose $\textbf{Dynamic Large Concept Models (DLCM)}$, a hierarchical language modeling framework that learns semantic boundaries from latent representations and shifts computation from tokens to a compressed concept space where reasoning is more efficient. DLCM discovers variable-length concepts end-to-end without relying on predefined linguistic units. Hierarchical compression fundamentally changes scaling behavior. We introduce the first $\textbf{compression-aware scaling law}$, which disentangles token-level capacity, concept-level reasoning capacity, and compression ratio, enabling principled compute allocation under fixed FLOPs. To stably train this heterogeneous architecture, we further develop a $\textbf{decoupled $μ$P parametrization}$ that supports zero-shot hyperparameter transfer across widths and compression regimes. At a practical setting ($R=4$, corresponding to an average of four tokens per concept), DLCM reallocates roughly one-third of inference compute into a higher-capacity reasoning backbone, achieving a $\textbf{+2.69$\%$ average improvement}$ across 12 zero-shot benchmarks under matched inference FLOPs.

preprint2026arXiv

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

Benchmarks play a crucial role in tracking the rapid advancement of large language models (LLMs) and identifying their capability boundaries. However, existing benchmarks predominantly curate questions at the question level, suffering from three fundamental limitations: vulnerability to data contamination, restriction to single-knowledge-point assessment, and reliance on costly domain expert annotation. We propose Encyclo-K, a statement-based benchmark that rethinks benchmark construction from the ground up. Our key insight is that knowledge statements, not questions, can serve as the unit of curation, and questions can then be constructed from them. We extract standalone knowledge statements from authoritative textbooks and dynamically compose them into evaluation questions through random sampling at test time. This design directly addresses all three limitations: the combinatorial space is too vast to memorize, and model rankings remain stable across dynamically generated question sets, enabling reliable periodic dataset refresh; each question aggregates 8-10 statements for comprehensive multi-knowledge assessment; annotators only verify formatting compliance without requiring domain expertise, substantially reducing annotation costs. Experiments on over 50 LLMs demonstrate that Encyclo-K poses substantial challenges with strong discriminative power. Even the top-performing OpenAI-GPT-5.1 achieves only 62.07% accuracy, and model performance displays a clear gradient distribution--reasoning models span from 16.04% to 62.07%, while chat models range from 9.71% to 50.40%. These results validate the challenges introduced by dynamic evaluation and multi-statement comprehensive understanding. These findings establish Encyclo-K as a scalable framework for dynamic evaluation of LLMs' comprehensive understanding over multiple fine-grained disciplinary knowledge statements.

preprint2026arXiv

High-Fidelity Universal Quantum Gate Compilation for Non-semisimple Ising Anyons via Genetic Algorithm-Optimized Solovay-Kitaev Decomposition

We present a systematic numerical construction of a universal quantum gate set for topological quantum computation based on the non-semisimple Ising anyons model. By employing a Genetic Algorithm-enhanced Solovay-Kitaev Algorithm (GA-enhanced SKA), we achieve high-fidelity approximations of standard single-qubit gates (Hadamard H-gate and phase T-gate) with a recursion level of just three, meeting the fidelity requirements for fault-tolerant quantum computation. Our numerical results demonstrate that for the critical parameter range α \in [2.001, 2.022], a few braiding operations can approximate the local equivalence class [CNOT] with high precision. Specifically, at α =2.012, 2.015, 2.020, and 2.022, we successfully construct a universal gate set {H, T, CNOT} with leakage errors of two-qubit gate below 0.07,0.08,0.09 and 0.10, respectively. This work establishes a new pathway towards universal quantum computation using non-semisimple Ising anyons, overcoming the limitations of traditional Ising models through optimized braiding sequences and Genetic Algorithm-driven compilation.

preprint2025arXiv

OmniBench: Towards The Future of Universal Omni-Language Models

Recent advancements in multimodal large language models (MLLMs) have aimed to integrate and interpret data across diverse modalities. However, the capacity of these models to concurrently process and reason about multiple modalities remains underexplored, partly due to the lack of comprehensive modality-wise benchmarks. We introduce OmniBench, a novel benchmark designed to rigorously evaluate models' ability to recognize, interpret, and reason across visual, acoustic, and textual inputs simultaneously. We define language models capable of such tri-modal processing as the omni-language models (OLMs). OmniBench is distinguished by high-quality human annotations, ensuring that accurate responses require integrated understanding and reasoning across all three modalities. Our main findings reveal that: i) open-source OLMs exhibit critical limitations in instruction-following and reasoning capabilities within tri-modal contexts; and ii) most baselines models perform poorly (below 50% accuracy) even when provided with alternative textual representations of images or/and audio. These results suggest that the ability to construct a consistent context from text, image, and audio is often overlooked in existing MLLM training paradigms. To address this gap, we curate an instruction tuning dataset of 84.5K training samples, OmniInstruct, for training OLMs to adapt to tri-modal contexts. We advocate for future research to focus on developing more robust tri-modal integration techniques and training strategies to enhance OLMs. Codes, data and live leaderboard could be found at https://m-a-p.ai/OmniBench.

preprint2024arXiv

Nearest-Neighboring Pairing of Monolayer NbSe2 Facilitates the Emergence of Topological Superconducting States

NbSe2, which simultaneously exhibits superconductivity and spin-orbit coupling, is anticipated to pave the way for topological superconductivity and unconventional electron pairing. In this paper, we systematically study topological superconducting (TSC) phases in monolayer NbSe2 through mixing on-site s-wave pairing (ps) with nearest-neighbor pairing (psA1) based on a tight-binding model. We observe rich phases with both fixed and sensitive Chern numbers (CNs) depending on the chemical potential (μ) and out-of-plane magnetic field (Vz). As the psA1 increases, the TSC phase manifests matching and mismatching features according to whether there is a bulk-boundary correspondence (BBC). Strikingly, the introduction of mixed wave pairing significantly reduces the critical Vz to form TSC phases compared with the pure s-wave paring. Moreover, the TSC phase can be modulated even at Vz=0 under appropriate μ and psA1, which is identified by the robust topological edge states (TESs) of ribbons. Additionally, the mixed pairing influences the hybridization of bulk and edge states, resulting in a matching/mismatching BBC with localized/oscillating TESs on the ribbon. Our finding is helpful for the realization of TSC states in experiment, as well as designing and regulating TSC materials.

preprint2023arXiv

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.

preprint2023arXiv

Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning

With the development of multimodality and large language models, the deep learning-based technique for medical image captioning holds the potential to offer valuable diagnostic recommendations. However, current generic text and image pre-trained models do not yield satisfactory results when it comes to describing intricate details within medical images. In this paper, we present a novel medical image captioning method guided by the segment anything model (SAM) to enable enhanced encoding with both general and detailed feature extraction. In addition, our approach employs a distinctive pre-training strategy with mixed semantic learning to simultaneously capture both the overall information and finer details within medical images. We demonstrate the effectiveness of this approach, as it outperforms the pre-trained BLIP2 model on various evaluation metrics for generating descriptions of medical images.