Source author record

Yang Yu

Yang Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision Machine Learning quant-ph cs.CY hep-ex hep-lat hep-ph nucl-ex nucl-th

Catalog footprint

What is connected

8works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Demonstration of Discrete-Time Quantum Walks and Observation of Topological Edge States in a Superconducting Qutrit Chain

Quantum walk serves as a versatile tool for universal quantum computing and algorithmic research. However, the implementation of discrete-time quantum walks (DTQWs) with superconducting circuits is still constrained by some limitations such as operation precision, circuit depth and connectivity. With improved hardware efficiency by using superconducting qutrits (three-level systems), we experimentally demonstrate a scalable DTQW in a superconducting circuit, observing the ballistic spreading of quantum walk in a qutrit chain. The usage of qutrits in our implementation allows hardware efficiently encoding of the walker position and the coin degree of freedom. By exploiting the flexibility and intrinsic symmetries of qutrit-based DTQWs, we successfully prepare two topological phases in the chain. For the first time, particle-hole-symmetry-protected edge states, bounded at the interface between these two topological phases, are observed in the superconducting platform. Measured parameter dependencies further validate the properties of edge states. The scalability and gate-control compatibility of the demonstrated DTQWs enable a versatile tool for superconducting quantum computing and quantum simulation.

preprint2026arXiv

ELMM: Efficient Lightweight Multimodal Large Language Models for Multimodal Knowledge Graph Completion

Multimodal Knowledge Graphs (MKGs) extend traditional knowledge graphs by incorporating visual and textual modalities, enabling richer and more expressive entity representations. However, existing MKGs often suffer from incompleteness, which hinder their effectiveness in downstream tasks. Therefore, multimodal knowledge graph completion (MKGC) task is receiving increasing attention. While large language models (LLMs) have shown promise for knowledge graph completion (KGC), their application to the multimodal setting remains underexplored. Moreover, applying Multimodal Large Language Models (MLLMs) to the task of MKGC introduces significant challenges: (1) the large number of image tokens per entity leads to semantic noise and modality conflicts, and (2) the high computational cost of processing large token inputs. To address these issues, we propose Efficient Lightweight Multimodal Large Language Models (ELMM) for MKGC. ELMM proposes a Multi-view Visual Token Compressor (MVTC) based on multi-head attention mechanism, which adaptively compresses image tokens from both textual and visual views, thereby effectively reducing redundancy while retaining necessary information and avoiding modality conflicts. Additionally, we design an attention pruning strategy to remove redundant attention layers from MLLMs, thereby significantly reducing the inference cost. We further introduce a linear projection to compensate for the performance degradation caused by pruning. Extensive experiments on four benchmark datasets demonstrate that ELMM achieves state-of-the-art performance.

preprint2026arXiv

Exploring Reliable Spatiotemporal Dependencies for Efficient Visual Tracking

Recent advances in transformer-based lightweight object tracking have established new standards across benchmarks, leveraging the global receptive field and powerful feature extraction capabilities of attention mechanisms. Despite these achievements, existing methods universally employ sparse sampling during training--utilizing only one template and one search image per sequence--which fails to comprehensively explore spatiotemporal information in videos. This limitation constrains performance and cause the gap between lightweight and high-performance trackers. To bridge this divide while maintaining real-time efficiency, we propose STDTrack, a framework that pioneers the integration of reliable spatiotemporal dependencies into lightweight trackers. Our approach implements dense video sampling to maximize spatiotemporal information utilization. We introduce a temporally propagating spatiotemporal token to guide per-frame feature extraction. To ensure comprehensive target state representation, we disign the Multi-frame Information Fusion Module (MFIFM), which augments current dependencies using historical context. The MFIFM operates on features stored in our constructed Spatiotemporal Token Maintainer (STM), where a quality-based update mechanism ensures information reliability. Considering the scale variation among tracking targets, we develop a multi-scale prediction head to dynamically adapt to objects of different sizes. Extensive experiments demonstrate state-of-the-art results across six benchmarks. Notably, on GOT-10k, STDTrack rivals certain high-performance non-real-time trackers (e.g., MixFormer) while operating at 192 FPS(GPU) and 41 FPS(CPU).

preprint2026arXiv

Helicity Dependent Distribution Functions of the Proton and $Λ$ and $Σ^0$ Baryons

Using continuum Schwinger function methods, a coherent set of predictions for proton, $Λ$ and $Σ^0$ distribution functions (DFs) has been made available -- both helicity dependent and unpolarised. The results and comparisons between them reveal impacts of diquark correlations and SU$(3)$-flavour symmetry breaking, some of which are highlighted in this contribution. For instance: in-proton ratios of helicity-dependent/unpolarised valence-quark DFs are presented; it is highlighted that, were it not for the presence of axialvector diquarks in the $Σ^0$, the valence strange quark would carry none of the $Σ^0$ spin; and the sign and size of polarised gluon DFs is discussed -- at a scale typical of modern measurements, gluon partons carry roughly 40% of each octet baryon's spin.

preprint2026arXiv

MedHorizon: Towards Long-context Medical Video Understanding in the Wild

Medical multimodal large language models (MLLMs) have advanced image understanding and short-video analysis, but real clinical review often requires full-procedure video understanding. Unlike general long videos, medical procedures contain highly redundant anatomical views, while decisive evidence is temporally sparse, spatially subtle, and context dependent. Existing benchmarks often assume this evidence has already been localized through images, short clips, or pre-segmented videos, leaving the retrieval-before-reasoning problem under-tested. We introduce MedHorizon, an in-the-wild benchmark for long-context medical video understanding. MedHorizon preserves 759 hours of full-length clinical procedures and provides 1,253 evidence-grounded multiple-choice questionsthat jointly evaluate sparse evidence understanding and multi-hop clinical reasoning. Its evidence is extremely sparse, with only 0.166% evidence frames on average, requiring models to search noisy procedural streams before interpreting and aggregating findings. We evaluate representative general-domain, medical-domain, and long-video MLLMs. The best model reaches only 41.1% accuracy, showing that current systems remain far from robust full-procedure understanding. Further analysis yields four key findings: performance does not scale reliably with more frames, evidence retrieval and clinical interpretation remain primary bottlenecks; these bottlenecks are rooted in weak procedural reasoning and attention drift under redundancy, and generic sampling methods only partially balances local detail with global coverage. MedHorizon provides a rigorous testbed for MLLMs that retrieve sparse evidence and reason over complete clinical workflows.

preprint2026arXiv

Towards Public Administration Research Based on Interpretable Machine Learning

Causal relationships play a pivotal role in research within the field of public administration. Ensuring reliable causal inference requires validating the predictability of these relationships, which is a crucial precondition. However, prediction has not garnered adequate attention within the realm of quantitative research in public administration and the broader social sciences. The advent of interpretable machine learning presents a significant opportunity to integrate prediction into quantitative research conducted in public administration. This article delves into the fundamental principles of interpretable machine learning while also examining its current applications in social science research. Building upon this foundation, the article further expounds upon the implementation process of interpretable machine learning, encompassing key aspects such as dataset construction, model training, model evaluation, and model interpretation. Lastly, the article explores the disciplinary value of interpretable machine learning within the field of public administration, highlighting its potential to enhance the generalization of inference, facilitate the selection of optimal explanations for phenomena, stimulate the construction of theoretical hypotheses, and provide a platform for the translation of knowledge. As a complement to traditional causal inference methods, interpretable machine learning ushers in a new era of credibility in quantitative research within the realm of public administration.

preprint2026arXiv

Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Previous studies have demonstrated the strong performance of Graph Neural Networks (GNNs) in node classification. However, most existing GNNs adopt a node-centric perspective and rely on global message passing, leading to high computational and memory costs that hinder scalability. To mitigate these challenges, subgraph-based methods have been introduced, leveraging local subgraphs as approximations of full computational trees. While this approach improves efficiency, it often suffers from performance degradation due to the loss of global contextual information, limiting its effectiveness compared to global GNNs. To address this trade-off between scalability and classification accuracy, we reformulate the node classification task as a subgraph classification problem and propose SubGND (Subgraph GNN for NoDe). This framework introduces a differentiated zero-padding strategy and an Ego-Alter subgraph representation method to resolve label conflicts while incorporating an Adaptive Feature Scaling Mechanism to dynamically adjust feature contributions based on dataset-specific dependencies. Experimental results on six benchmark datasets demonstrate that SubGND achieves performance comparable to or surpassing global message-passing GNNs, particularly in heterophilic settings, highlighting its effectiveness and scalability as a promising solution for node classification.

preprint2025arXiv

Tunable Hybrid-Mode Coupler Enabling Strong Interactions between Transmons at Centimeter-Scale Distance

The transmon, a fabrication-friendly superconducting qubit, remains a leading candidate for scalable quantum computing. Recent advances in tunable couplers have accelerated progress toward high-performance quantum processors. However, extending coherent interactions beyond millimeter scales to enhance quantum connectivity presents a critical challenge. Here, we introduce a hybrid-mode coupler exploiting resonator-transmon hybridization to simultaneously engineer the two lowest-frequency mode, enabling high-contrast coupling between centimeter-scale transmons. For a 1-cm coupler, our framework predicts flux-tunable $XX$ and $ZZ$ coupling strengths reaching 23 MHz and 100 MHz, with modulation contrasts exceeding $10^2$ and $10^4$, respectively, demonstrating quantitative agreement with an effective two-channel model. This work provides an efficient pathway to mitigate the inherent connectivity constraints imposed by short-range interactions, enabling transmon-based architectures compatible with hardware-efficient quantum tasks.

Yang Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Demonstration of Discrete-Time Quantum Walks and Observation of Topological Edge States in a Superconducting Qutrit Chain

ELMM: Efficient Lightweight Multimodal Large Language Models for Multimodal Knowledge Graph Completion

Exploring Reliable Spatiotemporal Dependencies for Efficient Visual Tracking

Helicity Dependent Distribution Functions of the Proton and $Λ$ and $Σ^0$ Baryons

MedHorizon: Towards Long-context Medical Video Understanding in the Wild

Towards Public Administration Research Based on Interpretable Machine Learning

Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Tunable Hybrid-Mode Coupler Enabling Strong Interactions between Transmons at Centimeter-Scale Distance