Source author record

Lu Qi

Lu Qi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision quant-ph cond-mat.mes-hall Artificial Intelligence Machine Learning math.AC math.AG math.DG

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Report of the 5th PVUW Challenge: Towards More Diverse Modalities in Pixel-Level Understanding

This report summarizes the objectives, datasets, and top-performing methodologies of the 2026 Pixel-level Video Understanding in the Wild (PVUW) Challenge, hosted at CVPR 2026, which evaluates state-of-the-art models under highly unconstrained conditions. To provide a comprehensive assessment, the 2026 edition features three specialized tracks: the MOSE track for tracking objects within densely cluttered and severely occluded scenarios; the MeViS-Text track for localizing targets via motion-focused linguistic expressions; and the newly inaugurated MeViS-Audio track, which pioneers acoustic-driven object segmentation. By introducing previously unreleased challenging data and analyzing the cutting-edge, multimodal solutions submitted by participants, this report highlights the community's latest technical advancements and charts promising future directions for robust video scene comprehension.

preprint2026arXiv

Video Prediction Transformers without Recurrence or Convolution

Video prediction has witnessed the emergence of RNN-based models led by ConvLSTM, and CNN-based models led by SimVP. Following the significant success of ViT, recent works have integrated ViT into both RNN and CNN frameworks, achieving improved performance. While we appreciate these prior approaches, we raise a fundamental question: Is there a simpler yet more effective solution that can eliminate the high computational cost of RNNs while addressing the limited receptive fields and poor generalization of CNNs? How far can it go with a simple pure transformer model for video prediction? In this paper, we propose PredFormer, a framework entirely based on Gated Transformers. We provide a comprehensive analysis of 3D Attention in the context of video prediction. Extensive experiments demonstrate that PredFormer delivers state-of-the-art performance across four standard benchmarks. The significant improvements in both accuracy and efficiency highlight the potential of PredFormer as a strong baseline for real-world video prediction applications. The source code and trained models will be released at https://github.com/yyyujintang/PredFormer.

preprint2022arXiv

ACC for local volumes and boundedness of singularities

The ACC conjecture for local volumes predicts that the set of local volumes of klt singularities $x\in (X,Δ)$ satisfies the ACC if the coefficients of $Δ$ belong to a DCC set. In this paper, we prove the ACC conjecture for local volumes under the assumption that the ambient germ is analytically bounded. We introduce another related conjecture, which predicts the existence of $δ$-plt blow-ups of a klt singularity whose local volume has a positive lower bound. We show that the latter conjecture also holds when the ambient germ is analytically bounded. Moreover, we prove that both conjectures hold in dimension 2 as well as for 3-dimensional terminal singularities.

preprint2022arXiv

Automatically Discovering Novel Visual Categories with Self-supervised Prototype Learning

This paper tackles the problem of novel category discovery (NCD), which aims to discriminate unknown categories in large-scale image collections. The NCD task is challenging due to the closeness to the real-world scenarios, where we have only encountered some partial classes and images. Unlike other works on the NCD, we leverage the prototypes to emphasize the importance of category discrimination and alleviate the issue of missing annotations of novel classes. Concretely, we propose a novel adaptive prototype learning method consisting of two main stages: prototypical representation learning and prototypical self-training. In the first stage, we obtain a robust feature extractor, which could serve for all images with base and novel categories. This ability of instance and category discrimination of the feature extractor is boosted by self-supervised learning and adaptive prototypes. In the second stage, we utilize the prototypes again to rectify offline pseudo labels and train a final parametric classifier for category clustering. We conduct extensive experiments on four benchmark datasets and demonstrate the effectiveness and robustness of the proposed method with state-of-the-art performance.

preprint2022arXiv

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data. We show that these two approaches, at the two extreme ends of the task-specificity spectrum, are suboptimal for the task performance. Utilizing too little task-specific training signals causes underfitting to the ground-truth labels of downstream tasks, while the opposite causes overfitting to the ground-truth labels. To this end, we propose a novel Class-Agnostic Semi-Supervised Learning (CA-SSL) framework to achieve a more favorable task-specificity balance in extracting training signals from unlabeled data. CA-SSL has three training stages that act on either ground-truth labels (labeled data) or pseudo labels (unlabeled data). This decoupling strategy avoids the complicated scheme in traditional SSL methods that balances the contributions from both data types. Especially, we introduce a warmup training stage to achieve a more optimal balance in task specificity by ignoring class information in the pseudo labels, while preserving localization training signals. As a result, our warmup model can better avoid underfitting/overfitting when fine-tuned on the ground-truth labels in detection and segmentation tasks. Using 3.6M unlabeled data, we achieve a significant performance gain of 4.7% over ImageNet-pretrained baseline on FCOS object detection. In addition, our warmup model demonstrates excellent transferability to other detection and segmentation frameworks.

preprint2022arXiv

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Recent studies have shown the importance of modeling long-range interactions in the inpainting problem. To achieve this goal, existing approaches exploit either standalone attention techniques or transformers, but usually under a low resolution in consideration of computational cost. In this paper, we present a novel transformer-based model for large hole inpainting, which unifies the merits of transformers and convolutions to efficiently process high-resolution images. We carefully design each component of our framework to guarantee the high fidelity and diversity of recovered images. Specifically, we customize an inpainting-oriented transformer block, where the attention module aggregates non-local information only from partial valid tokens, indicated by a dynamic mask. Extensive experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets. Code is released at https://github.com/fenglinglwb/MAT.

preprint2022arXiv

Quantum transport in a one-dimensional quasicrystal with mobility edges

Quantum transport in a one-dimensional (1D) quasiperiodic lattice with mobility edges is explored. We first investigate the adiabatic pumping between left and right edge modes by resorting to two edge-bulk-edge channels and demonstrate that the success or failure of the adiabatic pumping depends on whether the corresponding bulk subchannel undergoes a localization-delocalization transition. Compared with the paradigmatic Aubry-André (AA) model, the introduction of mobility edges triggers an opposite outcome for successful pumping in the two channels, showing a discrepancy of critical condition, and facilitates the robustness of the adiabatic pumping against quasidisorder. We also consider the transfer between excitations at both boundaries of the lattice and an anomalous phenomenon characterized by the enhanced quasidisorder contributing to the excitation transfer is found. Furthermore, there exists a parametric regime where a nonreciprocal effect emerges in the presence of mobility edges, which leads to a unidirectional transport for the excitation transfer and enables potential applications in the engineering of quantum diodes.

preprint2020arXiv

Dissipation-induced topological phase transition and periodic-driving-induced photonic topological state transfer in a small optomechanical lattice

We propose a scheme to investigate the topological phase transition and the topological state transfer based on the small optomechanical lattice under the realistic parameters regime. We find that the optomechanical lattice can be equivalent to a topologically nontrivial Su-Schrieffer-Heeger (SSH) model via designing the effective optomechanical coupling. Especially, the optomechanical lattice experiences the phase transition between topologically nontrivial SSH phase and topologically trivial SSH phase by controlling the decay of the cavity field and the optomechanical coupling. We stress that the topological phase transition is mainly induced by the decay of the cavity field, which is counter-intuitive since the dissipation is usually detrimental to the system. Also, we investigate the photonic state transfer between the two cavity fields via the topologically protected edge channel based on the small optomechanical lattice. We find that the quantum state transfer assisted by the topological zero energy mode can be achieved via implying the external lasers with the periodical driving amplitudes into the cavity fields. Our scheme provides the fundamental and the insightful explanations toward the mapping of the photonic topological insulator based on the micro-nano optomechanical quantum optical platform.

preprint2020arXiv

Engineering of topological state transfer and topological beam splitter in an even-size Su-Schrieffer-Heeger chain

The usual Su-Schrieffer-Heeger model with an even number of lattice sites possesses two degenerate zero energy modes. The degeneracy of the zero energy modes leads to the mixing between the topological left and right edge states, which makes it difficult to implement the state transfer via topological edge channel. Here, enlightened by the Rice-Male topological pumping, we find that the staggered periodic next-nearest neighbor hoppings can also separate the initial mixed edge states, which ensures the state transfer between topological left and right edge states. Significantly, we construct an unique topological state transfer channel by introducing the staggered periodic on-site potentials and the periodic next-nearest neighbor hoppings added only on the odd sites simultaneously, and find that the state initially prepared at the last site can be transfered to the first two sites with the same probability distribution. This special topological state transfer channel is expected to realize a topological beam splitter, whose function is to make the initial photon at one position appear at two different positions with the same probability. Further, we demonstrate the feasibility of implementing the topological beam splitter based on the circuit quantum electrodynamic lattice. Our scheme opens up a new way for the realization of topological quantum information processing and provides a new path towards the engineering of new type of quantum optical device.

preprint2020arXiv

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

Video super-resolution (VSR) aims to utilize multiple low-resolution frames to generate a high-resolution prediction for each frame. In this process, inter- and intra-frames are the key sources for exploiting temporal and spatial information. However, there are a couple of limitations for existing VSR methods. First, optical flow is often used to establish temporal correspondence. But flow estimation itself is error-prone and affects recovery results. Second, similar patterns existing in natural images are rarely exploited for the VSR task. Motivated by these findings, we propose a temporal multi-correspondence aggregation strategy to leverage similar patches across frames, and a cross-scale nonlocal-correspondence aggregation scheme to explore self-similarity of images across scales. Based on these two new modules, we build an effective multi-correspondence aggregation network (MuCAN) for VSR. Our method achieves state-of-the-art results on multiple benchmark datasets. Extensive experiments justify the effectiveness of our method.

preprint2020arXiv

Robust interface-state laser in non-Hermitian micro-resonator arrays

We propose a scheme to achieve the analogous interface-state laser by dint of the interface between the two intermediate-resonator-coupled non-Hermitian resonator chains. We find that, after introducing the couplings between the two resonator chains and the intermediate resonator at the interface, the photons of the system mainly gather into the three resonators near the intermediate resonator. The phenomenon of the photon gathering towards the certain resonators is expected to construct the photon storage and even the laser generator. We reveal that the phenomenon is induced via the joint effect between the isolated intermediate resonator and two kinds of non-Hermitian skin effects. Specially, we investigate the interface-state laser in topologically trivial non-Hermitian resonator array in detail. We find that the pulsed interface-state laser can be achieved accompanying with the intermittent proliferation of the photons at the intermediate resonator when an arbitrary resonator is excited. Also, we reveal that the pulsed interface-state laser in the topologically trivial non-Hermitian resonator array is immune to the on-site defects in some cases, whose mechanism is mainly induced by the nonreciprocal couplings instead of the protection of topology. Our scheme provides a promising and excellent platform to investigate interface-state laser in the micro-resonator array.

preprint2019arXiv

Controllable double quantum state transfers by one topological channel in a frequency-modulated optomechanical array

We propose a scheme to achieve the quantum state transfer via the topological protected edge channel based on a one dimensional frequency-modulated optomechanical array. We find that the optomechanical array can be mapped into a Su-Schrieffer-Heeger model after eliminating the counter rotating wave terms via frequency modulations. By dint of the edge channel of the Su-Schrieffer-Heeger model, we show that the quantum state transfer between the photonic left edge state and the photonic right edge state can be achieved with a high fidelity. Specially, our scheme can also achieve another phononic quantum state transfer based on the same channel via controlling the next-nearest-neighboring interactions between the cavity fields, which is different from the previous investigations only achieving one kind of quantum state transfer. Our scheme provides a novel path to switch two different kinds of quantum state transfers in a controllable way.

preprint2019arXiv

Topological and nontopological edge states induced by qubit-assisted coupling potentials

In the usual Su-Schrieffer-Heeger (SSH) chain, the topology of the energy spectrum is divided into two categories in different parameter regions. Here we study the topological and nontopological edge states induced by qubit-assisted coupling potentials in circuit quantum electrodynamics (QED) lattice system modelled as a SSH chain. We find that, when the coupling potential added on only one end of the system raises to a certain extent, the strong coupling potential will induce a new topologically nontrivial phase accompanied with the appearance of a nontopological edge state in the whole parameter region, and the novel phase transition leads to the inversion of odd-even effect in the system directly. Furthermore, we also study the topological properties as well as phase transitions when two unbalanced coupling potentials are injected into both the ends of the circuit QED lattice system, and find that the system exhibits three distinguishing phases in the process of multiple flips of energy bands. These phases are significantly different from the previous phase induced via unilateral coupling potential, which is reflected by the existence of a pair of nontopological edge states under strong coupling potential regime. Our scheme provides a feasible and visible method to induce a variety of different kinds of topological and nontopological edge states through controlling the qubit-assisted coupling potentials in circuit QED lattice system both in experiment and theory.

preprint2019arXiv

Topological phase induced by distinguishing parameter regimes in cavity optomechanical system with multiple mechanical resonators

We propose two kinds of distinguishing parameter regimes to induce topological Su-Schrieffer-Heeger (SSH) phase in a one dimensional (1D) multi-resonator cavity optomechanical system via modulating the frequencies of both cavity fields and resonators. The introduction of the frequency modulations allows us to eliminate the Stokes heating process for the mapping of the tight-binding Hamiltonian without usual rotating wave approximation, which is totally different from the traditional mapping of the topological tight-binding model. We find that the tight-binding Hamiltonian can be mapped into a topological SSH phase via modifying the Bessel function originating from the frequency modulations of cavity fields and resonators, and the induced SSH phase is independent on the effective optomechanical coupling strength. On the other hand, the insensitivity of the system to the effective optomechanical coupling provides us another new path to induce the topological SSH phase based on the present 1D cavity optomechanical system. And we show that the system can exhibit a topological SSH phase via varying the effective optomechanical coupling strength in an alternative way, which is much more easier to be achieved in experiment. Furthermore, we also construct an analogous bosonic Kitaev model with the trivial topology by keeping the Stokes heating processes. Our scheme provides a steerable platform to investigate the effects of next-nearest-neighboring interactions on the topology of the system.

preprint2016arXiv

A Novel Biologically Mechanism-Based Visual Cognition Model--Automatic Extraction of Semantics, Formation of Integrated Concepts and Re-selection Features for Ambiguity

Integration between biology and information science benefits both fields. Many related models have been proposed, such as computational visual cognition models, computational motor control models, integrations of both and so on. In general, the robustness and precision of recognition is one of the key problems for object recognition models. In this paper, inspired by features of human recognition process and their biological mechanisms, a new integrated and dynamic framework is proposed to mimic the semantic extraction, concept formation and feature re-selection in human visual processing. The main contributions of the proposed model are as follows: (1) Semantic feature extraction: Local semantic features are learnt from episodic features that are extracted from raw images through a deep neural network; (2) Integrated concept formation: Concepts are formed with local semantic information and structural information learnt through network. (3) Feature re-selection: When ambiguity is detected during recognition process, distinctive features according to the difference between ambiguous candidates are re-selected for recognition. Experimental results on hand-written digits and facial shape dataset show that, compared with other methods, the new proposed model exhibits higher robustness and precision for visual recognition, especially in the condition when input samples are smantic ambiguous. Meanwhile, the introduced biological mechanisms further strengthen the interaction between neuroscience and information science.

Lu Qi

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Report of the 5th PVUW Challenge: Towards More Diverse Modalities in Pixel-Level Understanding

Video Prediction Transformers without Recurrence or Convolution

ACC for local volumes and boundedness of singularities

Automatically Discovering Novel Visual Categories with Self-supervised Prototype Learning

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Quantum transport in a one-dimensional quasicrystal with mobility edges

Dissipation-induced topological phase transition and periodic-driving-induced photonic topological state transfer in a small optomechanical lattice

Engineering of topological state transfer and topological beam splitter in an even-size Su-Schrieffer-Heeger chain

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

Robust interface-state laser in non-Hermitian micro-resonator arrays

Controllable double quantum state transfers by one topological channel in a frequency-modulated optomechanical array

Topological and nontopological edge states induced by qubit-assisted coupling potentials

Topological phase induced by distinguishing parameter regimes in cavity optomechanical system with multiple mechanical resonators

A Novel Biologically Mechanism-Based Visual Cognition Model--Automatic Extraction of Semantics, Formation of Integrated Concepts and Re-selection Features for Ambiguity