Source author record

Quan Zhang

Quan Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

quant-ph Computer Vision Artificial Intelligence cond-mat.soft cond-mat.dis-nn Cryptography and Security eess.SP Computation and Language cond-mat.mtrl-sci eess.SY Information Theory Machine Learning math.IT Systems and Control

Catalog footprint

What is connected

20works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

$\textit{Don't Guess, Just Ask}$: Resolving Ambiguity in Referring Segmentation via Multi-turn Clarification

Referring segmentation aims to segment the target objects in images or videos based on the textual query. Despite remarkable progress over the past years, existing works always assume that the user-provided queries are already precise and clear. However, this assumption is impractical. In real-world scenarios, it is unrealistic to expect all users to thoroughly review their visual content and carefully ensure their queries are unique and unambiguous. When encountering such cases, existing segmentation models tend to arbitrarily guess the user preferences, often resulting in undesired outcomes. To address this limitation, we propose \textbf{IC-Seg}, a novel agentic framework that proactively clarifies user intent through multi-turn conversation before segmentation. To effectively incentivize this capability, we further introduce \textbf{Hi-GRPO}, a new hierarchical optimization strategy that injects dense and informative supervision signals at the trajectory, turn, and step levels. This strategy encourages efficient intent clarification, effectively eliminating redundant interactions and improving overall dialogue quality. For evaluation, we establish \textbf{Ambi-RVOS}, a referring video object segmentation benchmark with ambiguous user queries. Extensive experiments demonstrate that IC-Seg not only outperforms existing methods by a large margin in resolving ambiguous queries, but also maintains state-of-the-art performance on standard reasoning segmentation benchmarks. Code and data will be released at \url{https://github.com/iSEE-Laboratory/IC-Seg}.

preprint2026arXiv

Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs

Although reinforcement learning (RL) has significantly advanced reasoning capabilities in large multimodal language models (MLLMs), its efficacy remains limited for lightweight models essential for edge deployments. To address this issue, we leverage causal analysis and experiment to reveal the underlying phenomenon of perceptual bias, demonstrating that RL-based fine-tuning compels lightweight models to preferentially adopt perceptual shortcuts induced by data biases, rather than developing genuine reasoning abilities. Motivated by this insight, we propose VideoThinker, a causal-inspired framework that cultivates robust reasoning in lightweight models through a two-stage debiasing process. First, the Bias Aware Training stage forges a dedicated "bias model" to embody these shortcut behaviors. Then, the Causal Debiasing Policy Optimization (CDPO) algorithm fine-tunes the primary model, employing an innovative repulsive objective to actively push it away from the bias model's flawed logic while simultaneously pulling it toward correct, generalizable solutions. Our model, VideoThinker-R1, establishes a new state-of-the-art in video reasoning efficiency. For same-scale comparison, requiring no Supervised Fine-Tuning (SFT) and using only 1 of the training data for RL, it surpasses VideoRFT-3B with a 3.2% average gain on widely-used benchmarks and a 7% lead on VideoMME. For cross-scale comparison, it outperforms the larger Video-UTR-7B model on multiple benchmarks, including a 2.1% gain on MVBench and a 3.8% gain on TempCompass. Code is available at https://github.com/falonss703/VideoThinker.

preprint2026arXiv

Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval

Zero-shot composed image retrieval (ZS-CIR) retrieves a target image from a reference image and a text modification without human-annotated CIR triplets. Projection-based ZS-CIR methods are attractive because they do not rely on LLMs at inference and remain lightweight, but they often underperform LLM-based approaches on complex semantic modifications. This gap reflects a semantic transition bottleneck in projection-based ZS-CIR: endpoint-level matching can let the edit text act as a target-side attribute cue rather than grounding it as a source-conditioned semantic transition. We further show that adding semantic transition supervision to the same text adapter creates an endpoint--transition conflict between endpoint alignment and semantic transition alignment. To address this conflict, DeCIR decouples endpoint and transition learning. It constructs paired forward/reverse edit tuples from image-caption pairs, trains separate low-rank text adapter branches for endpoint alignment and semantic transition alignment, and merges them with Low-Rank Directional Merge (LRDM) into one deployable adapter. Extensive experiments on CIRR, CIRCO, FashionIQ, and GeneCIS demonstrate that DeCIR consistently improves projection-based ZS-CIR without increasing inference complexity.

preprint2026arXiv

Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization

Multi-Hop Fact Verification (MHFV) necessitates complex reasoning across disparate evidence, posing significant challenges for Large Language Models (LLMs) which often suffer from hallucinations and fractured logical chains. Existing methods, while improving transparency via Chain-of-Thought (CoT), lack explicit modeling of the causal dependencies between evidence and claims. In this work, we introduce a novel framework that grounds reasoning in a Structural Causal Model (SCM), treating verification as a constructive causal inference process. We empirically identify an "inverted U-shaped" correlation between reasoning chain length and accuracy, revealing that excessive structural complexity degrades performance. To address this, we propose a Rule-based Reinforcement Learning strategy using Group Relative Policy Optimization (GRPO). This approach dynamically optimizes the trade-off between structural depth and conciseness. Extensive experiments on HoVer and EX-FEVER demonstrate that our SCM-GRPO framework significantly outperforms state-of-the-art baselines, offering a reliable and interpretable solution for complex fact verification.

preprint2026arXiv

View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification

Aerial-Ground Person Re-Identification (AGPReID) remains highly challenging due to drastic viewpoint variations between drones and fixed cameras. Existing methods typically follow a view-invariant paradigm, aligning shared features across views to achieve robustness. However, view-invariant inherently enforces part-level alignment, which ignores view-specific cues and discriminative identity information. To this end, this work proposes ViSA (View-aware Semantic Alignment), a view-aware framework that achieves cross-view semantic consistency containing an Expert-driven Token Generation Module (ETGM) and a Dual-branch Local Fusion Module (DLFM). Technically, the former constructs a set of view-aware experts to generate adaptive semantic queries that perceive viewpoint-specific patterns, while the latter leverages graph reasoning to extract and align local regions responsive to different experts. Extensive experiments on three AGPReID benchmarks including AG-ReID.v2, CARGO and LAGPeR demonstrate that ViSA consistently achieves superior performance, with a notable 10.06\% mAP improvement on the challenging CARGO cross-view protocol. The code is available at \href{https://github.com/Cat-Zero/ViSA}{https://github.com/Cat-Zero/ViSA}.

preprint2022arXiv

Distribution-Aware Graph Representation Learning for Transient Stability Assessment of Power System

The real-time transient stability assessment (TSA) plays a critical role in the secure operation of the power system. Although the classic numerical integration method, \textit{i.e.} time-domain simulation (TDS), has been widely used in industry practice, it is inevitably trapped in a high computational complexity due to the high latitude sophistication of the power system. In this work, a data-driven power system estimation method is proposed to quickly predict the stability of the power system before TDS reaches the end of simulating time windows, which can reduce the average simulation time of stability assessment without loss of accuracy. As the topology of the power system is in the form of graph structure, graph neural network based representation learning is naturally suitable for learning the status of the power system. Motivated by observing the distribution information of crucial active power and reactive power on the power system's bus nodes, we thus propose a distribution-aware learning~(DAL) module to explore an informative graph representation vector for describing the status of a power system. Then, TSA is re-defined as a binary classification task, and the stability of the system is determined directly from the resulting graph representation without numerical integration. Finally, we apply our method to the online TSA task. The case studies on the IEEE 39-bus system and Polish 2383-bus system demonstrate the effectiveness of our proposed method.

preprint2022arXiv

On the design of Massive MIMO-QAM detector via $\ell_2$-Box ADMM approach

In this letter, we develop an $\ell_2$-box maximum likelihood (ML) formulation for massive multiple-input multiple-output (MIMO) quadrature amplitude modulation (QAM) signal detection and customize an alternating direction method of multipliers (ADMM) algorithm to solve the nonconvex optimization model. In the $\ell_2$-box ADMM implementation, all variables are solved analytically. Moreover, several theoretical results related to convergence, iteration complexity, and computational complexity are presented. Simulation results demonstrate the effectiveness of the proposed $\ell_2$-box ADMM detector in comparison with state-of-the-arts approaches.

preprint2021arXiv

A Low-Complexity ADMM-based Massive MIMO Detectors via Deep Neural Networks

An alternate direction method of multipliers (ADMM)-based detectors can achieve good performance in both small and large-scale multiple-input multiple-output (MIMO) systems. However, due to the difficulty of choosing the optimal penalty parameters, their performance is limited. This paper presents a deep neural network (DNN)-based massive MIMO detection method which can overcome the above limitation. It exploits the unfolding technique and learns to estimate the penalty parameters. Additionally, a computationally cheaper detector is also proposed. The proposed methods can handle the higher-order modulation signals. Numerical results are presented to demonstrate the performances of the proposed methods compared with the existing works.

preprint2021arXiv

RNN-Test: Towards Adversarial Testing for Recurrent Neural Network Systems

While massive efforts have been investigated in adversarial testing of convolutional neural networks (CNN), testing for recurrent neural networks (RNN) is still limited and leaves threats for vast sequential application domains. In this paper, we propose an adversarial testing framework RNN-Test for RNN systems, focusing on the main sequential domains, not only classification tasks. First, we design a novel search methodology customized for RNN models by maximizing the inconsistency of RNN states to produce adversarial inputs. Next, we introduce two state-based coverage metrics according to the distinctive structure of RNNs to explore more inference logics. Finally, RNN-Test solves the joint optimization problem to maximize state inconsistency and state coverage, and crafts adversarial inputs for various tasks of different kinds of inputs. For evaluations, we apply RNN-Test on three sequential models of common RNN structures. On the tested models, the RNN-Test approach is demonstrated to be competitive in generating adversarial inputs, outperforming FGSM-based and DLFuzz-based methods to reduce the model performance more sharply with 2.78% to 32.5% higher success (or generation) rate. RNN-Test could also achieve 52.65% to 66.45% higher adversary rate on MNIST-LSTM model than relevant work testRNN. Compared with the neuron coverage, the proposed state coverage metrics as guidance excel with 4.17% to 97.22% higher success (or generation) rate.

preprint2014arXiv

Preserving Privacy of Mobile Reader Holders in Server-less RFID Authentication and Searching Protocols

Along with the development of internet of things and pervasive computing, researchers are increasingly focusing on server-less RFID authentication and searching protocols, which utilize mobile RFID readers. However, revealing privacy of mobile reader holders is a widely neglected problem in current research. This paper concentrates on preserving privacy of mobile reader holders in server-less RFID authentication and searching protocols. We propose a detailed requirement as a principle for future protocol designs, and a scheme to enhance most current protocols. We apply our scheme to two classical protocols. The comparisons between the original and our enhanced protocols show that our scheme is secure and effective.

preprint2014arXiv

RFID Authentication Against an Unsecure Backend Server

This paper address a new problem in RFID authentication research for the first time. That is, existing RFID authentication schemes generally assume that the backend server is absolutely secure, however, this assumption is rarely tenable in practical conditions. It disables existing RFID authentication protocols from being safely applied to a reallife scenario in which the backend server is actually vulnerable, compromised or even malicious itself. We propose an RFID authentication scheme against an unsecure backend server. It is based on hash chain, searching over encrypted data, and coprivacy, defending against the privacy revealing to the backend server. The proposed scheme is scalable, resistant to desynchronization attacks, and provides mutual authentication in only three frontend communication steps. Moreover, it is the first scheme meeting the special security and privacy requirement for a cloud-based RFID authentication scenario in which the backend server is untrustworthy to readers held by cloud clients.

preprint2013arXiv

Smectic order, pinning, and phase transition in a smectic liquid crystal cell with a random substrate

We study smectic-liquid-crystal order in a cell with a heterogeneous substrate imposing surface random positional and orientational pinnings. Proposing a minimal random elastic model, we demonstrate that, for a thick cell, the smectic state without a rubbed substrate is always unstable at long scales and, for weak random pinning, is replaced by a smectic glass state. We compute the statistics of the associated substrate-driven distortions and the characteristic smectic domain size on the heterogeneous substrate and in the bulk. We find that for weak disorder, the system exhibits a three-dimensional temperature-controlled phase transition between a weakly and strongly pinned smectic glass states akin to the Cardy-Ostlund phase transition. We explore experimental implications of the predicted phenomenology and suggest that it provides a plausible explanation for the experimental observations on polarized light microscopy and x-ray scattering.

preprint2011arXiv

Security proof of Counterfactual Quantum Cryptography against General Intercept-resend Attacks and Its Vulnerability

Counterfactual quantum cryptography (CQC), recently proposed by Noh, is featured with no transmission of signal particles. This exhibits evident security advantage, such as its immunity to the well known PNS attack. In this paper, the theoretical security of CQC protocol against the general intercept-resend attacks is proved by bounding the information of an eavesdropper Eve more tightly than in Yin's proposal[Phys. Rev. A 82, 042335 (2010)]. It is also showed that practical CQC implementations may be vulnerable when equipped with imperfect apparatuses, by proving that a negative key rate can be achieved when Eve launches a time-shift attack based on imperfect detector efficiency.

preprint2011arXiv

Semiquantum key distribution using entangled states

Recently, Boyer et al. presented a novel semiquantum key distribution protocol [M. Boyer, D. Kenigsberg, and T. Mor, Phys. Rev. Lett. 99, 140501 (2007)], by using four quantum states, each of which is randomly prepared by Z basis or X basis. Here we present a semiquantum key distribution protocol by using entangled states in which quantum Alice shares a secret key with classical Bob. We also show the protocol is secure against eavesdropping.

preprint2011arXiv

Semiquantum secret sharing using two-particle entangled state

Recently, Boyer et al. presented a novel semiquantum key distribution protocol [M. Boyer, D. Kenigsberg, and T. Mor, Phys. Rev. Lett. 99, 140501 (2007)], in which quantum Alice shares a secret key with classical Bob. Li et al. proposed two semiquantum secret sharing protocols [Q. Li, W. H. Chan, and D. Y. Long, Phys. Rev. A 82, 022303 (2010)] by using maximally entangled Greenberger-Horne-Zeilinger states. In this paper, we present a semiquantum secret sharing protocol by using two-particle entangled states in which quantum Alice shares a secret key with two classical parties, Bob and Charlie. Classical Bob and Charlie are restricted to performing measurement in the computational basis, preparing a particle in the computational basis, or reflecting the particles. None of them can acquire the secret unless they collaborate. We also show the protocol is secure against eavesdropping.

preprint2010arXiv

Plugs in rough capillary tubes: enhanced dependence of motion on plug length

We discuss the creeping motion of plugs of negligible viscosity in rough capillary tubes filled with carrier fluids. This extends Bretherton's research work on the infinite-length bubble motion in a cylindrical or smooth tube for small capillary numbers Ca. We first derive the asymptotic dependence of the plug speed on the finite length in the smooth tube case. This dependence on length is exponentially small, with a decay length much shorter than the tube radius R. Then we discuss the effect of azimuthal roughness of the tube on the plug speed. The tube roughness leads to an unbalanced capillary pressure and a carrier fluid flux in the azimuthal plane. This flux controls the relaxation of the plug shape to its infinite-length limit. For long-wavelength roughness, we find that the above decay length is much longer in the rough tube, and even becomes comparable to the tube radius R in some cases. This implies a much-enhanced dependence of the plug speed on the plug length. This mechanism may explain the catch-up effect seen experimentally.

preprint2010arXiv

Stability and distortions of liquid crystal order in a cell with a heterogeneous substrate

We study stability and distortions of liquid crystal nematic order in a cell with a random heterogeneous substrate. Modeling this system as a bulk xy model with quenched disorder confined to a surface, we find that nematic order is marginally unstable to such surface pinning. We compute the length scale beyond which nematic distortions become large and calculate orientational correlation functions using the functional renormalization-group and matching methods, finding universal logarithmic and double-logarithmic distortions in two and three dimensions, respectively. We extend these results to a finite-thickness liquid crystal cell with a second homogeneous substrate, detailing crossovers as a function of random pinning strength and cell thickness. We conclude with analysis of experimental signatures of these distortions in a conventional crossed-polarizer-analyzer light microscopy.

preprint2006arXiv

Quantum secure communication scheme with W state

Recently, Cao et al. proposed a new quantum secure direct communication scheme using W state. In their scheme, the error rate introduced by an eavesdropper who takes intercept-resend attack, is only 8.3%. Actually, their scheme is just a quantum key distribution scheme because the communication parties first create a shared key and then encrypt the secret message using one-time pad. We then present a quantum secure communication scheme using three-qubit W state. In our scheme, the error rate is raised to 25% and it is not necessary for the present scheme to use alternative measurement or Bell basis measurement. We also show our scheme is unconditionally secure.

preprint2006arXiv

Quantum secure direct communication with pure entangled states

We present a quantum secure direct communication protocol where the channels are not maximally entangled states. The communication parties utilize decoy photons to check eavesdropping. After ensuring the security of the quantum channel, the sender encodes the secret message and transmits it to the receiver by using Controlled-NOT operation and von Neumann measurement. The protocol is simple and realizable with present technology. We also show the protocol is secure for noisy quantum channel.

preprint2005arXiv

Quantum signature scheme with single photons

Quantum digital signature combines quantum theory with classical digital signature. The main goal of this field is to take advantage of quantum effects to provide unconditionally secure signature. We present a quantum signature scheme with message recovery without using entangle effect. The most important property of the proposed scheme is that it is not necessary for the scheme to use Greenberger-Horne-Zeilinger states. The present scheme utilizes single photons to achieve the aim of signature and verification. The security of the scheme relies on the quantum one-time pad and quantum key distribution. The efficiency analysis shows that the proposed scheme is an efficient scheme.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance