Source author record

Qi Shen

Qi Shen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval physics.ins-det quant-ph physics.optics Artificial Intelligence Computation and Language Computer Science and Game Theory Computer Vision Hardware Architecture hep-ex Multiagent Systems physics.med-ph Software Engineering

Catalog footprint

What is connected

14works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

BARE: Towards Bias-Aware and Reasoning-Enhanced One-Tower Visual Grounding

Visual Grounding (VG), which aims to locate a specific region referred to by expressions, is a fundamental yet challenging task in the multimodal understanding fields. While recent grounding transfer works have advanced the field through one-tower architectures, they still suffer from two primary limitations: (1) over-entangled multimodal representations that exacerbate deceptive modality biases, and (2) insufficient semantic reasoning that hinders the comprehension of referential cues. In this paper, we propose BARE, a bias-aware and reasoning-enhanced framework for one-tower visual grounding. BARE introduces a mechanism that preserves modality-specific features and constructs referential semantics through three novel modules: (i) language salience modulator, (ii) visual bias correction and (iii) referential relationship enhancement, which jointly mitigate multimodal distractions and enhance referential comprehension. Extensive experimental results on five benchmarks demonstrate that BARE not only achieves state-of-the-art performance but also delivers superior computational efficiency compared to existing approaches. The code is publicly accessible at https://github.com/Marloweeee/BARE.

preprint2025arXiv

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Strategic dialogue requires agents to execute distinct dialogue acts, for which belief estimation is essential. While prior work often estimates beliefs accurately, it lacks a principled mechanism to use those beliefs during generation. We bridge this gap by first formalizing two core acts Adversarial and Alignment, and by operationalizing them via probabilistic constraints on what an agent may generate. We instantiate this idea in BEDA, a framework that consists of the world set, the belief estimator for belief estimation, and the conditional generator that selects acts and realizes utterances consistent with the inferred beliefs. Across three settings, Conditional Keeper Burglar (CKBG, adversarial), Mutual Friends (MF, cooperative), and CaSiNo (negotiation), BEDA consistently outperforms strong baselines: on CKBG it improves success rate by at least 5.0 points across backbones and by 20.6 points with GPT-4.1-nano; on Mutual Friends it achieves an average improvement of 9.3 points; and on CaSiNo it achieves the optimal deal relative to all baselines. These results indicate that casting belief estimation as constraints provides a simple, general mechanism for reliable strategic dialogue.

preprint2022arXiv

Heterogeneous Global Graph Neural Networks for Personalized Session-based Recommendation

Predicting the next interaction of a short-term interaction session is a challenging task in session-based recommendation. Almost all existing works rely on item transition patterns, and neglect the impact of user historical sessions while modeling user preference, which often leads to non-personalized recommendation. Additionally, existing personalized session-based recommenders capture user preference only based on the sessions of the current user, but ignore the useful item-transition patterns from other user's historical sessions. To address these issues, we propose a novel Heterogeneous Global Graph Neural Networks (HG-GNN) to exploit the item transitions over all sessions in a subtle manner for better inferring user preference from the current and historical sessions. To effectively exploit the item transitions over all sessions from users, we propose a novel heterogeneous global graph that contains item transitions of sessions, user-item interactions and global co-occurrence items. Moreover, to capture user preference from sessions comprehensively, we propose to learn two levels of user representations from the global graph via two graph augmented preference encoders. Specifically, we design a novel heterogeneous graph neural network (HGNN) on the heterogeneous global graph to learn the long-term user preference and item representations with rich semantics. Based on the HGNN, we propose the Current Preference Encoder and the Historical Preference Encoder to capture the different levels of user preference from the current and historical sessions, respectively. To achieve personalized recommendation, we integrate the representations of the user current preference and historical interests to generate the final user preference representation. Extensive experimental results on three real-world datasets show that our model outperforms other state-of-the-art methods.

preprint2022arXiv

Intention Adaptive Graph Neural Network for Category-aware Session-based Recommendation

Session-based recommendation (SBR) is proposed to recommend items within short sessions given that user profiles are invisible in various scenarios nowadays, such as e-commerce and short video recommendation. There is a common scenario that user specifies a target category of items as a global filter, however previous SBR settings mainly consider the item sequence and overlook the rich target category information. Therefore, we define a new task called Category-aware Session-Based Recommendation (CSBR), focusing on the above scenario, in which the user-specified category can be efficiently utilized by the recommendation system. To address the challenges of the proposed task, we develop a novel method called Intention Adaptive Graph Neural Network (IAGNN), which takes advantage of relationship between items and their categories to achieve an accurate recommendation result. Specifically, we construct a category-aware graph with both item and category nodes to represent the complex transition information in the session. An intention-adaptive graph neural network on the category-aware graph is utilized to capture user intention by transferring the historical interaction information to the user-specified category domain. Extensive experiments on three real-world datasets are conducted to show our IAGNN outperforms the state-of-the-art baselines in the new task.

preprint2022arXiv

Label free visualization of amyloid plaques in Alzheimer's disease with polarization-sensitive photoacoustic Mueller matrix tomography

The formation of amyloid plaques in the cortical and hippocampal brain regions caused by abnormal deposition of extracellular amyloid \b{eta}-protein (A\b{eta}) is a characteristic pathological hallmark of early Alzheimer's disease (AD), while label-free graphic rendering of diseased amyloid plaques in vivo is still a highly challenging task. Herein, by ingeniously extracting the polarization-sensitive optical absorption of amyloid plaques via photoacoustic (PA) technique, a novel PA Mueller matrix (PAMM) tomography that capable of providing three new conformational parameters of molecules is developed to realize depth-resolved label-free imaging of amyloid plaques. Whole brain PAMM imaging on different stages of APP/PS1 transgenic AD mice has been performed to demonstrate its ability for in situ/in vivo quantitative three-dimensional (3D) detection of amyloid plaques and its great potential for monitoring early AD pathological development without labeling.

preprint2022arXiv

Multiple Choice Questions based Multi-Interest Policy Learning for Conversational Recommendation

Conversational recommendation system (CRS) is able to obtain fine-grained and dynamic user preferences based on interactive dialogue. Previous CRS assumes that the user has a clear target item. However, for many users who resort to CRS, they might not have a clear idea about what they really like. Specifically, the user may have a clear single preference for some attribute types (e.g. color) of items, while for other attribute types, the user may have multiple preferences or even no clear preferences, which leads to multiple acceptable attribute instances (e.g. black and red) of one attribute type. Therefore, the users could show their preferences over items under multiple combinations of attribute instances rather than a single item with unique combination of all attribute instances. As a result, we first propose a more realistic CRS learning setting, namely Multi-Interest Multi-round Conversational Recommendation, where users may have multiple interests in attribute instance combinations and accept multiple items with partially overlapped combinations of attribute instances. To effectively cope with the new CRS learning setting, in this paper, we propose a novel learning framework namely, Multi-Choice questions based Multi-Interest Policy Learning . In order to obtain user preferences more efficiently, the agent generates multi-choice questions rather than binary yes/no ones on specific attribute instance. Besides, we propose a union set strategy to select candidate items instead of existing intersection set strategy in order to overcome over-filtering items during the conversation. Finally, we design a Multi-Interest Policy Learning module, which utilizes captured multiple interests of the user to decide next action, either asking attribute instances or recommending items. Extensive experimental results on four datasets verify the superiority of our method for the proposed setting.

preprint2021arXiv

Temporal aware Multi-Interest Graph Neural Network For Session-based Recommendation

Session-based recommendation (SBR) is a challenging task, which aims at recommending next items based on anonymous interaction sequences. Despite the superior performance of existing methods for SBR, there are still several limitations: (i) Almost all existing works concentrate on single interest extraction and fail to disentangle multiple interests of user, which easily results in suboptimal representations for SBR. (ii) Furthermore, previous methods also ignore the multi-form temporal information, which is significant signal to obtain current intention for SBR. To address the limitations mentioned above, we propose a novel method, called \emph{Temporal aware Multi-Interest Graph Neural Network} (TMI-GNN) to disentangle multi-interest and yield refined intention representations with the injection of two level temporal information. Specifically, by appending multiple interest nodes, we construct a multi-interest graph for current session, and adopt the GNNs to model the item-item relation to capture adjacent item transitions, item-interest relation to disentangle the multi-interests, and interest-item relation to refine the item representation. Meanwhile, we incorporate item-level time interval signals to guide the item information propagation, and interest-level time distribution information to assist the scattering of interest information. Experiments on three benchmark datasets demonstrate that TMI-GNN outperforms other state-of-the-art methods consistently.

preprint2020arXiv

From API to NLI: A New Interface for Library Reuse

Developers frequently reuse APIs from existing libraries to implement certain functionality. However, learning APIs is difficult due to their large scale and complexity. In this paper, we design an abstract framework NLI2Code to ease the reuse process. Under the framework, users can reuse library functionalities with a high-level, automatically-generated NLI (Natural Language Interface) instead of the detailed API elements. The framework consists of three components: a functional feature extractor to summarize the frequently-used library functions in natural language form, a code pattern miner to give a code template for each functional feature, and a synthesizer to complete code patterns into well-typed snippets. From the perspective of a user, a reuse task under NLI2Code starts from choosing a functional feature and our framework will guide the user to synthesize the desired solution. We instantiated the framework as a tool to reuse Java libraries. The evaluation shows our tool can generate a high-quality natural language interface and save half of the coding time for newcomers to solve real-world programming tasks.

preprint2020arXiv

Towards satellite-based quantum-secure time transfer

High-precision time synchronization for remote clocks plays an important role in fundamental science and real-life applications. However, the current time synchronization techniques have been shown to be vulnerable to sophisticated adversaries. There is a compelling need for fundamentally new methods to distribute high-precision time information securely. Here we propose a satellite-based quantum-secure time transfer (QSTT) scheme based on two-way quantum key distribution (QKD) in free-space, and experimentally verify the key technologies of the scheme via the Micius quantum satellite. In QSTT, a quantum signal (e.g., single photon) is used as the carrier for both the time transfer and the secret-key generation, offering quantum-enhanced security for transferring time signal and time information. We perform a satellite-to-ground time synchronization using single-photon-level signals and achieve a quantum bit error rate of less than 1%, a time data rate of 9 kHz and a time-transfer precision of 30 ps. These results offer possibilities towards an enhanced infrastructure of time-transfer network, whose security stems from quantum physics.

preprint2019arXiv

Spaceborne low-noise single-photon detection for satellite-based quantum communications

Single-photon detectors (SPDs) play important roles in highly sensitive detection applications, such as fluorescence spectroscopy, remote sensing and ranging, deep space optical communications, elementary particle detection, and quantum communications. However, the adverse conditions in space, such as the increased radiation flux and thermal vacuum, severely limit their noise performances, reliability, and lifetime. Herein, we present the first example of spaceborne, low-noise, high reliability SPDs, based on commercial off-the-shelf (COTS) silicon avalanche photodiodes (APD). Based on the high noise-radiation sensitivity of silicon APD, we have developed special shielding structures, multistage cooling technologies, and configurable driver electronics that significantly improved the COTS APD reliability and mitigated the SPD noise-radiation sensitivity. This led to a reduction of the expected in-orbit radiation-induced dark count rate (DCR) from ~219 counts per second (cps) per day to ~0.76 cps/day. During a continuous period of continuous operations in orbit which spanned of 1029 days, the SPD DCR was maintained below 1000 cps, i.e., the actual in-orbit radiation-induced DCR increment rate was ~0.54 cps/day, i.e., two orders of magnitude lower than those evoked by previous technologies, while its photon detection efficiency was > 45%. Our spaceborne, low-noise SPDs established a feasible satellite-based up-link quantum communication that was validated on the quantum experiment science satellite platform. Moreover, our SPDs open new windows of opportunities for space research and applications in deep-space optical communications, single-photon laser ranging, as well as for testing the fundamental principles of physics in space.

preprint2014arXiv

A Compact PCI-based Measurement and Control System for Satellite-Ground Quantum Communication

Since the 1990s, there has been a dramatic interest in quantum communication. Free-space quantum communication is being developed to ultra-long distance quantum experiment, which requires higher electronics performance, such as time measurement precision, data-transfer rate, and system integration density. As part of the ground station of quantum experiment satellite that will be launched in 2016, we specifically designed a compact PCI-based multi-channel electronics system with high time-resolution, high data-transfer-rate. The electronics performance of this system was tested. The time bin size is 23.9ps and the time precision root-mean-square (RMS) is less than 24ps for 16 channels. The dead time is 30ns. The data transfer rate to local computer is up to 35 MBps, and the count rate is up to 30M/s. The system has been proven to perform well and operate stably through a test of free space quantum key distribution (QKD) experiment.

preprint2014arXiv

A Multi-chain Measurements Averaging TDC Implemented in a 40 nm FPGA

A high precision and high resolution time-to-digital converter (TDC) implemented in a 40 nm fabrication process Virtex-6 FPGA is presented in this paper. The multi-chain measurements averaging architecture is used to overcome the resolution limitation determined by intrinsic cell delay of the plain single tapped-delay chain. The resolution and precision are both improved with this architecture. In such a TDC, the input signal is connected to multiple tapped-delay chains simultaneously (the chain number is M), and there is a fixed delay cell between every two adjacent chains. Each tapped-delay chain is just a plain TDC and should generate a TDC time for a hit input signal, so totally M TDC time values should be got for a hit signal. After averaging, the final TDC time is obtained. A TDC with 3 ps resolution (i.e. bin size) and 6.5 ps precision (i.e. RMS) has been implemented using 8 parallel tapped-delay chains. Meanwhile the plain TDC with single tapped-delay chain yields 24 ps resolution and 18 ps precision.

preprint2013arXiv

A Fast Improved Fat Tree Encoder for Wave Union TDC in an FPGA

Up to the present, the wave union method can achieve the best timing performance in FPGA based TDC designs. However, it should be guaranteed in such a structure that the non-thermometer code to binary code (NTH2B) encoding process should be finished within just one system clock cycle. So the implementation of the NTH2B encoder is quite challenging considering the high speed requirement. Besides, the high resolution wave union TDC also demands the encoder to convert an ultra-wide input code to a binary code. We present a fast improved fat tree encoder (IFTE) to fulfill such requirements, in which bubble error suppression is also integrated. With this encoder scheme, a wave union TDC with 7.7 ps RMS and 3.8 ps effective bin size was implemented in an FPGA from Xilinx Virtex 5 family. An encoding time of 8.33 ns was achieved for a 276-bit non-thermometer code to a 9-bit binary code conversion. We conducted a series of tests on the oscillating period of the wave union launcher, as well as the overall performance of the TDC; test results indicate that the IFTE works well. In fact, in the implementation of this encoder, no manual routing or special constrains were required; therefore, this IFTE structure could also be further applied in other delay chain based FPGA TDCs.

preprint2013arXiv

Direct and full-scale experimental verifications towards ground-satellite quantum key distribution

Quantum key distribution (QKD), provides the only intrinsically unconditional secure method for communication based on principle of quantum mechanics. Compared with fiber-based demonstrations-, free-space links could provide the most appealing solution for much larger distance. Despite of significant efforts, so far all realizations rely on stationary sites. Justifications are therefore extremely crucial for applications via a typical Low Earth Orbit Satellite (LEOS). To achieve direct and full-scale verifications, we demonstrate here three independent experiments with a decoy-state QKD system overcoming all the demanding conditions. The system is operated in a moving platform through a turntable, a floating platform through a hot-air balloon, and a huge loss channel, respectively, for substantiating performances under rapid motion, attitude change, vibration, random movement of satellites and in high-loss regime. The experiments cover expanded ranges for all the leading parameters of LEOS. Our results pave the way towards ground-satellite QKD and global quantum communication network.

Qi Shen

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

BARE: Towards Bias-Aware and Reasoning-Enhanced One-Tower Visual Grounding

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Heterogeneous Global Graph Neural Networks for Personalized Session-based Recommendation

Intention Adaptive Graph Neural Network for Category-aware Session-based Recommendation

Label free visualization of amyloid plaques in Alzheimer's disease with polarization-sensitive photoacoustic Mueller matrix tomography

Multiple Choice Questions based Multi-Interest Policy Learning for Conversational Recommendation

Temporal aware Multi-Interest Graph Neural Network For Session-based Recommendation

From API to NLI: A New Interface for Library Reuse

Towards satellite-based quantum-secure time transfer

Spaceborne low-noise single-photon detection for satellite-based quantum communications

A Compact PCI-based Measurement and Control System for Satellite-Ground Quantum Communication

A Multi-chain Measurements Averaging TDC Implemented in a 40 nm FPGA

A Fast Improved Fat Tree Encoder for Wave Union TDC in an FPGA

Direct and full-scale experimental verifications towards ground-satellite quantum key distribution