Source author record

Yifan Zhu

Yifan Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Databases Computer Vision Information Retrieval physics.app-ph Robotics Computation and Language Distributed, Parallel, and Cluster Computing Machine Learning physics.med-ph physics.optics Populations and Evolution quant-ph Social and Information Networks Software Engineering

Catalog footprint

What is connected

18works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

All-in-one Graph-based Indexing for Hybrid Search on GPUs

Hybrid search has emerged as a promising paradigm that combines lexical and semantic retrieval, enhancing accuracy for applications such as recommendations, information retrieval, and Retrieval-Augmented Generation. However, existing methods are constrained by a trilemma: they sacrifice flexibility for efficiency, suffer from accuracy degradation, or incur prohibitive storage overhead for flexible combinations of retrieval paths. This paper introduces Allan-Poe, a novel all-in-one graph index accelerated by GPUs for efficient hybrid search. We first analyze the limitations of existing retrieval paradigms and extract key design principles for an effective hybrid index. Guided by the principles, we architect a unified graph-based index that flexibly integrates three retrieval paths (dense vector, sparse vector, and full-text) within a single, cohesive structure. To enable efficient construction, we design a GPU-accelerated pipeline featuring a warp-level hybrid distance kernel, RNG-IP joint pruning, and keyword-aware neighbor recycling. For query processing, we introduce a dynamic fusion framework that supports any combination of retrieval paths and weights without index reconstruction, flexibly leveraging logical structures from the knowledge graph to resolve complex multi-hop queries. Extensive experiments on 6 real-world datasets demonstrate that Allan-Poe achieves superior end-to-end query accuracy and outperforms state-of-the-art methods by 1.5x-186.4x in throughput, while significantly reducing storage overhead.

preprint2026arXiv

Branch, or Layer? Zeroth-Order Optimization for Continual Learning of Vision-Language Models

Vision-Language Continual Learning (VLCL) has attracted significant research attention for its robust capabilities, and the adoption of Parameter-Efficient Fine-Tuning (PEFT) strategies is enabling these models to achieve competitive performance with substantially reduced resource consumption. However, dominated First-Order (FO) optimization is prone to trap models in suboptimal local minima, especially in limited exploration subspace within PEFT. To overcome this challenge, this paper pioneers a systematic exploration of adopting Zeroth-Order (ZO) optimization for PEFT-based VLCL. We first identify the incompatibility of naive full-ZO adoption in VLCL due to optimization process instability. We then investigate the application of ZO optimization from a modality branch-wise to a fine-grained layer-wise across various training units to identify an optimal strategy. Besides, a key theoretical insight reveals that vision modality exhibit higher variance than language counterparts in VLCL during the ZO optimization process, and we propose a modality-aware ZO strategy, which adopts gradient sign normalization in ZO and constrains vision modality perturbation to further improve performance. Benefiting from the adoption of ZO optimization, PEFT-based VLCL fulfills better ability to escape local minima during the optimization process, extensive experiments on four benchmarks demonstrate that our method achieves state-of-the-art results.

preprint2026arXiv

DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization

Recent advances in Emotional Support Conversation (ESC) have improved emotional support generation by fine-tuning Large Language Models (LLMs) via Supervised Fine-Tuning (SFT). However, common psychological errors still persist. While Direct Preference Optimization (DPO) shows promise in reducing such errors through pairwise preference learning, its effectiveness in ESC tasks is limited by two key challenges: (1) Entangled data structure: Existing ESC data inherently entangles psychological strategies and response content, making it difficult to construct high-quality preference pairs; and (2) Optimization ambiguity: Applying vanilla DPO to such entangled pairwise data leads to ambiguous training objectives. To address these issues, we introduce Inferential Preference Mining (IPM) to construct high-quality preference data, forming the IPM-PrefDial dataset. Building upon this data, we propose a Decoupled ESC framework inspired by Gross's Extended Process Model of Emotion Regulation, which decomposes the ESC task into two sequential subtasks: strategy planning and empathic response generation. Each was trained via SFT and subsequently enhanced by DPO to align with the psychological preference. Extensive experiments demonstrate that our Decoupled ESC framework outperforms joint optimization baselines, reducing preference bias and improving response quality.

preprint2026arXiv

Frequency-Aware Graph Construction and Search for Dynamic Vector Databases

Approximate Nearest Neighbor Search (ANNS) is a crucial operation in databases and artificial intelligence. While graph-based ANNS methods like HNSW and NSG excel in performance, they assume uniform query distribution. However, in real-world scenarios, user preferences and temporal dynamics often result in certain data points being queried more frequently than others, and these query patterns can change over time. To better leverage such characteristics, we propose DQF, a novel Dual-Index Query Framework. This framework features a dual-layer index structure and a dynamic search strategy based on a decision tree. The dual-layer index includes a hot index for high-frequency nodes and a full index covering the entire dataset, allowing for the separate management of hot and cold queries. Furthermore, we propose a dynamic search strategy that employs a decision tree to determine whether a query is of the high-frequency type, avoiding unnecessary searches in the full index through early termination. Additionally, to address fluctuations in query frequency, we design an update mechanism to manage the hot index. New high-frequency nodes will be inserted into the hot index, which is periodically rebuilt when its size exceeds a predefined threshold, removing outdated low-frequency nodes. Experiments on four real-world datasets demonstrate that the Dual-Index Query Framework achieves a significant speedup of 2.0-5.7x over state-of-the-art algorithms while maintaining a 95% recall rate. Importantly, it avoids full index reconstruction even as query distributions change, underscoring its efficiency and practicality in dynamic query distribution scenarios.

preprint2026arXiv

HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning

Domain Incremental Learning is a critical scenario that requires models to continuously adapt to new data domains without retraining. However, domain shifts often cause severe performance degradation. To address this, we propose Hybrid Energy-Distance Prompt, a domain-incremental framework inspired by Helmholtz free energy. HEDP introduces an energy regularization loss to enhance the separability of domain representations and a hybrid energy-distance weighted mechanism that fuses energy-based and distance-based cues to improve domain selection and generalization. Experiments on multiple benchmarks, including CORe50, show that HEDP achieves superior performance on unseen domains with a 2.57\% accuracy gain, effectively mitigating catastrophic forgetting and enhancing open-world adaptability. Our code is \href{https://github.com/dannis97500/HEDP/}{available here}.

preprint2026arXiv

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Although precise recall is a core objective in Retrieval-Augmented Generation (RAG), a critical oversight persists in the field: improvements in retrieval performance do not consistently translate to commensurate gains in downstream reasoning. To diagnose this gap, we propose the Recall Conversion Rate (RCR), a novel evaluation metric to quantify the contribution of retrieval to reasoning accuracy. Our quantitative analysis of mainstream RAG methods reveals that as Recall@5 improves, the RCR exhibits a near-linear decay. We identify the neglect of retrieval quality in these methods as the underlying cause. In contrast, approaches that focus solely on quality optimization often suffer from inferior recall performance. Both categories lack a comprehensive understanding of retrieval quality optimization, resulting in a trade-off dilemma. To address these challenges, we propose comprehensive retrieval quality optimization criteria and introduce the NeocorRAG framework. This framework achieves holistic retrieval quality optimization by systematically mining and utilizing Evidence Chains. Specifically, NeocorRAG first employs an innovative activated search algorithm to obtain a refined candidate space. Then it ensures precise evidence chain generation through constrained decoding. Finally, the retrieved set of evidence chains guides the retrieval optimization process. Evaluated on benchmarks including HotpotQA, 2WikiMultiHopQA, MuSiQue, and NQ, NeocorRAG achieves SOTA performance on both 3B and 70B parameter models, while consuming less than 20% of tokens used by comparable methods. This study presents an efficient, training-free paradigm for RAG enhancement that effectively optimizes retrieval quality while maintaining high recall. Our code is released at https://github.com/BUPT-Reasoning-Lab/NeocorRAG.

preprint2026arXiv

Prior Diffusiveness and Regret in the Linear-Gaussian Bandit

We prove that Thompson sampling exhibits $\tilde{O}(σd \sqrt{T} + d r \sqrt{\mathrm{Tr}(Σ_0)})$ Bayesian regret in the linear-Gaussian bandit with a $\mathcal{N}(μ_0, Σ_0)$ prior distribution on the coefficients, where $d$ is the dimension, $T$ is the time horizon, $r$ is the maximum $\ell_2$ norm of the actions, and $σ^2$ is the noise variance. In contrast to existing regret bounds, this shows that to within logarithmic factors, the prior-dependent ``burn-in'' term $d r \sqrt{\mathrm{Tr}(Σ_0)}$ decouples additively from the minimax (long run) regret $σd \sqrt{T}$. Previous regret bounds exhibit a multiplicative dependence on these terms. We establish these results via a new ``elliptical potential'' lemma, and also provide a lower bound indicating that the burn-in term is unavoidable.

preprint2026arXiv

Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

The efficacy of deep neural networks is heavily reliant on the design of non-linear activation functions, yet existing approaches often struggle to balance optimization stability with computational efficiency. While piecewise linear functions offer inference speed, they suffer from optimization instability due to non-differentiability at the origin, whereas smooth counterparts typically incur significant computational overhead through their reliance on transcendental operations. To address these limitations, this paper proposes a general smoothing framework based on constructive approximation theory and introduces the Bernstein Linear Unit (BerLU). This novel activation function utilizes Bernstein polynomials to construct a differentiable quadratic transition region that effectively eliminates singularities while maintaining a piecewise linear structure. Theoretical analysis demonstrates that the proposed method guarantees strictly continuous differentiability and a non-expansive Lipschitz constant of one, which ensures stable gradient propagation and prevents the gradient explosion problems common in deep architectures. Comprehensive empirical evaluations across representative Vision Transformer and Convolutional Neural Network architectures confirm that this approach consistently outperforms state-of-the-art baselines on standard image classification benchmarks while delivering superior computational and memory efficiency.

preprint2022arXiv

AMinerGNN: Heterogeneous Graph Neural Network for Paper Click-through Rate Prediction with Fusion Query

Paper recommendation with user-generated keyword is to suggest papers that simultaneously meet user's interests and are relevant to the input keyword. This is a recommendation task with two queries, a.k.a. user ID and keyword. However, existing methods focus on recommendation according to one query, a.k.a. user ID, and are not applicable to solving this problem. In this paper, we propose a novel click-through rate (CTR) prediction model with heterogeneous graph neural network, called AMinerGNN, to recommend papers with two queries. Specifically, AMinerGNN constructs a heterogeneous graph to project user, paper, and keyword into the same embedding space by graph representation learning. To process two queries, a novel query attentive fusion layer is designed to recognize their importances dynamically and then fuse them as one query to build a unified and end-to-end recommender system. Experimental results on our proposed dataset and online A/B tests prove the superiority of AMinerGNN.

preprint2022arXiv

Automated Heart and Lung Auscultation in Robotic Physical Examinations

This paper presents the first implementation of autonomous robotic auscultation of heart and lung sounds. To select auscultation locations that generate high-quality sounds, a Bayesian Optimization (BO) formulation leverages visual anatomical cues to predict where high-quality sounds might be located, while using auditory feedback to adapt to patient-specific anatomical qualities. Sound quality is estimated online using machine learning models trained on a database of heart and lung stethoscope recordings. Experiments on 4 human subjects show that our system autonomously captures heart and lung sounds of similar quality compared to tele-operation by a human trained in clinical auscultation. Surprisingly, one of the subjects exhibited a previously unknown cardiac pathology that was first identified using our robot, which demonstrates the potential utility of autonomous robotic auscultation for health screening.

preprint2022arXiv

Excavation Reinforcement Learning Using Geometric Representation

Excavation of irregular rigid objects in clutter, such as fragmented rocks and wood blocks, is very challenging due to their complex interaction dynamics and highly variable geometries. In this paper, we adopt reinforcement learning (RL) to tackle this challenge and learn policies to plan for a sequence of excavation trajectories for irregular rigid objects, given point clouds of excavation scenes. Moreover, we separately learn a compact representation of the point cloud on geometric tasks that do not require human labeling. We show that using the representation reduces training time for RL, while achieving similar asymptotic performance compare to an end-to-end RL algorithm. When using a policy trained in simulation directly on a real scene, we show that the policy trained with the representation outperforms end-to-end RL. To our best knowledge, this paper presents the first application of RL to plan a sequence of excavation trajectories of irregular rigid objects in clutter.

preprint2022arXiv

Hybrid integration of deterministic quantum dots-based single-photon sources with CMOS-compatible silicon carbide photonics

Thin film 4H-silicon carbide (4H-SiC) is emerging as a contender for realizing large-scale optical quantum circuits due to its high CMOS technology compatibility and large optical nonlinearities. Though, challenges remain in producing wafer-scale 4H-SiC thin film on insulator (4H-SiCOI) for dense integration of photonic circuits, and in efficient coupling of deterministic quantum emitters that are essential for scalable quantum photonics. Here we demonstrate hybrid integration of self-assembled InGaAs quantum dots (QDs) based single-photon sources (SPSs) with wafer-scale 4H-SiC photonic chips prepared by ion slicing technique. By designing a bilayer vertical coupler, we realize generation and highly efficient routing of single-photon emission in the hybrid quantum photonic chip. Furthermore, we realize a chip-integrated beamsplitter operation for triggered single photons through fabricating a 1x2 multi-mode interferometer (MMI) with a symmetric power splitting ratio of 50:50. The successful demonstration of heterogeneously integrating QDs-based SPSs on 4H-SiC photonic chip prepared by ion slicing technique constitutes an important step toward CMOS-compatible, fast reconfigurable quantum photonic circuits with deterministic SPSs.

preprint2022arXiv

Indexing Metric Spaces for Exact Similarity Search

With the continued digitization of societal processes, we are seeing an explosion in available data. This is referred to as big data. In a research setting, three aspects of the data are often viewed as the main sources of challenges when attempting to enable value creation from big data: volume, velocity, and variety. Many studies address volume or velocity, while fewer studies concern the variety. Metric spaces are ideal for addressing variety because they can accommodate any data as long as it can be equipped with a distance notion that satisfies the triangle inequality. To accelerate search in metric spaces, a collection of indexing techniques for metric data have been proposed. However, existing surveys offer limited coverage, and a comprehensive empirical study exists has yet to be reported. We offer a comprehensive survey of existing metric indexes that support exact similarity search: we summarize existing partitioning, pruning, and validation techniques used by metric indexes to support exact similarity search; we provide the time and space complexity analyses of index construction; and we offer an empirical comparison of their query processing performance. Empirical studies are important when evaluating metric indexing performance, because performance can depend highly on the effectiveness of available pruning and validation as well as on the data distribution, which means that complexity analyses often offer limited insights. This article aims at revealing strengths and weaknesses of different indexing techniques to offer guidance on selecting an appropriate indexing technique for a given setting, and to provide directions for future research on metric indexing.

preprint2021arXiv

Systematic design and experimental demonstration of transmission-type multiplexed acoustic meta-holograms

Acoustic holograms have promising applications in sound-field reconstruction, particle manipulation, ultrasonic haptics and therapy. This paper reports on the theoretical, numerical, and experimental investigation of multiplexed acoustic holograms at both audio and ultrasonic frequencies via a rationally designed transmission-type acoustic metamaterial. The proposed meta-hologram is composed of two Fabry-Perot resonant channels per unit cell, which enables the simultaneous modulation of the transmitted amplitude and phase at two desired frequencies. In contrast to conventional acoustic metamaterial-based holograms, the design strategy proposed here, provides a new degree of freedom (frequency) that can actively tailor holograms that are otherwise completely passive and hence significantly enhances the information encoded in acoustic metamaterials. To demonstrate the multiplexed acoustic metamaterial, we first show the projection of two different high-quality meta-holograms at 14 kHz and 17 kHz, with the patterns of the letters, N and S. We then demonstrate two-channel ultrasound focusing and annular beams generation for the incident ultrasonic frequencies of 35 kHz and 42.5 kHz. These multiplexed acoustic meta-holograms offer a technical advance to tackle the rising challenges in the fields of acoustic metamaterials, architectural acoustics, and medical ultrasound.

preprint2020arXiv

A Deep Learning Method for Complex Human Activity Recognition Using Virtual Wearable Sensors

Sensor-based human activity recognition (HAR) is now a research hotspot in multiple application areas. With the rise of smart wearable devices equipped with inertial measurement units (IMUs), researchers begin to utilize IMU data for HAR. By employing machine learning algorithms, early IMU-based research for HAR can achieve accurate classification results on traditional classical HAR datasets, containing only simple and repetitive daily activities. However, these datasets rarely display a rich diversity of information in real-scene. In this paper, we propose a novel method based on deep learning for complex HAR in the real-scene. Specially, in the off-line training stage, the AMASS dataset, containing abundant human poses and virtual IMU data, is innovatively adopted for enhancing the variety and diversity. Moreover, a deep convolutional neural network with an unsupervised penalty is proposed to automatically extract the features of AMASS and improve the robustness. In the on-line testing stage, by leveraging advantages of the transfer learning, we obtain the final result by fine-tuning the partial neural network (optimizing the parameters in the fully-connected layers) using the real IMU data. The experimental results show that the proposed method can surprisingly converge in a few iterations and achieve an accuracy of 91.15% on a real IMU dataset, demonstrating the efficiency and effectiveness of the proposed method.

preprint2019arXiv

Extreme Low-Frequency Ultrathin Acoustic Absorbing Metasurface

We introduce a multi-coiled acoustic metasurface providing a quasi-perfect absorption (reaching 99.99% in experiments) at extremely low-frequency of 50 Hz, and simultaneously featuring an ultrathin thickness down to λ/527 (1.3 cm). In contrast to the state of the art, this original conceived multi-coiled metasurface offers additional degrees of freedom capable to tune the acoustic impedance effectively without increasing the total thickness. We provide analytical derivation, numerical simulation and experimental demonstrations for this unique absorber concept, and discuss its physical mechanism which breaks the quarter-wavelength resonator theory. Furthermore, based on the same conceptual approach, we propose a broadband lowfrequency metasurface absorber by coupling unit cells exhibiting different properties.

preprint2010arXiv

Increasing risk behavior can outweigh the benefits of anti-retroviral drug treatment on the HIV incidence among men-having-sex-with-men in Amsterdam

The transmission through contacts among MSM (men who have sex with men) is one of the dominating contributors to HIV prevalence in industrialized countries. In Amsterdam, the capital of the Netherlands, the MSM risk group has been traced for decades. This has motivated studies which provide detailed information about MSM's risk behavior statistically, psychologically and sociologically. Despite the era of potent antiretroviral therapy, the incidence of HIV among MSM increases. In the long term the contradictory effects of risk behavior and effective therapy are still poorly understood. Using a previously presented Complex Agent Network model, we describe steady and casual partnerships to predict the HIV spreading among MSM. Behavior-related parameters and values, inferred from studies on Amsterdam MSM, are fed into the model; we validate the model using historical yearly incidence data. Subsequently, we study scenarios to assess the contradictory effects of risk behavior and effective therapy, by varying corresponding values of parameters. Finally, we conduct quantitative analysis based on the resulting incidence data. The simulated incidence reproduces the ACS historical incidence well and helps to predict the HIV epidemic among MSM in Amsterdam. Our results show that in the long run the positive influence of effective therapy can be outweighed by an increase in risk behavior of at least 30% for MSM. Conclusion: We recommend, based on the model predictions, that lowering risk behavior is the prominent control mechanism of HIV incidence even in the presence of effective therapy.

preprint2010arXiv

Service-Oriented Simulation Framework: An Overview and Unifying Methodology

The prevailing net-centric environment demands and enables modeling and simulation to combine efforts from numerous disciplines. Software techniques and methodology, in particular service-oriented architecture, provide such an opportunity. Service-oriented simulation has been an emerging paradigm following on from object- and process-oriented methods. However, the ad-hoc frameworks proposed so far generally focus on specific domains or systems and each has its pros and cons. They are capable of addressing different issues within service-oriented simulation from different viewpoints. It is increasingly important to describe and evaluate the progress of numerous frameworks. In this paper, we propose a novel three-dimensional reference model for a service-oriented simulation paradigm. The model can be used as a guideline or an analytic means to find the potential and possible future directions of the current simulation frameworks. In particular, the model inspects the crossover between the disciplines of modeling and simulation, service-orientation, and software/systems engineering. Based on the model, we present a comprehensive survey on several classical service-oriented simulation frameworks, including formalism-based, model-driven, interoperability protocol based, eXtensible Modeling and Simulation Framework (XMSF), and Open Grid Services Architecture (OGSA) based frameworks etc. The comparison of these frameworks is also performed. Finally the significance both in academia and practice are presented and future directions are pointed out.

Yifan Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

All-in-one Graph-based Indexing for Hybrid Search on GPUs

Branch, or Layer? Zeroth-Order Optimization for Continual Learning of Vision-Language Models

DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization

Frequency-Aware Graph Construction and Search for Dynamic Vector Databases

HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Prior Diffusiveness and Regret in the Linear-Gaussian Bandit

Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

AMinerGNN: Heterogeneous Graph Neural Network for Paper Click-through Rate Prediction with Fusion Query

Automated Heart and Lung Auscultation in Robotic Physical Examinations

Excavation Reinforcement Learning Using Geometric Representation

Hybrid integration of deterministic quantum dots-based single-photon sources with CMOS-compatible silicon carbide photonics

Indexing Metric Spaces for Exact Similarity Search

Systematic design and experimental demonstration of transmission-type multiplexed acoustic meta-holograms

A Deep Learning Method for Complex Human Activity Recognition Using Virtual Wearable Sensors

Extreme Low-Frequency Ultrathin Acoustic Absorbing Metasurface

Increasing risk behavior can outweigh the benefits of anti-retroviral drug treatment on the HIV incidence among men-having-sex-with-men in Amsterdam

Service-Oriented Simulation Framework: An Overview and Unifying Methodology