Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

AIConfigurator: Lightning-Fast Configuration Optimization for Multi-Framework LLM Serving

Optimizing Large Language Model (LLM) inference in production systems is increasingly difficult due to dynamic workloads, stringent latency/throughput targets, and a rapidly expanding configuration space. This complexity spans not only distributed parallelism strategies (tensor/pipeline/expert) but also intricate framework-specific runtime parameters such as those concerning the enablement of CUDA graphs, available KV-cache memory fractions, and maximum token capacity, which drastically impact performance. The diversity of modern inference frameworks (e.g., TRT-LLM, vLLM, SGLang), each employing distinct kernels and execution policies, makes manual tuning both framework-specific and computationally prohibitive. We present AIConfigurator, a unified performance-modeling system that enables rapid, framework-agnostic inference configuration search without requiring GPU-based profiling. AIConfigurator combines (1) a methodology that decomposes inference into analytically modelable primitives - GEMM, attention, communication, and memory operations while capturing framework-specific scheduling dynamics; (2) a calibrated kernel-level performance database for these primitives across a wide range of hardware platforms and popular open-weights models (GPT-OSS, Qwen, DeepSeek, LLama, Mistral); and (3) an abstraction layer that automatically resolves optimal launch parameters for the target backend, seamlessly integrating into production-grade orchestration systems. Evaluation on production LLM serving workloads demonstrates that AIConfigurator identifies superior serving configurations that improve performance by up to 40% for dense models (e.g., Qwen3-32B) and 50% for MoE architectures (e.g., DeepSeek-V3), while completing searches within 30 seconds on average. Enabling the rapid exploration of vast design spaces - from cluster topology down to engine specific flags.

preprint2026arXiv

Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw

Agentic language-model systems increasingly rely on mutable execution contexts, including files, memory, tools, skills, and auxiliary artifacts, creating security risks beyond explicit user prompts. This paper presents DeepTrap, an automated framework for discovering contextual vulnerabilities in OpenClaw. DeepTrap formulates adversarial context manipulation as a black-box trajectory-level optimization problem that balances risk realization, benign-task preservation, and stealth. It combines risk-conditioned evaluation, multi-objective trajectory scoring, reward-guided beam search, and reflection-based deep probing to identify high-value compromised contexts. We construct a 42-case benchmark spanning six vulnerability classes and seven operational scenarios, and evaluate nine target models using attack and utility grading scores. Results show that contextual compromise can induce substantial unsafe behavior while preserving user-facing task completion, demonstrating that final-response evaluation is insufficient. The findings highlight the need for execution-centric security evaluation of agentic AI systems. Our code is released at: https://github.com/ZJUICSR/DeepTrap

preprint2026arXiv

SLASH the Sink: Sharpening Structural Attention Inside LLMs

Large Language Models (LLMs) show remarkable semantic understanding but often struggle with structural understanding when processing graph topologies in a serialized format. Existing solutions rely on training external graph-based adapters or fine-tuning, which incur high costs and lost generalizability. In this work, we investigate the internal mechanisms of LLMs and present a critical finding: LLMs spontaneously reconstruct the graph's topology internally, evidenced by a distinct "sawtooth" pattern in their attention maps that structurally aligns with the "token-level adjacency matrix". However, this intrinsic structural understanding is diluted by the attention sink. We theoretically formalize this dilution as a representation bottleneck, stemming from a fundamental conflict: the model's anisotropic bias, essential for language tasks, suppresses the topology-aware local aggregation required for graph reasoning. To address this, we propose a training-free solution, named StructuraL Attention SHarpening (SLASH), which amplifies this internal structural understanding via a plug-and-play attention redistribution. Experiments on pure graph tasks and molecular prediction validate that SLASH delivers significant and consistent performance gains across diverse LLMs.

preprint2026arXiv

XSearch: Explainable Code Search via Concept-to-Code Alignment

Semantic code search has been widely adopted in both academia and industry. These approaches embed natural-language queries and code snippets into a shared embedding space and retrieve results based on vector similarity. Despit strong performance on benchmark datasets, they often suffer from poor explainability and generalization. Retrieved code may appear semantically similar yet miss critical functional requirements of the query, while providing no explanation of why the result was retrieved. Moreover, such failures become more severe under distribution shift, where models struggle to generalize to unseen benchmarks. In this work, we propose XSearch, an intrinsically explainable code search framework. Our key insight is that by relying on global embedding similarity, existing retrievers inherently take an inductive view. They learn statistical patterns rather than truly understanding the query's functional requirements. We address this problem by reformulating code search as a deductive concept alignment problem. XSearch (i) identifies functional concepts in the query and (ii) explicitly aligns them with corresponding code statements. This explain-then-predict design produces inherent concept-level explanations and mitigates shortcut learning that harms out-of-distribution generalization. We train an encoder with explicit concept-alignment objectives and perform retrieval through explicit matching between query concepts and code statements. Experiments show that, trained on CodeSearchNet using GraphCodeBERT (125M parameters), XSearch improves performance on out-of-distribution benchmarks from 0.02 to 0.33 (15x) over eight state-of-the-art retrievers, and consistently outperforms both encoder- and decoder-based baselines with up to 7B parameters. A user study demonstrates that concept-alignment explanations enable users to evaluate retrieved results faster and more accurately.

preprint2023arXiv

Flexible Alignment Super-Resolution Network for Multi-Contrast MRI

Magnetic resonance imaging plays an essential role in clinical diagnosis by acquiring the structural information of biological tissue. Recently, many multi-contrast MRI super-resolution networks achieve good effects. However, most studies ignore the impact of the inappropriate foreground scale and patch size of multi-contrast MRI, which probably leads to inappropriate feature alignment. To tackle this problem, we propose the Flexible Alignment Super-Resolution Network (FASR-Net) for multi-contrast MRI Super-Resolution. The Flexible Alignment module of FASR-Net consists of two modules for feature alignment. (1) The Single-Multi Pyramid Alignment(S-A) module solves the situation where low-resolution (LR) images and reference (Ref) images have different scales. (2) The Multi-Multi Pyramid Alignment(M-A) module solves the situation where LR and Ref images have the same scale. Besides, we propose the Cross-Hierarchical Progressive Fusion (CHPF) module aiming at fusing the features effectively, further improving the image quality. Compared with other state-of-the-art methods, FASR-net achieves the most competitive results on FastMRI and IXI datasets. Our code will be available at \href{https://github.com/yimingliu123/FASR-Net}{https://github.com/yimingliu123/FASR-Net}.

preprint2022arXiv

Active Phase-Encode Selection for Slice-Specific Fast MR Scanning Using a Transformer-Based Deep Reinforcement Learning Framework

Purpose: Long scan time in phase encoding for forming complete K-space matrices is a critical drawback of MRI, making patients uncomfortable and wasting important time for diagnosing emergent diseases. This paper aims to reducing the scan time by actively and sequentially selecting partial phases in a short time so that a slice can be accurately reconstructed from the resultant slice-specific incomplete K-space matrix. Methods: A transformer based deep reinforcement learning framework is proposed for actively determining a sequence of partial phases according to reconstruction-quality based Q-value (a function of reward), where the reward is the improvement degree of reconstructed image quality. The Q-value is efficiently predicted from binary phase-indicator vectors, incomplete K-space matrices and their corresponding undersampled images with a light-weight transformer so that the sequential information of phases and global relationship in images can be used. The inverse Fourier transform is employed for efficiently computing the undersampled images and hence gaining the rewards of selecting phases. Results: Experimental results on the fastMRI dataset with original K-space data accessible demonstrate the efficiency and accuracy superiorities of proposed method. Compared with the state-of-the-art reinforcement learning based method proposed by Pineda et al., the proposed method is roughly 150 times faster and achieves significant improvement in reconstruction accuracy. Conclusions: We have proposed a light-weight transformer based deep reinforcement learning framework for generating high-quality slice-specific trajectory consisting of a small number of phases. The proposed method, called TITLE (Transformer Involved Trajectory LEarning), has remarkable superiority in phase-encode selection efficiency and image reconstruction accuracy.

preprint2022arXiv

Beamforming Design and Performance Evaluation for Reconfigurable Intelligent Surface Assisted Wireless Communication Systems With Non-Ideal Hardware

Reconfigurable intelligent surface (RIS) can effectively control the wavefront of the impinging signals and has emerged as a cost-effective promising solution to improve the spectrum and energy efficiency of wireless systems. Most existing researches on RIS assume that the hardware operations are perfect. However, both physical transceiver and RIS suffer from inevitable hardware impairments in practice, which can lead to severe system performance degradation and increase the complexity of beamforming optimization. Consequently, the existing researches on RIS, including channel estimation, beamforming optimization, spectrum and energy efficiency analysis, etc., cannot directly apply to the case of hardware impairments. In this paper, by taking hardware impairments into consideration, we conduct the joint transmit and reflect beamforming optimization, and reevaluate the system performance. First, we characterize the closed-form estimators of direct and cascaded channels in both single-user and multi-user cases, and analyze the impact of hardware impairments on channel estimation accuracy. Then, the optimal transmit beamforming solution is derived, and a gradient descent method-based algorithm is also proposed to optimize the reflect beamforming. Moreover, we analyze the three types of asymptotic channel capacities with respect to the transmit power, the antenna number, and the reflecting element number. Finally, in terms of the system energy consumption, we analyze the power scaling law and the energy efficiency. Our experimental results also reveal an encouraging phenomenon that the RIS-assisted wireless system with massive reflecting elements can achieve both high spectrum and energy efficiency without the need for massive antennas and without allocating too many resources to optimize the reflect beamforming.

preprint2022arXiv

Energy-efficient Caching and Task offloading for Timely Status Updates in UAV-assisted VANETs

Intelligent edge network is maturing to enable smart and efficient transportation systems. In this letter, we consider unmanned aerial vehicle (UAV)-assisted vehicular networks where UAVs provide caching and computing services in complement with base station (BS). One major challenge is that vehicles need to obtain timely situational awareness via orchestration of ubiquitous caching and computing resources. Note that cached data for vehicles' perception tasks contains time-varying context information, thus freshness of cached data should be considered in conjunction with task execution to guarantee timeliness of obtained status updates. To this end, we propose a two-stage performance metric to quantify the impact of cache refreshing and computation offloading decisions on the age of status updates. We formulate an energy minimization problem by jointly considering cache refreshing, computation offloading and aging of status updates. To facilitate online decision making, we propose a deep deterministic policy gradient(DDPG)-based solution procedure and incorporate differentiated experience replay mechanism to accelerate convergence. Simulation results show that the performance of proposed solution is competitive in terms of energy consumption for obtaining fresh status updates.

preprint2022arXiv

Exploring fundamental laws of classical mechanics via predicting the orbits of planets based on neural networks

Neural networks have provided powerful approaches to solve various scientific problems. Many of them are even difficult for human experts who are good at accessing the physical laws from experimental data. We investigate whether neural networks can assist us in exploring the fundamental laws of classical mechanics from data of planetary motion. Firstly, we predict the orbits of planets in the geocentric system using the gate recurrent unit, one of the common neural networks. We find that the precision of the prediction is obviously improved when the information of the Sun is included in the training set. This result implies that the Sun is particularly important in the geocentric system without any prior knowledge, which inspires us to gain Copernicus' heliocentric theory. Secondly, we turn to the heliocentric system and make successfully mutual predictions between the position and velocity of planets. We hold that the successful prediction is due to the existence of enough conserved quantities (such as conservations of mechanical energy and angular momentum) in the system. Our research provides a new way to explore the existence of conserved quantities in mechanics system based on neural networks.

preprint2022arXiv

FORCE: A Framework of Rule-Based Conversational Recommender System

The conversational recommender systems (CRSs) have received extensive attention in recent years. However, most of the existing works focus on various deep learning models, which are largely limited by the requirement of large-scale human-annotated datasets. Such methods are not able to deal with the cold-start scenarios in industrial products. To alleviate the problem, we propose FORCE, a Framework Of Rule-based Conversational Recommender system that helps developers to quickly build CRS bots by simple configuration. We conduct experiments on two datasets in different languages and domains to verify its effectiveness and usability.

preprint2022arXiv

Robust PCA Unrolling Network for Super-resolution Vessel Extraction in X-ray Coronary Angiography

Although robust PCA has been increasingly adopted to extract vessels from X-ray coronary angiography (XCA) images, challenging problems such as inefficient vessel-sparsity modelling, noisy and dynamic background artefacts, and high computational cost still remain unsolved. Therefore, we propose a novel robust PCA unrolling network with sparse feature selection for super-resolution XCA vessel imaging. Being embedded within a patch-wise spatiotemporal super-resolution framework that is built upon a pooling layer and a convolutional long short-term memory network, the proposed network can not only gradually prune complex vessel-like artefacts and noisy backgrounds in XCA during network training but also iteratively learn and select the high-level spatiotemporal semantic information of moving contrast agents flowing in the XCA-imaged vessels. The experimental results show that the proposed method significantly outperforms state-of-the-art methods, especially in the imaging of the vessel network and its distal vessels, by restoring the intensity and geometry profiles of heterogeneous vessels against complex and dynamic backgrounds.

preprint2022arXiv

The role of haptic communication in dyadic collaborative object manipulation tasks

Intuitive and efficient physical human-robot collaboration relies on the mutual observability of the human and the robot, i.e. the two entities being able to interpret each other's intentions and actions. This is remedied by a myriad of methods involving human sensing or intention decoding, as well as human-robot turn-taking and sequential task planning. However, the physical interaction establishes a rich channel of communication through forces, torques and haptics in general, which is often overlooked in industrial implementations of human-robot interaction. In this work, we investigate the role of haptics in human collaborative physical tasks, to identify how to integrate physical communication in human-robot teams. We present a task to balance a ball at a target position on a board either bimanually by one participant, or dyadically by two participants, with and without haptic information. The task requires that the two sides coordinate with each other, in real-time, to balance the ball at the target. We found that with training the completion time and number of velocity peaks of the ball decreased, and that participants gradually became consistent in their braking strategy. Moreover we found that the presence of haptic information improved the performance (decreased completion time) and led to an increase in overall cooperative movements. Overall, our results show that humans can better coordinate with one another when haptic feedback is available. These results also highlight the likely importance of haptic communication in human-robot physical interaction, both as a tool to infer human intentions and to make the robot behaviour interpretable to humans.

preprint2021arXiv

Blockchain-empowered Data-driven Networks: A Survey and Outlook

The paths leading to future networks are pointing towards a data-driven paradigm to better cater to the explosive growth of mobile services as well as the increasing heterogeneity of mobile devices, many of which generate and consume large volumes and variety of data. These paths are also hampered by significant challenges in terms of security, privacy, services provisioning, and network management. Blockchain, which is a technology for building distributed ledgers that provide an immutable log of transactions recorded in a distributed network, has become prominent recently as the underlying technology of cryptocurrencies and is revolutionizing data storage and processing in computer network systems. For future data-driven networks (DDNs), blockchain is considered as a promising solution to enable the secure storage, sharing, and analytics of data, privacy protection for users, robust, trustworthy network control, and decentralized routing and resource managements. However, many important challenges and open issues remain to be addressed before blockchain can be deployed widely to enable future DDNs. In this article, we present a survey on the existing research works on the application of blockchain technologies in computer networks, and identify challenges and potential solutions in the applications of blockchains in future DDNs. We identify application scenarios in which future blockchain-empowered DDNs could improve the efficiency and security, and generally the effectiveness of network services.

preprint2021arXiv

Hierarchical Reinforcement Learning for Relay Selection and Power Optimization in Two-Hop Cooperative Relay Network

Cooperative communication is an effective approach to improve spectrum utilization. In order to reduce outage probability of communication system, most studies propose various schemes for relay selection and power allocation, which are based on the assumption of channel state information (CSI). However, it is difficult to get an accurate CSI in practice. In this paper, we study the outage probability minimizing problem subjected to a total transmission power constraint in a two-hop cooperative relay network. We use reinforcement learning (RL) methods to learn strategies for relay selection and power allocation, which do not need any prior knowledge of CSI but simply rely on the interaction with communication environment. It is noted that conventional RL methods, including most deep reinforcement learning (DRL) methods, cannot perform well when the search space is too large. Therefore, we first propose a DRL framework with an outage-based reward function, which is then used as a baseline. Then, we further propose a hierarchical reinforcement learning (HRL) framework and training algorithm. A key difference from other RL-based methods in existing literatures is that, our proposed HRL approach decomposes relay selection and power allocation into two hierarchical optimization objectives, which are trained in different levels. With the simplification of search space, the HRL approach can solve the problem of sparse reward, while the conventional RL method fails. Simulation results reveal that compared with traditional DRL method, the HRL training algorithm can reach convergence 30 training iterations earlier and reduce the outage probability by 5% in two-hop relay network with the same outage threshold.

preprint2021arXiv

Integrating Pre-trained Model into Rule-based Dialogue Management

Rule-based dialogue management is still the most popular solution for industrial task-oriented dialogue systems for their interpretablility. However, it is hard for developers to maintain the dialogue logic when the scenarios get more and more complex. On the other hand, data-driven dialogue systems, usually with end-to-end structures, are popular in academic research and easier to deal with complex conversations, but such methods require plenty of training data and the behaviors are less interpretable. In this paper, we propose a method to leverages the strength of both rule-based and data-driven dialogue managers (DM). We firstly introduce the DM of Carina Dialog System (CDS, an advanced industrial dialogue system built by Microsoft). Then we propose the "model-trigger" design to make the DM trainable thus scalable to scenario changes. Furthermore, we integrate pre-trained models and empower the DM with few-shot capability. The experimental results demonstrate the effectiveness and strong few-shot capability of our method.

preprint2021arXiv

Metaknowledge Extraction Based on Multi-Modal Documents

The triple-based knowledge in large-scale knowledge bases is most likely lacking in structural logic and problematic of conducting knowledge hierarchy. In this paper, we introduce the concept of metaknowledge to knowledge engineering research for the purpose of structural knowledge construction. Therefore, the Metaknowledge Extraction Framework and Document Structure Tree model are presented to extract and organize metaknowledge elements (titles, authors, abstracts, sections, paragraphs, etc.), so that it is feasible to extract the structural knowledge from multi-modal documents. Experiment results have proved the effectiveness of metaknowledge elements extraction by our framework. Meanwhile, detailed examples are given to demonstrate what exactly metaknowledge is and how to generate it. At the end of this paper, we propose and analyze the task flow of metaknowledge applications and the associations between knowledge and metaknowledge.

preprint2020arXiv

Channel Estimation and Power Scaling Law of Large Reflecting Surface with Non-Ideal Hardware

Large reflecting surface (LRS) has emerged as a new solution to improve the energy and spectrum efficiency of wireless communication system. Most existing studies were conducted with an assumption of ideal hardware, and the impact of hardware impairments receives little attention. However, the non-negligible hardware impairments should be taken into consideration when we evaluate the system performance. In this paper, we consider an LRS assisted communication system with hardware impairments, and focus on the channel estimation study and the power scaling law analysis. First, with linear minimum mean square error estimation, we theoretically characterize the relationship between channel estimation performance and impairment level, number of reflecting elements, and pilot power. After that, we analyze the power scaling law and reveal that if the base station (BS) has perfect channel state information, the transmit power of user can be made inversely proportional to the number of BS antennas and the square of the number of reflecting elements with no reduction in performance; If the BS has imperfectly estimated channel state information, to achieve the same performance, the transmit power of user can be made inversely proportional to the square-root of the number of BS antennas and the square of the number of reflecting elements.

preprint2020arXiv

Energy Efficiency Analysis of Intelligent Reflecting Surface System with Hardware Impairments

Recently, as explosive growth of mobile data traffic, the performance of wireless communication systems requires to be enhanced significantly in future. Intelligent reflecting surface (IRS) can be used as a promising way to improve the energy efficiency of wireless communications with less complexity and hardware cost. Most existing studies consider the cases with ideal hardware, however, both physical transceiver and IRS suffer from non-negligible hardware impairments which may greatly degrade the system performance. In this paper, by taking hardware impairments into consideration, we focus on the energy efficiency analysis of IRS system. Our first contribution is to derive the optimal receive combining and transmit beamforming vectors. After that, we characterize the asymptotic channel capacity. With the derived asymptotic channel capacity and the power consumption model, the analytical upper and lower bounds of energy efficiency are provided. Our results show that an IRS system can achieve both high spectral efficiency and high energy efficiency with moderate number of antennas. This observation is encouraging for that there is no need to cost a lot on expensive high-quality antennas, which corresponds to the requirements of new communication paradigms.

preprint2020arXiv

Investigating Bottom-Quark Yukawa Interaction at Higgs Factory

Measuring the fermion Yukawa coupling constants is important for understanding the origin of the fermion masses and its relationship to the spontaneously electroweak symmetry breaking. On the other hand, some new physics models will change the Lorentz structure of the Yukawa interactions between the standard model (SM) fermions and the SM-like Higgs boson even in their decoupling limit. Thus the precisely measurement of the fermion Yukawa interactions is a powerful tool of new physics searching in the decoupling limit. In this work, we show the possibility of investigating the Lorentz structure of the bottom-quark Yukawa interaction with the 125GeV SM-like Higgs boson at future $e^+e^-$ colliders.

preprint2020arXiv

Reconfigurable Intelligent Surface Aided Wireless Localization

The advantages of millimeter-wave and large antenna arrays technologies for accurate wireless localization received extensive attentions recently. However, how to further improve the accuracy of wireless localization, even in the case of obstructed line-of-sight, is largely undiscovered. In this paper, the reconfigurable intelligent surface (RIS) is introduced into the system to make the positioning more accurate. First, we establish the three-dimensional RIS-assisted wireless localization channel model. After that, we derive the Fisher information matrix and the Cramer-Rao lower bound for evaluating the estimation of absolute mobile station position. Finally, we propose an alternative optimization method and a gradient decent method to optimize the reflect beamforming, which aims to minimize the Cramer-Rao lower bound to obtain a more accurate estimation. Our results show that the proposed methods significantly improve the accuracy of positioning, and decimeter-level or even centimeter-level positioning can be achieved by utilizing the RIS with a large number of reflecting elements.