Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling

Despite rigorous safety alignment, Large Language Models (LLMs) remain vulnerable to jailbreak attacks. Existing black-box methods often rely on heuristic templates or exhaustive trials, lacking mechanistic interpretability and query efficiency. In this study, we investigate an intrinsic vulnerability in the safety mechanisms of LLMs, where safety alignment relies on a small set of sparsely distributed attention heads, leaving much of the representational space weakly monitored. We formalize this phenomenon with a mathematical jailbreaking model that characterizes the delicate boundary of effective text obfuscation and analytically explains observed jailbreak behaviors. Guided by this model, we propose Babel, an efficient black-box attack framework that exploits the identified safety gap through systematic obfuscation sampling with iterative, feedback-driven distribution refinement, enabling reliable and high-success jailbreak attacks without access to model internals. Comprehensive evaluations on frontier commercial models demonstrate that Babel achieves state-of-the-art attack success rates and superior query efficiency. Specifically, compared to state-of-the-art methods, Babel increases the attack success rate on GPT-4o from 41.33% to 82.67% and on Claude-3-5-haiku from 38.33% to 78.33% within an average of 40 queries, providing a robust red-teaming methodology for LLMs safety research.

preprint2026arXiv

Backpropagation-Free Test-Time Adaptation for Lightweight EEG-Based Brain-Computer Interfaces

Electroencephalogram (EEG)-based brain-computer interfaces (BCIs) face significant deployment challenges due to inter-subject variability, signal non-stationarity, and computational constraints. While test-time adaptation (TTA) mitigates distribution shifts under online data streams without per-use calibration sessions, existing TTA approaches heavily rely on explicitly defined loss objectives that require backpropagation for updating model parameters, which incurs computational overhead, privacy risks, and sensitivity to noisy data streams. This paper proposes Backpropagation-Free Transformations (BFT), a TTA approach for EEG decoding that eliminates such issues. BFT applies multiple sample-wise transformations of knowledge-guided augmentations or approximate Bayesian inference to each test trial, generating multiple prediction scores for a single test sample. A learning-to-rank module enhances the weighting of these predictions, enabling robust aggregation for uncertainty suppression during inference under theoretical justifications. Extensive experiments on five EEG datasets of motor imagery classification and driver drowsiness regression tasks demonstrate the effectiveness, versatility, robustness, and efficiency of BFT. This research enables lightweight plug-and-play BCIs on resource-constrained devices, broadening the real-world deployment of decoding algorithms for EEG-based BCI.

preprint2026arXiv

Imprints of Dark Photons on Gravitational Wave Polarizations

We study conversion processes between gravitons and dark photons and reveal the effects of dark photons on the polarization of gravitational waves. Considering cosmological dark magnetic fields, we investigate the evolution of the intensity and polarization of gravitational waves through the conversion. Specifically, we demonstrate that for minimal coupling between gravitons and dark photons, the intensity, circular polarization, and linear polarization evolve separately. We derive explicit formulas for the statistical mean and variance of the intensity and polarization when the gravitational waves pass through magnetic fields with random orientation. The formulas capture how the initial polarization of dark photons will be imprinted on the observed gravitational wave background.

preprint2026arXiv

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Structuring latent representations in a hierarchical manner enables models to learn patterns at multiple levels of abstraction. However, most prevalent image understanding models focus on visual similarity, and learning visual hierarchies is relatively unexplored. In this work, for the first time, we introduce a learning paradigm that can encode user-defined multi-level complex visual hierarchies in hyperbolic space without requiring explicit hierarchical labels. As a concrete example, first, we define a part-based image hierarchy using object-level annotations within and across images. Then, we introduce an approach to enforce the hierarchy using contrastive loss with pairwise entailment metrics. Finally, we discuss new evaluation metrics to effectively measure hierarchical image retrieval. Encoding these complex relationships ensures that the learned representations capture semantic and structural information that transcends mere visual similarity. Experiments in part-based image retrieval show significant improvements in hierarchical retrieval tasks, demonstrating the capability of our model in capturing visual hierarchies.

preprint2025arXiv

Dual radar-guided glide path error correction based on the Izhikevich neuron model

Aiming at the ranging and angle measurement errors caused by target reflection characteristics and system noise in dual radar tracking, this paper proposes a dual radar track error correction method based on the Izhikevich neural model. The network uses the dynamic differential equation of the Izhikevich model to simulate the discharge characteristics of biological neurons. Its input layer integrates the coordinate measurement data of the dual radar, and the output layer represents the error compensation amount through the pulse emission frequency. The spike-timing-dependent plasticity (STDP) is used to adjust the neuron connection weights dynamically, and the trajectory distortion caused by system noise and radar ranging and angle measurement errors can be effectively suppressed.

preprint2024arXiv

Nearfield Vortex Dynamics of Supercell Bloch Modes

Densely arranged optical vortices are natural solutions of high-symmetry Bloch modes in photonic crystals. However, strict symmetry constraints limit the potential spatial configurations of nearfield vortices, restricting the control over light-matter interaction. Here, we demonstrate a nearfield vortex dynamic within a supercell photonic crystal. By introducing paired rotations of triangular structures, we achieve high-quality-factor Bloch mode transition from evanescent valley modes, to quasi-bound states in the continuum, frustrated modes, and quasi-valleys. Each stage exhibits distinct nearfield vortex distributions, nonlinear overlap properties, and quality factors, revealing diverse physical behaviors for tailoring light-matter interaction. Notably, the asymmetric vortex configuration of frustrated modes enhances second harmonic generation, driven by an optimized nonlinear overlap factor. Our paired-rotation strategy offers a versatile design framework for creating supercell photonic crystals with unique nearfield vortex properties, presenting promising applications in lasing, nonlinear optics and optical forces.

preprint2022arXiv

A Linear Comb Filter for Event Flicker Removal

Event cameras are bio-inspired sensors that capture per-pixel asynchronous intensity change rather than the synchronous absolute intensity frames captured by a classical camera sensor. Such cameras are ideal for robotics applications since they have high temporal resolution, high dynamic range and low latency. However, due to their high temporal resolution, event cameras are particularly sensitive to flicker such as from fluorescent or LED lights. During every cycle from bright to dark, pixels that image a flickering light source generate many events that provide little or no useful information for a robot, swamping the useful data in the scene. In this paper, we propose a novel linear filter to preprocess event data to remove unwanted flicker events from an event stream. The proposed algorithm achieves over 4.6 times relative improvement in the signal-to-noise ratio when compared to the raw event stream due to the effective removal of flicker from fluorescent lighting. Thus, it is ideally suited to robotics applications that operate in indoor settings or scenes illuminated by flickering light sources.

preprint2022arXiv

An Asynchronous Kalman Filter for Hybrid Event Cameras

Event cameras are ideally suited to capture HDR visual information without blur but perform poorly on static or slowly changing scenes. Conversely, conventional image sensors measure absolute intensity of slowly changing scenes effectively but do poorly on high dynamic range or quickly changing scenes. In this paper, we present an event-based video reconstruction pipeline for High Dynamic Range (HDR) scenarios. The proposed algorithm includes a frame augmentation pre-processing step that deblurs and temporally interpolates frame data using events. The augmented frame and event data are then fused using a novel asynchronous Kalman filter under a unifying uncertainty model for both sensors. Our experimental results are evaluated on both publicly available datasets with challenging lighting conditions and fast motions and our new dataset with HDR reference. The proposed algorithm outperforms state-of-the-art methods in both absolute intensity error (48% reduction) and image similarity indexes (average 11% improvement).

preprint2022arXiv

InvisibiliTee: Angle-agnostic Cloaking from Person-Tracking Systems with a Tee

After a survey for person-tracking system-induced privacy concerns, we propose a black-box adversarial attack method on state-of-the-art human detection models called InvisibiliTee. The method learns printable adversarial patterns for T-shirts that cloak wearers in the physical world in front of person-tracking systems. We design an angle-agnostic learning scheme which utilizes segmentation of the fashion dataset and a geometric warping process so the adversarial patterns generated are effective in fooling person detectors from all camera angles and for unseen black-box detection models. Empirical results in both digital and physical environments show that with the InvisibiliTee on, person-tracking systems' ability to detect the wearer drops significantly.

preprint2022arXiv

Shap-CAM: Visual Explanations for Convolutional Neural Networks based on Shapley Value

Explaining deep convolutional neural networks has been recently drawing increasing attention since it helps to understand the networks' internal operations and why they make certain decisions. Saliency maps, which emphasize salient regions largely connected to the network's decision-making, are one of the most common ways for visualizing and analyzing deep networks in the computer vision community. However, saliency maps generated by existing methods cannot represent authentic information in images due to the unproven proposals about the weights of activation maps which lack solid theoretical foundation and fail to consider the relations between each pixel. In this paper, we develop a novel post-hoc visual explanation method called Shap-CAM based on class activation mapping. Unlike previous gradient-based approaches, Shap-CAM gets rid of the dependence on gradients by obtaining the importance of each pixel through Shapley value. We demonstrate that Shap-CAM achieves better visual performance and fairness for interpreting the decision making process. Our approach outperforms previous methods on both recognition and localization tasks.

preprint2022arXiv

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

In this paper, we propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search. Differentiable architecture search (DARTS) acquires the optimal architectures by optimizing the architecture parameters with gradient descent, which significantly reduces the search cost. However, the magnitude of architecture parameters updated by gradient descent fails to reveal the actual operation importance to the task performance and therefore harms the effectiveness of obtained architectures. By contrast, we propose to evaluate the direct influence of operations on validation accuracy. To deal with the complex relationships between supernet components, we leverage Shapley value to quantify their marginal contributions by considering all possible combinations. Specifically, we iteratively optimize the supernet weights and update the architecture parameters by evaluating operation contributions via Shapley value, so that the optimal architectures are derived by selecting the operations that contribute significantly to the tasks. Since the exact computation of Shapley value is NP-hard, the Monte-Carlo sampling based algorithm with early truncation is employed for efficient approximation, and the momentum update mechanism is adopted to alleviate fluctuation of the sampling process. Extensive experiments on various datasets and various search spaces show that our Shapley-NAS outperforms the state-of-the-art methods by a considerable margin with light search cost. The code is available at https://github.com/Euphoria16/Shapley-NAS.git

preprint2022arXiv

Smart Explorer: Recognizing Objects in Dense Clutter via Interactive Exploration

Recognizing objects in dense clutter accurately plays an important role to a wide variety of robotic manipulation tasks including grasping, packing, rearranging and many others. However, conventional visual recognition models usually miss objects because of the significant occlusion among instances and causes incorrect prediction due to the visual ambiguity with the high object crowdedness. In this paper, we propose an interactive exploration framework called Smart Explorer for recognizing all objects in dense clutters. Our Smart Explorer physically interacts with the clutter to maximize the recognition performance while minimize the number of motions, where the false positives and negatives can be alleviated effectively with the optimal accuracy-efficiency trade-offs. Specifically, we first collect the multi-view RGB-D images of the clutter and reconstruct the corresponding point cloud. By aggregating the instance segmentation of RGB images across views, we acquire the instance-wise point cloud partition of the clutter through which the existed classes and the number of objects for each class are predicted. The pushing actions for effective physical interaction are generated to sizably reduce the recognition uncertainty that consists of the instance segmentation entropy and multi-view object disagreement. Therefore, the optimal accuracy-efficiency trade-off of object recognition in dense clutter is achieved via iterative instance prediction and physical interaction. Extensive experiments demonstrate that our Smart Explorer acquires promising recognition accuracy with only a few actions, which also outperforms the random pushing by a large margin.

preprint2022arXiv

Smart Visual Beacons with Asynchronous Optical Communications using Event Cameras

Event cameras are bio-inspired dynamic vision sensors that respond to changes in image intensity with a high temporal resolution, high dynamic range and low latency. These sensor characteristics are ideally suited to enable visual target tracking in concert with a broadcast visual communication channel for smart visual beacons with applications in distributed robotics. Visual beacons can be constructed by high-frequency modulation of Light Emitting Diodes (LEDs) such as vehicle headlights, Internet of Things (IoT) LEDs, smart building lights, etc., that are already present in many real-world scenarios. The high temporal resolution characteristic of the event cameras allows them to capture visual signals at far higher data rates compared to classical frame-based cameras. In this paper, we propose a novel smart visual beacon architecture with both LED modulation and event camera demodulation algorithms. We quantitatively evaluate the relationship between LED transmission rate, communication distance and the message transmission accuracy for the smart visual beacon communication system that we prototyped. The proposed method achieves up to 4 kbps in an indoor environment and lossless transmission over a distance of 100 meters, at a transmission rate of 500 bps, in full sunlight, demonstrating the potential of the technology in an outdoor environment.

preprint2022arXiv

Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception

Stereo camera systems play an important role in robotics applications to perceive the 3D world. However, conventional cameras have drawbacks such as low dynamic range, motion blur and latency due to the underlying frame-based mechanism. Event cameras address these limitations as they report the brightness changes of each pixel independently with a fine temporal resolution, but they are unable to acquire absolute intensity information directly. Although integrated hybrid event-frame sensors (eg., DAVIS) are available, the quality of data is compromised by coupling at the pixel level in the circuit fabrication of such cameras. This paper proposes a stereo hybrid event-frame (SHEF) camera system that offers a sensor modality with separate high-quality pure event and pure frame cameras, overcoming the limitations of each separate sensor and allowing for stereo depth estimation. We provide a SHEF dataset targeted at evaluating disparity estimation algorithms and introduce a stereo disparity estimation algorithm that uses edge information extracted from the event stream correlated with the edge detected in the frame data. Our disparity estimation outperforms the state-of-the-art stereo matching algorithm on the SHEF dataset.

preprint2021arXiv

Enhanced Modality Transition for Image Captioning

Image captioning model is a cross-modality knowledge discovery task, which targets at automatically describing an image with an informative and coherent sentence. To generate the captions, the previous encoder-decoder frameworks directly forward the visual vectors to the recurrent language model, forcing the recurrent units to generate a sentence based on the visual features. Although these sentences are generally readable, they still suffer from the lack of details and highlights, due to the fact that the substantial gap between the image and text modalities is not sufficiently addressed. In this work, we explicitly build a Modality Transition Module (MTM) to transfer visual features into semantic representations before forwarding them to the language model. During the training phase, the modality transition network is optimised by the proposed modality loss, which compares the generated preliminary textual encodings with the target sentence vectors from a pre-trained text auto-encoder. In this way, the visual vectors are transited into the textual subspace for more contextual and precise language generation. The novel MTM can be incorporated into most of the existing methods. Extensive experiments have been conducted on the MS-COCO dataset demonstrating the effectiveness of the proposed framework, improving the performance by 3.4% comparing to the state-of-the-arts.

preprint2021arXiv

OpenQA: Hybrid QA System Relying on Structured Knowledge Base as well as Non-structured Data

Search engines based on keyword retrieval can no longer adapt to the way of information acquisition in the era of intelligent Internet of Things due to the return of keyword related Internet pages. How to quickly, accurately and effectively obtain the information needed by users from massive Internet data has become one of the key issues urgently needed to be solved. We propose an intelligent question-answering system based on structured KB and unstructured data, called OpenQA, in which users can give query questions and the model can quickly give accurate answers back to users. We integrate KBQA structured question answering based on semantic parsing and deep representation learning, and two-stage unstructured question answering based on retrieval and neural machine reading comprehension into OpenQA, and return the final answer with the highest probability through the Transformer answer selection module in OpenQA. We carry out preliminary experiments on our constructed dataset, and the experimental results prove the effectiveness of the proposed intelligent question answering system. At the same time, the core technology of each module of OpenQA platform is still in the forefront of academic hot spots, and the theoretical essence and enrichment of OpenQA will be further explored based on these academic hot spots.

preprint2020arXiv

BiDet: An Efficient Binarized Object Detector

In this paper, we propose a binarized neural network learning method called BiDet for efficient object detection. Conventional network binarization methods directly quantize the weights and activations in one-stage or two-stage detectors with constrained representational capacity, so that the information redundancy in the networks causes numerous false positives and degrades the performance significantly. On the contrary, our BiDet fully utilizes the representational capacity of the binary neural networks for object detection by redundancy removal, through which the detection precision is enhanced with alleviated false positives. Specifically, we generalize the information bottleneck (IB) principle to object detection, where the amount of information in the high-level feature maps is constrained and the mutual information between the feature maps and object detection is maximized. Meanwhile, we learn sparse object priors so that the posteriors are concentrated on informative detection prediction with false positive elimination. Extensive experiments on the PASCAL VOC and COCO datasets show that our method outperforms the state-of-the-art binary neural networks by a sizable margin.

preprint2020arXiv

Ekpyrotic Cosmology with a Zero-Shear S-Brane

In a recent paper we proposed a mechanism for a continuous transition between a contracting Ekpyrotic phase and the Standard Big Bang phase of expansion: the bounce is generated by an S-brane which represents the effects of higher mass string states in the low energy effective field theory. We showed that gravitational waves on cosmological scales obtain a nearly scale-invariant spectrum. Here, we study the cosmological fluctuations in this setup, assuming that the S-brane has zero shear. We find a nearly scale-invariant spectrum of cosmological perturbations with a slight red tilt. The scenario yields two consistency relations for cosmological observations, the first one relating the tensor to scalar ratio with the scalar spectral tilt, the second relating the tensor tilt to the scalar tilt. The predicted tensor to scalar ratio is within the reach of upcoming CMB observations. The tensor tilt is blue.

preprint2020arXiv

Light Dark Photon Dark Matter from Inflation

We discuss the possibility of producing a light dark photon dark matter through a coupling between the dark photon field and the inflaton. The dark photon with a large wavelength is efficiently produced due to the inflaton motion during inflation and becomes non-relativistic before the time of matter-radiation equality. We compute the amount of production analytically. The correct relic abundance is realized with a dark photon mass extending down to $10^{-21} \, \rm eV$.

preprint2020arXiv

Nonsingular Ekpyrotic Cosmology with a Nearly Scale-Invariant Spectrum of Cosmological Perturbations and Gravitational Waves

We propose a mechanism borrowed from string theory which yields a non-singular transition from a phase of Ekpyrotic contraction to the expanding phase of Standard Big Bang cosmology. The same mechanism converts the initial vacuum spectrum of cosmological fluctuations before the bounce into a scale-invariant one, and also changes the spectrum of gravitational waves into an almost scale-invariant one. The scalar and tensor tilts are predicted to be the same, in contrast to the predictions from the "String Gas Cosmology" scenario. The amplitude of the gravitational wave spectrum depends on the ratio of the string scale to the Planck scale and may be in reach of upcoming experiments.

preprint2020arXiv

ORD: Object Relationship Discovery for Visual Dialogue Generation

With the rapid advancement of image captioning and visual question answering at single-round level, the question of how to generate multi-round dialogue about visual content has not yet been well explored.Existing visual dialogue methods encode the image into a fixed feature vector directly, concatenated with the question and history embeddings to predict the response.Some recent methods tackle the co-reference resolution problem using co-attention mechanism to cross-refer relevant elements from the image, history, and the target question.However, it remains challenging to reason visual relationships, since the fine-grained object-level information is omitted before co-attentive reasoning. In this paper, we propose an object relationship discovery (ORD) framework to preserve the object interactions for visual dialogue generation. Specifically, a hierarchical graph convolutional network (HierGCN) is proposed to retain the object nodes and neighbour relationships locally, and then refines the object-object connections globally to obtain the final graph embeddings. A graph attention is further incorporated to dynamically attend to this graph-structured representation at the response reasoning stage. Extensive experiments have proved that the proposed method can significantly improve the quality of dialogue by utilising the contextual information of visual relationships. The model achieves superior performance over the state-of-the-art methods on the Visual Dialog dataset, increasing MRR from 0.6222 to 0.6447, and recall@1 from 48.48% to 51.22%.

preprint2020arXiv

Reheating after S-Brane Ekpyrosis

In recent work, two of us proposed a nonsingular Ekpyrotic cosmology making use of an S-brane which forms at the end of the phase of Ekpyrotic contraction. This S-Brane mediates a transition between contraction and expansion. Graviitational waves passing through the S-Brane acquire a roughly scale-invariant spectrum, and if the S-Brane has zero shear, then a roughly scale-invariant spectrum of cosmological perturbatiions results. Here, we study the production of gauge field fluctuations driven by the decay of the S-Brane, and we show that the reheating process via gauge field production will be efficient, leading to a radiation-dominated expanding phase.

preprint2020arXiv

The prescribed time sliding mode control for attitude tracking of spacecraft

With the development of the space missions, there are extensive missions in the demand of the prescribed time convergence. However, it is still a difficult work to combine the prescribed time method with the sliding mode control due to the infinite gain of the prescribed time method while approaching the prescribed time and two periods of sliding mode control. In this paper, a new prescribed time sliding mode control method is proposed for general systems with matched disturbances, from the second-order system to the high-order system. A novel sliding mode variable with explicit time term is designed for achieving the prescribed time convergence. More importantly, as time approaches the prescribed time, the singularity of control input can be avoided. Finally, this paper presents a disturbance observer based prescribed time sliding mode control method for attitude tracking of spacecraft and the efficiency of this method has been verified through the numerical simulations.

preprint2019arXiv

Tunable High-Quality Fano Resonance in Coupled Terahertz Whispering-Gallery-Mode Resonators

Fano resonance is widely discussed in designing novel terahertz components, such as sensors, filters, modulators, and group delay modules. Usually, high quality (Q) factor and flexible tunability of Fano resonance are key requirements for these applications. Here, we present tunable terahertz Fano resonance with a Q factor of 2095 at 0.439 THz in coupled terahertz whispering-gallery-mode resonators (WGMRs). Coupling between a relatively low Q (578) quartz ring and a high Q (2095) silicon ring is employed to generate the Fano resonance. The resonant frequency of the Fano resonance can be actively manipulated by tuning the resonant frequency of the high Q WGMR, which is achieved through utilizing an electrical thermo-optic tuning method, meanwhile, the resonance intensity of the Fano resonance can be engineered by adjusting the coupling strength between two WGMRs. This coupled-WGMR scheme delivers high Q tunable Fano resonance and may contribute to the design of high-performance configurable terahertz devices.