Source author record

Renjing Xu

Renjing Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci Artificial Intelligence Computer Vision cond-mat.mes-hall Robotics Computation and Language eess.AS Machine Learning Multimedia physics.atom-ph physics.optics Sound

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Large Language Models (LLMs) demonstrate impressive zero-shot performance across a wide range of natural language processing tasks. Integrating various modality encoders further expands their capabilities, giving rise to Multimodal Large Language Models (MLLMs) that process not only text but also visual and auditory modality inputs. However, these advanced capabilities may also pose significant safety problems, as models can be exploited to generate harmful or inappropriate content through jailbreak attacks. While prior work has extensively explored how manipulating textual or visual modality inputs can circumvent safeguards in LLMs and MLLMs, the vulnerability of audio-specific jailbreak on Large Audio-Language Models (LALMs) remains largely underexplored. To address this gap, we introduce Jailbreak-AudioBench, which consists of the Toolbox, curated Dataset, and comprehensive Benchmark. The Toolbox supports not only text-to-audio conversion but also various editing techniques for injecting audio hidden semantics. The curated Dataset provides diverse explicit and implicit jailbreak audio examples in both original and edited forms. Utilizing this dataset, we evaluate multiple state-of-the-art LALMs and establish the most comprehensive Jailbreak benchmark to date for audio modality. Finally, Jailbreak-AudioBench establishes a foundation for advancing future research on LALMs safety alignment by enabling the in-depth exposure of more powerful jailbreak threats, such as query-based audio editing, and by facilitating the development of effective defense mechanisms.

preprint2026arXiv

Spiking Neural Networks Need High Frequency Information

Spiking Neural Networks promise brain-inspired and energy-efficient computation by transmitting information through binary (0/1) spikes. Yet, their performance still lags behind that of artificial neural networks, often assumed to result from information loss caused by sparse and binary activations. In this work, we challenge this long-standing assumption and reveal a previously overlooked frequency bias: spiking neurons inherently suppress high-frequency components and preferentially propagate low-frequency information. This frequency-domain imbalance, we argue, is the root cause of degraded feature representation in SNNs. Empirically, on Spiking Transformers, adopting Avg-Pooling (low-pass) for token mixing lowers performance to 76.73% on Cifar-100, whereas replacing it with Max-Pool (high-pass) pushes the top-1 accuracy to 79.12%. Accordingly, we introduce Max-Former that restores high-frequency signals through two frequency-enhancing operators: (1) extra Max-Pool in patch embedding, and (2) Depth-Wise Convolution in place of self-attention. Notably, Max-Former attains 82.39% top-1 accuracy on ImageNet using only 63.99M parameters, surpassing Spikformer (74.81%, 66.34M) by +7.58%. Extending our insight beyond transformers, our Max-ResNet-18 achieves state-of-the-art performance on convolution-based benchmarks: 97.17% on CIFAR-10 and 83.06% on CIFAR-100. We hope this simple yet effective solution inspires future research to explore the distinctive nature of spiking neural networks. Code is available: https://github.com/bic-L/MaxFormer.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2026arXiv

What Limits Vision-and-Language Navigation ?

Vision-and-Language Navigation (VLN) is a cornerstone of embodied intelligence. However, current agents often suffer from significant performance degradation when transitioning from simulation to real-world deployment, primarily due to perceptual instability (e.g., lighting variations and motion blur) and under-specified instructions. While existing methods attempt to bridge this gap by scaling up model size and training data, we argue that the bottleneck lies in the lack of robust spatial grounding and cross-domain priors. In this paper, we propose StereoNav, a robust Vision-Language-Action framework designed to enhance real-world navigation consistency. To address the inherent gap between synthetic training and physical execution, we introduce Target-Location Priors as a persistent bridge. These priors provide stable visual guidance that remains invariant across domains, effectively grounding the agent even when instructions are vague. Furthermore, to mitigate visual disturbances like motion blur and illumination shifts, StereoNav leverages stereo vision to construct a unified representation of semantics and geometry, enabling precise action prediction through enhanced depth awareness. Extensive experiments on R2R-CE and RxR-CE demonstrate that StereoNav achieves state-of-the-art egocentric RGB performance, with SR and SPL scores of 81.1% and 68.3%, and 67.5% and 52.0%, respectively, while using significantly fewer parameters and less training data than prior scaling-based approaches. More importantly, real-world robotic deployments confirm that StereoNav substantially improves navigation reliability in complex, unstructured environments. Project page: https://yunheng-wang.github.io/stereonav-public.github.io.

preprint2016arXiv

Exciton Brightening in Monolayer Phosphorene via Dimensionality Modification

Two-dimensional (2D) monolayer phosphorene, a 2D system with quasi-one-dimensional (quasi-1D) excitons, provides a unique 2D platform for investigating the dynamics of excitons in reduced dimensions and fundamental many-body interactions. However, on the other hand, the quasi-1D excitonic nature can limit the luminescence quantum yield significantly. Here, we report exciton brightening in monolayer phosphorene achieved via the dimensionality modification of excitons from quasi-1D to zero-dimensional (0D), through the transference of monolayer phosphorene samples onto defect-rich oxide substrate deposited by plasma-enhanced chemical vapor deposition (PECVD). The resultant interfacial luminescent local states lead to exciton localization and trigger efficient photon emissions at a new wavelength of ~920 nm. The luminescence quantum yield of 0D-like localized excitons is measured to be at least 33.6 times larger than that of intrinsic quasi-1D free excitons in monolayer phosphorene. This is primarily due to the reduction of non-radiative decay rate and the possibly enhanced radiative recombination probability. Owing to the large trapping energy, this new photon emission from the localized excitons in monolayer phosphorene can be observed at elevated temperature, which contrasts markedly with defect-induced photon emission from transition metal dichalcogenide (TMD) semiconductor monolayers that can only be observed at cryogenic temperatures. Our findings introduce new avenues for the development of novel photonic devices based on monolayer phosphorene, such as near-infrared lighting devices that are operable at elevated temperature. More importantly, 2D phosphorene with quasi-1D free excitons and 0D-like localized excitons provides a unique platform to investigate the fundamental phenomena in the ideal 2D-1D-0D hybrid system.

preprint2015arXiv

Layer-dependent surface potential of phosphorene and anisotropic/layer-dependent charge transfer in phosphorene-gold hybrid system

The surface potential and the efficiency of interfacial charge transfer are extremely important for designing future semiconductor devices based on the emerging two-dimensional (2D) phosphorene. Here, we directly measured the strongly layer-dependent surface potential of mono- and few-layer phosphorene on gold, which confirms with the reported theoretical prediction. At the same time, we used an optical way - photoluminescence (PL) spectroscopy to probe the charge transfer in phosphorene-gold hybrid system. We firstly observed highly anisotropic and layer-dependent PL quenching in the phosphorene-gold hybrid system, which is attributed to the highly anisotropic/layer-dependent interfacial charge transfer.

preprint2015arXiv

Optical Tuning of Exciton and Trion Emissions in Monolayer Phosphorene

Monolayer phosphorene provides a unique two-dimensional (2D) platform to investigate the fundamental dynamics of excitons and trions (charged excitons) in reduced dimensions. However, owing to its high instability, unambiguous identification of monolayer phosphorene has been elusive. Consequently, many important fundamental properties, such as exciton dynamics, remain underexplored. We report a rapid, noninvasive, and highly accurate approach based on optical interferometry to determine the layer number of phosphorene, and confirm the results with reliable photoluminescence measurements. Furthermore, we successfully probed the dynamics of excitons and trions in monolayer phosphorene by controlling the photo-carrier injection in a relatively low excitation power range. Based on our measured optical gap and the previously measured electronic energy gap, we determined the exciton binding energy to be ~0.3 eV for the monolayer phosphorene on SiO2/Si substrate, which agrees well with theoretical predictions. A huge trion binding energy of ~100 meV was first observed in monolayer phosphorene, which is around five times higher than that in transition metal dichalcogenide (TMD) monolayer semiconductor, such as MoS2. The carrier lifetime of exciton emission in monolayer phosphorene was measured to be ~220 ps, which is comparable to those in other 2D TMD semiconductors. Our results open new avenues for exploring fundamental phenomena and novel optoelectronic applications using monolayer phosphorene.

preprint2014arXiv

Atomically Thin Optical Lenses and Gratings

Two-dimensional (2D) materials have emerged as promising candidates for miniaturized optoelectronic devices, due to their strong inelastic interactions with light. On the other hand, a miniaturized optical system also requires strong elastic light-matter interactions to control the flow of light. Here, we report giant optical path length (OPL) from a single-layer molybdenum disulfide (MoS2), which is around one order of magnitude larger than that from a single-layer graphene. Using such giant OPL to engineer the phase front of optical beams, we demonstrated, to the best of our knowledge, the world's thinnest optical lens consisting of a few layers of MoS2 less than 6.3 nm thick. Moreover, we show that MoS2 has much better dielectric response than good conductor (like gold) and other dielectric materials (like Si, SiO2 or graphene). By taking advantage of the giant elastic scattering efficiency in ultra-thin high-index 2D materials, we demonstrated high-efficiency gratings based on a single- or few-layers of MoS2. The capability of manipulating the flow of light in 2D materials opens an exciting avenue towards unprecedented miniaturization of optical components and the integration of advanced optical functionalities.

preprint2014arXiv

Extraordinarily bound quasi-one-dimensional trions in two-dimensional phosphorene atomic semiconductors

The anisotropic nature of the new two-dimensional (2D) material phosphorene, in contrast to other 2D materials such as graphene and transition metal dichalcogenide (TMD) semiconductors, allows excitons to be confined in a quasi-one-dimensional (1D) space predicted in theory, leading to remarkable phenomena arising from the reduced dimensionality and screening. Here, we report a trion (charged exciton) binding energy of 190 meV in few-layer phosphorene at room temperature, which is nearly one to two orders of magnitude larger than those in 2D TMD semiconductors (20-30 meV) and quasi-2D quantum wells (1-5 meV). Such a large binding energy has only been observed in truly 1D materials such as carbon nanotubes, whose optoelectronic applications have been severely hurdled by their intrinsically small optical cross-sections. Phosphorene offers an elegant way to overcome this hurdle by enabling quasi-1D excitonic and trionic behaviors in a large 2D area, allowing optoelectronic integration. We experimentally validated the quasi-1D nature of excitonic and trionic dynamics in phospherene by demonstrating completely linearly polarized light emission from excitons and trions. The implications of the extraordinarily large trion binding energy in a higher-than-one-dimensional material are far-reaching. It provides a room-temperature 2D platform to observe the fundamental many-body interactions in the quasi-1D region. The strong photoluminescence emission in phosphorene has been electrically tuned over a large spectral range at room temperature, which opens a new route for tunable light sources.

preprint2014arXiv

Extraordinary Photoluminescence and Strong Temperature/Angle-Dependent Raman Responses in Few-Layer Phosphorene

Phosphorene is a new family member of two-dimensional materials. We observed strong and highly layer-dependent photoluminescence in few-layer phosphorene (2 to 5 layers). The results confirmed the theoretical prediction that few-layer phosphorene has a direct and layer-sensitive band gap. We also demonstrated that few-layer phosphorene is more sensitive to temperature modulation than graphene and MoS2 in Raman scattering. The anisotropic Raman response in few-layer phosphorene has enabled us to use an optical method to quickly determine the crystalline orientation without tunneling electron microscope (TEM) or scanning tunneling microscope (STM). Our results provide much needed experimental information about the band structures and exciton nature in few-layer phosphorene.

preprint2014arXiv

Review of Strong Field Approximation and Investigating Semiclassical Evolution Approach

This paper theoretically analyzes the behavior of an atom driven by a strong electro-magnetic field. Moreover, besides traditional quantum mechanics method, we also investigate semiclassical approaches to this problem. We first performed strong field approximation for system of an atom driven by a strong electromagnetic field in velocity gauge. Our simulation result is consistent with theories and close to experiments except some reasonable difference caused by different parameters and omitted bound and final states in the transition amplitude. Next, a new semiclassical approach is used to solve Volkov wave function. We prove that semiclassical approximation works well in predicting the particle evolution in quantum world, especially for system in a strong electromagnetic field with low frequency. Finally, we also briefly illustrated how to use semiclassical approximation to get the same results as strong field approximation and also partly tested the viability of the semiclassical approach to Teller Potential. This part needs future work to accomplish. But still, semiclassical approximation provides a potentially new method to solve complicated system, which might be more effective than traditional quantum mechanics recipe.

preprint2014arXiv

Unambiguous identification of monolayer phosphorene by phase-shifting interferometry

Monolayer phosphorene provides a unique two-dimensional (2D) platform to investigate the fundamental many-body interactions. However, owing to its high instability, unambiguous identification of monolayer phosphorene has been elusive. Consequently, many important fundamental properties, such as exciton dynamics, remain underexplored. We report a rapid, noninvasive, and highly accurate approach based on optical interferometry to determine the layer number of phosphorene, and confirm the results with reliable photoluminescence measurements. Based on the measured optical gap and the calculated electronic energy gap, we determined the exciton binding energy to be ~0.4 eV for the monolayer phosphorene on SiO2/Si substrate, which agrees well with theoretical predictions. Our results open new avenues for exploring fundamental phenomena and novel optoelectronic applications using monolayer phosphorene.

Renjing Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Spiking Neural Networks Need High Frequency Information

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

What Limits Vision-and-Language Navigation ?

Exciton Brightening in Monolayer Phosphorene via Dimensionality Modification

Layer-dependent surface potential of phosphorene and anisotropic/layer-dependent charge transfer in phosphorene-gold hybrid system

Optical Tuning of Exciton and Trion Emissions in Monolayer Phosphorene

Atomically Thin Optical Lenses and Gratings

Extraordinarily bound quasi-one-dimensional trions in two-dimensional phosphorene atomic semiconductors

Extraordinary Photoluminescence and Strong Temperature/Angle-Dependent Raman Responses in Few-Layer Phosphorene

Review of Strong Field Approximation and Investigating Semiclassical Evolution Approach

Unambiguous identification of monolayer phosphorene by phase-shifting interferometry