Researcher profile

Xue Yang

Xue Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2026arXiv

ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandbox

Current LLM agents are proficient at calling isolated APIs but struggle with the "last mile" of commercial software automation. In real-world scenarios, tools are not independent; they are atomic, interdependent, and prone to environmental noise. We introduce $\textbf{ComplexMCP}$, a benchmark designed to evaluate agents in these rigorous conditions. Built on the Model Context Protocol (MCP), $\textbf{ComplexMCP}$ provides over 300 meticulously tested tools derived from 7 stateful sandboxes, ranging from office suites to financial systems. Unlike existing datasets, our benchmark utilizes a seed-driven architecture to simulate dynamic environment states and unpredictable API failures, ensuring a deterministic yet diverse evaluation. We evaluate various LLMs across full-context and RAG paradigms, revealing a stark performance gap: even top-tier models fail to exceed a 60% success rate, far trailing human performance 90%. Granular trajectory analysis identifies three fundamental bottlenecks: (1) $\textbf{tool retrieval saturation}$ as action spaces scale; (2) $\textbf{over-confidence}$, where agents skip essential environment verifications; and (3) $\textbf{strategic defeatism}$, a tendency to rationalize failure rather than pursuing recovery. These findings underscore the insufficiency of current agents for interdependent workflows, positioning $\textbf{ComplexMCP}$ as a critical testbed for the next generation of resilient autonomous systems.

preprint2026arXiv

DPN-LE: Dual Personality Neuron Localization and Editing for Large Language Models

With the widespread adoption of large language models (LLMs), understanding their personality representation mechanisms has become critical. As a novel paradigm in Personality Editing, most existing methods employ neuron-editing to locate and modify LLM neurons, requiring changes to numerous neurons and leading to significant performance degradation. This raises a fundamental question: Are all modified neurons directly related to personality representation? In this work, we investigate and quantify this specificity through assessments of general capability impact and representation-level patterns. We find that: 1) Current methods can change personalities but reduce overall performance. 2) Neurons are multifunctional, connecting personality traits and general knowledge. 3) Opposing personality traits demonstrate distinctly mutually exclusive representation patterns. Motivated by these findings, we propose DPN-LE (Dual Personality Neuron Localization and Editing), which identifies personality-specific neurons by contrasting MLP activations between high-trait and low-trait samples. DPN-LE constructs layer-wise steering vectors and applies dual-criterion filtering based on Cohen's $d$ effect size and activation magnitude to isolate mutually exclusive neuron subsets. Sparse linear intervention on these neurons enables precise personality control at inference time. Using only 1,000 contrastive sample pairs per trait, DPN-LE intervenes on $\sim$0.5\% of neurons while achieving competitive personality control and substantially better capability preservation across reasoning tasks. Experiments on LLaMA-3-8B-Instruct and Qwen2.5-7B-Instruct demonstrate the effectiveness and generalizability of our approach.

preprint2026arXiv

SafeSteer: A Decoding-level Defense Mechanism for Multimodal Large Language Models

Multimodal large language models (MLLMs) are gaining increasing attention. Due to the heterogeneity of their input features, they face significant challenges in terms of jailbreak defenses. Current defense methods rely on costly fine-tuning or inefficient post-hoc interventions, limiting their ability to address novel attacks and involving performance trade-offs. To address the above issues, we explore the inherent safety capabilities within MLLMs and quantify their intrinsic ability to discern harmfulness at decoding stage. We observe that 1) MLLMs can distinguish the harmful and harmless inputs during decoding process, 2) Image-based attacks are more stealthy. Based on these insights, we introduce SafeSteer, a decoding-level defense mechanism for MLLMs. Specifically, it includes a Decoding-Probe, a lightweight probe for detecting and correcting harmful output during decoding, which iteratively steers the decoding process toward safety. Furthermore, a modal semantic alignment vector is integrated to transfer the strong textual safety alignment to the vision modality. Experiments on multiple MLLMs demonstrate that SafeSterr can improve MLLMs' safety by up to 33.40\% without fine-tuning. Notably, it can maintain the effectiveness of MLLMs, ensuring a balance between their helpfulness and harmlessness.

preprint2022arXiv

1T-FeS$_2$$:$ a new type of two-dimensional metallic ferromagnet

Discovery of intrinsic two-dimensional (2D) magnetic materials is crucial for understanding the fundamentals of 2D magnetism and realizing next-generation magnetoelectronic and magneto-optical devices. Although significant efforts have been devoted to identifying 2D magnetism by exfoliating bulk magnetic layered materials, seldom studies are performed to synthesize ultra-thin magnetic materials directly for non-layered magnetic materials. Here, we report the successful synthesis of a new type of theoretically proposed 2D metallic ferromagnet 1T FeS2, through the molten-salt-assisted chemical vapor deposition (CVD) method. The long-range 2D ferromagnetic order is confirmed by the observation of a large anomalous Hall effect (AHE) and a hysteretic magnetoresistance. The experimentally detected out-of-plane ferromagnetic ordering is theoretically suported with Stoner criterion. Our findings open up new possibilities to search novel 2D ferromagnets in non-layered compounds and render opportunities for realizing realistic ultra-thin spintronic devices.

preprint2022arXiv

A new entanglement measure based dual entropy

Quantum entropy is an important measure for describing the uncertainty of a quantum state, more uncertainty in subsystems implies stronger quantum entanglement between subsystems. Our goal in this work is to quantify bipartite entanglement using both von Neumann entropy and its complementary dual. We first propose a type of dual entropy from Shannon entropy. We define $S^{t}$-entropy entanglement based on von Neumann entropy and its complementary dual. This implies an analytic formula for two-qubit systems. We show that the monogamy properties of the $S^{t}$-entropy entanglement and the entanglement of formation are inequivalent for high-dimensional systems. We finally prove a new type of entanglement polygon inequality in terms of $S^{t}$-entropy entanglement for quantum entangled networks. These results show new features of multipartite entanglement in quantum information processing.

preprint2022arXiv

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent neural network (RNN) to advanced transformer, have been designed sophistically to improve separation performance. However, the state-of-the-art models usually suffer from several flaws related to the computation, such as large model size, huge memory consumption and computational complexity. To find the balance between the performance and computational efficiency and to further explore the modeling ability of traditional network structure, we combine RNN and a newly proposed variant of convolutional network to cope with speech separation problem. By embedding two RNNs into basic block of this variant with the help of dual-path strategy, the proposed network can effectively learn the local information and global dependency. Besides, a four-staged structure enables the separation procedure to be performed gradually at finer and finer scales as the feature dimension increases. The experimental results on various datasets have proven the effectiveness of the proposed method and shown that a trade-off between the separation performance and computational efficiency is well achieved.

preprint2022arXiv

Entanglement polygon inequality in qudit systems

Entanglement is one of important resources for quantum communication tasks. Most of results are focused on qubit entanglement. Our goal in this work is to characterize the multipartite high-dimensional entanglement. We firstly derive an entanglement polygon inequality for the $q$-concurrence, which manifests the relationship among all the "one-to-group" marginal entanglements in any multipartite qudit system. This implies lower and upper bounds for the marginal entanglement of any three-qudit system. We further extend to general entanglement distribution inequalities for high-dimensional entanglement in terms of the unified-$(r, s)$ entropy entanglement including Tsallis entropy, Rényi entropy, and von Neumann entropy entanglement as special cases. These results provide new insights into characterizing bipartite high-dimensional entanglement in quantum information processing.

preprint2022arXiv

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Existing rotated object detectors are mostly inherited from the horizontal detection paradigm, as the latter has evolved into a well-developed area. However, these detectors are difficult to perform prominently in high-precision detection due to the limitation of current regression loss design, especially for objects with large aspect ratios. Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection. We show that one essential challenge is how to modulate the coupled parameters in the rotation regression loss, as such the estimated parameters can influence to each other during the dynamic joint optimization, in an adaptive and synergetic way. Specifically, we first convert the rotated bounding box into a 2-D Gaussian distribution, and then calculate the Kullback-Leibler Divergence (KLD) between the Gaussian distributions as the regression loss. By analyzing the gradient of each parameter, we show that KLD (and its derivatives) can dynamically adjust the parameter gradients according to the characteristics of the object. It will adjust the importance (gradient weight) of the angle parameter according to the aspect ratio. This mechanism can be vital for high-precision detection as a slight angle error would cause a serious accuracy drop for large aspect ratios objects. More importantly, we have proved that KLD is scale invariant. We further show that the KLD loss can be degenerated into the popular $l_{n}$-norm loss for horizontal detection. Experimental results on seven datasets using different detectors show its consistent superiority, and codes are available at https://github.com/yangxue0827/RotationDetection and https://github.com/open-mmlab/mmrotate.

preprint2022arXiv

MMRotate: A Rotated Object Detection Benchmark using PyTorch

We present an open-source toolbox, named MMRotate, which provides a coherent algorithm framework of training, inferring, and evaluation for the popular rotated object detection algorithm based on deep learning. MMRotate implements 18 state-of-the-art algorithms and supports the three most frequently used angle definition methods. To facilitate future research and industrial applications of rotated object detection-related problems, we also provide a large number of trained models and detailed benchmarks to give insights into the performance of rotated object detection. MMRotate is publicly released at https://github.com/open-mmlab/mmrotate.

preprint2022arXiv

On the Arbitrary-Oriented Object Detection: Classification based Approaches Revisited

Arbitrary-oriented object detection has been a building block for rotation sensitive tasks. We first show that the boundary problem suffered in existing dominant regression-based rotation detectors, is caused by angular periodicity or corner ordering, according to the parameterization protocol. We also show that the root cause is that the ideal predictions can be out of the defined range. Accordingly, we transform the angular prediction task from a regression problem to a classification one. For the resulting circularly distributed angle classification problem, we first devise a Circular Smooth Label technique to handle the periodicity of angle and increase the error tolerance to adjacent angles. To reduce the excessive model parameters by Circular Smooth Label, we further design a Densely Coded Labels, which greatly reduces the length of the encoding. Finally, we further develop an object heading detection module, which can be useful when the exact heading orientation information is needed e.g. for ship and plane heading detection. We release our OHD-SJTU dataset and OHDet detector for heading detection. Extensive experimental results on three large-scale public datasets for aerial images i.e. DOTA, HRSC2016, OHD-SJTU, and face dataset FDDB, as well as scene text dataset ICDAR2015 and MLT, show the effectiveness of our approach.

preprint2022arXiv

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design. In this paper, we propose a novel regression loss based on Gaussian Wasserstein distance as a fundamental approach to solve the problem. Specifically, the rotated bounding box is converted to a 2-D Gaussian distribution, which enables to approximate the indifferentiable rotational IoU induced loss by the Gaussian Wasserstein distance (GWD) which can be learned efficiently by gradient back-propagation. GWD can still be informative for learning even there is no overlapping between two rotating bounding boxes which is often the case for small object detection. Thanks to its three unique properties, GWD can also elegantly solve the boundary discontinuity and square-like problem regardless how the bounding box is defined. Experiments on five datasets using different detectors show the effectiveness of our approach. Codes are available at https://github.com/yangxue0827/RotationDetection and https://github.com/open-mmlab/mmrotate.

preprint2022arXiv

SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing

Small and cluttered objects are common in real-world which are challenging for detection. The difficulty is further pronounced when the objects are rotated, as traditional detectors often routinely locate the objects in horizontal bounding box such that the region of interest is contaminated with background or nearby interleaved objects. In this paper, we first innovatively introduce the idea of denoising to object detection. Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects. To handle the rotation variation, we also add a novel IoU constant factor to the smooth L1 loss to address the long standing boundary problem, which to our analysis, is mainly caused by the periodicity of angular (PoA) and exchangeability of edges (EoE). By combing these two features, our proposed detector is termed as SCRDet++. Extensive experiments are performed on large aerial images public datasets DOTA, DIOR, UCAS-AOD as well as natural image dataset COCO, scene text dataset ICDAR2015, small traffic light dataset BSTLD and our released S$^2$TLD by this paper. The results show the effectiveness of our approach. The released dataset S2TLD is made public available, which contains 5,786 images with 14,130 traffic light instances across five categories.

preprint2022arXiv

Threshold solutions for nonlocal reaction diffusion equations

We study the Cauchy problem for nonlocal reaction diffusion equations with bistable nonlinearity in 1D spatial domain and investigate the asymptotic behaviors of solutions with a one-parameter family of monotonically increasing and compactly supported initial data. We show that for small values of the parameter the corresponding solutions decay to 0, while for large values the related solutions converge to 1 uniformly on compacts. Moreover, we prove that the transition from extinction (converging to 0) to propagation (converging to 1) is sharp. Numerical results are provided to verify the theoretical results.

preprint2021arXiv

General spin systems without genuinely multipartite nonlocality

There are multipartite entangled states in many-body systems which may be potential resources in various quantum applications. There are lots of methods to witness specific entangled systems. However, no efficient method is available to explore many-body systems without multipartite entanglement. It may provide necessary restrictions for experimental preparations of multipartite entanglement. Our goal is to solve this problem for spin systems. The maximal effective velocity with propagation of information is bounded in quantum spin systems with short-range interactions from Lieb-Robinson's inequalities. This implies two clustering theorems for ground states and thermal states. With these propagation relations, we show that both the gapped ground state and thermal state at an upper-bounded inverse temperature have no genuine multipartite nonlocality when disjoint regions are far away from each other. The present $n$-particle system shows only biseparable quantum correlations when the propagation relations show exponential decay. Similar result holds for spin systems with product states as initial states. These results show interesting features of quantum many-body systems with exponential decay of correlations.

preprint2021arXiv

Strong entanglement distribution of quantum networks

Large-scale quantum networks have been employed to overcome practical constraints of transmissions and storage for single entangled systems. Our goal in this article is to explore the strong entanglement distribution of quantum networks. We firstly show any connected network consisting of generalized EPR states and GHZ states satisfies strong CKW monogamy inequality in terms of bipartite entanglement measure. This reveals interesting feature of high-dimensional entanglement with local tensor decomposition going beyond qubit entanglement. We then apply the new entanglement distribution relation in entangled networks for getting quantum max-flow min-cut theorem in terms of von Neumann entropy and Rényi-$α$ entropy. We finally classify entangled quantum networks by distinguishing network configurations under local unitary operations. These results provide new insights into characterizing quantum networks in quantum information processing.

preprint2020arXiv

Comparison Network for One-Shot Conditional Object Detection

The current advances in object detection depend on large-scale datasets to get good performance. However, there may not always be sufficient samples in many scenarios, which leads to the research on few-shot detection as well as its extreme variation one-shot detection. In this paper, the one-shot detection has been formulated as a conditional probability problem. With this insight, a novel one-shot conditional object detection (OSCD) framework, referred as Comparison Network (ComparisonNet), has been proposed. Specifically, query and target image features are extracted through a Siamese network as mapped metrics of marginal probabilities. A two-stage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized non-linear conditional probability. Once trained, ComparisonNet can detect objects of both seen and unseen classes without further training, which also has the advantages including class-agnostic, training-free for unseen classes, and without catastrophic forgetting. Experiments show that the proposed approach achieves state-of-the-art performance on the proposed datasets of Fashion-MNIST and PASCAL VOC.

preprint2020arXiv

Multinomial Random Forest: Toward Consistency and Privacy-Preservation

Despite the impressive performance of random forests (RF), its theoretical properties have not been thoroughly understood. In this paper, we propose a novel RF framework, dubbed multinomial random forest (MRF), to analyze the \emph{consistency} and \emph{privacy-preservation}. Instead of deterministic greedy split rule or with simple randomness, the MRF adopts two impurity-based multinomial distributions to randomly select a split feature and a split value respectively. Theoretically, we prove the consistency of the proposed MRF and analyze its privacy-preservation within the framework of differential privacy. We also demonstrate with multiple datasets that its performance is on par with the standard RF. To the best of our knowledge, MRF is the first consistent RF variant that has comparable performance to the standard RF.

preprint2020arXiv

Novel hydrogen clathrate hydrate

We report a new hydrogen clathrate hydrate synthesized at 1.2 GPa and 298 K documented by single-crystal X-ray diffraction, Raman spectroscopy, and first-principles calculations. The oxygen sublattice of the new clathrate hydrate matches that of ice II, while hydrogen molecules are in the ring cavities, which results in the trigonal R3c or R-3c space group (proton ordered or disordered, respectively) and the composition of (H2O)6H2. Raman spectroscopy and theoretical calculations reveal a hydrogen disordered nature of the new phase C1', distinct from the well-known ordered C1 clathrate, to which this new structure transforms upon compression and/or cooling. This new clathrate phase can be viewed as a realization of a disordered ice II, unobserved before, in contrast to all other ordered ice structures.

preprint2020arXiv

Rectified Decision Trees: Exploring the Landscape of Interpretable and Effective Machine Learning

Interpretability and effectiveness are two essential and indispensable requirements for adopting machine learning methods in reality. In this paper, we propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT), to explore the possibility of fulfilling those requirements simultaneously. Specifically, we extend the splitting criteria and the ending condition of the standard decision trees, which allows training with soft labels while preserving the deterministic splitting paths. We then train the ReDT based on the soft label distilled from a well-trained teacher model through a novel jackknife-based method. Accordingly, ReDT preserves the excellent interpretable nature of the decision trees while having a relatively good performance. The effectiveness of adopting soft labels instead of hard ones is also analyzed empirically and theoretically. Surprisingly, experiments indicate that the introduction of soft labels also reduces the model size compared with the standard decision trees from the aspect of the total nodes and rules, which is an unexpected gift from the `dark knowledge' distilled from the teacher model.

preprint2020arXiv

Response solutions for wave equations with variable wave speed and periodic forcing

We consider a model of nonlinear wave equations with periodically varying wave speed and periodic external forcing. By imposing non-resonance conditions on the frequency, we establish the existence of the response solutions (i.e., periodic solutions with the same frequency as the forcing) for such a model in a Cantor set of asymptotically full measure. The proof relies on a Lyapunov--Schmidt reduction together with the Nash--Moser iteration.