Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
23topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2026arXiv

A general formula for walk determinants of rooted products with applications to DGS-graph constructions

For an $n$-vertex graph $G$, and a rooted graph $H^{(v)}$ with $v$ as the root, the rooted product graph $G\circ H^{(v)}$ is obtained from $G$ and $n$ copies of $H$ by identifying the root of the $i$th copy of $H$ with the $i$th vertex of $G$ for each $i$. As a refinement of the controllability criterion of $G\circ H^{(v)}$ obtained recently by Shan and Liu (2025), we obtain an explicit formula for the determinant of the walk matrix of $G\circ H^{(v)}$. Furthermore, for an important family of graphs $\mathcal{F}$ that are determined by their generalized spectrum (DGS), we introduce the concept of $\mathcal{F}$-preservers and provide a sufficient condition for a rooted graph to be an $\mathcal{F}$-preserver. A list of $\mathcal{F}$-preservers of small order is provided, which leads to many new infinite families of DGS-graphs using rooted products.

preprint2026arXiv

A Unified Masked Jigsaw Puzzle Framework for Vision and Language Models

In federated learning, Transformer, as a popular architecture, faces critical challenges in defending against gradient attacks and improving model performance in both Computer Vision (CV) and Natural Language Processing (NLP) tasks. It has been revealed that the gradient of Position Embeddings (PEs) in Transformer contains sufficient information, which can be used to reconstruct the input data. To mitigate this issue, we introduce a Masked Jigsaw Puzzle (MJP) framework. MJP starts with random token shuffling to break the token order, and then a learnable \textit{unknown (unk)} position embedding is used to mask out the PEs of the shuffled tokens. In this manner, the local spatial information which is encoded in the position embeddings is disrupted, and the models are forced to learn feature representations that are less reliant on the local spatial information. Notably, with the careful use of MJP, we can not only improve models' robustness against gradient attacks, but also boost their performance in both vision and text application scenarios, such as classification for images (\textit{e.g.,} ImageNet-1K) and sentiment analysis for text (\textit{e.g.,} Yelp and Amazon). Experimental results suggest that MJP is a unified framework for different Transformer-based models in both vision and language tasks. Code is publicly available via https://github.com/ywxsuperstar/transformerattack

preprint2026arXiv

Advancing Adaptive Multi-Stage Video Anomaly Reasoning: A Benchmark Dataset and Method

Recent progress in reasoning capabilities of Multimodal Large Language Models(MLLMs) has highlighted their potential for performing complex video understanding tasks. However, in the domain of Video Anomaly Detection and Understanding (VAD&U), existing MLLM-based methods are largely limited to anomaly localization or post-hoc description, lacking explicit reasoning processes, risk awareness, and decision-oriented interpretation. To address this gap, we define a new task termed Video Anomaly Reasoning (VAR), which elevates video anomaly analysis from descriptive understanding to structured, multi-stage reasoning. VAR explicitly requires models to perform progressive reasoning over anomalous events before answering anomaly-related questions, encompassing visual perception, causal interpretation, and risk-aware decision making. To support this task, we present a new dataset with 8,641 videos, where each video is annotated with diverse question types corresponding to different reasoning depths, totaling more than 50,000 samples, making it one of the largest datasets for video anomaly. The annotations are based on a structured Perception-Cognition-Action Chain-of-Thought (PerCoAct-CoT), which formalizes domain-specific reasoning priors for video anomaly understanding. This design enables systematic evaluation of multi-stage and adaptive anomaly reasoning. In addition, we propose Anomaly-Aware Group Relative Policy Optimization to further enhance reasoning reliability under weak supervision. Building upon the proposed task and dataset, we develop an end-to-end MLLM-based VAR model termed Vad-R1-Plus, which supports adaptive hierarchical reasoning and risk-aware decision making. Extensive experiments demonstrate that the proposed benchmark and method effectively advance the reasoning capabilities of MLLMs on VAR tasks, outperforming both open-source and proprietary baselines.

preprint2026arXiv

BIDO: An Out-Of-Distribution Resistant Image-based Malware Detector

While image-based detectors have shown promise in Android malware detection, they often struggle to maintain their performance and interpretability when encountering out-of-distribution (OOD) samples. Specifically, OOD samples generated by code obfuscation and concept drift exhibit distributions that significantly deviate from the detector's training data. Such shifts not only severely undermine the generalisation of detectors to OOD samples but also compromise the reliability of their associated interpretations. To address these challenges, we propose BIDO, a novel generative classifier that reformulates malware detection as a likelihood estimation task. Unlike conventional discriminative methods, BIDO jointly produces classification results and interpretations by explicitly modeling class-conditional distributions, thereby resolving the long-standing separation between detection and explanation. Empirical results demonstrate that BIDO substantially enhances robustness against extreme obfuscation and concept drift while achieving reliable interpretation without sacrificing performance. The source code is available at https://github.com/whatishope/BIDO/.

preprint2026arXiv

CoSER: A Comprehensive Literary Dataset and Framework for Training and Evaluating LLM Role-Playing and Persona Simulation

Role-playing language agents (RPLAs) have emerged as promising applications of large language models (LLMs). However, simulating established characters presents a challenging task for RPLAs, due to the lack of authentic character datasets and nuanced evaluation methods using such data. In this paper, we present CoSER, a collection of a high-quality dataset, open models, and an evaluation protocol towards effective RPLAs of established characters. The CoSER dataset covers 17,966 characters from 771 renowned books. It provides authentic dialogues with real-world intricacies, as well as diverse data types such as conversation setups, character experiences and internal thoughts. Drawing from acting methodology, we introduce given-circumstance acting for training and evaluating role-playing LLMs, where LLMs sequentially portray multiple characters in book scenes. Using our dataset, we develop CoSER 8B and CoSER 70B, i.e., advanced open role-playing LLMs built on LLaMA-3.1 models. Extensive experiments demonstrate the value of the CoSER dataset for RPLA training, evaluation and retrieval. Moreover, CoSER 70B exhibits state-of-the-art performance surpassing or matching GPT-4o on our evaluation and three existing benchmarks, i.e., achieving 75.80% and 93.47% accuracy on the InCharacter and LifeChoice benchmarks respectively.

preprint2026arXiv

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore (C/R) of this state is needed for fault tolerance, spot execution, RL rollout branching, and safe rollback-yet existing approaches fall into two extremes: application-level recovery preserves chat history but misses OS-side effects, while full per-turn checkpointing is correct but too expensive under dense co-location. The root cause is an agent-OS semantic gap: agent frameworks see tool calls but not their OS effects; the OS sees state changes but lacks turn-level context to judge recovery relevance. This gap hides massive sparsity: over 75% of agent turns produce no recovery-relevant state, so most checkpoints are unnecessary. Crab (Checkpoint-and-Restore for Agent SandBoxes) is a transparent host-side runtime that bridges this gap without modifying agents or C/R backends. An eBPF-based inspector classifies each turn's OS-visible effects to decide checkpoint granularity; a coordinator aligns checkpoints with turn boundaries and overlaps C/R with LLM wait time; and a host-scoped engine schedules checkpoint traffic across co-located sandboxes. On shell-intensive and code-repair workloads, Crab raises recovery correctness from 8% (chat-only) to 100%, cuts checkpoint traffic by up to 87%, and stays within 1.9% of fault-free execution time.

preprint2026arXiv

Fabry-Pérot Metacavities with Single-Layered Dielectric Metamirrors

The Fabry-Pérot resonator is a cornerstone of photonics and wave physics, providing a universal mechanism for spectral confinement and resonant enhancement of wave-matter interactions. In this work, we establish an analytically tractable class of Fabry-Pérot metacavities in which the reflecting elements are realized by single-layer periodic arrays of circular dielectric cylinders acting as metamirrors. Both the reflection efficiency and reflection phase of such metamirrors are obtained in closed form and shown to be widely and independently tunable, encompassing ideal electric and magnetic mirror limits with unit reflectivity. Building on these results, we derive explicit analytical expressions that fully describe the optical responses of Fabry-Pérot cavities composed of two such parallel metamirrors. Our combined analytical and numerical investigations reveal that these metamirrors provide exceptional flexibility for tailoring Fabry-Pérot resonances across a broad spectral range, enabling precise control over resonance positions and quality factors. In particular, the framework naturally predicts the emergence of Fabry-Pérot bound states in the continuum with formally infinite Q-factors. These results establish dielectric-metamirror-based Fabry-Pérot cavities as a versatile and fundamentally transparent platform for engineering high-Q optical resonances.

preprint2026arXiv

FitText: Evolving Agent Tool Ecologies via Memetic Retrieval

A semantic gap separates how users describe tasks from how tools are documented. As API ecosystems scale to tens of thousands of endpoints, static retrieval from the initial query alone cannot bridge this gap: the agent's understanding of what it needs evolves during execution, but its tool set does not. We introduce FitText, a training-free framework that makes retrieval dynamic by embedding it directly in the agent's reasoning loop. FitText generates natural-language pseudo-tool descriptions as retrieval probes, refines them iteratively using retrieval feedback, and explores diverse alternatives through stochastic generation. Memetic Retrieval adds evolutionary selection pressure over candidate descriptions, guided by a tool memory that avoids redundant search. On ToolRet (43k tools, 4 domains), FitText improves average retrieval rank from 8.81 to 2.78; on StableToolBench (16,464 APIs), it achieves a 0.73 average pass rate--a 24-point absolute gain over static query retrieval. The gains transfer across base models capable of acting as competent semantic operators; under weaker base models, Memetic's evolutionary search inverts--amplifying noise rather than refining signal--surfacing model capacity as a prerequisite for evolutionary tool exploration.

preprint2026arXiv

From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition

Managing extensive context remains a critical bottleneck for Large Language Models (LLMs), particularly in applications like long-document question answering and autonomous agents where lengthy inputs incur high computational costs and introduce noise. Existing compression techniques often disrupt local coherence through discrete token removal or rely on implicit latent encoding that suffers from positional bias and incompatibility with closed-source APIs. To address these limitations, we introduce the EDU-based Context Compressor, a novel explicit compression framework designed to preserve both global structure and fine-grained details. Our approach reformulates context compression as a structure-then-select process. First, our LingoEDU transforms linear text into a structural relation tree of Elementary Discourse Units (EDUs) which are anchored strictly to source indices to eliminate hallucination. Second, a lightweight ranking module selects query-relevant sub-trees for linearization. To rigorously evaluate structural understanding, we release StructBench, a manually annotated dataset of 248 diverse documents. Empirical results demonstrate that our method achieves state-of-the-art structural prediction accuracy and significantly outperforms frontier LLMs while reducing costs. Furthermore, our structure-aware compression substantially enhances performance across downstream tasks ranging from long-context tasks to complex Deep Search scenarios.

preprint2026arXiv

GPS-Synchronized Monitoring of Core-collapse Supernova Bursts with PandaX-4T via Coherent Elastic Neutrino Nuclear Scattering

The landmark detection of neutrinos from SN1987A marked the dawn of neutrino astrophysics. The neutrino burst provided essential insights into fundamental properties of neutrinos, and served as key probes of stellar evolution and supernova dynamics. The recent advancement in coherent elastic neutrino-nucleus scattering enables the detection of core-collapse supernova burst neutrinos using tonne-scale liquid xenon detectors originally designed for dark matter direct detection. Leveraging this capability, we developed and deployed an online supernova monitoring system for the PandaX-4T experiment. This system features a GPS module with millisecond-level timing precision, a low false-alarm rate, and high sensitivity to galactic core-collapse supernova explosion events. The methodology is robust, directly scalable, and planned for implementation in the next-generation PandaX-20T experiment.

preprint2026arXiv

ISCS: Parameter-Guided Feature Pruning for Resource-Constrained Embodied Perception

Prior studies in embodied AI consistently show that robust perception is critical for human-robot interaction, yet deploying high-fidelity visual models on resource-constrained agents remains challenging due to limited on-device computation power and transmission latency. Exploiting the redundancy in latent representations could improve system efficiency, yet existing approaches often rely on costly dataset-specific ablation tests or heavy entropy models unsuitable for real-time edge-robot collaboration. We propose a generalizable, dataset-agnostic method to identify and selectively transmit structure-critical channels in pretrained encoders. Instead of brute-force empirical evaluations, our approach leverages intrinsic parameter statistics-weight variances and biases-to estimate channel importance. This analysis reveals a consistent organizational structure, termed the Invariant Salient Channel Space (ISCS), where Salient-Core channels capture dominant structures while Salient-Auxiliary channels encode fine visual details. Building on ISCS, we introduce a deterministic static pruning strategy that enables lightweight split-computing. Experiments across different datasets demonstrate that our method achieves a deterministic, ultra-low latency pipeline by bypassing heavy entropy modeling. Our method reduces end-to-end latency, providing a critical speed-accuracy trade-off for resource-constrained human-aware embodied systems.

preprint2026arXiv

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Traditional workflow-based agents exhibit limited intelligence when addressing real-world problems requiring tool invocation. Tool-integrated reasoning (TIR) agents capable of autonomous reasoning and tool invocation are rapidly emerging as a powerful approach for complex decision-making tasks involving multi-step interactions with external environments. In this work, we introduce MindWatcher, a TIR agent integrating interleaved thinking and multimodal chain-of-thought (CoT) reasoning. MindWatcher can autonomously decide whether and how to invoke diverse tools and coordinate their use, without relying on human prompts or workflows. The interleaved thinking paradigm enables the model to switch between thinking and tool calling at any intermediate stage, while its multimodal CoT capability allows manipulation of images during reasoning to yield more precise search results. We implement automated data auditing and evaluation pipelines, complemented by manually curated high-quality datasets for training, and we construct a benchmark, called MindWatcher-Evaluate Bench (MWE-Bench), to evaluate its performance. MindWatcher is equipped with a comprehensive suite of auxiliary reasoning tools, enabling it to address broad-domain multimodal problems. A large-scale, high-quality local image retrieval database, covering eight categories including cars, animals, and plants, endows model with robust object recognition despite its small size. Finally, we design a more efficient training infrastructure for MindWatcher, enhancing training speed and hardware utilization. Experiments not only demonstrate that MindWatcher matches or exceeds the performance of larger or more recent models through superior tool invocation, but also uncover critical insights for agent training, such as the genetic inheritance phenomenon in agentic RL.

preprint2026arXiv

On Exact Editing of Flow-Based Diffusion Models

Recent methods in flow-based diffusion editing have enabled direct transformations between source and target image distribution without explicit inversion. However, the latent trajectories in these methods often exhibit accumulated velocity errors, leading to semantic inconsistency and loss of structural fidelity. We propose Conditioned Velocity Correction (CVC), a principled framework that reformulates flow-based editing as a distribution transformation problem driven by a known source prior. CVC rethinks the role of velocity in inter-distribution transformation by introducing a dual-perspective velocity conversion mechanism. This mechanism explicitly decomposes the latent evolution into two components: a structure-preserving branch that remains consistent with the source trajectory, and a semantically-guided branch that drives a controlled deviation toward the target distribution. The conditional velocity field exhibits an absolute velocity error relative to the true underlying distribution trajectory, which inherently introduces potential instability and trajectory drift in the latent space. To address this quantifiable deviation and maintain fidelity to the true flow, we apply a posterior-consistent update to the resulting conditional velocity field. This update is derived from Empirical Bayes Inference and Tweedie correction, which ensures a mathematically grounded error compensation over time. Our method yields stable and interpretable latent dynamics, achieving faithful reconstruction alongside smooth local semantic conversion. Comprehensive experiments demonstrate that CVC consistently achieves superior fidelity, better semantic alignment, and more reliable editing behavior across diverse tasks.

preprint2026arXiv

On octonionic Monge-Ampère equation and pluripotential theory associated to octonionic plurisubharmonic functions of two variables

Several aspects of pluripotential theory are generalized to octonionic plurisubharmonic (OPSH) functions of two variables. We prove the comparison principle for continuous OPSH functions and the quasicontinuity of locally bounded ones. An important tool is a formula of integration by parts for mixed octonionic Monge-Ampère operator. Various useful properties of octonionic relative extremal functions and octonionic capacity are established. The main difficulty is the non-associativity of octonions. However, some weak form of associativity can be used to covercome this difficulty. Another important ingredient in pluripotential theory is the solution to the Dirichlet problem for the homogeneous octonionic Monge-Ampère equation on the unit ball, for which we show the $C_{loc}^{1,1}$-regularity by applying Bedford-Taylor's method. The obstacle to do so is that an OPSH function is usually not OPSH under automorphisms of the unit ball. This issue can be solved by finding a weighted transformation formula of OPSH functions.

preprint2026arXiv

Reinforcement Learning for Tool-Integrated Interleaved Thinking towards Cross-Domain Generalization

Recent advances in large language models (LLMs) have demonstrated remarkable capabilities in reasoning and tool utilization. However, the generalization of tool-augmented reinforcement learning (RL) across diverse domains remains a significant challenge. Standard paradigms often treat tool usage as a linear or isolated event, which becomes brittle when transferring skills from restricted domains (e.g., mathematics) to open-ended tasks. In this work, we investigate the cross-domain generalization of an LLM agent trained exclusively on mathematical problem-solving. To facilitate robust skill transfer, we propose a {\textbf{R}einforcement Learning for \textbf{I}nterleaved \textbf{T}ool \textbf{E}xecution (RITE)}. Unlike traditional methods, RITE enforces a continuous ``Plan-Action-Reflection'' cycle, allowing the model to ground its reasoning in intermediate tool outputs and self-correct during long-horizon tasks. To effectively train this complex interleaved policy, we introduce {Dr. GRPO}, a robust optimization objective that utilizes token-level loss aggregation with importance sampling to mitigate reward sparsity and high-variance credit assignment. Furthermore, we employ a dual-component reward system and dynamic curriculum via online rollout filtering to ensure structural integrity and sample efficiency. Extensive experiments reveal that our approach, despite being trained solely on math tasks, achieves state-of-the-art performance across diverse reasoning domains, demonstrating high token efficiency and strong generalization capabilities.

preprint2026arXiv

Testing a Linear Relation: Short-Range Correlations and the EMC Effect for Gluons and Quarks in Nuclei

In this work, we focus on the possible linear relation between short-range correlations (SRCs) and the EMC effect for partons in nuclei. First, we test a linear relationship pertaining to gluons in bound nuclei; it is manifested as a correlation between the slope of the reduced cross section ratio in deep inelastic scattering (DIS) and the cross section of sub-threshold $J/ψ$ photoproduction. For comparison, the results from four different global analyses groups of nuclear parton distribution functions (nPDFs) are utilized. These results show a good linear correlation between the gluons in bound nuclei and the slope of the reduced cross section ratio, consistent with the possible presence of nuclear effects in the gluon distributions. Second, we investigate the linear relationship of quarks in the proton-induced Drell-Yan process. The corresponding results for quarks show strong sensitivity to the parameterization forms adopted by the different groups. These findings enhance our understanding of the substructure in bound nuclei and provide valuable reference for future global fitting of nPDFs.

preprint2026arXiv

Unraveling Year-Long Radial Velocity Variations in Red Clump Region -- I: Comprehensive analysis of a K0 Giant star, 2 Draconis

Slow-rotating evolved stars frequently exhibit radial velocity (RV) variations on annual timescales, complicated by instrumental systematics and aliasing in the one-year regime. Here we investigate the origin of the near-yearly periodicity in 2 Dra, a star located in the red-clump region, assessing possible causes between stellar activity, instrumental profile (IP) effects, sampling alias, and planetary companions. We applied two independent approaches: (1) constraining diagnostic signals and performing a correlation analysis ($r$) between period-confined signals, and (2) evaluating phase stability by partitioning Keplerian fits. These methods enabled us to examine the physical connections and phase coherence among stellar activity indicators, RV measurements, and IP diagnostics. Our analysis suggests a stellar rotation period of $\simeq270\text{--}320$\,d for 2~Dra. The 340-d RV signal does not appear to originate from stellar activity in this chromospherically quiet star ($|r| \lesssim 0.33$), nor from instrumental systematics near the annual period ($|r| \lesssim 0.1$). This conclusion is supported by contrasting phase behavior: the RV and stellar activity phases remain stable, whereas the IP phases do not. We therefore propose that the 340-d variation likely arises from either small-amplitude intrinsic variability or a tentative gas giant companion with potential weak activity-induced modulation. The case of 2~Dra provides a framework for distinguishing the origins of $\sim$1-yr RV variations in other evolved stars.

preprint2026arXiv

User-Centric Requirements Prioritization in mHealth Applications: Insights from a Discrete Choice Experiment

Mobile health (mHealth) applications are widely used for chronic disease management, but usability and accessibility challenges persist due to the diverse needs of users. Adaptive User Interfaces (AUIs) offer a personalized solution to enhance user experience, yet barriers to adoption remain. Understanding user preferences and trade-offs is essential to ensure widespread acceptance of adaptation designs. This study identifies key factors influencing user preferences and trade-offs in mHealth adaptation design. A Discrete Choice Experiment (DCE) was conducted with 186 participants who have chronic diseases and use mHealth applications. Participants were asked to select preferred adaptation designs from choices featuring six attributes with varying levels. A mixed logit model was used to analyze preference heterogeneity and determine the factors most likely influencing adoption. Additionally, subgroup analyses were performed to explore differences by age, gender, health conditions, and coping mechanisms. Maintaining usability while ensuring controllability over adaptations, infrequent adaptations, and small-scale changes are key factors that facilitate the adoption of adaptive mHealth app designs. In contrast, frequently used functions and caregiver involvement can diminish the perceived value of such adaptations. This study employs a data-driven approach to quantify user preferences, identify key trade-offs, and reveal variations across demographic and behavioral subgroups through preference heterogeneity modeling. Furthermore, our results offer valuable guidance for developing future adaptive mHealth applications and lay the groundwork for continued exploration into requirements prioritization within the field of software engineering.

preprint2025arXiv

Introduction to the Chinese Space Station Survey Telescope (CSST)

The Chinese Space Station Survey Telescope (CSST) is an upcoming Stage-IV sky survey telescope, distinguished by its large field of view (FoV), high image quality, and multi-band observation capabilities. It can simultaneously conduct precise measurements of the Universe by performing multi-color photometric imaging and slitless spectroscopic surveys. The CSST is equipped with five scientific instruments, i.e. Multi-band Imaging and Slitless Spectroscopy Survey Camera (SC), Multi-Channel Imager (MCI), Integral Field Spectrograph (IFS), Cool Planet Imaging Coronagraph (CPI-C), and THz Spectrometer (TS). Using these instruments, CSST is expected to make significant contributions and discoveries across various astronomical fields, including cosmology, galaxies and active galactic nuclei (AGN), the Milky Way and nearby galaxies, stars, exoplanets, Solar System objects, astrometry, and transients and variable sources. This review aims to provide a comprehensive overview of the CSST instruments, observational capabilities, data products, and scientific potential.

preprint2025arXiv

Reliable and Resilient Collective Communication Library for LLM Training and Serving

Modern ML training and inference now span tens to tens of thousands of GPUs, where network faults can waste 10--15\% of GPU hours due to slow recovery. Common network errors and link fluctuations trigger timeouts that often terminate entire jobs, forcing expensive checkpoint rollback during training and request reprocessing during inference. We present R$^2$CCL, a fault-tolerant communication library that provides lossless, low-overhead failover by exploiting multi-NIC hardware. R$^2$CCL performs rapid connection migration, bandwidth-aware load redistribution, and resilient collective algorithms to maintain progress under failures. We evaluate R$^2$CCL on two 8-GPU H100 InfiniBand servers and via large-scale ML simulators modeling hundreds of GPUs with diverse failure patterns. Experiments show that R$^2$CCL is highly robust to NIC failures, incurring less than 1\% training and less than 3\% inference overheads. R$^2$CCL outperforms baselines AdapCC and DejaVu by 12.18$\times$ and 47$\times$, respectively.

preprint2025arXiv

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To accommodate the growing number of experts in practice, modern inference systems commonly adopt expert parallelism to distribute experts across devices. However, the absence of explicit load balancing constraints during inference allows adversarial inputs to trigger severe routing concentration. We demonstrate that out-of-distribution prompts can manipulate the routing strategy such that all tokens are consistently routed to the same set of top-$k$ experts, which creates computational bottlenecks on certain devices while forcing others to idle. This converts an efficiency mechanism into a denial-of-service attack vector, leading to violations of service-level agreements for time to first token. We propose RepetitionCurse, a low-cost black-box strategy to exploit this vulnerability. By identifying a universal flaw in MoE router behavior, RepetitionCurse constructs adversarial prompts using simple repetitive token patterns in a model-agnostic manner. On widely deployed MoE models like Mixtral-8x7B, our method increases end-to-end inference latency by 3.063x, degrading service availability significantly.

preprint2021arXiv

Pressure-induced structural transition, metallization, and topological superconductivity in PdSSe

Pressure not only provides a powerful way to tune the crystal structure of transition metal dichalcogenides (TMDCs) but also promotes the discovery of exotic electronic states and intriguing phenomena. Structural transitions from the quasi-two-dimensional layered orthorhombic phase to three-dimensional cubic pyrite phase, metallization, and superconductivity under high pressure have been observed experimentally in TMDCs materials PdS2 and PdSe2. Here, we report a theoretical prediction of the pressure-induced evolutions of crystal structure and electronic structure of PdSSe, an isomorphous intermediate material of the orthorhombic PdS2 and PdSe2. A series of pressure-induced structural phase transitions from the layered orthorhombic structure into an intermediate phase, then to a cubic phase are revealed. The intermediate phase features the same structure symmetry as the ambient orthorhombic phase, except for drastic collapsed interlayer distances and striking changes of the coordination polyhedron. Furthermore, the structural phase transitions are accompanied by electronic structure variations from semiconductor to semimetal, which are attributed to bandwidth broaden and orbital-selective mechanisms. Especially, the cubic phase PdSSe is distinct from the cubic PdS2 and PdSe2 materials by breaking inversion and mirror-plane symmetries, but showing similar superconductivity under high pressure, which is originated from strong electron-phonon coupling interactions concomitant with topologically nontrivial Weyl and high-fold Fermions. The intricate interplay between lattice, charge, and orbital degrees of freedom as well as the topologically nontrivial states in these compounds will further stimulate wide interest to explore the exotic physics of the TMDCs materials.