Researcher profile

Sibei Yang

Sibei Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

EVA: Editing for Versatile Alignment against Jailbreaks

Large Language Models (LLMs) and Vision Language Models (VLMs) have demonstrated impressive capabilities but remain vulnerable to jailbreaking attacks, where adversaries exploit textual or visual triggers to bypass safety guardrails. Recent defenses typically rely on safety fine-tuning or external filters to reduce the model's likelihood of producing harmful content. While effective to some extent, these methods often incur significant computational overheads and suffer from the safety utility trade-off, degrading the model's performance on benign tasks. To address these challenges, we propose EVA (Editing for Versatile Alignment against Jailbreaks), a novel framework that pioneers the application of direct model editing for safety alignment. EVA reframes safety alignment as a precise knowledge correction task. Instead of retraining massive parameters, EVA identifies and surgically edits specific neurons responsible for the model's susceptibility to harmful instructions, while leaving the vast majority of the model unchanged. By localizing the updates, EVA effectively neutralizes harmful behaviors without compromising the model's general reasoning capabilities. Extensive experiments demonstrate that EVA outperforms baselines in mitigating jailbreaks across both LLMs and VLMs, offering a precise and efficient solution for post-deployment safety alignment.

preprint2026arXiv

GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization

Diffusion Vision-Language Models (dVLMs), built upon the non-causal foundations of Diffusion Large Language Models (dLLMs), have demonstrated remarkable efficacy in multimodal tasks by departing from the traditional autoregressive generation paradigm. While dVLMs appear inherently robust against conventional jailbreak tactics, which we categorize as Fixed Prefix Optimization (FPO) (e.g., anchoring responses with "Sure, here is"), this perceived resilience is deceptive. Our investigation into the safety landscape of dVLMs reveals a unique refusal pattern: Immediate Refusal and Progressive Refusal. We find that while FPO-based attacks often fail by triggering the latter, the progressive refinement process itself uncovers a novel, latent attack surface. To exploit this vulnerability, we propose Global Probability Optimization (GPO), a general jailbreak paradigm designed specifically for the denoising trajectory of masked diffusion models. Unlike prefix-based methods, GPO manipulates the global generative dynamics to bypass guardrails in diffusion language models. Building on this, we introduce GPO-V, the first visual-modality jailbreak framework tailored for dVLMs. Empirical results demonstrate that GPO-V produces stealthy perturbations with exceptional cross-model transferability, revealing a critical security gap in non-sequential generative architectures. Our findings underscore the critical urgency of addressing safety alignment in dVLMs. These results necessitate an immediate and fundamental re-evaluation of current defense paradigms to mitigate the unique risks of diffusion-based generation. Our code is available at: https://anonymous.4open.science/r/GPO-V-0250.

preprint2026arXiv

Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats

Despite their impressive performance across a wide range of tasks, Large Vision-Language Models (LVLMs) remain prone to hallucination. In this study, we propose a comprehensive intervention framework aligned with the transformer's causal architecture in LVLMs, integrating the effects of different intervention paths on hallucination. We find that hallucinations in LVLMs do not arise from a single causal path, but rather from the interplay among image-to-input-text, image-to-output-text, and text-to-text pathways. For the first time, we also find that LVLMs rely on different pathways depending on the question-answer alignment format. Building on these insights, we propose simple yet effective methods to identify and intervene on critical hallucination heads within each pathway, tailored to discriminative and generative formats. Experiments across multiple benchmarks demonstrate that our approach consistently reduces hallucinations across diverse alignment types.

preprint2026arXiv

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

Compositional Zero-Shot Learning (CZSL) aims to learn semantic primitives (attributes and objects) from seen compositions and recognize unseen attribute-object compositions. Existing CZSL datasets focus on single attributes, neglecting the fact that objects naturally exhibit multiple interrelated attributes. Their narrow attribute scope and single attribute labeling introduce annotation biases, misleading the learning of attributes and causing inaccurate evaluation. To address these issues, we introduce the Multi-Attribute Composition (MAC) dataset, encompassing 22,838 images and 17,627 compositions with comprehensive and representative attribute annotations. MAC shows complex relationship between attributes and objects, with each attribute type linked to an average of 82.2 object types, and each object type associated with 31.4 attribute types. Based on MAC, we propose multi-attribute compositional zero-shot learning that requires deeper semantic understanding and advanced attribute associations, establishing a more realistic and challenging benchmark for CZSL. We also propose Multi-attribute Visual-Primitive Integrator (MVP-Integrator), a robust baseline for multi-attribute CZSL, which disentangles semantic primitives and performs effective visual-primitive association. Experimental results demonstrate that MVP-Integrator significantly outperforms existing CZSL methods on MAC with improved inference efficiency.

preprint2023arXiv

PCRLv2: A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis

Recent advances in self-supervised learning (SSL) in computer vision are primarily comparative, whose goal is to preserve invariant and discriminative semantics in latent representations by comparing siamese image views. However, the preserved high-level semantics do not contain enough local information, which is vital in medical image analysis (e.g., image-based diagnosis and tumor segmentation). To mitigate the locality problem of comparative SSL, we propose to incorporate the task of pixel restoration for explicitly encoding more pixel-level information into high-level semantics. We also address the preservation of scale information, a powerful tool in aiding image understanding but has not drawn much attention in SSL. The resulting framework can be formulated as a multi-task optimization problem on the feature pyramid. Specifically, we conduct multi-scale pixel restoration and siamese feature comparison in the pyramid. In addition, we propose non-skip U-Net to build the feature pyramid and develop sub-crop to replace multi-crop in 3D medical imaging. The proposed unified SSL framework (PCRLv2) surpasses its self-supervised counterparts on various tasks, including brain tumor segmentation (BraTS 2018), chest pathology identification (ChestX-ray, CheXpert), pulmonary nodule detection (LUNA), and abdominal organ segmentation (LiTS), sometimes outperforming them by large margins with limited annotations.

preprint2022arXiv

Global Gradient Estimates for Dirichlet Problems of Elliptic Operators with a BMO Anti-Symmetric Part

Let $n\ge2$ and $Ω\subset\mathbb{R}^n$ be a bounded NTA domain. In this article, the authors investigate (weighted) global gradient estimates for Dirichlet boundary value problems of second order elliptic equations of divergence form with an elliptic symmetric part and a BMO anti-symmetric part in $Ω$. More precisely, for any given $p\in(2,\infty)$, the authors prove that a weak reverse Hölder inequality with exponent $p$ implies the global $W^{1,p}$ estimate and the global weighted $W^{1,q}$ estimate, with $q\in[2,p]$ and some Muckenhoupt weights, of solutions to Dirichlet boundary value problems. As applications, the authors establish some global gradient estimates for solutions to Dirichlet boundary value problems of second order elliptic equations of divergence form with small $\mathrm{BMO}$ symmetric part and small $\mathrm{BMO}$ anti-symmetric part, respectively, on bounded Lipschitz domains, quasi-convex domains, Reifenberg flat domains, $C^1$ domains, or (semi-)convex domains, in weighted Lebesgue spaces. Furthermore, as further applications, the authors obtain the global gradient estimate, respectively, in (weighted) Lorentz spaces, (Lorentz--)Morrey spaces, (Musielak--)Orlicz spaces, and variable Lebesgue spaces. Even on global gradient estimates in Lebesgue spaces, the results obtained in this article improve the known results via weakening the assumption on the coefficient matrix.

preprint2022arXiv

Heat Kernels and Hardy Spaces on Non-Tangentially Accessible Domains with Applications to Global Regularity of Inhomogeneous Dirichlet Problems

Let $n\ge2$ and $Ω$ be a bounded non-tangentially accessible domain (for short, NTA domain) of $\mathbb{R}^n$. Assume that $L_D$ is a second-order divergence form elliptic operator having real-valued, bounded, measurable coefficients on $L^2(Ω)$ with the Dirichlet boundary condition. The main aim of this article is threefold. First, the authors prove that the heat kernels $\{K_t^{L_D}\}_{t>0}$ generated by $L_D$ are Hölder continuous. Second, for any $p\in(0,1]$, the authors introduce the `geometrical' Hardy space $H^p_r(Ω)$ by restricting any element of the Hardy space $H^p(\mathbb{R}^n)$ to $Ω$, and show that, when $p\in(\frac{n}{n+δ_0},1]$, $H^p_r(Ω)=H^p(Ω)=H^p_{L_D}(Ω)$ with equivalent quasi-norms, where $H^p(Ω)$ and $H^p_{L_D}(Ω)$ respectively denote the Hardy space on $Ω$ and the Hardy space associated with $L_D$, and $δ_0\in(0,1]$ is the critical index of the Hölder continuity for the kernels $\{K_t^{L_D}\}_{t>0}$. Third, as applications, the authors obtain the global gradient estimates in both $L^p(Ω)$, with $p\in(1,p_0)$, and $H^p_z(Ω)$, with $p\in(\frac{n}{n+1},1]$, for the inhomogeneous Dirichlet problem of second-order divergence form elliptic equations on bounded NTA domains, where $p_0\in(2,\infty)$ is a constant depending only on $n$, $Ω$, and the coefficient matrix of $L_D$. It is worth pointing out that the range $p\in(1,p_0)$ for the global gradient estimate in the scale of Lebesgue spaces $L^p(Ω)$ is sharp and the above results are established without any additional assumptions on both the coefficient matrix of $L_D$, and the domain $Ω$.

preprint2022arXiv

Maximal Function and Riesz Transform Characterizations of Hardy Spaces Associated with Homogeneous Higher Order Elliptic Operators and Ball Quasi-Banach Function Spaces

Let $L$ be a homogeneous divergence form higher order elliptic operator with complex bounded measurable coefficients on $\mathbb{R}^n$ and $X$ a ball quasi-Banach function space on $\mathbb{R}^n$ satisfying some mild assumptions. Denote by $H_{X,\, L}(\mathbb{R}^n)$ the Hardy space, associated with both $L$ and $X$, which is defined via the Lusin area function related to the semigroup generated by $L$. In this article, the authors establish both the maximal function and the Riesz transform characterizations of $H_{X,\, L}(\mathbb{R}^n)$. The results obtained in this article have a wide range of generality and can be applied to the weighted Hardy space, the variable Hardy space, the mixed-norm Hardy space, the Orlicz--Hardy space, the Orlicz-slice Hardy space, and the Morrey--Hardy space, associated with $L$. In particular, even when $L$ is a second order divergence form elliptic operator, both the maximal function and the Riesz transform characterizations of the mixed-norm Hardy space, the Orlicz-slice Hardy space, and the Morrey--Hardy space, associated with $L$, obtained in this article, are totally new.

preprint2022arXiv

Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts

Preserving maximal information is one of principles of designing self-supervised learning methodologies. To reach this goal, contrastive learning adopts an implicit way which is contrasting image pairs. However, we believe it is not fully optimal to simply use the contrastive estimation for preservation. Moreover, it is necessary and complemental to introduce an explicit solution to preserve more information. From this perspective, we introduce Preservational Learning to reconstruct diverse image contexts in order to preserve more information in learned representations. Together with the contrastive loss, we present Preservational Contrastive Representation Learning (PCRL) for learning self-supervised medical representations. PCRL provides very competitive results under the pretraining-finetuning protocol, outperforming both self-supervised and supervised counterparts in 5 classification/segmentation tasks substantially.

preprint2020arXiv

Graph-Structured Referring Expression Reasoning in The Wild

Grounding referring expressions aims to locate in an image an object referred to by a natural language expression. The linguistic structure of a referring expression provides a layout of reasoning over the visual contents, and it is often crucial to align and jointly understand the image and the referring expression. In this paper, we propose a scene graph guided modular network (SGMN), which performs reasoning over a semantic graph and a scene graph with neural modules under the guidance of the linguistic structure of the expression. In particular, we model the image as a structured semantic graph, and parse the expression into a language scene graph. The language scene graph not only decodes the linguistic structure of the expression, but also has a consistent representation with the image semantic graph. In addition to exploring structured solutions to grounding referring expressions, we also propose Ref-Reasoning, a large-scale real-world dataset for structured referring expression reasoning. We automatically generate referring expressions over the scene graphs of images using diverse expression templates and functional programs. This dataset is equipped with real-world visual contents as well as semantically rich expressions with different reasoning layouts. Experimental results show that our SGMN not only significantly outperforms existing state-of-the-art algorithms on the new Ref-Reasoning dataset, but also surpasses state-of-the-art structured methods on commonly used benchmark datasets. It can also provide interpretable visual evidences of reasoning. Data and code are available at https://github.com/sibeiyang/sgmn

preprint2020arXiv

Relationship-Embedded Representation Learning for Grounding Referring Expressions

Grounding referring expressions in images aims to locate the object instance in an image described by a referring expression. It involves a joint understanding of natural language and image content, and is essential for a range of visual tasks related to human-computer interaction. As a language-to-vision matching task, the core of this problem is to not only extract all the necessary information (i.e., objects and the relationships among them) in both the image and referring expression, but also make full use of context information to align cross-modal semantic concepts in the extracted information. Unfortunately, existing work on grounding referring expressions fails to accurately extract multi-order relationships from the referring expression and associate them with the objects and their related contexts in the image. In this paper, we propose a Cross-Modal Relationship Extractor (CMRE) to adaptively highlight objects and relationships (spatial and semantic relations) related to the given expression with a cross-modal attention mechanism, and represent the extracted information as a language-guided visual relation graph. In addition, we propose a Gated Graph Convolutional Network (GGCN) to compute multimodal semantic contexts by fusing information from different modes and propagating multimodal information in the structured relation graph. Experimental results on three common benchmark datasets show that our Cross-Modal Relationship Inference Network, which consists of CMRE and GGCN, significantly surpasses all existing state-of-the-art methods. Code is available at https://github.com/sibeiyang/sgmn/tree/master/lib/cmrin_models

preprint2020arXiv

Weighted Global Regularity Estimates for Elliptic Problems with Robin Boundary Conditions in Lipschitz Domains

Let $n\ge2$ and $Ω$ be a bounded Lipschitz domain in $\mathbb{R}^n$. In this article, the authors investigate global (weighted) estimates for the gradient of solutions to Robin boundary value problems of second order elliptic equations of divergence form with real-valued, bounded, measurable coefficients in $Ω$. More precisely, let $p\in(n/(n-1),\infty)$. Using a real-variable argument, the authors obtain two necessary and sufficient conditions for $W^{1,p}$ estimates of solutions to Robin boundary value problems, respectively, in terms of a weak reverse Hölder inequality with exponent $p$ or weighted $W^{1,q}$ estimates of solutions with $q\in(n/(n-1),p]$ and some Muckenhoupt weights. As applications, the authors establish some global regularity estimates for solutions to Robin boundary value problems of second order elliptic equations of divergence form with small $\mathrm{BMO}$ coefficients, respectively, on bounded Lipschitz domains, $C^1$ domains or (semi-)convex domains, in the scale of weighted Lebesgue spaces, via some quite subtle approach which is different from the existing ones and, even when $n=3$ in case of bounded $C^1$ domains, also gives an alternative correct proof of some know result. By this and some technique from harmonic analysis, the authors further obtain the global regularity estimates, respectively, in Morrey spaces, (Musielak--)Orlicz spaces and variable Lebesgue spaces