Researcher profile

Deyue Zhang

Deyue Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

DIVER: Dynamic Iterative Visual Evidence Reasoning for Multimodal Fake News Detection

Multimodal fake news detection is crucial for mitigating adversarial misinformation. Existing methods, relying on static fusion or LLMs, face computational redundancy and hallucination risks due to weak visual foundations. To address this, we propose DIVER (Dynamic Iterative Visual Evidence Reasoning), a framework grounded in a progressive, evidence-driven reasoning paradigm. DIVER first establishes a strong text-based baseline through language analysis, leveraging intra-modal consistency to filter unreliable or hallucinated claims. Only when textual evidence is insufficient does the framework introduce visual information, where inter-modal alignment verification adaptively determines whether deeper visual inspection is necessary. For samples exhibiting significant cross-modal semantic discrepancies, DIVER selectively invokes fine-grained visual tools (e.g., OCR and dense captioning) to extract task-relevant evidence, which is iteratively aggregated via uncertainty-aware fusion to refine multimodal reasoning. Experiments on Weibo, Weibo21, and GossipCop demonstrate that DIVER outperforms state-of-the-art baselines by an average of 2.72\%, while optimizing inference efficiency with a reduced latency of 4.12 s.

preprint2026arXiv

DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs

Multimodal Large Language Models (MLLMs) are vulnerable to jailbreak attacks, which can elicit harmful responses from MLLMs. Many MLLMs support multi-image inputs, inadvertently introducing new vulnerabilities due to less efforts on multi-image safety alignment. Previous MLLM jailbreak methods only uses a single image, which restricts the attack space: they cannot distribute harmful requests across multiple images, carry abundant information, or exploit additional visual reasoning tasks to distract MLLMs. To address these limitations, in this paper, we propose a compositional jailbreak framework, \textbf{DMN}, which leverages \textbf{D}istributed instruction, \textbf{M}ultimodal evidence and a \textbf{N}umber chain task to fully enhance the jailbreak performance. Extensive experiments show that DMN is highly effective for MLLM jailbreaking, e.g. achieving attack success rates of over 90\% on GPT-4o, Gemini-2.5-pro and Claude Sonnet 4, surpassing other baselines by a large margin. This compositional, multi-image jailbreak strategy reveals fundamental weaknesses in their safety mechanisms.

preprint2026arXiv

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

With the rapid evolution of foundation models, Large Language Model (LLM) agents have demonstrated increasingly powerful tool-use capabilities. However, this proficiency introduces significant security risks, as malicious actors can manipulate agents into executing tools to generate harmful content. While existing defensive mechanisms are effective, they frequently suffer from the over-refusal problem, where increased safety strictness compromises the agent's utility on benign tasks. To mitigate this trade-off, we propose \textsc{SafeHarbor}, a novel framework designed to establish precise decision boundaries for LLM agents. Unlike static guidelines, \textsc{SafeHarbor} extracts context-aware defense rules through enhanced adversarial generation. We design a local hierarchical memory system for dynamic rule injection, offering a training-free, efficient, and plug-and-play solution. Furthermore, we introduce an information entropy-based self-evolution mechanism that continuously optimizes the memory structure through dynamic node splitting and merging. Extensive experiments demonstrate that \textsc{SafeHarbor} achieves state-of-the-art performance on both ambiguous benign tasks and explicit malicious attacks, notably attaining a peak benign utility of 63.6\% on GPT-4o while maintaining a robust refusal rate exceeding 93\% against harmful requests. The source code is publicly available at https://github.com/ljj-cyber/SafeHarbor.

preprint2026arXiv

SAPL: Semantic-Agnostic Prompt Learning in CLIP for Weakly Supervised Image Manipulation Localization

Malicious image manipulation threatens public safety and requires efficient localization methods. Existing approaches depend on costly pixel-level annotations which make training expensive. Existing weakly supervised methods rely only on image-level binary labels and focus on global classification, often overlooking local edge cues that are critical for precise localization. We observe that feature variations at manipulated boundaries are substantially larger than in interior regions. To address this gap, we propose Semantic-Agnostic Prompt Learning (SAPL) in CLIP, which learns text prompts that intentionally encode non-semantic, boundary-centric cues so that CLIPs multimodal similarity highlights manipulation edges rather than high-level object semantics. SAPL combines two complementary modules Edge-aware Contextual Prompt Learning (ECPL) and Hierarchical Edge Contrastive Learning (HECL) to exploit edge information in both textual and visual spaces. The proposed ECPL leverages edge-enhanced image features to generate learnable textual prompts via an attention mechanism, embedding semantic-irrelevant information into text features, to guide CLIP focusing on manipulation edges. The proposed HECL extract genuine and manipulated edge patches, and utilize contrastive learning to boost the discrimination between genuine edge patches and manipulated edge patches. Finally, we predict the manipulated regions from the similarity map after processing. Extensive experiments on multiple public benchmarks demonstrate that SAPL significantly outperforms existing approaches, achieving state-of-the-art localization performance.

preprint2022arXiv

A direct imaging method for the exterior and interior inverse scattering problems

This paper is concerned with the inverse acoustic scattering problems by an obstacle or a cavity with a sound-soft or a sound-hard boundary. A direct imaging method relying on the boundary conditions is proposed for reconstructing the shape of the obstacle or cavity. First, the scattered fields are approximated by the Fourier-Bessel functions with the measurements on a closed curve. Then, the indicator functions are established by the superpositions of the total fields or their derivatives to the incident point sources. We prove that the indicator functions vanish only on the boundary of the obstacle or cavity. Numerical examples are also included to demonstrate the effectiveness of the method.

preprint2020arXiv

Reconstruction of acoustic sources from multi-frequency phaseless far-field data

We consider the inverse source problem of determining an acoustic source from multi-frequency phaseless far-field data. By supplementing some reference point sources to the inverse source model, we develop a novel strategy for recovering the phase information of far-field data. This reference source technique leads to an easy-to-implement phase retrieval formula. Mathematically, the stability of the phase retrieval approach is rigorously justified. Then we employ the Fourier method to deal with the multi-frequency inverse source problem with recovered phase information. Finally, some two and three dimensional numerical results are presented to demonstrate the viability and effectiveness of the proposed method.

preprint2019arXiv

Uniqueness in inverse cavity scattering problems with phaseless near-field data

This paper is concerned with the uniqueness of inverse acoustic scattering problem for cavities with the modulus of the near-fields. With the aid of the reference ball technique and the superpositions of two point sources as the incident waves, we rigorously prove that the location and shape of the cavity as well as its boundary condition can be uniquely determined by the modulus of near-fields at an admissible surface. To our knowledge, this is the first uniqueness result in inverse cavity scattering problems with phaseless near-field data. In this paper, we make use of the phaseless near-field data incurred by the cavity and the point sources, and thus the configuration is more feasible in practice.

preprint2018arXiv

Uniqueness in phaseless inverse scattering problems with superposition of incident point sources

This paper is concerned with the uniqueness in inverse acoustic scattering problems with the modulus of the far-field patterns co-produced by the obstacle (resp. medium) and the point sources. Based on the superposition of point sources as the incident waves, we overcome the difficulty of translation invariance induced by a single incident plane wave, and rigorously prove that the location and shape of the obstacle as well as its boundary condition or the refractive index can be uniquely determined by the modulus of far-field patterns. This work is different from our previous work on phaseless inverse scattering problems [2018 Inverse Problems 34, 085002], in which the reference ball technique and the superposition of incident waves were used, and the phaseless far-field data generated only by the the scatterer were considered. In this paper, the phaseless far-field data co-produced by the scatterer and the point sources are used, thus the configuration is practically more feasible. Moreover, since the reference ball is not needed, the justification of uniqueness is much more clear and concise.