Source author record

Wenbo Guo

Wenbo Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computation and Language Cryptography and Security eess.IV Human-Computer Interaction Programming Languages quant-ph

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due to their high capability and flexibility, such agents raise significant security and safety concerns. A growing number of real-world incidents have shown that adversaries can easily manipulate agents into performing harmful actions, such as leaking API keys, deleting user data, or initiating unauthorized transactions. Evaluating agent security is inherently challenging, as agents operate in dynamic, untrusted environments involving external tools, heterogeneous data sources, and frequent user interactions. However, realistic, controllable, and reproducible environments for large-scale risk assessment remain largely underexplored. To address this gap, we introduce the DecodingTrust-Agent Platform (DTap), the first controllable and interactive red-teaming platform for AI agents, spanning 14 real-world domains and over 50 simulation environments that replicate widely used systems such as Google Workspace, Paypal, and Slack. To scale the risk assessment of agents in DTap, we further propose DTap-Red, the first autonomous red-teaming agent that systematically explores diverse injection vectors (e.g., prompt, tool, skill, environment, combinations) and autonomously discovers effective attack strategies tailored to varying malicious goals. Using DTap-Red, we curate DTap-Bench, a large-scale red-teaming dataset comprising high-quality instances across domains, each paired with a verifiable judge to automatically validate attack outcomes. Through DTap, we conduct large-scale evaluations of popular AI agents built on various backbone models, spanning security policies, risk categories, and attack strategies, revealing systematic vulnerability patterns and providing valuable insights for developing secure next-generation agents.

preprint2026arXiv

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

AI agents are rapidly gaining capabilities that could significantly reshape cybersecurity, making rigorous evaluation urgent. A critical capability is exploitation: turning a vulnerability, which is not yet an attack, into a concrete security impact, such as unauthorized file access or code execution. Exploitation is a particularly challenging task because it requires low-level program reasoning (e.g., about memory layout), runtime adaptation, and sustained progress over long horizons. Meanwhile, it is inherently dual-use, supporting defensive workflows while lowering the barrier for offense. Despite its importance and diagnostic value, exploitation remains under-evaluated. To address this gap, we introduce ExploitGym, a large-scale, diverse, realistic benchmark on the exploitation capabilities of AI agents. Given a program input that triggers a vulnerability, ExploitGym tasks agents with progressively extending it into a working exploit. The benchmark comprises 898 instances sourced from real-world vulnerabilities across three domains, including userspace programs, Google's V8 JavaScript engine, and the Linux kernel. We vary the security protections applied to each instance, isolating their impact on agent performance. All configurations are packaged in reproducible containerized environments. Our evaluation shows that while exploitation remains challenging, frontier models can successfully exploit a non-trivial fraction of vulnerabilities. For example, the strongest configurations are Anthropic's latest model Claude Mythos Preview and OpenAI's GPT-5.5, which produce working exploits for 157 and 120 instances, respectively. Notably, even with widely used defenses enabled, models retain non-trivial success rates. These results establish ExploitGym as an effective testbed for exploitation and highlight the growing cybersecurity risks posed by increasingly capable AI agents.

preprint2022arXiv

Are Shortest Rationales the Best Explanations for Human Understanding?

Existing self-explaining models typically favor extracting the shortest possible rationales - snippets of an input text "responsible for" corresponding output - to explain the model prediction, with the assumption that shorter rationales are more intuitive to humans. However, this assumption has yet to be validated. Is the shortest rationale indeed the most human-understandable? To answer this question, we design a self-explaining model, LimitedInk, which allows users to extract rationales at any target length. Compared to existing baselines, LimitedInk achieves compatible end-task performance and human-annotated rationale agreement, making it a suitable representation of the recent class of self-explaining models. We use LimitedInk to conduct a user study on the impact of rationale length, where we ask human judges to predict the sentiment label of documents based only on LimitedInk-generated rationales with different lengths. We show rationales that are too short do not help humans predict labels better than randomly masked text, suggesting the need for more careful design of the best human rationales.

preprint2022arXiv

QPanda: high-performance quantum computing framework for multiple application scenarios

With the birth of Noisy Intermediate Scale Quantum (NISQ) devices and the verification of "quantum supremacy" in random number sampling and boson sampling, more and more fields hope to use quantum computers to solve specific problems, such as aerodynamic design, route allocation, financial option prediction, quantum chemical simulation to find new materials, and the challenge of quantum cryptography to automotive industry security. However, these fields still need to constantly explore quantum algorithms that adapt to the current NISQ machine, so a quantum programming framework that can face multi-scenarios and application needs is required. Therefore, this paper proposes QPanda, an application scenario-oriented quantum programming framework with high-performance simulation. Such as designing quantum chemical simulation algorithms based on it to explore new materials, building a quantum machine learning framework to serve finance, etc. This framework implements high-performance simulation of quantum circuits, a configuration of the fusion processing backend of quantum computers and supercomputers, and compilation and optimization methods of quantum programs for NISQ machines. Finally, the experiment shows that quantum jobs can be executed with high fidelity on the quantum processor using quantum circuit compile and optimized interface and have better simulation performance.

preprint2020arXiv

High-speed and high-efficiency three-dimensional shape measurement based on Gray-coded light

Fringe projection profilometry has been increasingly sought and applied in dynamic three-dimensional (3D) shape measurement. In this work, a robust and high-efficiency 3D measurement based on Gray-code light is proposed. Unlike the traditional method, a novel tripartite phase unwrapping method is proposed to avoid the jump errors on the boundary of code words, which are mainly caused by the defocusing of the projector and the motion of the tested object. Subsequently, the time-overlapping coding strategy is presented to greatly increase the coding efficiency, decreasing the projected number in each group, e.g. from 7 (3 + 4) to 4 (3 + 1) for one restored 3D frame. Combination of two proposed techniques allows to reconstruct a pixel-wise and unambiguous 3D geometry of dynamic scenes with strong noise using every 4 projected patterns. The presented techniques preserve the high anti-noise ability of Gray-coded-based method while overcoming the drawbacks of jump errors and low coding efficiency. Experiments have demonstrated that the proposed method can achieve the robust and high-efficiency 3D shape measurement of high-speed dynamic scenes even polluted by strong noise.

preprint2016arXiv

Using Non-invertible Data Transformations to Build Adversarial-Robust Neural Networks

Deep neural networks have proven to be quite effective in a wide variety of machine learning tasks, ranging from improved speech recognition systems to advancing the development of autonomous vehicles. However, despite their superior performance in many applications, these models have been recently shown to be susceptible to a particular type of attack possible through the generation of particular synthetic examples referred to as adversarial samples. These samples are constructed by manipulating real examples from the training data distribution in order to "fool" the original neural model, resulting in misclassification (with high confidence) of previously correctly classified samples. Addressing this weakness is of utmost importance if deep neural architectures are to be applied to critical applications, such as those in the domain of cybersecurity. In this paper, we present an analysis of this fundamental flaw lurking in all neural architectures to uncover limitations of previously proposed defense mechanisms. More importantly, we present a unifying framework for protecting deep neural models using a non-invertible data transformation--developing two adversary-resilient architectures utilizing both linear and nonlinear dimensionality reduction. Empirical results indicate that our framework provides better robustness compared to state-of-art solutions while having negligible degradation in accuracy.