Researcher profile

Chen

Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

Web-browsing AI agents are increasingly deployed in enterprise settings under strict whitelists of approved domains, yet adversaries can still influence them by embedding hidden instructions in the HTML pages those domains serve. Existing red-teaming resources fall short of this scenario: prompt-injection benchmarks ship pre-built adversarial pages that whitelisted agents cannot reach, and generic LLM scanners probe the model API rather than its retrieved content. We present IPI-proxy, an open-source toolkit for red-teaming web-browsing agents against indirect prompt injection (IPI). At its core is an intercepting proxy that rewrites real HTTP responses from whitelisted domains in flight, embedding payloads drawn from a unified library of 820 deduplicated attack strings extracted from six published benchmarks (BIPIA, InjecAgent, AgentDojo, Tensor Trust, WASP, and LLMail-Inject). A YAML-driven test harness independently parameterizes the payload set, the embedding technique (HTML comment, invisible CSS, or LLM-generated semantic prose), and the HTML insertion point (6 locations from \icode{head\_meta} to \icode{script\_comment}), enabling parameter-sweep evaluation without mock pages or sandboxed environments. A companion exfiltration tracker logs successful callbacks. This paper describes the threat model, situates IPI-proxy among contemporary IPI benchmarks and red-teaming tools, and details its architecture, design decisions, and configuration interface. By bridging static benchmarks and live deployment, IPI-proxy gives AI security teams a reproducible substrate for measuring and hardening web-browsing agents against indirect prompt injection on the same retrieval surface attackers exploit in production.

preprint2022arXiv

Can An Image Classifier Suffice For Action Recognition?

We explore a new perspective on video understanding by casting the video recognition problem as an image recognition task. Our approach rearranges input video frames into super images, which allow for training an image classifier directly to fulfill the task of action recognition, in exactly the same way as image classification. With such a simple idea, we show that transformer-based image classifiers alone can suffice for action recognition. In particular, our approach demonstrates strong and promising performance against SOTA methods on several public datasets including Kinetics400, Moments In Time, Something-Something V2 (SSV2), Jester and Diving48. We also experiment with the prevalent ResNet image classifiers in computer vision to further validate our idea. The results on both Kinetics400 and SSV2 are comparable to some of the best-performed CNN approaches based on spatio-temporal modeling. Our source codes and models are available at https://github.com/IBM/sifar-pytorch.

preprint2022arXiv

MSTGD:A Memory Stochastic sTratified Gradient Descent Method with an Exponential Convergence Rate

The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms.Using this fluctuation effect, combined with the stratified sampling strategy, this paper designs a novel \underline{M}emory \underline{S}tochastic s\underline{T}ratified Gradient Descend(\underline{MST}GD) algorithm with an exponential convergence rate. Specifically, MSTGD uses two strategies for variance reduction: the first strategy is to perform variance reduction according to the proportion p of used historical gradient, which is estimated from the mean and variance of sample gradients before and after iteration, and the other strategy is stratified sampling by category. The statistic \ $\bar{G}_{mst}$\ designed under these two strategies can be adaptively unbiased, and its variance decays at a geometric rate. This enables MSTGD based on $\bar{G}_{mst}$ to obtain an exponential convergence rate of the form $λ^{2(k-k_0)}$($λ\in (0,1)$,k is the number of iteration steps,$λ$ is a variable related to proportion p).Unlike most other algorithms that claim to achieve an exponential convergence rate, the convergence rate is independent of parameters such as dataset size N, batch size n, etc., and can be achieved at a constant step size.Theoretical and experimental results show the effectiveness of MSTGD

preprint2022arXiv

RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation

Source code authorship attribution is an important problem often encountered in applications such as software forensics, bug fixing, and software quality analysis. Recent studies show that current source code authorship attribution methods can be compromised by attackers exploiting adversarial examples and coding style manipulation. This calls for robust solutions to the problem of code authorship attribution. In this paper, we initiate the study on making Deep Learning (DL)-based code authorship attribution robust. We propose an innovative framework called Robust coding style Patterns Generation (RoPGen), which essentially learns authors' unique coding style patterns that are hard for attackers to manipulate or imitate. The key idea is to combine data augmentation and gradient augmentation at the adversarial training phase. This effectively increases the diversity of training examples, generates meaningful perturbations to gradients of deep neural networks, and learns diversified representations of coding styles. We evaluate the effectiveness of RoPGen using four datasets of programs written in C, C++, and Java. Experimental results show that RoPGen can significantly improve the robustness of DL-based code authorship attribution, by respectively reducing 22.8% and 41.0% of the success rate of targeted and untargeted attacks on average.

preprint2022arXiv

Temporal Relevance Analysis for Video Action Models

In this paper, we provide a deep analysis of temporal modeling for action recognition, an important but underexplored problem in the literature. We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models based on layer-wise relevance propagation. We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected by various factors such as dataset, network architecture, and input frames. With this, we further study some important questions for action recognition that lead to interesting findings. Our analysis shows that there is no strong correlation between temporal relevance and model performance; and action models tend to capture local temporal information, but less long-range dependencies. Our codes and models will be publicly available.

preprint2021arXiv

Single-Source SIE for Two-Dimensional Arbitrarily Connected Penetrable and PEC Objects with Nonconformal Meshes

We proposed a simple and efficient modular single-source surface integral equation (SS-SIE) formulation for electromagnetic analysis of arbitrarily connected penetrable and perfectly electrical conductor (PEC) objects in two-dimensional space. In this formulation, a modular equivalent model for each penetrable object consisting of the composite structure is first independently constructed through replacing it by the background medium, no matter whether it is surrounded by the background medium, other media, or partially connected objects, and enforcing an equivalent electric current density on the boundary to remain fields in the exterior region unchanged. Then, by combining all the modular models and any possible PEC objects together, an equivalent model for the composite structure can be derived. The troublesome junction handling techniques are not needed and non-conformal meshes are intrinsically supported. The proposed SS-SIE formulation is simple to implement, efficient, and flexible, which shows significant performance improvement in terms of CPU time compared with the original SS-SIE formulation and the Poggio-Miller-Chang-Harrington-Wu-Tsai (PMCHWT) formulation. Several numerical examples including the coated dielectric cuboid, the large lossy objects, the planar layered dielectric structure, and the partially connected dielectric and PEC structure are carried out to validate its accuracy, efficiency and robustness.

preprint2020arXiv

Establishing Secrecy Region for Directional Modulation Scheme with Random Frequency Diverse Array

Random frequency diverse array (RFDA) based directional modulation (DM) was proposed as a promising technology in secure communications to achieve a precise transmission of confidential messages, and artificial noise (AN) was considered as an important helper in RFDA-DM. Compared with previous works that only focus on the spot of the desired receiver, in this work, we investigate a secrecy region around the desired receiver, that is, a specific range and angle resolution around the desired receiver. Firstly, the minimum number of antennas and the bandwidth needed to achieve a secrecy region are derived. Moreover, based on the lower bound of the secrecy capacity in RFDA-DM-AN scheme, we investigate the performance impact of AN on the secrecy capacity. From this work, we conclude that: 1) AN is not always beneficial to the secure transmission. Specifically, when the number of antennas is sufficiently large and the transmit power is smaller than a specified value, AN will reduce secrecy capacity due to the consumption of limited transmit power. 2) Increasing bandwidth will enlarge the set for randomly allocating frequencies and thus lead to a higher secrecy capacity. 3) The minimum number of antennas increases as the predefined secrecy transmission rate increases.

preprint2020arXiv

Future Physics Programme of BESIII

There has recently been a dramatic renewal of interest in the subjects of hadron spectroscopy and charm physics. This renaissance has been driven in part by the discovery of a plethora of charmonium-like $XYZ$ states at BESIII and $B$ factories, and the observation of an intriguing proton-antiproton threshold enhancement and the possibly related $X(1835)$ meson state at BESIII, as well as the threshold measurements of charm mesons and charm baryons. We present a detailed survey of the important topics in tau-charm physics and hadron physics that can be further explored at BESIII over the remaining lifetime of BEPCII operation. This survey will help in the optimization of the data-taking plan over the coming years, and provides physics motivation for the possible upgrade of BEPCII to higher luminosity.

preprint2019arXiv

Whole-Slide Image Focus Quality: Automatic Assessment and Impact on AI Cancer Detection

Digital pathology enables remote access or consults and powerful image analysis algorithms. However, the slide digitization process can create artifacts such as out-of-focus (OOF). OOF is often only detected upon careful review, potentially causing rescanning and workflow delays. Although scan-time operator screening for whole-slide OOF is feasible, manual screening for OOF affecting only parts of a slide is impractical. We developed a convolutional neural network (ConvFocus) to exhaustively localize and quantify the severity of OOF regions on digitized slides. ConvFocus was developed using our refined semi-synthetic OOF data generation process, and evaluated using real whole-slide images spanning 3 different tissue types and 3 different stain types that were digitized by two different scanners. ConvFocus's predictions were compared with pathologist-annotated focus quality grades across 514 distinct regions representing 37,700 35x35 $μ$m image patches, and 21 digitized "z-stack" whole-slide images that contain known OOF patterns. When compared to pathologist-graded focus quality, ConvFocus achieved Spearman rank coefficients of 0.81 and 0.94 on two scanners, and reproduced the expected OOF patterns from z-stack scanning. We also evaluated the impact of OOF on the accuracy of a state-of-the-art metastatic breast cancer detector and saw a consistent decrease in performance with increasing OOF. Comprehensive whole-slide OOF categorization could enable rescans prior to pathologist review, potentially reducing the impact of digitization focus issues on the clinical workflow. We show that the algorithm trained on our semi-synthetic OOF data generalizes well to real OOF regions across tissue types, stains, and scanners. Finally, quantitative OOF maps can flag regions that might otherwise be misclassified by image analysis algorithms, preventing OOF-induced errors.

preprint2018arXiv

Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer

For prostate cancer patients, the Gleason score is one of the most important prognostic factors, potentially determining treatment independent of the stage. However, Gleason scoring is based on subjective microscopic examination of tumor morphology and suffers from poor reproducibility. Here we present a deep learning system (DLS) for Gleason scoring whole-slide images of prostatectomies. Our system was developed using 112 million pathologist-annotated image patches from 1,226 slides, and evaluated on an independent validation dataset of 331 slides, where the reference standard was established by genitourinary specialist pathologists. On the validation dataset, the mean accuracy among 29 general pathologists was 0.61. The DLS achieved a significantly higher diagnostic accuracy of 0.70 (p=0.002) and trended towards better patient risk stratification in correlations to clinical follow-up data. Our approach could improve the accuracy of Gleason scoring and subsequent therapy decisions, particularly where specialist expertise is unavailable. The DLS also goes beyond the current Gleason system to more finely characterize and quantitate tumor morphology, providing opportunities for refinement of the Gleason system itself.