Source author record

Chen

Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Cryptography and Security Software Engineering Computational Engineering, Finance, and Science hep-ex hep-ph Information Theory math.IT math.OC Neurons and Cognition

Catalog footprint

What is connected

14works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

Web-browsing AI agents are increasingly deployed in enterprise settings under strict whitelists of approved domains, yet adversaries can still influence them by embedding hidden instructions in the HTML pages those domains serve. Existing red-teaming resources fall short of this scenario: prompt-injection benchmarks ship pre-built adversarial pages that whitelisted agents cannot reach, and generic LLM scanners probe the model API rather than its retrieved content. We present IPI-proxy, an open-source toolkit for red-teaming web-browsing agents against indirect prompt injection (IPI). At its core is an intercepting proxy that rewrites real HTTP responses from whitelisted domains in flight, embedding payloads drawn from a unified library of 820 deduplicated attack strings extracted from six published benchmarks (BIPIA, InjecAgent, AgentDojo, Tensor Trust, WASP, and LLMail-Inject). A YAML-driven test harness independently parameterizes the payload set, the embedding technique (HTML comment, invisible CSS, or LLM-generated semantic prose), and the HTML insertion point (6 locations from \icode{head\_meta} to \icode{script\_comment}), enabling parameter-sweep evaluation without mock pages or sandboxed environments. A companion exfiltration tracker logs successful callbacks. This paper describes the threat model, situates IPI-proxy among contemporary IPI benchmarks and red-teaming tools, and details its architecture, design decisions, and configuration interface. By bridging static benchmarks and live deployment, IPI-proxy gives AI security teams a reproducible substrate for measuring and hardening web-browsing agents against indirect prompt injection on the same retrieval surface attackers exploit in production.

preprint2022arXiv

Can An Image Classifier Suffice For Action Recognition?

We explore a new perspective on video understanding by casting the video recognition problem as an image recognition task. Our approach rearranges input video frames into super images, which allow for training an image classifier directly to fulfill the task of action recognition, in exactly the same way as image classification. With such a simple idea, we show that transformer-based image classifiers alone can suffice for action recognition. In particular, our approach demonstrates strong and promising performance against SOTA methods on several public datasets including Kinetics400, Moments In Time, Something-Something V2 (SSV2), Jester and Diving48. We also experiment with the prevalent ResNet image classifiers in computer vision to further validate our idea. The results on both Kinetics400 and SSV2 are comparable to some of the best-performed CNN approaches based on spatio-temporal modeling. Our source codes and models are available at https://github.com/IBM/sifar-pytorch.

preprint2022arXiv

MSTGD:A Memory Stochastic sTratified Gradient Descent Method with an Exponential Convergence Rate

The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms.Using this fluctuation effect, combined with the stratified sampling strategy, this paper designs a novel \underline{M}emory \underline{S}tochastic s\underline{T}ratified Gradient Descend(\underline{MST}GD) algorithm with an exponential convergence rate. Specifically, MSTGD uses two strategies for variance reduction: the first strategy is to perform variance reduction according to the proportion p of used historical gradient, which is estimated from the mean and variance of sample gradients before and after iteration, and the other strategy is stratified sampling by category. The statistic \ $\bar{G}_{mst}$\ designed under these two strategies can be adaptively unbiased, and its variance decays at a geometric rate. This enables MSTGD based on $\bar{G}_{mst}$ to obtain an exponential convergence rate of the form $λ^{2(k-k_0)}$($λ\in (0,1)$,k is the number of iteration steps,$λ$ is a variable related to proportion p).Unlike most other algorithms that claim to achieve an exponential convergence rate, the convergence rate is independent of parameters such as dataset size N, batch size n, etc., and can be achieved at a constant step size.Theoretical and experimental results show the effectiveness of MSTGD

preprint2022arXiv

RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation

Source code authorship attribution is an important problem often encountered in applications such as software forensics, bug fixing, and software quality analysis. Recent studies show that current source code authorship attribution methods can be compromised by attackers exploiting adversarial examples and coding style manipulation. This calls for robust solutions to the problem of code authorship attribution. In this paper, we initiate the study on making Deep Learning (DL)-based code authorship attribution robust. We propose an innovative framework called Robust coding style Patterns Generation (RoPGen), which essentially learns authors' unique coding style patterns that are hard for attackers to manipulate or imitate. The key idea is to combine data augmentation and gradient augmentation at the adversarial training phase. This effectively increases the diversity of training examples, generates meaningful perturbations to gradients of deep neural networks, and learns diversified representations of coding styles. We evaluate the effectiveness of RoPGen using four datasets of programs written in C, C++, and Java. Experimental results show that RoPGen can significantly improve the robustness of DL-based code authorship attribution, by respectively reducing 22.8% and 41.0% of the success rate of targeted and untargeted attacks on average.

preprint2022arXiv

Temporal Relevance Analysis for Video Action Models

In this paper, we provide a deep analysis of temporal modeling for action recognition, an important but underexplored problem in the literature. We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models based on layer-wise relevance propagation. We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected by various factors such as dataset, network architecture, and input frames. With this, we further study some important questions for action recognition that lead to interesting findings. Our analysis shows that there is no strong correlation between temporal relevance and model performance; and action models tend to capture local temporal information, but less long-range dependencies. Our codes and models will be publicly available.

preprint2021arXiv

Single-Source SIE for Two-Dimensional Arbitrarily Connected Penetrable and PEC Objects with Nonconformal Meshes

We proposed a simple and efficient modular single-source surface integral equation (SS-SIE) formulation for electromagnetic analysis of arbitrarily connected penetrable and perfectly electrical conductor (PEC) objects in two-dimensional space. In this formulation, a modular equivalent model for each penetrable object consisting of the composite structure is first independently constructed through replacing it by the background medium, no matter whether it is surrounded by the background medium, other media, or partially connected objects, and enforcing an equivalent electric current density on the boundary to remain fields in the exterior region unchanged. Then, by combining all the modular models and any possible PEC objects together, an equivalent model for the composite structure can be derived. The troublesome junction handling techniques are not needed and non-conformal meshes are intrinsically supported. The proposed SS-SIE formulation is simple to implement, efficient, and flexible, which shows significant performance improvement in terms of CPU time compared with the original SS-SIE formulation and the Poggio-Miller-Chang-Harrington-Wu-Tsai (PMCHWT) formulation. Several numerical examples including the coated dielectric cuboid, the large lossy objects, the planar layered dielectric structure, and the partially connected dielectric and PEC structure are carried out to validate its accuracy, efficiency and robustness.

preprint2020arXiv

Establishing Secrecy Region for Directional Modulation Scheme with Random Frequency Diverse Array

Random frequency diverse array (RFDA) based directional modulation (DM) was proposed as a promising technology in secure communications to achieve a precise transmission of confidential messages, and artificial noise (AN) was considered as an important helper in RFDA-DM. Compared with previous works that only focus on the spot of the desired receiver, in this work, we investigate a secrecy region around the desired receiver, that is, a specific range and angle resolution around the desired receiver. Firstly, the minimum number of antennas and the bandwidth needed to achieve a secrecy region are derived. Moreover, based on the lower bound of the secrecy capacity in RFDA-DM-AN scheme, we investigate the performance impact of AN on the secrecy capacity. From this work, we conclude that: 1) AN is not always beneficial to the secure transmission. Specifically, when the number of antennas is sufficiently large and the transmit power is smaller than a specified value, AN will reduce secrecy capacity due to the consumption of limited transmit power. 2) Increasing bandwidth will enlarge the set for randomly allocating frequencies and thus lead to a higher secrecy capacity. 3) The minimum number of antennas increases as the predefined secrecy transmission rate increases.

preprint2020arXiv

Future Physics Programme of BESIII

There has recently been a dramatic renewal of interest in the subjects of hadron spectroscopy and charm physics. This renaissance has been driven in part by the discovery of a plethora of charmonium-like $XYZ$ states at BESIII and $B$ factories, and the observation of an intriguing proton-antiproton threshold enhancement and the possibly related $X(1835)$ meson state at BESIII, as well as the threshold measurements of charm mesons and charm baryons. We present a detailed survey of the important topics in tau-charm physics and hadron physics that can be further explored at BESIII over the remaining lifetime of BEPCII operation. This survey will help in the optimization of the data-taking plan over the coming years, and provides physics motivation for the possible upgrade of BEPCII to higher luminosity.

preprint2019arXiv

Whole-Slide Image Focus Quality: Automatic Assessment and Impact on AI Cancer Detection

Digital pathology enables remote access or consults and powerful image analysis algorithms. However, the slide digitization process can create artifacts such as out-of-focus (OOF). OOF is often only detected upon careful review, potentially causing rescanning and workflow delays. Although scan-time operator screening for whole-slide OOF is feasible, manual screening for OOF affecting only parts of a slide is impractical. We developed a convolutional neural network (ConvFocus) to exhaustively localize and quantify the severity of OOF regions on digitized slides. ConvFocus was developed using our refined semi-synthetic OOF data generation process, and evaluated using real whole-slide images spanning 3 different tissue types and 3 different stain types that were digitized by two different scanners. ConvFocus's predictions were compared with pathologist-annotated focus quality grades across 514 distinct regions representing 37,700 35x35 $μ$m image patches, and 21 digitized "z-stack" whole-slide images that contain known OOF patterns. When compared to pathologist-graded focus quality, ConvFocus achieved Spearman rank coefficients of 0.81 and 0.94 on two scanners, and reproduced the expected OOF patterns from z-stack scanning. We also evaluated the impact of OOF on the accuracy of a state-of-the-art metastatic breast cancer detector and saw a consistent decrease in performance with increasing OOF. Comprehensive whole-slide OOF categorization could enable rescans prior to pathologist review, potentially reducing the impact of digitization focus issues on the clinical workflow. We show that the algorithm trained on our semi-synthetic OOF data generalizes well to real OOF regions across tissue types, stains, and scanners. Finally, quantitative OOF maps can flag regions that might otherwise be misclassified by image analysis algorithms, preventing OOF-induced errors.

preprint2018arXiv

Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer

For prostate cancer patients, the Gleason score is one of the most important prognostic factors, potentially determining treatment independent of the stage. However, Gleason scoring is based on subjective microscopic examination of tumor morphology and suffers from poor reproducibility. Here we present a deep learning system (DLS) for Gleason scoring whole-slide images of prostatectomies. Our system was developed using 112 million pathologist-annotated image patches from 1,226 slides, and evaluated on an independent validation dataset of 331 slides, where the reference standard was established by genitourinary specialist pathologists. On the validation dataset, the mean accuracy among 29 general pathologists was 0.61. The DLS achieved a significantly higher diagnostic accuracy of 0.70 (p=0.002) and trended towards better patient risk stratification in correlations to clinical follow-up data. Our approach could improve the accuracy of Gleason scoring and subsequent therapy decisions, particularly where specialist expertise is unavailable. The DLS also goes beyond the current Gleason system to more finely characterize and quantitate tumor morphology, providing opportunities for refinement of the Gleason system itself.

preprint2016arXiv

Neuron's Eye View: Inferring Features of Complex Stimuli from Neural Responses

Experiments that study neural encoding of stimuli at the level of individual neurons typically choose a small set of features present in the world --- contrast and luminance for vision, pitch and intensity for sound --- and assemble a stimulus set that systematically varies along these dimensions. Subsequent analysis of neural responses to these stimuli typically focuses on regression models, with experimenter-controlled features as predictors and spike counts or firing rates as responses. Unfortunately, this approach requires knowledge in advance about the relevant features coded by a given population of neurons. For domains as complex as social interaction or natural movement, however, the relevant feature space is poorly understood, and an arbitrary \emph{a priori} choice of features may give rise to confirmation bias. Here, we present a Bayesian model for exploratory data analysis that is capable of automatically identifying the features present in unstructured stimuli based solely on neuronal responses. Our approach is unique within the class of latent state space models of neural activity in that it assumes that firing rates of neurons are sensitive to multiple discrete time-varying features tied to the \emph{stimulus}, each of which has Markov (or semi-Markov) dynamics. That is, we are modeling neural activity as driven by multiple simultaneous stimulus features rather than intrinsic neural dynamics. We derive a fast variational Bayesian inference algorithm and show that it correctly recovers hidden features in synthetic data, as well as ground-truth stimulus features in a prototypical neural dataset. To demonstrate the utility of the algorithm, we also apply it to cluster neural responses and demonstrate successful recovery of features corresponding to monkeys and faces in the image set.

preprint2015arXiv

Dynamic Pricing in a Dual Market Environment

This paper is concerned with the determination of pricing strategies for a firm that in each period of a finite horizon receives replenishment quantities of a single product which it sells in two markets, e.g., a long-distance market and an on-site market. The key difference between the two markets is that the long-distance market provides for a one period delay in demand fulfillment. In contrast, on-site orders must be filled immediately as the customer is at the physical on-site location. We model the demands in consecutive periods as independent random variables and their distributions depend on the item's price in accordance with two general stochastic demand functions: additive or multiplicative. The firm uses a single pool of inventory to fulfill demands from both markets. We investigate properties of the structure of the dynamic pricing strategy that maximizes the total expected discounted profit over the finite time horizon, under fixed or controlled replenishment conditions. Further, we provide conditions under which one market may be the preferred outlet to sale over the other.

preprint2010arXiv

Software Design Document, Testing, Deployment and Configuration Management of the UUIS--a Team 2 COMP5541-W10 Project Approach

The Software Design Document of UUIS describes the prototype design details of the system architecture, database layer, deployment and configuration details as well as test cases produced while working the design and implementation of the prototype. The requirements specification of UUIS are detailed in arXiv:1005.0783.

preprint2010arXiv

Software Requirements Specification of the IUfA's UUIS -- a Team 2 COMP5541-W10 Project Approach

In the 52-page document, we describe our approach to the Software Requirements Specification of the IUfA's UUIS prototype. This includes the overall system description, functional requirements, non-functional requirements, use cases, the corresponding data dictionary for all entities involved, mock user interface (UI) design, and the overall projected cost estimate. The design specification of UUIS can be found in arXiv:1005.0665.

Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

Can An Image Classifier Suffice For Action Recognition?

MSTGD:A Memory Stochastic sTratified Gradient Descent Method with an Exponential Convergence Rate

RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation

Temporal Relevance Analysis for Video Action Models

Single-Source SIE for Two-Dimensional Arbitrarily Connected Penetrable and PEC Objects with Nonconformal Meshes

Establishing Secrecy Region for Directional Modulation Scheme with Random Frequency Diverse Array

Future Physics Programme of BESIII

Whole-Slide Image Focus Quality: Automatic Assessment and Impact on AI Cancer Detection

Development and Validation of a Deep Learning Algorithm for Improving Gleason Scoring of Prostate Cancer

Neuron's Eye View: Inferring Features of Complex Stimuli from Neural Responses

Dynamic Pricing in a Dual Market Environment

Software Design Document, Testing, Deployment and Configuration Management of the UUIS--a Team 2 COMP5541-W10 Project Approach

Software Requirements Specification of the IUfA's UUIS -- a Team 2 COMP5541-W10 Project Approach