Researcher profile

Kaiwen Yang

Kaiwen Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

VPD-100K: Towards Generalizable and Fine-grained Visual Privacy Protection

Privacy protection has become a critical requirement in the era of ubiquitous visual data sharing, imposing higher demands on efficient and robust privacy detection algorithms. However, current robust detection models are severely hindered by the lack of comprehensive datasets. Existing privacy-oriented datasets often suffer from limited scale, coarse-grained annotations, and narrow domain coverage, failing to capture the intricate details of sensitive information in realworld environments. To bridge this gap, we present a large-scale, fine-grained Visual Privacy Dataset (VPD-100K), designed to facilitate generalized privacy detection. We establish a holistic taxonomy comprising four primary domains: Human Presence, On-Screen Personally Identifiable Information (PII), Physical Identifiers, and Location Indicators, containing 100,000 images annotated with 33 fine-grained classes and over 190,000 object instances. Statistical analysis reveals that our dataset features long-tailed distributions, small object scales, and high visual complexity. These characteristics make the dataset particularly valuable for demanding, unconstrained applications such as live streaming, where actors frequently face unintentional, realtime information leakage. Furthermore, we design an effective frequency-enhanced lightweight module consisting of frequency-domain attention fusion and adaptive spectral gating mechanism that breaks the limitations of spatial pixel intensity to better capture the subtle details of sensitive information. Extensive experiments conducted on both diverse image and streaming videos benchmarks consistently demonstrate the effectiveness of our VPD-100K dataset and the wellcurated frequency mechanism. The code and dataset are available at https://vpd-100k.github.io/.

preprint2022arXiv

Defect Identification, Categorization, and Repair: Better Together

Just-In-Time defect prediction (JIT-DP) models can identify defect-inducing commits at check-in time. Even though previous studies have achieved a great progress, these studies still have the following limitations: 1) useful information (e.g., semantic information and structure information) are not fully used; 2) existing work can only predict a commit as buggy one or clean one without more information about what type of defect it is; 3) a commit may involve changes in many files, which cause difficulty in locating the defect; 4) prior studies treat defect identification and defect repair as separate tasks, none aims to handle both tasks simultaneously. In this paper, to handle aforementioned limitations, we propose a comprehensive defect prediction and repair framework named CompDefect, which can identify whether a changed function (a more fine-grained level) is defect-prone, categorize the type of defect, and repair such a defect automatically if it falls into several scenarios, e.g., defects with single statement fixes, or those that match a small set of defect templates. Generally, the first two tasks in CompDefect are treated as a multiclass classification task, while the last one is treated as a sequence generation task. The whole input of CompDefect consists of three parts (exampled with positive functions): the clean version of a function (i.e., the version before defect introduced), the buggy version of a function and the fixed version of a function. In multiclass classification task, CompDefect categorizes the type of defect via multiclass classification with the information in both the clean version and the buggy version. In code sequence generation task, CompDefect repairs the defect once identified or keeps it unchanged.