Researcher profile

Qiyuan An

Qiyuan An contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 11 - UnverifiedVerification L1Unclaimed author
1works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2022arXiv

A Privacy-Preserving Unsupervised Domain Adaptation Framework for Clinical Text Analysis

Unsupervised domain adaptation (UDA) generally aligns the unlabeled target domain data to the distribution of the source domain to mitigate the distribution shift problem. The standard UDA requires sharing the source data with the target, having potential data privacy leaking risks. To protect the source data's privacy, we first propose to share the source feature distribution instead of the source data. However, sharing only the source feature distribution may still suffer from the membership inference attack who can infer an individual's membership by the black-box access to the source model. To resolve this privacy issue, we further study the under-explored problem of privacy-preserving domain adaptation and propose a method with a novel differential privacy training strategy to protect the source data privacy. We model the source feature distribution by Gaussian Mixture Models (GMMs) under the differential privacy setting and send it to the target client for adaptation. The target client resamples differentially private source features from GMMs and adapts on target data with several state-of-art UDA backbones. With our proposed method, the source data provider could avoid leaking source data privacy during domain adaptation as well as reserve the utility. To evaluate our proposed method's utility and privacy loss, we apply our model on a medical report disease label classification task using two noisy challenging clinical text datasets. The results show that our proposed method can preserve source data's privacy with a minor performance influence on the text classification task.