Source author record

Keke Chen

Keke Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Artificial Intelligence Databases Distributed, Parallel, and Cluster Computing eess.SY Information Retrieval Machine Learning Systems and Control

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Membership Inference Attacks on Recommender System: A Survey

Recommender systems (RecSys) have been widely applied to various applications, including E-commerce, finance, healthcare, social media and have become increasingly influential in shaping user behavior and decision-making, highlighting their growing impact in various domains. However, recent studies have shown that RecSys are vulnerable to membership inference attacks (MIAs), which aim to infer whether user interaction record was used to train a target model or not. MIAs on RecSys models can directly lead to a privacy breach. For example, via identifying the fact that a purchase record that has been used to train a RecSys associated with a specific user, an attacker can infer that user's special quirks. In recent years, MIAs have been shown to be effective on other ML tasks, e.g., classification models and natural language processing. However, traditional MIAs are ill-suited for RecSys due to the unseen posterior probability. Although MIAs on RecSys form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this article, we conduct the first comprehensive survey on RecSys MIAs. This survey offers a comprehensive review of the latest advancements in RecSys MIAs, exploring the design principles, challenges, attack and defense associated with this emerging field. We provide a unified taxonomy that categorizes different RecSys MIAs based on their characterizations and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain.

preprint2022arXiv

A Comparative Study of Image Disguising Methods for Confidential Outsourced Learning

Large training data and expensive model tweaking are standard features of deep learning for images. As a result, data owners often utilize cloud resources to develop large-scale complex models, which raises privacy concerns. Existing solutions are either too expensive to be practical or do not sufficiently protect the confidentiality of data and models. In this paper, we study and compare novel \emph{image disguising} mechanisms, DisguisedNets and InstaHide, aiming to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data. DisguisedNets are novel combinations of image blocktization, block-level random permutation, and two block-level secure transformations: random multidimensional projection (RMT) and AES pixel-level encryption (AES). InstaHide is an image mixup and random pixel flipping technique \cite{huang20}. We have analyzed and evaluated them under a multi-level threat model. RMT provides a better security guarantee than InstaHide, under the Level-1 adversarial knowledge with well-preserved model quality. In contrast, AES provides a security guarantee under the Level-2 adversarial knowledge, but it may affect model quality more. The unique features of image disguising also help us to protect models from model-targeted attacks. We have done an extensive experimental evaluation to understand how these methods work in different settings for different datasets.

preprint2020arXiv

SGX-MR: Regulating Dataflows for Protecting Access Patterns of Data-Intensive SGX Applications

Intel SGX has been a popular trusted execution environment (TEE) for protecting the integrity and confidentiality of applications running on untrusted platforms such as cloud. However, the access patterns of SGX-based programs can still be observed by adversaries, which may leak important information for successful attacks. Researchers have been experimenting with Oblivious RAM (ORAM) to address the privacy of access patterns. ORAM is a powerful low-level primitive that provides application-agnostic protection for any I/O operations, however, at a high cost. We find that some application-specific access patterns, such as sequential block I/O, do not provide additional information to adversaries. Others, such as sorting, can be replaced with specific oblivious algorithms that are more efficient than ORAM. The challenge is that developers may need to look into all the details of application-specific access patterns to design suitable solutions, which is time-consuming and error-prone. In this paper, we present the lightweight SGX based MapReduce (SGX-MR) approach that regulates the dataflow of data-intensive SGX applications for easier application-level access-pattern analysis and protection. It uses the MapReduce framework to cover a large class of data-intensive applications, and the entire framework can be implemented with a small memory footprint. With this framework, we have examined the stages of data processing, identified the access patterns that need protection, and designed corresponding efficient protection methods. Our experiments show that SGX-MR based applications are much more efficient than ORAM-based implementations.

preprint2013arXiv

Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation

With the wide deployment of public cloud computing infrastructures, using clouds to host data query services has become an appealing solution for the advantages on scalability and cost-saving. However, some data might be sensitive that the data owner does not want to move to the cloud unless the data confidentiality and query privacy are guaranteed. On the other hand, a secured query service should still provide efficient query processing and significantly reduce the in-house workload to fully realize the benefits of cloud computing. We propose the RASP data perturbation method to provide secure and efficient range query and kNN query services for protected data in the cloud. The RASP data perturbation method combines order preserving encryption, dimensionality expansion, random noise injection, and random projection, to provide strong resilience to attacks on the perturbed data and queries. It also preserves multidimensional ranges, which allows existing indexing techniques to be applied to speedup range query processing. The kNN-R algorithm is designed to work with the RASP range query algorithm to process the kNN queries. We have carefully analyzed the attacks on data and queries under a precisely defined threat model and realistic security assumptions. Extensive experiments have been conducted to show the advantages of this approach on efficiency and security.

preprint2013arXiv

Secure Computation of Top-K Eigenvectors for Shared Matrices in the Cloud

With the development of sensor network, mobile computing, and web applications, data are now collected from many distributed sources to form big datasets. Such datasets can be hosted in the cloud to achieve economical processing. However, these data might be highly sensitive requiring secure storage and processing. We envision a cloud-based data storage and processing framework that enables users to economically and securely share and handle big datasets. Under this framework, we study the matrix-based data mining algorithms with a focus on the secure top-k eigenvector algorithm. Our approach uses an iterative processing model in which the authorized user interacts with the cloud to achieve the result. In this process, both the source matrix and the intermediate results keep confidential and the client-side incurs low costs. The security of this approach is guaranteed by using Paillier Encryption and a random perturbation technique. We carefully analyze its security under a cloud-specific threat model. Our experimental results show that the proposed method is scalable to big matrices while requiring low client-side costs.

Keke Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Membership Inference Attacks on Recommender System: A Survey

A Comparative Study of Image Disguising Methods for Confidential Outsourced Learning

SGX-MR: Regulating Dataflows for Protecting Access Patterns of Data-Intensive SGX Applications

Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation

Secure Computation of Top-K Eigenvectors for Shared Matrices in the Cloud