Source author record

Jiale Guo

Jiale Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Artificial Intelligence

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Privacy-Preserving Retrieval Augmented Generation with Distance-Preserving Encryption

RAG has emerged as a key technique for enhancing response quality of LLMs without high computational cost. In traditional architectures, RAG services are provided by a single entity that hosts the dataset within a trusted local environment. However, individuals or small organizations often lack the resources to maintain data storage servers, leading them to rely on outsourced cloud storage. This dependence on untrusted third-party services introduces privacy risks. Embedding-based retrieval mechanisms, commonly used in RAG systems, are vulnerable to privacy leakage such as vector-to-text reconstruction attacks and structural leakage via vector analysis. Several privacy-preserving RAG techniques have been proposed but most existing approaches rely on partially homomorphic encryption, which incurs substantial computational overhead. To address these challenges, we propose an efficient privacy-preserving RAG framework (ppRAG) tailored for untrusted cloud environments that defends against vector-to-text attack, vector analysis, and query analysis. We propose Conditional Approximate Distance-Comparison-Preserving Symmetric Encryption (CAPRISE) that encrypts embeddings while still allowing the cloud to compute similarity between an encrypted query and the encrypted database embeddings. CAPRISE preserves only the relative distance ordering between the encrypted query and each encrypted database embedding, without exposing inter-database distances, thereby enhancing both privacy and efficiency. To mitigate query analysis, we introduce DP by perturbing the query embedding prior to encryption, preventing the cloud from inferring sensitive patterns. Experimental results show that ppRAG achieves efficient processing throughput, high retrieval accuracy, strong privacy guarantees, making it a practical solution for resource-constrained users seeking secure cloud-augmented LLMs.

preprint2022arXiv

Efficient Dropout-resilient Aggregation for Privacy-preserving Machine Learning

With the increasing adoption of data-hungry machine learning algorithms, personal data privacy has emerged as one of the key concerns that could hinder the success of digital transformation. As such, Privacy-Preserving Machine Learning (PPML) has received much attention from both academia and industry. However, organizations are faced with the dilemma that, on the one hand, they are encouraged to share data to enhance ML performance, but on the other hand, they could potentially be breaching the relevant data privacy regulations. Practical PPML typically allows multiple participants to individually train their ML models, which are then aggregated to construct a global model in a privacy-preserving manner, e.g., based on multi-party computation or homomorphic encryption. Nevertheless, in most important applications of large-scale PPML, e.g., by aggregating clients' gradients to update a global model for federated learning, such as consumer behavior modeling of mobile application services, some participants are inevitably resource-constrained mobile devices, which may drop out of the PPML system due to their mobility nature. Therefore, the resilience of privacy-preserving aggregation has become an important problem to be tackled. In this paper, we propose a scalable privacy-preserving aggregation scheme that can tolerate dropout by participants at any time, and is secure against both semi-honest and active malicious adversaries by setting proper system parameters. By replacing communication-intensive building blocks with a seed homomorphic pseudo-random generator, and relying on the additive homomorphic property of Shamir secret sharing scheme, our scheme outperforms state-of-the-art schemes by up to 6.37$\times$ in runtime and provides a stronger dropout-resilience. The simplicity of our scheme makes it attractive both for implementation and for further improvements.

preprint2022arXiv

Privacy-Preserving Aggregation in Federated Learning: A Survey

Over the recent years, with the increasing adoption of Federated Learning (FL) algorithms and growing concerns over personal data privacy, Privacy-Preserving Federated Learning (PPFL) has attracted tremendous attention from both academia and industry. Practical PPFL typically allows multiple participants to individually train their machine learning models, which are then aggregated to construct a global model in a privacy-preserving manner. As such, Privacy-Preserving Aggregation (PPAgg) as the key protocol in PPFL has received substantial research interest. This survey aims to fill the gap between a large number of studies on PPFL, where PPAgg is adopted to provide a privacy guarantee, and the lack of a comprehensive survey on the PPAgg protocols applied in FL systems. In this survey, we review the PPAgg protocols proposed to address privacy and security issues in FL systems. The focus is placed on the construction of PPAgg protocols with an extensive analysis of the advantages and disadvantages of these selected PPAgg protocols and solutions. Additionally, we discuss the open-source FL frameworks that support PPAgg. Finally, we highlight important challenges and future research directions for applying PPAgg to FL systems and the combination of PPAgg with other technologies for further security improvement.

Jiale Guo

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Efficient Privacy-Preserving Retrieval Augmented Generation with Distance-Preserving Encryption

Efficient Dropout-resilient Aggregation for Privacy-preserving Machine Learning

Privacy-Preserving Aggregation in Federated Learning: A Survey