Source author record

Aleksandra Korolova

Aleksandra Korolova appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Machine Learning cs.CY Data Structures and Algorithms Databases Social and Information Networks

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

"You Can't Fix What You Can't Measure": Privately Measuring Demographic Performance Disparities in Federated Learning

As in traditional machine learning models, models trained with federated learning may exhibit disparate performance across demographic groups. Model holders must identify these disparities to mitigate undue harm to the groups. However, measuring a model's performance in a group requires access to information about group membership which, for privacy reasons, often has limited availability. We propose novel locally differentially private mechanisms to measure differences in performance across groups while protecting the privacy of group membership. To analyze the effectiveness of the mechanisms, we bound their error in estimating a disparity when optimized for a given privacy budget. Our results show that the error rapidly decreases for realistic numbers of participating clients, demonstrating that, contrary to what prior work suggested, protecting privacy is not necessarily in conflict with identifying performance disparities of federated models.

preprint2021arXiv

Advances and Open Problems in Federated Learning

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

preprint2020arXiv

On the Data Fight Between Cities and Mobility Providers

E-Scooters are changing transportation habits. In an attempt to oversee scooter usage, the Los Angeles Department of Transportation has put forth a specification that requests detailed data on scooter usage from scooter companies. In this work, we first argue that L.A.'s data request for using a new specification is not warranted as proposed use cases can be met by already existing specifications. Second, we show that even the existing specification, that requires companies to publish real-time data of parked scooters, puts the privacy of individuals using the scooters at risk. We then propose an algorithm that enables formal privacy and utility guarantees when publishing parked scooters data, allowing city authorities to meet their use cases while preserving riders' privacy.

preprint2020arXiv

The Power of The Hybrid Model for Mean Estimation

We explore the power of the hybrid model of differential privacy (DP), in which some users desire the guarantees of the local model of DP and others are content with receiving the trusted-curator model guarantees. In particular, we study the utility of hybrid model estimators that compute the mean of arbitrary real-valued distributions with bounded support. When the curator knows the distribution's variance, we design a hybrid estimator that, for realistic datasets and parameter settings, achieves a constant factor improvement over natural baselines. We then analytically characterize how the estimator's utility is parameterized by the problem setting and parameter choices. When the distribution's variance is unknown, we design a heuristic hybrid estimator and analyze how it compares to the baselines. We find that it often performs better than the baselines, and sometimes almost as well as the known-variance estimator. We then answer the question of how our estimator's utility is affected when users' data are not drawn from the same distribution, but rather from distributions dependent on their trust model preference. Concretely, we examine the implications of the two groups' distributions diverging and show that in some cases, our estimators maintain fairly high utility. We then demonstrate how our hybrid estimator can be incorporated as a sub-component in more complex, higher-dimensional applications. Finally, we propose a new privacy amplification notion for the hybrid model that emerges due to interaction between the groups, and derive corresponding amplification results for our hybrid estimators.

preprint2011arXiv

Personalized Social Recommendations - Accurate or Private?

With the recent surge of social networks like Facebook, new forms of recommendations have become possible - personalized recommendations of ads, content, and even new friend and product connections based on one's social interactions. Since recommendations may use sensitive social information, it is speculated that these recommendations are associated with privacy risks. The main contribution of this work is in formalizing these expected trade-offs between the accuracy and privacy of personalized social recommendations. In this paper, we study whether "social recommendations", or recommendations that are solely based on a user's social network, can be made without disclosing sensitive links in the social graph. More precisely, we quantify the loss in utility when existing recommendation algorithms are modified to satisfy a strong notion of privacy, called differential privacy. We prove lower bounds on the minimum loss in utility for any recommendation algorithm that is differentially private. We adapt two privacy preserving algorithms from the differential privacy literature to the problem of social recommendations, and analyze their performance in comparison to the lower bounds, both analytically and experimentally. We show that good private social recommendations are feasible only for a small subset of the users in the social network or for a lenient setting of privacy parameters.

preprint2010arXiv

On the (Im)possibility of Preserving Utility and Privacy in Personalized Social Recommendations

With the recent surge of social networks like Facebook, new forms of recommendations have become possible -- personalized recommendations of ads, content, and even new social and product connections based on one's social interactions. In this paper, we study whether "social recommendations", or recommendations that utilize a user's social network, can be made without disclosing sensitive links between users. More precisely, we quantify the loss in utility when existing recommendation algorithms are modified to satisfy a strong notion of privacy called differential privacy. We propose lower bounds on the minimum loss in utility for any recommendation algorithm that is differentially private. We also propose two recommendation algorithms that satisfy differential privacy, analyze their performance in comparison to the lower bound, both analytically and experimentally, and show that good private social recommendations are feasible only for a few users in the social network or for a lenient setting of privacy parameters.