Researcher profile

Jacob Imola

Jacob Imola contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2024arXiv

Communication-Efficient Triangle Counting under Local Differential Privacy

Triangle counting in networks under LDP (Local Differential Privacy) is a fundamental task for analyzing connection patterns or calculating a clustering coefficient while strongly protecting sensitive friendships from a central server. In particular, a recent study proposes an algorithm for this task that uses two rounds of interaction between users and the server to significantly reduce estimation error. However, this algorithm suffers from a prohibitively high communication cost due to a large noisy graph each user needs to download. In this work, we propose triangle counting algorithms under LDP with a small estimation error and communication cost. We first propose two-rounds algorithms consisting of edge sampling and carefully selecting edges each user downloads so that the estimation error is small. Then we propose a double clipping technique, which clips the number of edges and then the number of noisy triangles, to significantly reduce the sensitivity of each user's query. Through comprehensive evaluation, we show that our algorithms dramatically reduce the communication cost of the existing algorithm, e.g., from 6 hours to 8 seconds or less at a 20 Mbps download rate, while keeping a small estimation error.

preprint2022arXiv

Differentially Private Triangle and 4-Cycle Counting in the Shuffle Model

Subgraph counting is fundamental for analyzing connection patterns or clustering tendencies in graph data. Recent studies have applied LDP (Local Differential Privacy) to subgraph counting to protect user privacy even against a data collector in social networks. However, existing local algorithms suffer from extremely large estimation errors or assume multi-round interaction between users and the data collector, which requires a lot of user effort and synchronization. In this paper, we focus on a one-round of interaction and propose accurate subgraph counting algorithms by introducing a recently studied shuffle model. We first propose a basic technique called wedge shuffling to send wedge information, the main component of several subgraphs, with small noise. Then we apply our wedge shuffling to counting triangles and 4-cycles -- basic subgraphs for analyzing clustering tendencies -- with several additional techniques. We also show upper bounds on the estimation error for each algorithm. We show through comprehensive experiments that our one-round shuffle algorithms significantly outperform the one-round local algorithms in terms of accuracy and achieve small estimation errors with a reasonable privacy budget, e.g., smaller than 1 in edge DP.

preprint2022arXiv

Online $k$-means Clustering on Arbitrary Data Streams

We consider online $k$-means clustering where each new point is assigned to the nearest cluster center, after which the algorithm may update its centers. The loss incurred is the sum of squared distances from new points to their assigned cluster centers. The goal over a data stream $X$ is to achieve loss that is a constant factor of $L(X, OPT_k)$, the best possible loss using $k$ fixed points in hindsight. We propose a data parameter, $Λ(X)$, such that for any algorithm maintaining $O(k\text{poly}(\log n))$ centers at time $n$, there exists a data stream $X$ for which a loss of $Ω(Λ(X))$ is inevitable. We then give a randomized algorithm that achieves clustering loss $O(Λ(X) + L(X, OPT_k))$. Our algorithm uses $O(k\text{poly}(\log n))$ memory and maintains $O(k\text{poly}(\log n))$ cluster centers. Our algorithm also enjoys a running time of $O(k\text{poly}(\log n))$ and is the first algorithm to achieve polynomial space and time complexity in this setting. It also is the first to have provable guarantees without making any assumptions on the input data.

preprint2021arXiv

Locally Differentially Private Analysis of Graph Statistics

Differentially private analysis of graphs is widely used for releasing statistics from sensitive graphs while still preserving user privacy. Most existing algorithms however are in a centralized privacy model, where a trusted data curator holds the entire graph. As this model raises a number of privacy and security issues -- such as, the trustworthiness of the curator and the possibility of data breaches, it is desirable to consider algorithms in a more decentralized local model where no server holds the entire graph. In this work, we consider a local model, and present algorithms for counting subgraphs -- a fundamental task for analyzing the connection patterns in a graph -- with LDP (Local Differential Privacy). For triangle counts, we present algorithms that use one and two rounds of interaction, and show that an additional round can significantly improve the utility. For $k$-star counts, we present an algorithm that achieves an order optimal estimation error in the non-interactive local model. We provide new lower-bounds on the estimation error for general graph statistics including triangle counts and $k$-star counts. Finally, we perform extensive experiments on two real datasets, and show that it is indeed possible to accurately estimate subgraph counts in the local differential privacy model.