Source author record

Sanjeev Kulkarni

Sanjeev Kulkarni appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Cryptography and Security Distributed, Parallel, and Cluster Computing Information Retrieval Information Theory math.ST Networking and Internet Architecture Systems and Control

Catalog footprint

What is connected

6works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Statistical Unlearning of Distributions: A Hypothesis Testing Approach

Machine learning systems increasingly face requirements to forget not only individual data points, but entire domains of information, such as toxic language, copyrighted corpora, or demographic biases. This raises a fundamental dilemma of statistical-computational tradeoffs: removing all samples from an unwanted domain may be computationally prohibitive, while randomly removing a subset may not provide distribution-level statistical guarantees. We propose a statistical framework for distributional unlearning, in which domains are modeled as probability distributions, and the goal is to remove a carefully chosen subset of samples that reduces the effect of an unwanted distribution while preserving performance on a desired one. We formalize this using a hypothesis test of the edited data with the desired and unwanted domains, leading to an interpretable and robust criterion for selecting samples to remove. Within this statistical framework, we characterize the fundamental region of the allowable edited data distributions and the removal-preservation Pareto frontier for a broad class of distribution families. This includes parametric families such as shifted Gaussians of arbitrary dimension, a one-dimensional location family with log-concave noise, and the one-dimensional Poisson family. It also includes nonparametric families such as the Gaussian white noise model, a canonical model for nonparametric regression. We prove composition rules that describe how distributional unlearning behaves across multimodal unwanted domains, and introduce a central-limit behavior for the removal-preservation baselines when composing a large number of such families. Finally, we provide finite sample guarantees by providing Pareto frontiers for some selection algorithms, and observe an information-computation gap.

preprint2014arXiv

An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

We analyze a class of distributed quantized consensus algorithms for arbitrary static networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach consensus with quantized precision. We analyze the expected convergence time for the general quantized consensus algorithm proposed by Kashyap et al \cite{Kashyap}. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an $O(N^3\log N)$ upper bound for the expected convergence time on an arbitrary graph of size $N$, improving on the state of art bound of $O(N^5)$ for quantized consensus algorithms. Our result is not dependent on graph topology. Example of complete graphs is given to show how to extend the analysis to graphs of given topology.

preprint2014arXiv

Cooperative Caching based on File Popularity Ranking in Delay Tolerant Networks

Increasing storage sizes and WiFi/Bluetooth capabilities of mobile devices have made them a good platform for opportunistic content sharing. In this work we propose a network model to study this in a setting with two characteristics: 1. delay tolerant; 2. lack of infrastructure. Mobile users generate requests and opportunistically download from other users they meet, via Bluetooth or WiFi. The difference in popularity of different web content induces a non-uniform request distribution, which is usually a Zipf's law distribution. We evaluate the performance of different caching schemes and derive the optimal scheme using convex optimization techniques. The optimal solution is found efficiently using a binary search method. It is shown that as the network mobility increases, the performance of the optimal scheme far exceeds the traditional caching scheme. To the best of our knowledge, our work is the first to consider popularity ranking in performance evaluation.

preprint2014arXiv

The Application of Differential Privacy for Rank Aggregation: Privacy and Accuracy

The potential risk of privacy leakage prevents users from sharing their honest opinions on social platforms. This paper addresses the problem of privacy preservation if the query returns the histogram of rankings. The framework of differential privacy is applied to rank aggregation. The error probability of the aggregated ranking is analyzed as a result of noise added in order to achieve differential privacy. Upper bounds on the error rates for any positional ranking rule are derived under the assumption that profiles are uniformly distributed. Simulation results are provided to validate the probabilistic analysis.

preprint2013arXiv

Privacy Preserving Recommendation System Based on Groups

Recommendation systems have received considerable attention in the recent decades. Yet with the development of information technology and social media, the risk in revealing private data to service providers has been a growing concern to more and more users. Trade-offs between quality and privacy in recommendation systems naturally arise. In this paper, we present a privacy preserving recommendation framework based on groups. The main idea is to use groups as a natural middleware to preserve users' privacy. A distributed preference exchange algorithm is proposed to ensure the anonymity of data, wherein the effective size of the anonymity set asymptotically approaches the group size with time. We construct a hybrid collaborative filtering model based on Markov random walks to provide recommendations and predictions to group members. Experimental results on the MovieLens and Epinions datasets show that our proposed methods outperform the baseline methods, L+ and ItemRank, two state-of-the-art personalized recommendation algorithms, for both recommendation precision and hit rate despite the absence of personal preference information.

preprint2007arXiv

Probabilistic coherence and proper scoring rules

We provide self-contained proof of a theorem relating probabilistic coherence of forecasts to their non-domination by rival forecasts with respect to any proper scoring rule. The theorem appears to be new but is closely related to results achieved by other investigators.

Sanjeev Kulkarni

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Statistical Unlearning of Distributions: A Hypothesis Testing Approach

An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

Cooperative Caching based on File Popularity Ranking in Delay Tolerant Networks

The Application of Differential Privacy for Rank Aggregation: Privacy and Accuracy

Privacy Preserving Recommendation System Based on Groups

Probabilistic coherence and proper scoring rules