Researcher profile

Sandeep Sen

Sandeep Sen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2012arXiv

A simple D^2-sampling based PTAS for k-means and other Clustering Problems

Given a set of points $P \subset \mathbb{R}^d$, the $k$-means clustering problem is to find a set of $k$ {\em centers} $C = \{c_1,...,c_k\}, c_i \in \mathbb{R}^d,$ such that the objective function $\sum_{x \in P} d(x,C)^2$, where $d(x,C)$ denotes the distance between $x$ and the closest center in $C$, is minimized. This is one of the most prominent objective functions that have been studied with respect to clustering. $D^2$-sampling \cite{ArthurV07} is a simple non-uniform sampling technique for choosing points from a set of points. It works as follows: given a set of points $P \subseteq \mathbb{R}^d$, the first point is chosen uniformly at random from $P$. Subsequently, a point from $P$ is chosen as the next sample with probability proportional to the square of the distance of this point to the nearest previously sampled points. $D^2$-sampling has been shown to have nice properties with respect to the $k$-means clustering problem. Arthur and Vassilvitskii \cite{ArthurV07} show that $k$ points chosen as centers from $P$ using $D^2$-sampling gives an $O(\log{k})$ approximation in expectation. Ailon et. al. \cite{AJMonteleoni09} and Aggarwal et. al. \cite{AggarwalDK09} extended results of \cite{ArthurV07} to show that $O(k)$ points chosen as centers using $D^2$-sampling give $O(1)$ approximation to the $k$-means objective function with high probability. In this paper, we further demonstrate the power of $D^2$-sampling by giving a simple randomized $(1 + ε)$-approximation algorithm that uses the $D^2$-sampling in its core.

preprint2012arXiv

Efficient cache oblivious algorithms for randomized divide-and-conquer on the multicore model

In this paper we present randomized algorithms for sorting and convex hull that achieves optimal performance (for speed-up and cache misses) on the multicore model with private cache model. Our algorithms are cache oblivious and generalize the randomized divide and conquer strategy given by Reischuk and Reif and Sen. Although the approach yielded optimal speed-up in the PRAM model, we require additional techniques to optimize cache-misses in an oblivious setting. Under a mild assumption on input and number of processors our algorithm will have optimal time and cache misses with high probability. Although similar results have been obtained recently for sorting, we feel that our approach is simpler and general and we apply it to obtain an optimal parallel algorithm for 3D convex hulls with similar bounds. We also present a simple randomized processor allocation technique without the explicit knowledge of the number of processors that is likely to find additional applications in resource oblivious environments.

preprint2012arXiv

Maintaining Approximate Maximum Weighted Matching in Fully Dynamic Graphs

We present a fully dynamic algorithm for maintaining approximate maximum weight matching in general weighted graphs. The algorithm maintains a matching ${\cal M}$ whose weight is at least $1/8 M^{*}$ where $M^{*}$ is the weight of the maximum weight matching. The algorithm achieves an expected amortized $O(\log n \log \mathcal C)$ time per edge insertion or deletion, where $\mathcal C$ is the ratio of the weights of the highest weight edge to the smallest weight edge in the given graph. Using a simple randomized scaling technique, we are able to obtain a matching whith expected approximation ratio 4.9108.