Source author record

Sanjeev Khanna

Sanjeev Khanna appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Computer Science and Game Theory Discrete Mathematics Databases Information Theory Machine Learning math.IT physics.soc-ph q-fin.CP q-fin.RM Social and Information Networks

Catalog footprint

What is connected

26works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

New Trade-Offs for Fully Dynamic Matching via Hierarchical EDCS

We study the maximum matching problem in fully dynamic graphs: a graph is undergoing both edge insertions and deletions, and the goal is to efficiently maintain a large matching after each edge update. This problem has received considerable attention in recent years. The known algorithms naturally exhibit a trade-off between the quality of the matching maintained (i.e., the approximation ratio) and the time needed per update. While several interesting results have been obtained, the optimal behavior of this trade-off remains largely unclear. Our main contribution is a new approach to designing fully dynamic approximate matching algorithms that in a unified manner not only (essentially) recovers all previously known trade-offs that were achieved via very different techniques, but reveals some new ones as well. As our main tool to achieve this, we introduce a generalization of the edge-degree constrained subgraph (EDCS) of Bernstein and Stein (2015) that we call the hierarchical EDCS (HEDCS).

preprint2022arXiv

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

We present a new approach for finding matchings in dense graphs by building on Szemerédi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit slight improvements over longstanding bounds for matchings in streaming and dynamic graphs. In particular, we establish the following results for $n$-vertex graphs: * A deterministic single-pass streaming algorithm that finds a $(1-o(1))$-approximate matching in $o(n^2)$ bits of space. This constitutes the first single-pass algorithm for this problem in sublinear space that improves over the $\frac{1}{2}$-approximation of the greedy algorithm. * A randomized fully dynamic algorithm that with high probability maintains a $(1-o(1))$-approximate matching in $o(n)$ worst-case update time per each edge insertion or deletion. The algorithm works even against an adaptive adversary. This is the first $o(n)$ update-time dynamic algorithm with approximation guarantee arbitrarily close to one. Given the use of regularity lemma, the improvement obtained by our algorithms over trivial bounds is only by some $(\log^*{n})^{Θ(1)}$ factor. Nevertheless, in each case, they show that the ``right'' answer to the problem is not what is dictated by the previous bounds. Finally, in the streaming model, we also present a randomized $(1-o(1))$-approximation algorithm whose space can be upper bounded by the density of certain Ruzsa-Szemerédi (RS) graphs. While RS graphs by now have been used extensively to prove streaming lower bounds, ours is the first to use them as an upper bound tool for designing improved streaming algorithms.

preprint2022arXiv

Sublinear Algorithms for Hierarchical Clustering

Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require linear time/space complexity. In many applications the underlying graph can be massive in size making it computationally challenging to process the graph even using a linear time/space algorithm. As a result, there is a strong interest in designing algorithms that can perform global computation using only sublinear resources. The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) MPC model where the graph edges are partitioned over several machines connected via a communication channel. We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing better algorithms in each of these models.

preprint2020arXiv

An Efficient PTAS for Stochastic Load Balancing with Poisson Jobs

We give the first polynomial-time approximation scheme (PTAS) for the stochastic load balancing problem when the job sizes follow Poisson distributions. This improves upon the 2-approximation algorithm due to Goel and Indyk (FOCS'99). Moreover, our approximation scheme is an efficient PTAS that has a running time double exponential in $1/ε$ but nearly-linear in $n$, where $n$ is the number of jobs and $ε$ is the target error. Previously, a PTAS (not efficient) was only known for jobs that obey exponential distributions (Goel and Indyk, FOCS'99). Our algorithm relies on several probabilistic ingredients including some (seemingly) new results on scaling and the so-called "focusing effect" of maximum of Poisson random variables which might be of independent interest.

preprint2020arXiv

Near-linear Size Hypergraph Cut Sparsifiers

Cuts in graphs are a fundamental object of study, and play a central role in the study of graph algorithms. The problem of sparsifying a graph while approximately preserving its cut structure has been extensively studied and has many applications. In a seminal work, Benczúr and Karger (1996) showed that given any $n$-vertex undirected weighted graph $G$ and a parameter $\varepsilon \in (0,1)$, there is a near-linear time algorithm that outputs a weighted subgraph $G'$ of $G$ of size $\tilde{O}(n/\varepsilon^2)$ such that the weight of every cut in $G$ is preserved to within a $(1 \pm \varepsilon)$-factor in $G'$. The graph $G'$ is referred to as a {\em $(1 \pm \varepsilon)$-approximate cut sparsifier} of $G$. A natural question is if such cut-preserving sparsifiers also exist for hypergraphs. Kogan and Krauthgamer (2015) initiated a study of this question and showed that given any weighted hypergraph $H$ where the cardinality of each hyperedge is bounded by $r$, there is a polynomial-time algorithm to find a $(1 \pm \varepsilon)$-approximate cut sparsifier of $H$ of size $\tilde{O}(\frac{nr}{\varepsilon^2})$. Since $r$ can be as large as $n$, in general, this gives a hypergraph cut sparsifier of size $\tilde{O}(n^2/\varepsilon^2)$, which is a factor $n$ larger than the Benczúr-Karger bound for graphs. It has been an open question whether or not Benczúr-Karger bound is achievable on hypergraphs. In this work, we resolve this question in the affirmative by giving a new polynomial-time algorithm for creating hypergraph sparsifiers of size $\tilde{O}(n/\varepsilon^2)$.

preprint2020arXiv

Near-Perfect Recovery in the One-Dimensional Latent Space Model

Suppose a graph $G$ is stochastically created by uniformly sampling vertices along a line segment and connecting each pair of vertices with a probability that is a known decreasing function of their distance. We ask if it is possible to reconstruct the actual positions of the vertices in $G$ by only observing the generated unlabeled graph. We study this question for two natural edge probability functions -- one where the probability of an edge decays exponentially with the distance and another where this probability decays only linearly. We initiate our study with the weaker goal of recovering only the order in which vertices appear on the line segment. For a segment of length $n$ and a precision parameter $δ$, we show that for both exponential and linear decay edge probability functions, there is an efficient algorithm that correctly recovers (up to reflection symmetry) the order of all vertices that are at least $δ$ apart, using only $\tilde{O}(\frac{n}{δ^ 2})$ samples (vertices). Building on this result, we then show that $O(\frac{n^2 \log n}{δ^2})$ vertices (samples) are sufficient to additionally recover the location of each vertex on the line to within a precision of $δ$. We complement this result with an $Ω(\frac{n^{1.5}}δ)$ lower bound on samples needed for reconstructing positions (even by a computationally unbounded algorithm), showing that the task of recovering positions is information-theoretically harder than recovering the order. We give experimental results showing that our algorithm recovers the positions of almost all points with high accuracy.

preprint2020arXiv

Sublinear Algorithms and Lower Bounds for Metric TSP Cost Estimation

We consider the problem of designing sublinear time algorithms for estimating the cost of a minimum metric traveling salesman (TSP) tour. Specifically, given access to a $n \times n$ distance matrix $D$ that specifies pairwise distances between $n$ points, the goal is to estimate the TSP cost by performing only sublinear (in the size of $D$) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any $\varepsilon > 0$, there exists an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm that returns a $(1 + \varepsilon)$-approximate estimate of the MST cost. This result immediately implies an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm to estimate the TSP cost to within a $(2 + \varepsilon)$ factor for any $\varepsilon > 0$. However, no $o(n^2)$ time algorithms are known to approximate metric TSP to a factor that is strictly better than $2$. On the other hand, there were also no known barriers that rule out the existence of $(1 + \varepsilon)$-approximate estimation algorithms for metric TSP with $\tilde{O}(n)$ time for any fixed $\varepsilon > 0$. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. We also show that the problem of estimating metric TSP cost is closely connected to the problem of estimating the size of a maximum matching in a graph.

preprint2016arXiv

Sensitivity and Computational Complexity in Financial Networks

Modern financial networks exhibit a high degree of interconnectedness and determining the causes of instability and contagion in financial networks is necessary to inform policy and avoid future financial collapse. In the American Economic Review, Elliott, Golub and Jackson proposed a simple model for capturing the dynamics of complex financial networks. In Elliott, Golub and Jackson's model, each institution in the network can buy underlying assets or percentage shares in other institutions (cross-holdings) and if any institution's value drops below a critical threshold value, its value suffers an additional failure cost. This work shows that even in simple model put forward by Elliott, Golub and Jackson there are fundamental barriers to understanding the risks that are inherent in a network. First, if institutions are not required to maintain a minimum amount of self-holdings, an $ε$ change in investments by a single institution can have an arbitrarily magnified influence on the net worth of the institutions in the system. This sensitivity result shows that if institutions have small self-holdings, then estimating the market value of an institution requires almost perfect information about every cross-holding in the system. Second, we show that even if a regulator has complete information about all cross-holdings in the system, it may be computationally intractable to even estimate the number of failures that could be caused by an arbitrarily small shock to the system. Together, these results show that any uncertainty in the cross-holdings or values of the underlying assets can be amplified by the network to arbitrarily large uncertainty in the valuations of institutions in the network.

preprint2016arXiv

Strategic Network Formation with Attack and Immunization

Strategic network formation arises where agents receive benefit from connections to other agents, but also incur costs for forming links. We consider a new network formation game that incorporates an adversarial attack, as well as immunization against attack. An agent's benefit is the expected size of her connected component post-attack, and agents may also choose to immunize themselves from attack at some additional cost. Our framework is a stylized model of settings where reachability rather than centrality is the primary concern and vertices vulnerable to attacks may reduce risk via costly measures. In the reachability benefit model without attack or immunization, the set of equilibria is the empty graph and any tree. The introduction of attack and immunization changes the game dramatically; new equilibrium topologies emerge, some more sparse and some more dense than trees. We show that, under a mild assumption on the adversary, every equilibrium network with $n$ agents contains at most $2n-4$ edges for $n\geq 4$. So despite permitting topologies denser than trees, the amount of overbuilding is limited. We also show that attack and immunization don't significantly erode social welfare: every non-trivial equilibrium with respect to several adversaries has welfare at least as that of any equilibrium in the attack-free model. We complement our theory with simulations demonstrating fast convergence of a new bounded rationality dynamic which generalizes linkstable best response but is considerably more powerful in our game. The simulations further elucidate the wide variety of asymmetric equilibria and demonstrate topological consequences of the dynamics e.g. heavy-tailed degree distributions. Finally, we report on a behavioral experiment on our game with over 100 participants, where despite the complexity of the game, the resulting network was surprisingly close to equilibrium.

preprint2016arXiv

The Ratio Index for Budgeted Learning, with Applications

In the budgeted learning problem, we are allowed to experiment on a set of alternatives (given a fixed experimentation budget) with the goal of picking a single alternative with the largest possible expected payoff. Approximation algorithms for this problem were developed by Guha and Munagala by rounding a linear program that couples the various alternatives together. In this paper we present an index for this problem, which we call the ratio index, which also guarantees a constant factor approximation. Index-based policies have the advantage that a single number (i.e. the index) can be computed for each alternative irrespective of all other alternatives, and the alternative with the highest index is experimented upon. This is analogous to the famous Gittins index for the discounted multi-armed bandit problem. The ratio index has several interesting structural properties. First, we show that it can be computed in strongly polynomial time. Second, we show that with the appropriate discount factor, the Gittins index and our ratio index are constant factor approximations of each other, and hence the Gittins index also gives a constant factor approximation to the budgeted learning problem. Finally, we show that the ratio index can be used to create an index-based policy that achieves an O(1)-approximation for the finite horizon version of the multi-armed bandit problem. Moreover, the policy does not require any knowledge of the horizon (whereas we compare its performance against an optimal strategy that is aware of the horizon). This yields the following surprising result: there is an index-based policy that achieves an O(1)-approximation for the multi-armed bandit problem, oblivious to the underlying discount factor.

preprint2016arXiv

Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem

We resolve the space complexity of single-pass streaming algorithms for approximating the classic set cover problem. For finding an $α$-approximate set cover (for any $α= o(\sqrt{n})$) using a single-pass streaming algorithm, we show that $Θ(mn/α)$ space is both sufficient and necessary (up to an $O(\log{n})$ factor); here $m$ denotes number of the sets and $n$ denotes size of the universe. This provides a strong negative answer to the open question posed by Indyk et al. (2015) regarding the possibility of having a single-pass algorithm with a small approximation factor that uses sub-linear space. We further study the problem of estimating the size of a minimum set cover (as opposed to finding the actual sets), and establish that an additional factor of $α$ saving in the space is achievable in this case and that this is the best possible. In other words, we show that $Θ(mn/α^2)$ space is both sufficient and necessary (up to logarithmic factors) for estimating the size of a minimum set cover to within a factor of $α$. Our algorithm in fact works for the more general problem of estimating the optimal value of a covering integer program. On the other hand, our lower bound holds even for set cover instances where the sets are presented in a random order.

preprint2015arXiv

Algorithms for Provisioning Queries and Analytics

Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates $k$ hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any $k$ hypotheticals, one can create a poly-size (in $k$) sketch that can then be used to answer the query under any of the $2^{k}$ possible scenarios without accessing the original instance. In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well as simpler queries that compute count and sum/average of positive values) can be compactly provisioned to provide (multiplicative) approximate answers to an arbitrary precision. In contrast, exact provisioning for each of these statistics requires the sketch size to be exponential in $k$. We then establish that for any complex query whose logical component is a positive relational algebra query, as long as the numerical component can be compactly provisioned, the complex query itself can be compactly provisioned. On the other hand, introducing negation or recursion in the logical component again requires the sketch size to be exponential in $k$. While our positive results use algorithms that do not access the original instance after a scenario is known, we prove our lower bounds even for the case when, knowing the scenario, limited access to the instance is allowed.

preprint2015arXiv

Dynamic Sketching for Graph Optimization Problems with Applications to Cut-Preserving Sketches

In this paper, we introduce a new model for sublinear algorithms called \emph{dynamic sketching}. In this model, the underlying data is partitioned into a large \emph{static} part and a small \emph{dynamic} part and the goal is to compute a summary of the static part (i.e, a \emph{sketch}) such that given any \emph{update} for the dynamic part, one can combine it with the sketch to compute a given function. We say that a sketch is \emph{compact} if its size is bounded by a polynomial function of the length of the dynamic data, (essentially) independent of the size of the static part. A graph optimization problem $P$ in this model is defined as follows. The input is a graph $G(V,E)$ and a set $T \subseteq V$ of $k$ terminals; the edges between the terminals are the dynamic part and the other edges in $G$ are the static part. The goal is to summarize the graph $G$ into a compact sketch (of size poly$(k)$) such that given any set $Q$ of edges between the terminals, one can answer the problem $P$ for the graph obtained by inserting all edges in $Q$ to $G$, using only the sketch. We study the fundamental problem of computing a maximum matching and prove tight bounds on the sketch size. In particular, we show that there exists a (compact) dynamic sketch of size $O(k^2)$ for the matching problem and any such sketch has to be of size $Ω(k^2)$. Our sketch for matchings can be further used to derive compact dynamic sketches for other fundamental graph problems involving cuts and connectivities. Interestingly, our sketch for matchings can also be used to give an elementary construction of a \emph{cut-preserving vertex sparsifier} with space $O(kC^2)$ for $k$-terminal graphs; here $C$ is the total capacity of the edges incident on the terminals. Additionally, we give an improved lower bound (in terms of $C$) of $Ω(C/\log{C})$ on size of cut-preserving vertex sparsifiers.

preprint2015arXiv

Fast Convergence in the Double Oral Auction

A classical trading experiment consists of a set of unit demand buyers and unit supply sellers with identical items. Each agent's value or opportunity cost for the item is their private information and preferences are quasi-linear. Trade between agents employs a double oral auction (DOA) in which both buyers and sellers call out bids or offers which an auctioneer recognizes. Transactions resulting from accepted bids and offers are recorded. This continues until there are no more acceptable bids or offers. Remarkably, the experiment consistently terminates in a Walrasian price. The main result of this paper is a mechanism in the spirit of the DOA that converges to a Walrasian equilibrium in a polynomial number of steps, thus providing a theoretical basis for the above-described empirical phenomenon. It is well-known that computation of a Walrasian equilibrium for this market corresponds to solving a maximum weight bipartite matching problem. The uncoordinated but rational responses of agents thus solve in a distributed fashion a maximum weight bipartite matching problem that is encoded by their private valuations. We show, furthermore, that every Walrasian equilibrium is reachable by some sequence of responses. This is in contrast to the well known auction algorithms for this problem which only allow one side to make offers and thus essentially choose an equilibrium that maximizes the surplus for the side making offers. Our results extend to the setting where not every agent pair is allowed to trade with each other.

preprint2015arXiv

Tight Bounds for Linear Sketches of Approximate Matchings

We resolve the space complexity of linear sketches for approximating the maximum matching problem in dynamic graph streams where the stream may include both edge insertion and deletion. Specifically, we show that for any $ε> 0$, there exists a one-pass streaming algorithm, which only maintains a linear sketch of size $\tilde{O}(n^{2-3ε})$ bits and recovers an $n^ε$-approximate maximum matching in dynamic graph streams, where $n$ is the number of vertices in the graph. In contrast to the extensively studied insertion-only model, to the best of our knowledge, no non-trivial single-pass streaming algorithms were previously known for approximating the maximum matching problem on general dynamic graph streams. Furthermore, we show that our upper bound is essentially tight. Namely, any linear sketch for approximating the maximum matching to within a factor of $O(n^ε)$ has to be of size $n^{2-3ε-o(1)}$ bits. We establish this lower bound by analyzing the corresponding simultaneous number-in-hand communication model, with a combinatorial construction based on Ruzsa-Szemerédi graphs.

preprint2014arXiv

On $(1,ε)$-Restricted Assignment Makespan Minimization

Makespan minimization on unrelated machines is a classic problem in approximation algorithms. No polynomial time $(2-δ)$-approximation algorithm is known for the problem for constant $δ> 0$. This is true even for certain special cases, most notably the restricted assignment problem where each job has the same load on any machine but can be assigned to one from a specified subset. Recently in a breakthrough result, Svensson [Svensson, 2011] proved that the integrality gap of a certain configuration LP relaxation is upper bounded by $1.95$ for the restricted assignment problem; however, the rounding algorithm is not known to run in polynomial time. In this paper we consider the $(1,\varepsilon)$-restricted assignment problem where each job is either heavy ($p_j = 1$) or light ($p_j = \varepsilon$), for some parameter $\varepsilon > 0$. Our main result is a $(2-δ)$-approximate polynomial time algorithm for the $(1,ε)$-restricted assignment problem for a fixed constant $δ> 0$. Even for this special case, the best polynomial-time approximation factor known so far is 2. We obtain this result by rounding the configuration LP relaxation for this problem. A simple reduction from vertex cover shows that this special case remains NP-hard to approximate to within a factor better than 7/6.

preprint2014arXiv

Streaming Lower Bounds for Approximating MAX-CUT

We consider the problem of estimating the value of max cut in a graph in the streaming model of computation. At one extreme, there is a trivial $2$-approximation for this problem that uses only $O(\log n)$ space, namely, count the number of edges and output half of this value as the estimate for max cut value. On the other extreme, if one allows $\tilde{O}(n)$ space, then a near-optimal solution to the max cut value can be obtained by storing an $\tilde{O}(n)$-size sparsifier that essentially preserves the max cut. An intriguing question is if poly-logarithmic space suffices to obtain a non-trivial approximation to the max-cut value (that is, beating the factor $2$). It was recently shown that the problem of estimating the size of a maximum matching in a graph admits a non-trivial approximation in poly-logarithmic space. Our main result is that any streaming algorithm that breaks the $2$-approximation barrier requires $\tildeΩ(\sqrt{n})$ space even if the edges of the input graph are presented in random order. Our result is obtained by exhibiting a distribution over graphs which are either bipartite or $\frac{1}{2}$-far from being bipartite, and establishing that $\tildeΩ(\sqrt{n})$ space is necessary to differentiate between these two cases. Thus as a direct corollary we obtain that $\tildeΩ(\sqrt{n})$ space is also necessary to test if a graph is bipartite or $\frac{1}{2}$-far from being bipartite. We also show that for any $ε> 0$, any streaming algorithm that obtains a $(1 + ε)$-approximation to the max cut value when edges arrive in adversarial order requires $n^{1 - O(ε)}$ space, implying that $Ω(n)$ space is necessary to obtain an arbitrarily good approximation to the max cut value.

preprint2013arXiv

The Power of Local Information in Social Networks

We study the power of \textit{local information algorithms} for optimization problems on social networks. We focus on sequential algorithms for which the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. The distinguishing feature of this setting is that locality is necessitated by constraints on the network information visible to the algorithm, rather than being desirable for reasons of efficiency or parallelizability. In this sense, changes to the level of network visibility can have a significant impact on algorithm design. We study a range of problems under this model of algorithms with local information. We first consider the case in which the underlying graph is a preferential attachment network. We show that one can find the node of maximum degree in the network in a polylogarithmic number of steps, using an opportunistic algorithm that repeatedly queries the visible node of maximum degree. This addresses an open question of Bollob{á}s and Riordan. In contrast, local information algorithms require a linear number of queries to solve the problem on arbitrary networks. Motivated by problems faced by recruiters in online networks, we also consider network coverage problems such as finding a minimum dominating set. For this optimization problem we show that, if each node added to the output set reveals sufficient information about the set's neighborhood, then it is possible to design randomized algorithms for general networks that nearly match the best approximations possible even with full access to the graph structure. We show that this level of visibility is necessary. We conclude that a network provider's decision of how much structure to make visible to its users can have a significant effect on a user's ability to interact strategically with the network.

preprint2012arXiv

Mechanism Design and Risk Aversion

We develop efficient algorithms to construct utility maximizing mechanisms in the presence of risk averse players (buyers and sellers) in Bayesian settings. We model risk aversion by a concave utility function, and players play strategically to maximize their expected utility. Bayesian mechanism design has usually focused on maximizing expected revenue in a {\em risk neutral} environment, and no succinct characterization of expected utility maximizing mechanisms is known even for single-parameter multi-unit auctions. We first consider the problem of designing optimal DSIC mechanism for a risk averse seller in the case of multi-unit auctions, and we give a poly-time computable SPM that is $(1-1/e-\eps)$-approximation to the expected utility of the seller in an optimal DSIC mechanism. Our result is based on a novel application of a correlation gap bound, along with {\em splitting} and {\em merging} of random variables to redistribute probability mass across buyers. This allows us to reduce our problem to that of checking feasibility of a small number of distinct configurations, each of which corresponds to a covering LP. A feasible solution to the LP gives us the distribution on prices for each buyer to use in a randomized SPM. We next consider the setting when buyers as well as the seller are risk averse, and the objective is to maximize the seller's expected utility. We design a truthful-in-expectation mechanism whose utility is a $(1-1/e -\eps)^3$-approximation to the optimal BIC mechanism under two mild assumptions. Our mechanism consists of multiple rounds that processes each buyer in a round with small probability. Lastly, we consider the problem of revenue maximization for a risk neutral seller in presence of risk averse buyers, and give a poly-time algorithm to design an optimal mechanism for the seller.

preprint2011arXiv

Delays and the Capacity of Continuous-time Channels

Any physical channel of communication offers two potential reasons why its capacity (the number of bits it can transmit in a unit of time) might be unbounded: (1) Infinitely many choices of signal strength at any given instant of time, and (2) Infinitely many instances of time at which signals may be sent. However channel noise cancels out the potential unboundedness of the first aspect, leaving typical channels with only a finite capacity per instant of time. The latter source of infinity seems less studied. A potential source of unreliability that might restrict the capacity also from the second aspect is delay: Signals transmitted by the sender at a given point of time may not be received with a predictable delay at the receiving end. Here we examine this source of uncertainty by considering a simple discrete model of delay errors. In our model the communicating parties get to subdivide time as microscopically finely as they wish, but still have to cope with communication delays that are macroscopic and variable. The continuous process becomes the limit of our process as the time subdivision becomes infinitesimal. We taxonomize this class of communication channels based on whether the delays and noise are stochastic or adversarial; and based on how much information each aspect has about the other when introducing its errors. We analyze the limits of such channels and reach somewhat surprising conclusions: The capacity of a physical channel is finitely bounded only if at least one of the two sources of error (signal noise or delay noise) is adversarial. In particular the capacity is finitely bounded only if the delay is adversarial, or the noise is adversarial and acts with knowledge of the stochastic delay. If both error sources are stochastic, or if the noise is adversarial and independent of the stochastic delay, then the capacity of the associated physical channel is infinite.

preprint2011arXiv

Provenance Views for Module Privacy

Scientific workflow systems increasingly store provenance information about the module executions used to produce a data item, as well as the parameter settings and intermediate data items passed between module executions. However, authors/owners of workflows may wish to keep some of this information confidential. In particular, a module may be proprietary, and users should not be able to infer its behavior by seeing mappings between all data inputs and outputs. The problem we address in this paper is the following: Given a workflow, abstractly modeled by a relation R, a privacy requirement Γand costs associated with data. The owner of the workflow decides which data (attributes) to hide, and provides the user with a view R' which is the projection of R over attributes which have not been hidden. The goal is to minimize the cost of hidden data while guaranteeing that individual modules are Γ-private. We call this the "secureview" problem. We formally define the problem, study its complexity, and offer algorithmic solutions.

preprint2011arXiv

Social Welfare in One-sided Matching Markets without Money

We study social welfare in one-sided matching markets where the goal is to efficiently allocate n items to n agents that each have a complete, private preference list and a unit demand over the items. Our focus is on allocation mechanisms that do not involve any monetary payments. We consider two natural measures of social welfare: the ordinal welfare factor which measures the number of agents that are at least as happy as in some unknown, arbitrary benchmark allocation, and the linear welfare factor which assumes an agent's utility linearly decreases down his preference lists, and measures the total utility to that achieved by an optimal allocation. We analyze two matching mechanisms which have been extensively studied by economists. The first mechanism is the random serial dictatorship (RSD) where agents are ordered in accordance with a randomly chosen permutation, and are successively allocated their best choice among the unallocated items. The second mechanism is the probabilistic serial (PS) mechanism of Bogomolnaia and Moulin [8], which computes a fractional allocation that can be expressed as a convex combination of integral allocations. The welfare factor of a mechanism is the infimum over all instances. For RSD, we show that the ordinal welfare factor is asymptotically 1/2, while the linear welfare factor lies in the interval [.526, 2/3]. For PS, we show that the ordinal welfare factor is also 1/2 while the linear welfare factor is roughly 2/3. To our knowledge, these results are the first non-trivial performance guarantees for these natural mechanisms.

preprint2010arXiv

Approximability of Capacitated Network Design

In the {\em capacitated} survivable network design problem (Cap-SNDP), we are given an undirected multi-graph where each edge has a capacity and a cost. The goal is to find a minimum cost subset of edges that satisfies a given set of pairwise minimum-cut requirements. Unlike its classical special case of SNDP when all capacities are unit, the approximability of Cap-SNDP is not well understood; even in very restricted settings no known algorithm achieves a $o(m)$ approximation, where $m$ is the number of edges in the graph. In this paper, we obtain several new results and insights into the approximability of Cap-SNDP.

preprint2010arXiv

Graph Sparsification via Refinement Sampling

A graph G'(V,E') is an \eps-sparsification of G for some \eps>0, if every (weighted) cut in G' is within (1\pm \eps) of the corresponding cut in G. A celebrated result of Benczur and Karger shows that for every undirected graph G, an \eps-sparsification with O(n\log n/\e^2) edges can be constructed in O(m\log^2n) time. Applications to modern massive data sets often constrain algorithms to use computation models that restrict random access to the input. The semi-streaming model, in which the algorithm is constrained to use \tilde O(n) space, has been shown to be a good abstraction for analyzing graph algorithms in applications to large data sets. Recently, a semi-streaming algorithm for graph sparsification was presented by Anh and Guha; the total running time of their implementation is Ω(mn), too large for applications where both space and time are important. In this paper, we introduce a new technique for graph sparsification, namely refinement sampling, that gives an \tilde{O}(m) time semi-streaming algorithm for graph sparsification. Specifically, we show that refinement sampling can be used to design a one-pass streaming algorithm for sparsification that takes O(\log\log n) time per edge, uses O(\log^2 n) space per node, and outputs an \eps-sparsifier with O(n\log^3 n/\eps^2) edges. At a slightly increased space and time complexity, we can reduce the sparsifier size to O(n \log n/\e^2) edges matching the Benczur-Karger result, while improving upon the Benczur-Karger runtime for m=ω(n\log^3 n). Finally, we show that an \eps-sparsifier with O(n \log n/\eps^2) edges can be constructed in two passes over the data and O(m) time whenever m =Ω(n^{1+δ}) for some constant δ>0. As a by-product of our approach, we also obtain an O(m\log\log n+n \log n) time streaming algorithm to compute a sparse k-connectivity certificate of a graph.

preprint2010arXiv

Optimal Lower Bounds for Universal and Differentially Private Steiner Tree and TSP

Given a metric space on n points, an α-approximate universal algorithm for the Steiner tree problem outputs a distribution over rooted spanning trees such that for any subset X of vertices containing the root, the expected cost of the induced subtree is within an α factor of the optimal Steiner tree cost for X. An α-approximate differentially private algorithm for the Steiner tree problem takes as input a subset X of vertices, and outputs a tree distribution that induces a solution within an α factor of the optimal as before, and satisfies the additional property that for any set X' that differs in a single vertex from X, the tree distributions for X and X' are "close" to each other. Universal and differentially private algorithms for TSP are defined similarly. An α-approximate universal algorithm for the Steiner tree problem or TSP is also an α-approximate differentially private algorithm. It is known that both problems admit O(logn)-approximate universal algorithms, and hence O(log n)-approximate differentially private algorithms as well. We prove an Ω(logn) lower bound on the approximation ratio achievable for the universal Steiner tree problem and the universal TSP, matching the known upper bounds. Our lower bound for the Steiner tree problem holds even when the algorithm is allowed to output a more general solution of a distribution on paths to the root.

preprint2010arXiv

Perfect Matchings in O(n \log n) Time in Regular Bipartite Graphs

In this paper we consider the well-studied problem of finding a perfect matching in a d-regular bipartite graph on 2n nodes with m=nd edges. The best-known algorithm for general bipartite graphs (due to Hopcroft and Karp) takes time O(m\sqrt{n}). In regular bipartite graphs, however, a matching is known to be computable in O(m) time (due to Cole, Ost and Schirra). In a recent line of work by Goel, Kapralov and Khanna the O(m) time algorithm was improved first to \tilde O(min{m, n^{2.5}/d}) and then to \tilde O(min{m, n^2/d}). It was also shown that the latter algorithm is optimal up to polylogarithmic factors among all algorithms that use non-adaptive uniform sampling to reduce the size of the graph as a first step. In this paper, we give a randomized algorithm that finds a perfect matching in a d-regular graph and runs in O(n\log n) time (both in expectation and with high probability). The algorithm performs an appropriately truncated random walk on a modified graph to successively find augmenting paths. Our algorithm may be viewed as using adaptive uniform sampling, and is thus able to bypass the limitations of (non-adaptive) uniform sampling established in earlier work. We also show that randomization is crucial for obtaining o(nd) time algorithms by establishing an Ω(nd) lower bound for any deterministic algorithm. Our techniques also give an algorithm that successively finds a matching in the support of a doubly stochastic matrix in expected time O(n\log^2 n) time, with O(m) pre-processing time; this gives a simple O(m+mn\log^2 n) time algorithm for finding the Birkhoff-von Neumann decomposition of a doubly stochastic matrix.

Sanjeev Khanna

What is connected

Connect this record

See the researcher in context

Building this map preview

26 published item(s)

New Trade-Offs for Fully Dynamic Matching via Hierarchical EDCS

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

Sublinear Algorithms for Hierarchical Clustering

An Efficient PTAS for Stochastic Load Balancing with Poisson Jobs

Near-linear Size Hypergraph Cut Sparsifiers

Near-Perfect Recovery in the One-Dimensional Latent Space Model

Sublinear Algorithms and Lower Bounds for Metric TSP Cost Estimation

Sensitivity and Computational Complexity in Financial Networks

Strategic Network Formation with Attack and Immunization

The Ratio Index for Budgeted Learning, with Applications

Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem

Algorithms for Provisioning Queries and Analytics

Dynamic Sketching for Graph Optimization Problems with Applications to Cut-Preserving Sketches

Fast Convergence in the Double Oral Auction

Tight Bounds for Linear Sketches of Approximate Matchings

On $(1,ε)$-Restricted Assignment Makespan Minimization

Streaming Lower Bounds for Approximating MAX-CUT

The Power of Local Information in Social Networks

Mechanism Design and Risk Aversion

Delays and the Capacity of Continuous-time Channels

Provenance Views for Module Privacy

Social Welfare in One-sided Matching Markets without Money

Approximability of Capacitated Network Design

Graph Sparsification via Refinement Sampling

Optimal Lower Bounds for Universal and Differentially Private Steiner Tree and TSP

Perfect Matchings in O(n \log n) Time in Regular Bipartite Graphs