Researcher profile

Sampath Kannan

Sampath Kannan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Quantifying the Burden of Exploration and the Unfairness of Free Riding

We consider the multi-armed bandit setting with a twist. Rather than having just one decision maker deciding which arm to pull in each round, we have $n$ different decision makers (agents). In the simple stochastic setting, we show that a "free-riding" agent observing another "self-reliant" agent can achieve just $O(1)$ regret, as opposed to the regret lower bound of $Ω(\log t)$ when one decision maker is playing in isolation. This result holds whenever the self-reliant agent's strategy satisfies either one of two assumptions: (1) each arm is pulled at least $γ\ln t$ times in expectation for a constant $γ$ that we compute, or (2) the self-reliant agent achieves $o(t)$ realized regret with high probability. Both of these assumptions are satisfied by standard zero-regret algorithms. Under the second assumption, we further show that the free rider only needs to observe the number of times each arm is pulled by the self-reliant agent, and not the rewards realized. In the linear contextual setting, each arm has a distribution over parameter vectors, each agent has a context vector, and the reward realized when an agent pulls an arm is the inner product of that agent's context vector with a parameter vector sampled from the pulled arm's distribution. We show that the free rider can achieve $O(1)$ regret in this setting whenever the free rider's context is a small (in $L_2$-norm) linear combination of other agents' contexts and all other agents pull each arm $Ω(\log t)$ times with high probability. Again, this condition on the self-reliant players is satisfied by standard zero-regret algorithms like UCB. We also prove a number of lower bounds.

preprint2022arXiv

Reconstructing Ultrametric Trees from Noisy Experiments

The problem of reconstructing evolutionary trees or phylogenies is of great interest in computational biology. A popular model for this problem assumes that we are given the set of leaves (current species) of an unknown binary tree and the results of `experiments' on triples of leaves (a,b,c), which return the pair with the deepest least common ancestor. If the tree is assumed to be an ultrametric (i.e., all root-leaf paths have the same length), the experiment can be equivalently seen to return the closest pair of leaves. In this model, efficient algorithms are known for tree reconstruction. In reality, since the data on which these `experiments' are run is itself generated by the stochastic process of evolution, these experiments are noisy. In all reasonable models of evolution, if the branches leading to the leaves in a triple separate from each other at common ancestors that are very close to each other in the tree, the result of the experiment should be close to uniformly random. Motivated by this, we consider a model where the noise on any triple is just dependent on the three pairwise distances (referred to as distance based noise). Our results are the following: 1. Suppose the length of every edge in the unknown tree is at least $\tilde{O}(\frac{1}{\sqrt n})$ fraction of the length of a root-leaf path. Then, we give an efficient algorithm to reconstruct the topology of the tree for a broad family of distance-based noise models. Further, we show that if the edges are asymptotically shorter, then topology reconstruction is information-theoretically impossible. 2. Further, for a specific distance-based noise model--which we refer to as the homogeneous noise model--we show that the edge weights can also be approximately reconstructed under the same quantitative lower bound on the edge lengths.

preprint2021arXiv

Best vs. All: Equity and Accuracy of Standardized Test Score Reporting

We study a game theoretic model of standardized testing for college admissions. Students are of two types; High and Low. There is a college that would like to admit the High type students. Students take a potentially costly standardized exam which provides a noisy signal of their type. The students come from two populations, which are identical in talent (i.e. the type distribution is the same), but differ in their access to resources: the higher resourced population can at their option take the exam multiple times, whereas the lower resourced population can only take the exam once. We study two models of score reporting, which capture existing policies used by colleges. The first policy (sometimes known as "super-scoring") allows students to report the max of the scores they achieve. The other policy requires that all scores be reported. We find in our model that requiring that all scores be reported results in superior outcomes in equilibrium, both from the perspective of the college (the admissions rule is more accurate), and from the perspective of equity across populations: a student's probability of admission is independent of their population, conditional on their type. In particular, the false positive rates and false negative rates are identical in this setting, across the highly and poorly resourced student populations. This is the case despite the fact that the more highly resourced students can -- at their option -- either report a more accurate signal of their type, or pool with the lower resourced population under this policy.

preprint2020arXiv

Fair Prediction with Endogenous Behavior

There is increasing regulatory interest in whether machine learning algorithms deployed in consequential domains (e.g. in criminal justice) treat different demographic groups "fairly." However, there are several proposed notions of fairness, typically mutually incompatible. Using criminal justice as an example, we study a model in which society chooses an incarceration rule. Agents of different demographic groups differ in their outside options (e.g. opportunity for legal employment) and decide whether to commit crimes. We show that equalizing type I and type II errors across groups is consistent with the goal of minimizing the overall crime rate; other popular notions of fairness are not.

preprint2020arXiv

Near-Perfect Recovery in the One-Dimensional Latent Space Model

Suppose a graph $G$ is stochastically created by uniformly sampling vertices along a line segment and connecting each pair of vertices with a probability that is a known decreasing function of their distance. We ask if it is possible to reconstruct the actual positions of the vertices in $G$ by only observing the generated unlabeled graph. We study this question for two natural edge probability functions -- one where the probability of an edge decays exponentially with the distance and another where this probability decays only linearly. We initiate our study with the weaker goal of recovering only the order in which vertices appear on the line segment. For a segment of length $n$ and a precision parameter $δ$, we show that for both exponential and linear decay edge probability functions, there is an efficient algorithm that correctly recovers (up to reflection symmetry) the order of all vertices that are at least $δ$ apart, using only $\tilde{O}(\frac{n}{δ^ 2})$ samples (vertices). Building on this result, we then show that $O(\frac{n^2 \log n}{δ^2})$ vertices (samples) are sufficient to additionally recover the location of each vertex on the line to within a precision of $δ$. We complement this result with an $Ω(\frac{n^{1.5}}δ)$ lower bound on samples needed for reconstructing positions (even by a computationally unbounded algorithm), showing that the task of recovering positions is information-theoretically harder than recovering the order. We give experimental results showing that our algorithm recovers the positions of almost all points with high accuracy.

preprint2020arXiv

Pipeline Interventions

We introduce the \emph{pipeline intervention} problem, defined by a layered directed acyclic graph and a set of stochastic matrices governing transitions between successive layers. The graph is a stylized model for how people from different populations are presented opportunities, eventually leading to some reward. In our model, individuals are born into an initial position (i.e. some node in the first layer of the graph) according to a fixed probability distribution, and then stochastically progress through the graph according to the transition matrices, until they reach a node in the final layer of the graph; each node in the final layer has a \emph{reward} associated with it. The pipeline intervention problem asks how to best make costly changes to the transition matrices governing people's stochastic transitions through the graph, subject to a budget constraint. We consider two objectives: social welfare maximization, and a fairness-motivated maximin objective that seeks to maximize the value to the population (starting node) with the \emph{least} expected value. We consider two variants of the maximin objective that turn out to be distinct, depending on whether we demand a deterministic solution or allow randomization. For each objective, we give an efficient approximation algorithm (an additive FPTAS) for constant width networks. We also tightly characterize the "price of fairness" in our setting: the ratio between the highest achievable social welfare and the highest social welfare consistent with a maximin optimal solution. Finally we show that for polynomial width networks, even approximating the maximin objective to any constant factor is NP hard, even for networks with constant depth. This shows that the restriction on the width in our positive results is essential.

preprint2020arXiv

Sublinear Algorithms and Lower Bounds for Metric TSP Cost Estimation

We consider the problem of designing sublinear time algorithms for estimating the cost of a minimum metric traveling salesman (TSP) tour. Specifically, given access to a $n \times n$ distance matrix $D$ that specifies pairwise distances between $n$ points, the goal is to estimate the TSP cost by performing only sublinear (in the size of $D$) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any $\varepsilon > 0$, there exists an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm that returns a $(1 + \varepsilon)$-approximate estimate of the MST cost. This result immediately implies an $\tilde{O}(n/\varepsilon^{O(1)})$ time algorithm to estimate the TSP cost to within a $(2 + \varepsilon)$ factor for any $\varepsilon > 0$. However, no $o(n^2)$ time algorithms are known to approximate metric TSP to a factor that is strictly better than $2$. On the other hand, there were also no known barriers that rule out the existence of $(1 + \varepsilon)$-approximate estimation algorithms for metric TSP with $\tilde{O}(n)$ time for any fixed $\varepsilon > 0$. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. We also show that the problem of estimating metric TSP cost is closely connected to the problem of estimating the size of a maximum matching in a graph.