Source author record

Divya Padmanabhan

Divya Padmanabhan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Machine Learning Artificial Intelligence Human-Computer Interaction

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A Dominant Strategy Truthful, Deterministic Multi-Armed Bandit Mechanism with Logarithmic Regret

Stochastic multi-armed bandit (MAB) mechanisms are widely used in sponsored search auctions, crowdsourcing, online procurement, etc. Existing stochastic MAB mechanisms with a deterministic payment rule, proposed in the literature, necessarily suffer a regret of $Ω(T^{2/3})$, where $T$ is the number of time steps. This happens because the existing mechanisms consider the worst case scenario where the means of the agents' stochastic rewards are separated by a very small amount that depends on $T$. We make, and, exploit the crucial observation that in most scenarios, the separation between the agents' rewards is rarely a function of $T$. Moreover, in the case that the rewards of the arms are arbitrarily close, the regret contributed by such sub-optimal arms is minimal. Our idea is to allow the center to indicate the resolution, $Δ$, with which the agents must be distinguished. This immediately leads us to introduce the notion of $Δ$-Regret. Using sponsored search auctions as a concrete example (the same idea applies for other applications as well), we propose a dominant strategy incentive compatible (DSIC) and individually rational (IR), deterministic MAB mechanism, based on ideas from the Upper Confidence Bound (UCB) family of MAB algorithms. Remarkably, the proposed mechanism $Δ$-UCB achieves a $Δ$-regret of $O(\log T)$ for the case of sponsored search auctions. We first establish the results for single slot sponsored search auctions and then non-trivially extend the results to the case where multiple slots are to be allocated.

preprint2016arXiv

A Robust UCB Scheme for Active Learning in Regression from Strategic Crowds

We study the problem of training an accurate linear regression model by procuring labels from multiple noisy crowd annotators, under a budget constraint. We propose a Bayesian model for linear regression in crowdsourcing and use variational inference for parameter estimation. To minimize the number of labels crowdsourced from the annotators, we adopt an active learning approach. In this specific context, we prove the equivalence of well-studied criteria of active learning like entropy minimization and expected error reduction. Interestingly, we observe that we can decouple the problems of identifying an optimal unlabeled instance and identifying an annotator to label it. We observe a useful connection between the multi-armed bandit framework and the annotator selection in active learning. Due to the nature of the distribution of the rewards on the arms, we use the Robust Upper Confidence Bound (UCB) scheme with truncated empirical mean estimator to solve the annotator selection problem. This yields provable guarantees on the regret. We further apply our model to the scenario where annotators are strategic and design suitable incentives to induce them to put in their best efforts.

preprint2016arXiv

A Truthful Mechanism with Biparameter Learning for Online Crowdsourcing

We study a problem of allocating divisible jobs, arriving online, to workers in a crowdsourcing setting which involves learning two parameters of strategically behaving workers. Each job is split into a certain number of tasks that are then allocated to workers. Each arriving job has to be completed within a deadline and each task has to be completed satisfying an upper bound on probability of failure. The job population is homogeneous while the workers are heterogeneous in terms of costs, completion times, and times to failure. The job completion time and time to failure of each worker are stochastic with fixed but unknown means. The requester is faced with the challenge of learning two separate parameters of each (strategically behaving) worker simultaneously, namely, the mean job completion time and the mean time to failure. The time to failure of a worker depends on the duration of the task handled by the worker. Assuming non-strategic workers to start with, we solve this biparameter learning problem by applying the Robust UCB algorithm. Then, we non-trivially extend this algorithm to the setting where the workers are strategic about their costs. Our proposed mechanism is dominant strategy incentive compatible and ex-post individually rational with asymptotically optimal regret performance.

preprint2016arXiv

Topic Model Based Multi-Label Classification from the Crowd

Multi-label classification is a common supervised machine learning problem where each instance is associated with multiple classes. The key challenge in this problem is learning the correlations between the classes. An additional challenge arises when the labels of the training instances are provided by noisy, heterogeneous crowdworkers with unknown qualities. We first assume labels from a perfect source and propose a novel topic model where the present as well as the absent classes generate the latent topics and hence the words. We non-trivially extend our topic model to the scenario where the labels are provided by noisy crowdworkers. Extensive experimentation on real world datasets reveals the superior performance of the proposed model. The proposed model learns the qualities of the annotators as well, even with minimal training data.

Divya Padmanabhan

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

A Dominant Strategy Truthful, Deterministic Multi-Armed Bandit Mechanism with Logarithmic Regret

A Robust UCB Scheme for Active Learning in Regression from Strategic Crowds

A Truthful Mechanism with Biparameter Learning for Online Crowdsourcing

Topic Model Based Multi-Label Classification from the Crowd