Researcher profile

Lalit Jain

Lalit Jain contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Active Learning with Safety Constraints

Active learning methods have shown great promise in reducing the number of samples necessary for learning. As automated learning systems are adopted into real-time, real-world decision-making pipelines, it is increasingly important that such algorithms are designed with safety in mind. In this work we investigate the complexity of learning the best safe decision in interactive environments. We reduce this problem to a constrained linear bandits problem, where our goal is to find the best arm satisfying certain (unknown) safety constraints. We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal. To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints. In practice, we demonstrate that this approach performs well on synthetic and real world datasets.

preprint2022arXiv

An Experimental Design Approach for Regret Minimization in Logistic Bandits

In this work we consider the problem of regret minimization for logistic bandits. The main challenge of logistic bandits is reducing the dependence on a potentially large problem dependent constant $κ$ that can at worst scale exponentially with the norm of the unknown parameter $θ_{\ast}$. Abeille et al. (2021) have applied self-concordance of the logistic function to remove this worst-case dependence providing regret guarantees like $O(d\log^2(κ)\sqrt{\dotμT}\log(|\mathcal{X}|))$ where $d$ is the dimensionality, $T$ is the time horizon, and $\dotμ$ is the variance of the best-arm. This work improves upon this bound in the fixed arm setting by employing an experimental design procedure that achieves a minimax regret of $O(\sqrt{d \dotμT\log(|\mathcal{X}|)})$. Our regret bound in fact takes a tighter instance (i.e., gap) dependent regret bound for the first time in logistic bandits. We also propose a new warmup sampling algorithm that can dramatically reduce the lower order term in the regret in general and prove that it can replace the lower order term dependency on $κ$ to $\log^2(κ)$ for some instances. Finally, we discuss the impact of the bias of the MLE on the logistic bandit problem, providing an example where $d^2$ lower order regret (cf., it is $d$ for linear bandits) may not be improved as long as the MLE is used and how bias-corrected estimators may be used to make it closer to $d$.

preprint2021arXiv

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/κ$, where $κ$ is the minimal variance over all arms' reward distributions. In general, $1/κ$ scales exponentially with the norm of the unknown linear parameter $θ^*$. Instead of relying on this worst-case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits improving upon state-of-the-art performance guarantees. For pure exploration, we also provide a lower bound highlighting a dependence on $1/κ$ for a family of instances.

preprint2020arXiv

A New Perspective on Pool-Based Active Classification and False-Discovery Control

In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i.e. false alarms). Such regions of the search space could differ drastically from a predicted set that minimizes 0/1 error and accurate identification could require very different sampling strategies. Like active learning for binary classification, this experimental design cannot be optimally chosen a priori, but rather the data must be taken sequentially and adaptively. However, unlike classification with 0/1 error, collecting data adaptively to find a set with high true positive rate and low false discovery rate (FDR) is not as well understood. In this paper we provide the first provably sample efficient adaptive algorithm for this problem. Along the way we highlight connections between classification, combinatorial bandits, and FDR control making contributions to each.

preprint2020arXiv

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. Leveraging ideas from the theory of suprema of empirical processes, we provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. Unlike previous approaches which sample based on minimizing a worst-case variance (e.g. G-optimal design), we define an experimental design objective based on the Gaussian-width of the underlying arm set. We provide a novel lower bound in terms of this objective that highlights its fundamental role in the sample complexity. The sample complexity of our fixed confidence algorithm matches this lower bound, and in addition is computationally efficient for combinatorial classes, e.g. shortest-path, matchings and matroids, where the arm sets can be exponentially large in the dimension. Finally, we propose the first algorithm for linear bandits in the the fixed budget setting. Its guarantee matches our lower bound up to logarithmic factors.

preprint2020arXiv

Finding All ε-Good Arms in Stochastic Bandits

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ε-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ε-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ε-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ε-good candidates. Mathematically, the all-ε-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2M ratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs.

preprint2020arXiv

Spectral Methods for Ranking with Scarce Data

Given a number of pairwise preferences of items, a common task is to rank all the items. Examples include pairwise movie ratings, New Yorker cartoon caption contests, and many other consumer preferences tasks. What these settings have in common is two-fold: a scarcity of data (it may be costly to get comparisons for all the pairs of items) and additional feature information about the items (e.g., movie genre, director, and cast). In this paper we modify a popular and well studied method, RankCentrality for rank aggregation to account for few comparisons and that incorporates additional feature information. This method returns meaningful rankings even under scarce comparisons. Using diffusion based methods, we incorporate feature information that outperforms state-of-the-art methods in practice. We also provide improved sample complexity for RankCentrality in a variety of sampling schemes.