Source author record

Jayakrishnan Nair

Jayakrishnan Nair appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Data Structures and Algorithms eess.SY Human-Computer Interaction Performance Systems and Control

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Constrained regret minimization for multi-criterion multi-armed bandits

We consider a stochastic multi-armed bandit setting and study the problem of constrained regret minimization over a given time horizon. Each arm is associated with an unknown, possibly multi-dimensional distribution, and the merit of an arm is determined by several, possibly conflicting attributes. The aim is to optimize a 'primary' attribute subject to user-provided constraints on other 'secondary' attributes. We assume that the attributes can be estimated using samples from the arms' distributions, and that the estimators enjoy suitable concentration properties. We propose an algorithm called Con-LCB that guarantees a logarithmic regret, i.e., the average number of plays of all non-optimal arms is at most logarithmic in the horizon. The algorithm also outputs a Boolean flag that correctly identifies, with high probability, whether the given instance is feasible/infeasible with respect to the constraints. We also show that Con-LCB is optimal within a universal constant, i.e., that more sophisticated algorithms cannot do much better universally. Finally, we establish a fundamental trade-off between regret minimization and feasibility identification. Our framework finds natural applications, for instance, in financial portfolio optimization, where risk constrained maximization of expected return is meaningful.

preprint2022arXiv

Pricing, competition and market segmentation in ride hailing

We analyse a non-cooperative strategic game among two ride-hailing platforms, each of which is modeled as a two-sided queueing system, where drivers (with a certain patience level) are assumed to arrive according to a Poisson process at a fixed rate, while the arrival process of passengers is split across the two providers based on QoS considerations. We also consider two monopolistic scenarios: (i) each platform has half the market share, and (ii) the platforms merge into a single entity, serving the entire passenger base using their combined driver resources. The key novelty of our formulation is that the total market share is fixed across the platforms. The game thus captures the competition among the platforms over market share, which is modeled using two different Quality of Service (QoS) metrics: (i) probability of driver availability, and (ii) probability that an arriving passenger takes a ride. The objective of the platforms is to maximize the profit generated from matching drivers and passengers. In each of the above settings, we analyse the equilibria associated with the game. Interestingly, under the second QoS metric, we show that for a certain range of parameters, no Nash equilibrium exists. Instead, we demonstrate a new solution concept called an equilibrium cycle. Our results highlight the interplay between competition, cooperation, passenger-side price sensitivity, and passenger/driver arrival rates.

preprint2022arXiv

Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed Bandits

Traditional multi-armed bandit (MAB) formulations usually make certain assumptions about the underlying arms' distributions, such as bounds on the support or their tail behaviour. Moreover, such parametric information is usually 'baked' into the algorithms. In this paper, we show that specialized algorithms that exploit such parametric information are prone to inconsistent learning performance when the parameter is misspecified. Our key contributions are twofold: (i) We establish fundamental performance limits of statistically robust MAB algorithms under the fixed-budget pure exploration setting, and (ii) We propose two classes of algorithms that are asymptotically near-optimal. Additionally, we consider a risk-aware criterion for best arm identification, where the objective associated with each arm is a linear combination of the mean and the conditional value at risk (CVaR). Throughout, we make a very mild 'bounded moment' assumption, which lets us work with both light-tailed and heavy-tailed distributions within a unified framework.

preprint2022arXiv

Unsupervised Crowdsourcing with Accuracy and Cost Guarantees

We consider the problem of cost-optimal utilization of a crowdsourcing platform for binary, unsupervised classification of a collection of items, given a prescribed error threshold. Workers on the crowdsourcing platform are assumed to be divided into multiple classes, based on their skill, experience, and/or past performance. We model each worker class via an unknown confusion matrix, and a (known) price to be paid per label prediction. For this setting, we propose algorithms for acquiring label predictions from workers, and for inferring the true labels of items. We prove that if the number of (unlabeled) items available is large enough, our algorithms satisfy the prescribed error thresholds, incurring a cost that is near-optimal. Finally, we validate our algorithms, and some heuristics inspired by them, through an extensive case study.

preprint2020arXiv

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

We study regret minimization in a stochastic multi-armed bandit setting and establish a fundamental trade-off between the regret suffered under an algorithm, and its statistical robustness. Considering broad classes of underlying arms' distributions, we show that bandit learning algorithms with logarithmic regret are always inconsistent and that consistent learning algorithms always suffer a super-logarithmic regret. This result highlights the inevitable statistical fragility of all `logarithmic regret' bandit algorithms available in the literature---for instance, if a UCB algorithm designed for $σ$-subGaussian distributions is used in a subGaussian setting with a mismatched variance parameter, the learning performance could be inconsistent. Next, we show a positive result: statistically robust and consistent learning performance is attainable if we allow the regret to be slightly worse than logarithmic. Specifically, we propose three classes of distribution oblivious algorithms that achieve an asymptotic regret that is arbitrarily close to logarithmic.

preprint2020arXiv

Multiple Server SRPT with speed scaling is competitive

Can the popular shortest remaining processing time (SRPT) algorithm achieve a constant competitive ratio on multiple servers when server speeds are adjustable (speed scaling) with respect to the flow time plus energy consumption metric? This question has remained open for a while, where a negative result in the absence of speed scaling is well known. The main result of this paper is to show that multi-server SRPT can be constant competitive, with a competitive ratio that only depends on the power-usage function of the servers, but not on the number of jobs/servers or the job sizes (unlike when speed scaling is not allowed). When all job sizes are unity, we show that round-robin routing is optimal and can achieve the same competitive ratio as the best known algorithm for the single server problem. Finally, we show that a class of greedy dispatch policies, including policies that route to the least loaded or the shortest queue, do not admit a constant competitive ratio. When job arrivals are stochastic, with Poisson arrivals and i.i.d. job sizes, we show that random routing and a simple gated-static speed scaling algorithm achieves a constant competitive ratio.

preprint2020arXiv

Pareto-optimal energy sharing between battery-equipped renewable generators

The inherent intermittency of renewable sources like wind and solar has resulted in a bundling of renewable generators with storage resources (batteries) for increased reliability. In this paper, we consider the problem of energy sharing between two such bundles, each associated with their own demand profiles. The demand profiles might, for example, correspond to commitments made by the bundle to the grid. With each bundle seeking to minimize its loss of load rate, we explore the possibility that one bundle can supply energy to the other from its battery at times of deficit, in return for a reciprocal supply from the other when it faces a deficit itself. We show that there always exist \emph{mutually beneficial} energy sharing arrangements between the two bundles. Moreover, we show that Pareto-optimal arrangements involve at least one bundle transferring energy to the other at the maximum feasible rate at times of deficit. We illustrate the potential gains from such dynamic energy sharing via an extensive case study.

Jayakrishnan Nair

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Constrained regret minimization for multi-criterion multi-armed bandits

Pricing, competition and market segmentation in ride hailing

Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed Bandits

Unsupervised Crowdsourcing with Accuracy and Cost Guarantees

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Multiple Server SRPT with speed scaling is competitive

Pareto-optimal energy sharing between battery-equipped renewable generators