Source author record

Arpita Biswas

Arpita Biswas appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning cs.CY Computer Science and Game Theory Data Structures and Algorithms Information Retrieval

Catalog footprint

What is connected

9works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

We propose Streaming Bandits, a Restless Multi Armed Bandit (RMAB) framework in which heterogeneous arms may arrive and leave the system after staying on for a finite lifetime. Streaming Bandits naturally capture the health intervention planning problem, where health workers must manage the health outcomes of a patient cohort while new patients join and existing patients leave the cohort each day. Our contributions are as follows: (1) We derive conditions under which our problem satisfies indexability, a precondition that guarantees the existence and asymptotic optimality of the Whittle Index solution for RMABs. We establish the conditions using a polytime reduction of the Streaming Bandit setup to regular RMABs. (2) We further prove a phenomenon that we call index decay, whereby the Whittle index values are low for short residual lifetimes driving the intuition underpinning our algorithm. (3) We propose a novel and efficient algorithm to compute the index-based solution for Streaming Bandits. Unlike previous methods, our algorithm does not rely on solving the costly finite horizon problem on each arm of the RMAB, thereby lowering the computational complexity compared to existing methods. (4) Finally, we evaluate our approach via simulations run on realworld data sets from a tuberculosis patient monitoring task and an intervention planning task for improving maternal healthcare, in addition to other synthetic domains. Across the board, our algorithm achieves a 2-orders-of-magnitude speed-up over existing methods while maintaining the same solution quality.

preprint2022arXiv

On Achieving Leximin Fairness and Stability in Many-to-One Matchings

The past few years have seen a surge of work on fairness in allocation problems where items must be fairly divided among agents having individual preferences. In comparison, fairness in settings with preferences on both sides, that is, where agents have to be matched to other agents, has received much less attention. Moreover, two-sided matching literature has largely focused on ordinal preferences. This paper initiates the study of fairness in stable many-to-one matchings under cardinal valuations. Motivated by real-world settings, we study leximin optimality over stable many-to-one matchings. We first investigate matching problems with ranked valuations where all agents on each side have the same preference orders or rankings over the agents on the other side (but not necessarily the same valuations). Here, we provide a complete characterisation of the space of stable matchings. This leads to FaSt, a novel and efficient algorithm to compute a leximin optimal stable matching under ranked isometric valuations (where, for each pair of agents, the valuation of one agent for the other is the same). Building upon FaSt, we present an efficient algorithm, FaSt-Gen, that finds the leximin optimal stable matching for a more general ranked setting. When there are exactly two agents on one side who may be matched to many agents on the other, strict preferences are enough to guarantee an efficient algorithm. We next establish that, in the absence of rankings and under strict preferences (with no restriction on the number of agents on either side), finding a leximin optimal stable matching is NP-Hard. Further, with weak rankings, the problem is strongly NP-Hard, even under isometric valuations. In fact, when additivity and non-negativity are the only assumptions, we show that, unless P=NP, no efficient polynomial factor approximation is possible.

preprint2022arXiv

Ranked Prioritization of Groups in Combinatorial Bandit Allocation

Preventing poaching through ranger patrols protects endangered wildlife, directly contributing to the UN Sustainable Development Goal 15 of life on land. Combinatorial bandits have been used to allocate limited patrol resources, but existing approaches overlook the fact that each location is home to multiple species in varying proportions, so a patrol benefits each species to differing degrees. When some species are more vulnerable, we ought to offer more protection to these animals; unfortunately, existing combinatorial bandit approaches do not offer a way to prioritize important species. To bridge this gap, (1) We propose a novel combinatorial bandit objective that trades off between reward maximization and also accounts for prioritization over species, which we call ranked prioritization. We show this objective can be expressed as a weighted linear sum of Lipschitz-continuous reward functions. (2) We provide RankedCUCB, an algorithm to select combinatorial actions that optimize our prioritization-based objective, and prove that it achieves asymptotic no-regret. (3) We demonstrate empirically that RankedCUCB leads to up to 38% improvement in outcomes for endangered species using real-world wildlife conservation data. Along with adapting to other challenges such as preventing illegal logging and overfishing, our no-regret algorithm addresses the general combinatorial bandit problem with a weighted linear objective.

preprint2022arXiv

Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning

We introduce robustness in \textit{restless multi-armed bandits} (RMABs), a popular model for constrained resource allocation among independent stochastic processes (arms). Nearly all RMAB techniques assume stochastic dynamics are precisely known. However, in many real-world settings, dynamics are estimated with significant \emph{uncertainty}, e.g., via historical data, which can lead to bad outcomes if ignored. To address this, we develop an algorithm to compute minimax regret -- robust policies for RMABs. Our approach uses a double oracle framework (oracles for \textit{agent} and \textit{nature}), which is often used for single-process robust planning but requires significant new techniques to accommodate the combinatorial nature of RMABs. Specifically, we design a deep reinforcement learning (RL) algorithm, DDLPO, which tackles the combinatorial challenge by learning an auxiliary "$λ$-network" in tandem with policy networks per arm, greatly reducing sample complexity, with guarantees on convergence. DDLPO, of general interest, implements our reward-maximizing agent oracle. We then tackle the challenging regret-maximizing nature oracle, a non-stationary RL challenge, by formulating it as a multi-agent RL problem between a policy optimizer and adversarial nature. This formulation is of general interest -- we solve it for RMABs by creating a multi-agent extension of DDLPO with a shared critic. We show our approaches work well in three experimental domains.

preprint2021arXiv

Towards Fair Recommendation in Two-Sided Platforms

Many online platforms today (such as Amazon, Netflix, Spotify, LinkedIn, and AirBnB) can be thought of as two-sided markets with producers and customers of goods and services. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigation reinforces the fact that such customer-centric design of these services may lead to unfair distribution of exposure to the producers, which may adversely impact their well-being. On the other hand, a pure producer-centric design might become unfair to the customers. As more and more people are depending on such platforms to earn a living, it is important to ensure fairness to both producers and customers. In this work, by mapping a fair personalized recommendation problem to a constrained version of the problem of fairly allocating indivisible goods, we propose to provide fairness guarantees for both sides. Formally, our proposed {\em FairRec} algorithm guarantees Maxi-Min Share ($α$-MMS) of exposure for the producers, and Envy-Free up to One Item (EF1) fairness for the customers. Extensive evaluations over multiple real-world datasets show the effectiveness of {\em FairRec} in ensuring two-sided fairness while incurring a marginal loss in overall recommendation quality. Finally, we present a modification of FairRec (named as FairRecPlus) that at the cost of additional computation time, improves the recommendation performance for the customers, while maintaining the same fairness guarantees.

preprint2020arXiv

COVID-19: Strategies for Allocation of Test Kits

With the increasing spread of COVID-19, it is important to systematically test more and more people. The current strategy for test-kit allocation is mostly rule-based, focusing on individuals having (a) symptoms for COVID-19, (b) travel history or (c) contact history with confirmed COVID-19 patients. Such testing strategy may miss out on detecting asymptomatic individuals who got infected via community spread. Thus, it is important to allocate a separate budget of test-kits per day targeted towards preventing community spread and detecting new cases early on. In this report, we consider the problem of allocating test-kits and discuss some solution approaches. We believe that these approaches will be useful to contain community spread and detect new cases early on. Additionally, these approaches would help in collecting unbiased data which can then be used to improve the accuracy of machine learning models trained to predict COVID-19 infections.

preprint2020arXiv

Ensuring Fairness under Prior Probability Shifts

In this paper, we study the problem of fair classification in the presence of prior probability shifts, where the training set distribution differs from the test set. This phenomenon can be observed in the yearly records of several real-world datasets, such as recidivism records and medical expenditure surveys. If unaccounted for, such shifts can cause the predictions of a classifier to become unfair towards specific population subgroups. While the fairness notion called Proportional Equality (PE) accounts for such shifts, a procedure to ensure PE-fairness was unknown. In this work, we propose a method, called CAPE, which provides a comprehensive solution to the aforementioned problem. CAPE makes novel use of prevalence estimation techniques, sampling and an ensemble of classifiers to ensure fair predictions under prior probability shifts. We introduce a metric, called prevalence difference (PD), which CAPE attempts to minimize in order to ensure PE-fairness. We theoretically establish that this metric exhibits several desirable properties. We evaluate the efficacy of CAPE via a thorough empirical evaluation on synthetic datasets. We also compare the performance of CAPE with several popular fair classifiers on real-world datasets like COMPAS (criminal risk assessment) and MEPS (medical expenditure panel survey). The results indicate that CAPE ensures PE-fair predictions, while performing well on other performance metrics.

preprint2016arXiv

Demand Prediction and Placement Optimization for Electric Vehicle Charging Stations

Effective placement of charging stations plays a key role in Electric Vehicle (EV) adoption. In the placement problem, given a set of candidate sites, an optimal subset needs to be selected with respect to the concerns of both (a) the charging station service provider, such as the demand at the candidate sites and the budget for deployment, and (b) the EV user, such as charging station reachability and short waiting times at the station. This work addresses these concerns, making the following three novel contributions: (i) a supervised multi-view learning framework using Canonical Correlation Analysis (CCA) for demand prediction at candidate sites, using multiple datasets such as points of interest information, traffic density, and the historical usage at existing charging stations; (ii) a mixed-packing-and- covering optimization framework that models competing concerns of the service provider and EV users; (iii) an iterative heuristic to solve these problems by alternately invoking knapsack and set cover algorithms. The performance of the demand prediction model and the placement optimization heuristic are evaluated using real world data.

preprint2016arXiv

Managing Overstaying Electric Vehicles in Park-and-Charge Facilities

With the increase in adoption of Electric Vehicles (EVs), proper utilization of the charging infrastructure is an emerging challenge for service providers. Overstaying of an EV after a charging event is a key contributor to low utilization. Since overstaying is easily detectable by monitoring the power drawn from the charger, managing this problem primarily involves designing an appropriate "penalty" during the overstaying period. Higher penalties do discourage overstaying; however, due to uncertainty in parking duration, less people would find such penalties acceptable, leading to decreased utilization (and revenue). To analyze this central trade-off, we develop a novel framework that integrates models for realistic user behavior into queueing dynamics to locate the optimal penalty from the points of view of utilization and revenue, for different values of the external charging demand. Next, when the model parameters are unknown, we show how an online learning algorithm, such as UCB, can be adapted to learn the optimal penalty. Our experimental validation, based on charging data from London, shows that an appropriate penalty can increase both utilization and revenue while significantly reducing overstaying.

Arpita Biswas

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

On Achieving Leximin Fairness and Stability in Many-to-One Matchings

Ranked Prioritization of Groups in Combinatorial Bandit Allocation

Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning

Towards Fair Recommendation in Two-Sided Platforms

COVID-19: Strategies for Allocation of Test Kits

Ensuring Fairness under Prior Probability Shifts

Demand Prediction and Placement Optimization for Electric Vehicle Charging Stations

Managing Overstaying Electric Vehicles in Park-and-Charge Facilities