Source author record

Vidit Saxena

Vidit Saxena appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Networking and Internet Architecture

Catalog footprint

What is connected

2works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Constrained Thompson Sampling for Wireless Link Optimization

Wireless communication systems operate in complex time-varying environments. Therefore, selecting the optimal configuration parameters in these systems is a challenging problem. For wireless links, \emph{rate selection} is used to select the optimal data transmission rate that maximizes the link throughput subject to an application-defined latency constraint. We model rate selection as a stochastic multi-armed bandit (MAB) problem, where a finite set of transmission rates are modeled as independent bandit arms. For this setup, we propose Con-TS, a novel constrained version of the Thompson sampling algorithm, where the latency requirement is modeled by a high-probability linear constraint. We show that for Con-TS, the expected number of constraint violations over T transmission intervals is upper bounded by O(\sqrt{KT}), where K is the number of available rates. Further, the expected loss in cumulative throughput compared to the optimal rate selection scheme (i.e., the egret is also upper bounded by O(\sqrt{KT \log K}). Through numerical simulations, we demonstrate that Con-TS significantly outperforms state-of-the-art bandit schemes for rate selection.

preprint2020arXiv

Thompson Sampling for Linearly Constrained Bandits

We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under a probabilistic linear constraint. For a few real-world instances of this problem, constrained extensions of the well-known Thompson Sampling (TS) heuristic have recently been proposed. However, finite-time analysis of constrained TS is challenging; as a result, only O(\sqrt{T}) bounds on the cumulative reward loss (i.e., the regret) are available. In this paper, we describe LinConTS, a TS-based algorithm for bandits that place a linear constraint on the probability of earning a reward in every round. We show that for LinConTS, the regret as well as the cumulative constraint violations are upper bounded by O(\log T) for the suboptimal arms. We develop a proof technique that relies on careful analysis of the dual problem and combine it with recent theoretical work on unconstrained TS. Through numerical experiments on two real-world datasets, we demonstrate that LinConTS outperforms an asymptotically optimal upper confidence bound (UCB) scheme in terms of simultaneously minimizing the regret and the violation.

Vidit Saxena

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Constrained Thompson Sampling for Wireless Link Optimization

Thompson Sampling for Linearly Constrained Bandits