Paper detail

The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling

In this note, we introduce a general version of the well-known elliptical potential lemma that is a widely used technique in the analysis of algorithms in sequential learning and decision-making problems. We consider a stochastic linear bandit setting where a decision-maker sequentially chooses among a set of given actions, observes their noisy rewards, and aims to maximize her cumulative expected reward over a decision-making horizon. The elliptical potential lemma is a key tool for quantifying uncertainty in estimating parameters of the reward function, but it requires the noise and the prior distributions to be Gaussian. Our general elliptical potential lemma relaxes this Gaussian requirement which is a highly non-trivial extension for a number of reasons; unlike the Gaussian case, there is no closed-form solution for the covariance matrix of the posterior distribution, the covariance matrix is not a deterministic function of the actions, and the covariance matrix is not decreasing with respect to the semidefinite inequality. While this result is of broad interest, we showcase an application of it to prove an improved Bayesian regret bound for the well-known Thompson sampling algorithm in stochastic linear bandits with changing action sets where prior and noise distributions are general. This bound is minimax optimal up to constants.

preprint2022arXivOpen access
0citations
0reviews
0saves
Nocode
Nodataset
0institutions

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.