Researcher profile

Lucas Baudin

Lucas Baudin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
1topics
1close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Best-Response Dynamics and Fictitious Play in Identical-Interest and Zero-Sum Stochastic Games

This paper combines ideas from Q-learning and fictitious play to define three reinforcement learning procedures which converge to the set of stationary mixed Nash equilibria in identical interest discounted stochastic games. First, we analyse three continuous-time systems that generalize the best-response dynamics defined by Leslie et al. for zero-sum discounted stochastic games. Under some assumptions depending on the system, the dynamics are shown to converge to the set of stationary equilibria in identical interest discounted stochastic games. Then, we introduce three analog discrete-time procedures in the spirit of Sayin et al. and demonstrate their convergence to the set of stationary equilibria using our results in continuous time together with stochastic approximation techniques. Some numerical experiments complement our theoretical findings.

preprint2022arXiv

Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions

Recent extensions to dynamic games of the well-known fictitious play learning procedure in static games were proved to globally converge to stationary Nash equilibria in two important classes of dynamic games (zero-sum and identical-interest discounted stochastic games). However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the systems in (Leslie 2020; Baudin 2022) to construct a family of new decentralized learning algorithms which are model-free (players don't know the transitions and their payoffs are perturbed at every stage). Our procedures can be seen as extensions to stochastic games of the classical smooth fictitious play learning procedures in static games (where the players best responses are regularized, thanks to a smooth strictly concave perturbation of their payoff functions). We prove the convergence of our family of procedures to stationary regularized Nash equilibria in zero-sum and identical-interest discounted stochastic games. The proof uses the continuous smooth best-response dynamics counterparts, and stochastic approximation methods. When there is only one player, our problem is an instance of Reinforcement Learning and our procedures are proved to globally converge to the optimal stationary policy of the regularized MDP. In that sense, they can be seen as an alternative to the well known Q-learning procedure.