Source author record

Xavier Venel

Xavier Venel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Computer Science and Game Theory math.PR

Catalog footprint

What is connected

10works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Decomposition of games: some strategic considerations

Candogan et al. (2011) provide an orthogonal direct-sum decomposition of finite games into potential, harmonic and nonstrategic components. In this paper we study the issue of decomposing games that are strategically equivalent from a game-theoretical point of view, for instance games obtained via transformations such as duplications of strategies or positive affine mappings of of payoffs. We show the need to define classes of decompositions to achieve commutativity of game transformations and decompositions.

preprint2020arXiv

History-dependent evaluations in POMDPs

We consider POMDPs in which the weight of the stage payoff depends on the past sequence of signals and actions occurring in the infinitely repeated problem. We prove that for all epsilon>0, there exists a strategy that is epsilon-optimal for any sequence of weights satisfying a property that interprets as "the decision-maker is patient enough". This unifies and generalizes several results of the literature, and applies notably to POMDPs with limsup payoffs.

preprint2016arXiv

Commutative Stochastic Games

We are interested in the convergence of the value of n-stage games as n goes to infinity and the existence of the uniform value in stochastic games with a general set of states and finite sets of actions where the transition is commutative. This means that playing an action profile a 1 followed by an action profile a 2 , leads to the same distribution on states as playing first the action profile a 2 and then a 1. For example, absorbing games can be reformulated as commutative stochastic games. When there is only one player and the transition function is deterministic, we show that the existence of a uniform value in pure strategies implies the existence of 0-optimal strategies. In the framework of two-player stochastic games, we study a class of games where the set of states is R m and the transition is deterministic and 1-Lipschitz for the L 1-norm, and prove that these games have a uniform value. A similar proof shows the existence of an equilibrium in the non zero-sum case. These results remain true if one considers a general model of finite repeated games, where the transition is commutative and the players observe the past actions but not the state.

preprint2016arXiv

On values of repeated games with signals

We study the existence of different notions of value in two-person zero-sum repeated games where the state evolves and players receive signals. We provide some examples showing that the limsup value (and the uniform value) may not exist in general. Then we show the existence of the value for any Borel payoff function if the players observe a public signal including the actions played. We also prove two other positive results without assumptions on the signaling structure: the existence of the $\sup$ value in any game and the existence of the uniform value in recursive games with nonnegative payoffs.

preprint2015arXiv

Pathwise uniform value in gambling houses and Partially Observable Markov Decision Processes

In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a very robust notion of value for the infinitely repeated problem, namely the pathwise uniform value. This solves two open problems. First, this shows that for any epsilon>0, the decision-maker has a pure strategy sigma which is epsilon-optimal in any n-stage game, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, the strategy sigma can be chosen such that under the long-run average payoff criterion (expectation of the liminf of the average payoffs), the decision-maker has more than lim v(n)-epsilon.

preprint2015arXiv

Recursive games: Uniform value, Tauberian theorem and the Mertens conjecture "$Maxmin=\lim v_n=\lim v_λ$"

We study two-player zero-sum recursive games with a countable state space and finite action spaces at each state. When the family of $n$-stage values $\{v_n,n\geq 1\}$ is totally bounded for the uniform norm, we prove the existence of the uniform value. Together with a result in Rosenberg and Vieille (2000), we obtain a uniform Tauberian theorem for recursive games: $(v_n)$ converges uniformly if and only if $(v_λ)$ converges uniformly. We apply our main result to finite recursive games with signals (where players observe only signals on the state and on past actions). When the maximizer is more informed than the minimizer, we prove the Mertens conjecture $Maxmin=\lim_{n\to\infty} v_n=\lim_{λ\to 0}v_λ$. Finally, we deduce the existence of the uniform value in finite recursive game with symmetric information.

preprint2014arXiv

Attainability in Repeated Games with Vector Payoffs

We introduce the concept of attainable sets of payoffs in two-player repeated games with vector payoffs. A set of payoff vectors is called {\em attainable} if player 1 can ensure that there is a finite horizon $T$ such that after time $T$ the distance between the set and the cumulative payoff is arbitrarily small, regardless of what strategy player 2 is using. This paper focuses on the case where the attainable set consists of one payoff vector. In this case the vector is called an attainable vector. We study properties of the set of attainable vectors, and characterize when a specific vector is attainable and when every vector is attainable.

preprint2013arXiv

Existence of the uniform value in repeated games with a more informed controller

We prove that in a general zero-sum repeated game where the first player is more informed than the second player and controls the evolution of information on the state, the uniform value exists. This result extends previous results on Markov decision processes with partial observation (Rosenberg, Solan, Vieille 2002), and repeated games with an informed controller (Renault 2012). Our formal definition of a more informed player is more general than the inclusion of signals, allowing therefore for imperfect monitoring of actions. We construct an auxiliary stochastic game whose state space is the set of second order beliefs of player 2 (beliefs about beliefs of player 1 on the true state variable of the initial game) with perfect monitoring and we prove it has a value by using a result of Renault 2012. A key element in this work is to prove that player 1 can use strategies of the auxiliary game in the initial game in our general framework, which allows to deduce that the value of the auxiliary game is also the value of our initial repeated game by using classical arguments.

preprint2012arXiv

A distance for probability spaces, and long-term values in Markov Decision Processes and Repeated Games

Given a finite set $K$, we denote by $X=Δ(K)$ the set of probabilities on $K$ and by $Z=Δ_f(X)$ the set of Borel probabilities on $X$ with finite support. Studying a Markov Decision Process with partial information on $K$ naturally leads to a Markov Decision Process with full information on $X$. We introduce a new metric $d_*$ on $Z$ such that the transitions become 1-Lipschitz from $(X, \|.\|_1)$ to $(Z,d_*)$. In the first part of the article, we define and prove several properties of the metric $d_*$. Especially, $d_*$ satisfies a Kantorovich-Rubinstein type duality formula and can be characterized by using disintegrations. In the second part, we characterize the limit values in several classes of "compact non expansive" Markov Decision Processes. In particular we use the metric $d_*$ to characterize the limit value in Partial Observation MDP with finitely many states and in Repeated Games with an informed controller with finite sets of states and actions. Moreover in each case we can prove the existence of a generalized notion of uniform value where we consider not only the Cesàro mean when the number of stages is large enough but any evaluation function $θ\in Δ(\N^*)$ when the impatience $I(θ)=\sum_{t\geq 1} |θ_{t+1}-θ_t|$ is small enough.

preprint2010arXiv

Asymptotic Properties of Optimal Trajectories in Dynamic Programming

We prove in a dynamic programming framework that uniform convergence of the finite horizon values implies that asymptotically the average accumulated payoff is constant on optimal trajectories. We analyze and discuss several possible extensions to two-person games.