Researcher profile

Asuman Ozdaglar

Asuman Ozdaglar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Computing Equilibrium beyond Unilateral Deviation

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts that provide stability against multilateral deviations (\emph{e.g.}, strong Nash and coalition-proof equilibrium), these generally fail to exist. In this paper, we study an alternative solution concept that minimizes coalitional deviation incentives, rather than requiring them to vanish, and is therefore guaranteed to exist. Specifically, we focus on minimizing the average gain of a deviating coalition, and extend the framework to weighted-average and maximum-within-coalition gains. In contrast, the minimum-gain analogue is shown to be computationally intractable. For the average-gain and maximum-gain objectives, we prove a lower bound on the complexity of computing such an equilibrium and present an algorithm that matches this bound. Finally, we use our framework to solve the \emph{Exploitability Welfare Frontier} (EWF), the maximum attainable social welfare subject to a given exploitability (the maximum gain over all unilateral deviations).

preprint2022arXiv

Convergence Rate of Incremental Gradient and Newton Methods

The incremental gradient method is a prominent algorithm for minimizing a finite sum of smooth convex functions, used in many contexts including large-scale data processing applications and distributed optimization over networks. It is a first-order method that processes the functions one at a time based on their gradient information. The incremental Newton method, on the other hand, is a second-order variant which exploits additionally the curvature information of the underlying functions and can therefore be faster. In this paper, we focus on the case when the objective function is strongly convex and present fast convergence results for the incremental gradient and incremental Newton methods under the constant and diminishing stepsizes. For a decaying stepsize rule $α_k = Θ(1/k^s)$ with $s \in (0,1]$, we show that the distance of the IG iterates to the optimal solution converges at rate ${\cal O}(1/k^{s})$ (which translates into ${\cal O}(1/k^{2s})$ rate in the suboptimality of the objective value). For $s>1/2$, this improves the previous ${\cal O}(1/\sqrt{k})$ results in distances obtained for the case when functions are non-smooth. We show that to achieve the fastest ${\cal O}(1/k)$ rate, incremental gradient needs a stepsize that requires tuning to the strong convexity parameter whereas the incremental Newton method does not. The results are based on viewing the incremental gradient method as a gradient descent method with gradient errors, devising efficient upper bounds for the gradient error to derive inequalities that relate distances of the consecutive iterates to the optimal solution and finally applying Chung's lemmas from the stochastic approximation literature to these inequalities to determine their asymptotic behavior. In addition, we construct examples to show tightness of our rate results.

preprint2022arXiv

Fictitious Play in Markov Games with Single Controller

Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have the fictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [Sayin et al. 2020] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).

preprint2022arXiv

Fictitious play in zero-sum stochastic games

We present a novel variant of fictitious play dynamics combining classical fictitious play with Q-learning for stochastic games and analyze its convergence properties in two-player zero-sum stochastic games. Our dynamics involves players forming beliefs on the opponent strategy and their own continuation payoff (Q-function), and playing a greedy best response by using the estimated continuation payoffs. Players update their beliefs from observations of opponent actions. A key property of the learning dynamics is that update of the beliefs on Q-functions occurs at a slower timescale than update of the beliefs on strategies. We show both in the model-based and model-free cases (without knowledge of player payoff functions and state transition probabilities), the beliefs on strategies converge to a stationary mixed Nash equilibrium of the zero-sum stochastic game.

preprint2022arXiv

What is a Good Metric to Study Generalization of Minimax Learners?

Minimax optimization has served as the backbone of many machine learning (ML) problems. Although the convergence behavior of optimization algorithms has been extensively studied in the minimax settings, their generalization guarantees in stochastic minimax optimization problems, i.e., how the solution trained on empirical data performs on unseen testing data, have been relatively underexplored. A fundamental question remains elusive: What is a good metric to study generalization of minimax learners? In this paper, we aim to answer this question by first showing that primal risk, a universal metric to study generalization in minimization problems, which has also been adopted recently to study generalization in minimax ones, fails in simple examples. We thus propose a new metric to study generalization of minimax learners: the primal gap, defined as the difference between the primal risk and its minimum over all models, to circumvent the issues. Next, we derive generalization error bounds for the primal gap in nonconvex-concave settings. As byproducts of our analysis, we also solve two open questions: establishing generalization error bounds for primal risk and primal-dual risk, another existing metric that is only well-defined when the global saddle-point exists, in the strong sense, i.e., without strong concavity or assuming that the maximization and expectation can be interchanged, while either of these assumptions was needed in the literature. Finally, we leverage this new metric to compare the generalization behavior of two popular algorithms -- gradient descent-ascent (GDA) and gradient descent-max (GDMax) in stochastic minimax optimization.

preprint2022arXiv

Why Random Reshuffling Beats Stochastic Gradient Descent

We analyze the convergence rate of the random reshuffling (RR) method, which is a randomized first-order incremental algorithm for minimizing a finite sum of convex component functions. RR proceeds in cycles, picking a uniformly random order (permutation) and processing the component functions one at a time according to this order, i.e., at each cycle, each component function is sampled without replacement from the collection. Though RR has been numerically observed to outperform its with-replacement counterpart stochastic gradient descent (SGD), characterization of its convergence rate has been a long standing open question. In this paper, we answer this question by showing that when the component functions are quadratics or smooth and the sum function is strongly convex, RR with iterate averaging and a diminishing stepsize $α_k=Θ(1/k^s)$ for $s\in (1/2,1)$ converges at rate $Θ(1/k^{2s})$ with probability one in the suboptimality of the objective value, thus improving upon the $Ω(1/k)$ rate of SGD. Our analysis draws on the theory of Polyak-Ruppert averaging and relies on decoupling the dependent cycle gradient error into an independent term over cycles and another term dominated by $α_k^2$. This allows us to apply law of large numbers to an appropriately weighted version of the cycle gradient errors, where the weights depend on the stepsize. We also provide high probability convergence rate estimates that shows decay rate of different terms and allows us to propose a modification of RR with convergence rate ${\cal O}(\frac{1}{k^2})$.

preprint2020arXiv

An Optimal Multistage Stochastic Gradient Method for Minimax Problems

In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the convergence rate and the size of this neighborhood. Next, we propose a multistage variant of stochastic GDA (M-GDA) that runs in multiple stages with a particular learning rate decay schedule and converges to the exact solution of the minimax problem. We show M-GDA achieves the lower bounds in terms of noise dependence without any assumptions on the knowledge of noise characteristics. We also show that M-GDA obtains a linear decay rate with respect to the error's dependence on the initial error, although the dependence on condition number is suboptimal. In order to improve this dependence, we apply the multistage machinery to the stochastic Optimistic Gradient Descent Ascent (OGDA) algorithm and propose the M-OGDA algorithm which also achieves the optimal linear decay rate with respect to the initial error. To the best of our knowledge, this method is the first to simultaneously achieve the best dependence on noise characteristic as well as the initial error and condition number.

preprint2020arXiv

GANs May Have No Nash Equilibria

Generative adversarial networks (GANs) represent a zero-sum game between two machine players, a generator and a discriminator, designed to learn the distribution of data. While GANs have achieved state-of-the-art performance in several benchmark learning tasks, GAN minimax optimization still poses great theoretical and empirical challenges. GANs trained using first-order optimization methods commonly fail to converge to a stable solution where the players cannot improve their objective, i.e., the Nash equilibrium of the underlying game. Such issues raise the question of the existence of Nash equilibrium solutions in the GAN zero-sum game. In this work, we show through several theoretical and numerical results that indeed GAN zero-sum games may not have any local Nash equilibria. To characterize an equilibrium notion applicable to GANs, we consider the equilibrium of a new zero-sum game with an objective function given by a proximal operator applied to the original objective, a solution we call the proximal equilibrium. Unlike the Nash equilibrium, the proximal equilibrium captures the sequential nature of GANs, in which the generator moves first followed by the discriminator. We prove that the optimal generative model in Wasserstein GAN problems provides a proximal equilibrium. Inspired by these results, we propose a new approach, which we call proximal training, for solving GAN problems. We discuss several numerical experiments demonstrating the existence of proximal equilibrium solutions in GAN minimax problems.

preprint2020arXiv

Graphon games: A statistical framework for network games and interventions

In this paper, we present a unifying framework for analyzing equilibria and designing interventions for large network games sampled from a stochastic network formation process represented by a graphon. We first introduce a new class of infinite population games, termed graphon games, where a continuum of heterogeneous agents interact according to a graphon. After studying properties of equilibria in graphon games, we show that graphon equilibria can approximate equilibria of large network games sampled from the graphon. We next show that, under some regularity assumptions, the graphon approach enables the design of asymptotically optimal interventions via the solution of an optimization problem with much lower dimension than the one based on the entire network structure. We illustrate our framework on a synthetic dataset of rural villages and show that the graphon intervention can be computed efficiently and based solely on aggregated relational data.

preprint2020arXiv

Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowledge, this is the first paper to provide a convergence rate guarantee for the last iterate of EG for the smooth convex-concave saddle point problem. Moreover, we show that this rate is tight by proving a lower bound of $Ω(1/\sqrt{T})$ for the last iterate. This lower bound therefore shows a quadratic separation of the convergence rates of ergodic and last iterates in smooth convex-concave saddle point problems.

preprint2020arXiv

On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

We study the convergence of a class of gradient-based Model-Agnostic Meta-Learning (MAML) methods and characterize their overall complexity as well as their best achievable accuracy in terms of gradient norm for nonconvex loss functions. We start with the MAML method and its first-order approximation (FO-MAML) and highlight the challenges that emerge in their analysis. By overcoming these challenges not only we provide the first theoretical guarantees for MAML and FO-MAML in nonconvex settings, but also we answer some of the unanswered questions for the implementation of these algorithms including how to choose their learning rate and the batch size for both tasks and datasets corresponding to tasks. In particular, we show that MAML can find an $ε$-first-order stationary point ($ε$-FOSP) for any positive $ε$ after at most $\mathcal{O}(1/ε^2)$ iterations at the expense of requiring second-order information. We also show that FO-MAML which ignores the second-order information required in the update of MAML cannot achieve any small desired level of accuracy, i.e., FO-MAML cannot find an $ε$-FOSP for any $ε>0$. We further propose a new variant of the MAML algorithm called Hessian-free MAML which preserves all theoretical guarantees of MAML, without requiring access to second-order information.

preprint2020arXiv

Optimal dynamic information provision in traffic routing

We consider a two-road dynamic routing game where the state of one of the roads (the "risky road") is stochastic and may change over time. This generates room for experimentation. A central planner may wish to induce some of the (finite number of atomic) agents to use the risky road even when the expected cost of travel there is high in order to obtain accurate information about the state of the road. Since agents are strategic, we show that in order to generate incentives for experimentation the central planner however needs to limit the number of agents using the risky road when the expected cost of travel on the risky road is low. In particular, because of congestion, too much use of the risky road when the state is favorable would make experimentation no longer incentive compatible. We characterize the optimal incentive compatible recommendation system, first in a two-stage game and then in an infinite-horizon setting. In both cases, this system induces only partial, rather than full, information sharing among the agents (otherwise there would be too much exploitation of the risky road when costs there are low).