Researcher profile

Kai Cui

Kai Cui contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

A Survey on Large-Population Systems and Scalable Multi-Agent Reinforcement Learning

The analysis and control of large-population systems is of great interest to diverse areas of research and engineering, ranging from epidemiology over robotic swarms to economics and finance. An increasingly popular and effective approach to realizing sequential decision-making in multi-agent systems is through multi-agent reinforcement learning, as it allows for an automatic and model-free analysis of highly complex systems. However, the key issue of scalability complicates the design of control and reinforcement learning algorithms particularly in systems with large populations of agents. While reinforcement learning has found resounding empirical success in many scenarios with few agents, problems with many agents quickly become intractable and necessitate special consideration. In this survey, we will shed light on current approaches to tractably understanding and analyzing large-population systems, both through multi-agent reinforcement learning and through adjacent areas of research such as mean-field games, collective intelligence, or complex network theory. These classically independent subject areas offer a variety of approaches to understanding or modeling large-population systems, which may be of great use for the formulation of tractable MARL algorithms in the future. Finally, we survey potential areas of application for large-scale control and identify fruitful future applications of learning algorithms in practical systems. We hope that our survey could provide insight and future directions to junior and senior researchers in theoretical and applied sciences alike.

preprint2022arXiv

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning

The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring convergence via fixed point iteration. Instead, we incorporate entropy-regularization and Boltzmann policies into the fixed point iteration. As a result, we obtain provable convergence to approximate fixed points where existing methods fail, and reach the original goal of approximate Nash equilibria. All proposed methods are evaluated with respect to their exploitability, on both instructive examples with tractable exact solutions and high-dimensional problems where exact methods become intractable. In high-dimensional scenarios, we apply established deep reinforcement learning methods and empirically combine fictitious play with our approximations.

preprint2022arXiv

Learning Graphon Mean Field Games and Approximate Nash Equilibria

Recent advances at the intersection of dense large graph limits and mean field games have begun to enable the scalable analysis of a broad class of dynamical sequential games with large numbers of agents. So far, results have been largely limited to graphon mean field systems with continuous-time diffusive or jump dynamics, typically without control and with little focus on computational methods. We propose a novel discrete-time formulation for graphon mean field games as the limit of non-linear dense graph Markov games with weak interaction. On the theoretical side, we give extensive and rigorous existence and approximation properties of the graphon mean field solution in sufficiently large systems. On the practical side, we provide general learning schemes for graphon mean field equilibria by either introducing agent equivalence classes or reformulating the graphon mean field system as a classical mean field system. By repeatedly finding a regularized optimal control solution and its generated mean field, we successfully obtain plausible approximate Nash equilibria in otherwise infeasible large dense graph games with many agents. Empirically, we are able to demonstrate on a number of examples that the finite-agent behavior comes increasingly close to the mean field behavior for our computed equilibria as the graph or system size grows, verifying our theory. More generally, we successfully apply policy gradient reinforcement learning in conjunction with sequential Monte Carlo methods.

preprint2022arXiv

Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems

Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be realized. Existing state-of-the-art solutions fail to consider the effect of communication delays on the behaviour of very large systems with many clients. In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues. In order to obtain a tractable solution, we model this system as a mean-field control problem with enlarged state-action space in discrete time through exact discretization. Subsequently, we apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution. Here, the discrete-time system model incorporates a synchronization delay under which the queue state information is synchronously broadcasted and updated at all clients. We then provide theoretical performance guarantees for our methodology in large systems. Finally, using experiments, we prove that our approach is not only scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ) and other policies in the presence of synchronization delays.

preprint2022arXiv

Mean Field Games on Weighted and Directed Graphs via Colored Digraphons

The field of multi-agent reinforcement learning (MARL) has made considerable progress towards controlling challenging multi-agent systems by employing various learning methods. Numerous of these approaches focus on empirical and algorithmic aspects of the MARL problems and lack a rigorous theoretical foundation. Graphon mean field games (GMFGs) on the other hand provide a scalable and mathematically well-founded approach to learning problems that involve a large number of connected agents. In standard GMFGs, the connections between agents are undirected, unweighted and invariant over time. Our paper introduces colored digraphon mean field games (CDMFGs) which allow for weighted and directed links between agents that are also adaptive over time. Thus, CDMFGs are able to model more complex connections than standard GMFGs. Besides a rigorous theoretical analysis including both existence and convergence guarantees, we provide a learning scheme and illustrate our findings with an epidemics model and a model of the systemic risk in financial markets.

preprint2022arXiv

Motif-based mean-field approximation of interacting particles on clustered networks

Interacting particles on graphs are routinely used to study magnetic behaviour in physics, disease spread in epidemiology, and opinion dynamics in social sciences. The literature on mean-field approximations of such systems for large graphs is limited to cluster-free graphs for which standard approximations based on degrees and pairs are often reasonably accurate. Here, we propose a motif-based mean-field approximation that considers higher-order subgraph structures in large clustered graphs. Numerically, our equations agree with stochastic simulations where existing methods fail.

preprint2022arXiv

Nearest-Neighbor-based Collision Avoidance for Quadrotors via Reinforcement Learning

Collision avoidance algorithms are of central interest to many drone applications. In particular, decentralized approaches may be the key to enabling robust drone swarm solutions in cases where centralized communication becomes computationally prohibitive. In this work, we draw biological inspiration from flocks of starlings (Sturnus vulgaris) and apply the insight to end-to-end learned decentralized collision avoidance. More specifically, we propose a new, scalable observation model following a biomimetic nearest-neighbor information constraint that leads to fast learning and good collision avoidance behavior. By proposing a general reinforcement learning approach, we obtain an end-to-end learning-based approach to integrating collision avoidance with arbitrary tasks such as package collection and formation change. To validate the generality of this approach, we successfully apply our methodology through motion models of medium complexity, modeling momentum and nonetheless allowing direct application to real world quadrotors in conjunction with a standard PID controller. In contrast to prior works, we find that in our sufficiently rich motion model, nearest-neighbor information is indeed enough to learn effective collision avoidance behavior. Our learned policies are tested in simulation and subsequently transferred to real-world drones to validate their real-world applicability.

preprint2022arXiv

Optimal Offloading Strategies for Edge-Computing via Mean-Field Games and Control

The optimal offloading of tasks in heterogeneous edge-computing scenarios is of great practical interest, both in the selfish and fully cooperative setting. In practice, such systems are typically very large, rendering exact solutions in terms of cooperative optima or Nash equilibria intractable. For this purpose, we adopt a general mean-field formulation in order to solve the competitive and cooperative offloading problems in the limit of infinitely large systems. We give theoretical guarantees for the approximation properties of the limiting solution and solve the resulting mean-field problems numerically. Furthermore, we verify our solutions numerically and find that our approximations are accurate for systems with dozens of edge devices. As a result, we obtain a tractable approach to the design of offloading strategies in large edge-computing scenarios with many users.