Researcher profile

Xingru Chen

Xingru Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Dynamics of Multi-Agent Actor-Critic Learning in Stochastic Games: from Multistability and Chaos to Stable Cooperation

Achieving robust coordination and cooperation is a central challenge in multi-agent reinforcement learning (MARL). Uncovering the mechanisms underlying such emergent behaviors calls for a dynamical understanding of learn processes. In this work, we investigate the dynamics of actor-critic agents in stochastic games, focusing on the impact of entropy regularization. By leveraging time-scale separation, we derive the system's evolution equations, which are then formally analyzed using dynamical systems theory. We find that in the constant-sum game of Matching Pennies, the system exhibits chaotic behavior. Entropy regularization mitigates this chaos and drives the dynamics toward convergence to fair cooperation. In contrast, in the general-sum game of the Prisoner's Dilemma, the system displays multistability. Interestingly, the three stable equilibria of the system correspond to the well-known ALLC (Always Cooperate), ALLD (Always Defect), and GRIM (Grim Trigger) strategies from evolutionary game theory (EGT). Entropy regularization strengthens system resilience by enlarging the basin of attraction of the cooperative equilibrium. Our findings reveal a close link between the mechanism of direct reciprocity in EGT and how cooperation emerges in MARL, offering insights for designing more robust and collaborative multi-agent systems.

preprint2023arXiv

Evolutionary Dynamics with Randomly Distributed Benevolent Individuals

Understanding the evolution of cooperation is pivotal in biology and social science. Public resources sharing is a common scenario in the real world. In our study, we explore the evolutionary dynamics of cooperation on a regular graph with degree $k$, introducing the presence of a third strategy, namely the benevolence, who does not evolve over time, but provides a fixed benefit to all its neighbors. We find that the presence of the benevolence can foster the development of cooperative behavior and it follows a simple rule: $b/c > k - p_S(k-1)$. Our results provide new insights into the evolution of cooperation in structured populations.

preprint2022arXiv

Highly coordinated nationwide massive travel restrictions are central to effective mitigation and control of COVID-19 outbreaks in China

The COVID-19, the disease caused by the novel coronavirus 2019 (SARS-CoV-2), has caused graving woes across the globe since first reported in the epicenter Wuhan, Hubei, China, December 2019. The spread of COVID-19 in China has been successfully curtailed by massive travel restrictions that put more than 900 million people housebound for more than two months since the lockdown of Wuhan on 23 January 2020 when other provinces in China followed suit. Here, we assess the impact of China's massive lockdowns and travel restrictions reflected by the changes in mobility patterns before and during the lockdown period. We quantify the synchrony of mobility patterns across provinces and within provinces. Using these mobility data, we calibrate movement flow between provinces in combination with an epidemiological compartment model to quantify the effectiveness of lockdowns and reductions in disease transmission. Our analysis demonstrates that the onset and phase of local community transmission in other provinces depends on the cumulative population outflow received from the epicenter Hubei. As such, infections can propagate further into other interconnected places both near and far, thereby necessitating synchronous lockdowns. Moreover, our data-driven modeling analysis shows that lockdowns and consequently reduced mobility lag a certain time to elicit an actual impact on slowing down the spreading and ultimately putting the epidemic under check. In spite of the vastly heterogeneous demographics and epidemiological characteristics across China, mobility data shows that massive travel restrictions have been applied consistently via a top-down approach along with high levels of compliance from the bottom up.

preprint2022arXiv

Outlearning Extortioners by Fair-minded Unbending Strategies

Recent theory shows that extortioners taking advantage of the zero-determinant (ZD) strategy can unilaterally claim an unfair share of the payoffs in the Iterated Prisoner's Dilemma. It is thus suggested that against a fixed extortioner, any adapting co-player should be subdued with full cooperation as their best response. In contrast, recent experiments demonstrate that human players often choose not to accede to extortion out of concern for fairness, actually causing extortioners to suffer more loss than themselves. In light of this, here we reveal fair-minded strategies that are unbending to extortion such that any payoff-maximizing extortioner ultimately will concede in their own interest by offering a fair split in head-to-head matches. We find and characterize multiple general classes of such unbending strategies, including generous zero-determinant strategies and Win-Stay, Lose-Shift as particular examples. When against fixed unbending players, extortioners are forced with consequentially increasing losses whenever intending to demand more unfair share. Our analysis also pivots to the importance of payoff structure in determining the superiority of zero-determinant strategies and in particular their extortion ability. We show that an extortionate ZD player can be even outperformed by, for example, Win-Stay Lose-Shift, if the total payoff of unilateral cooperation is smaller than that of mutual defection. Unbending strategies can be used to outlearn evolutionary extortioners and catalyze the evolution of Tit-for-Tat-like strategies out of ZD players. Our work has implications for promoting fairness and resisting extortion so as to uphold a just and cooperative society.

preprint2022arXiv

The Geometry of Zero-Determinant Strategies

The advent of Zero-Determinant (ZD) strategies has reshaped the study of reciprocity and cooperation in the iterated Prisoner's Dilemma games. The ramification of ZD strategies has been demonstrated through their ability to unilaterally enforce a linear relationship between their own average payoff and that of their co-player. Common practice conveniently represents this relationship by a straight line in the parametric plot of pairwise payoffs. Yet little attention has been paid to studying the actual geometry of the strategy space of all admissible ZD strategies. Here, our work offers intuitive geometric relationships between different classes of ZD strategies as well as nontrivial geometric interpretations of their specific parameterizations. Adaptive dynamics of ZD strategies further reveals the unforeseen connection between general ZD strategies and the so-called equalizers that can set any co-player's payoff to a fixed value. We show that the class of equalizers forming a hyperplane is the critical equilibrium manifold, only part of which is stable. The same hyperplane is also a separatrix of the cooperation-enhancing region where the optimum response is to increase cooperation for each of the four payoff outcomes. Our results shed light on the simple but elegant geometry of ZD strategies that is previously overlooked.