Researcher profile

Wolfram Barfuss

Wolfram Barfuss contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Dynamics of Multi-Agent Actor-Critic Learning in Stochastic Games: from Multistability and Chaos to Stable Cooperation

Achieving robust coordination and cooperation is a central challenge in multi-agent reinforcement learning (MARL). Uncovering the mechanisms underlying such emergent behaviors calls for a dynamical understanding of learn processes. In this work, we investigate the dynamics of actor-critic agents in stochastic games, focusing on the impact of entropy regularization. By leveraging time-scale separation, we derive the system's evolution equations, which are then formally analyzed using dynamical systems theory. We find that in the constant-sum game of Matching Pennies, the system exhibits chaotic behavior. Entropy regularization mitigates this chaos and drives the dynamics toward convergence to fair cooperation. In contrast, in the general-sum game of the Prisoner's Dilemma, the system displays multistability. Interestingly, the three stable equilibria of the system correspond to the well-known ALLC (Always Cooperate), ALLD (Always Defect), and GRIM (Grim Trigger) strategies from evolutionary game theory (EGT). Entropy regularization strengthens system resilience by enlarging the basin of attraction of the cooperative equilibrium. Our findings reveal a close link between the mechanism of direct reciprocity in EGT and how cooperation emerges in MARL, offering insights for designing more robust and collaborative multi-agent systems.

preprint2022arXiv

Conceptualizing World-Earth System resilience: Exploring transformation pathways towards a safe and just operating space for humanity

We develop a framework within which to conceptualize World-Earth System resilience. Our notion of World-Earth System resilience emphasizes the need to move beyond the basin of attraction notion of resilience as we are not in a basin we can stay in. We are on a trajectory to a new basin and we have to avoid falling into undesirable basins. We thus focus on `pathway resilience', i.e. the relative number of paths that allow us to move from the transitional operating space we occupy now as we leave the Holocene basin to a safe and just operating space in the Anthropocene. We develop a mathematical model to formalize this conceptualization and demonstrate how interactions between earth system resilience (biophysical processes) and world system resilience (social processes) impact pathway resilience. Our findings show that building earth system resilience is probably our only chance to reach a safe and just operating space. We also illustrate the importance of world system dynamics by showing how the notion of fairness coupled with regional inequality affects pathway resilience.

preprint2022arXiv

Modeling the effects of environmental and perceptual uncertainty using deterministic reinforcement learning dynamics with partial observability

Assessing the systemic effects of uncertainty that arises from agents' partial observation of the true states of the world is critical for understanding a wide range of scenarios. Yet, previous modeling work on agent learning and decision-making either lacks a systematic way to describe this source of uncertainty or puts the focus on obtaining optimal policies using complex models of the world that would impose an unrealistically high cognitive demand on real agents. In this work we aim to efficiently describe the emergent behavior of biologically plausible and parsimonious learning agents faced with partially observable worlds. Therefore we derive and present deterministic reinforcement learning dynamics where the agents observe the true state of the environment only partially. We showcase the broad applicability of our dynamics across different classes of partially observable agent-environment systems. We find that partial observability creates unintuitive benefits in a number of specific contexts, pointing the way to further research on a general understanding of such effects. For instance, partially observant agents can learn better outcomes faster, in a more stable way and even overcome social dilemmas. Furthermore, our method allows the application of dynamical systems theory to partially observable multiagent leaning. In this regard we find the emergence of catastrophic limit cycles, a critical slowing down of the learning processes between reward regimes and the separation of the learning dynamics into fast and slow directions, all caused by partial observability. Therefore, the presented dynamics have the potential to become a formal, yet practical, lightweight and robust tool for researchers in biology, social science and machine learning to systematically investigate the effects of interacting partially observant agents.

preprint2019arXiv

Deep reinforcement learning in World-Earth system models to discover sustainable management strategies

Increasingly complex, non-linear World-Earth system models are used for describing the dynamics of the biophysical Earth system and the socio-economic and socio-cultural World of human societies and their interactions. Identifying pathways towards a sustainable future in these models for informing policy makers and the wider public, e.g. pathways leading to a robust mitigation of dangerous anthropogenic climate change, is a challenging and widely investigated task in the field of climate research and broader Earth system science. This problem is particularly difficult when constraints on avoiding transgressions of planetary boundaries and social foundations need to be taken into account. In this work, we propose to combine recently developed machine learning techniques, namely deep reinforcement learning (DRL), with classical analysis of trajectories in the World-Earth system. Based on the concept of the agent-environment interface, we develop an agent that is generally able to act and learn in variable manageable environment models of the Earth system. We demonstrate the potential of our framework by applying DRL algorithms to two stylized World-Earth system models. Conceptually, we explore thereby the feasibility of finding novel global governance policies leading into a safe and just operating space constrained by certain planetary and socio-economic boundaries. The artificially intelligent agent learns that the timing of a specific mix of taxing carbon emissions and subsidies on renewables is of crucial relevance for finding World-Earth system trajectories that are sustainable on the long term.

preprint2019arXiv

Earth system modeling with endogenous and dynamic human societies: the copan:CORE open World-Earth modeling framework

Analysis of Earth system dynamics in the Anthropocene requires to explicitly take into account the increasing magnitude of processes operating in human societies, their cultures, economies and technosphere and their growing feedback entanglement with those in the physical, chemical and biological systems of the planet. However, current state-of-the-art Earth System Models do not represent dynamic human societies and their feedback interactions with the biogeophysical Earth system and macroeconomic Integrated Assessment Models typically do so only with limited scope. This paper (i) proposes design principles for constructing World-Earth Models (WEM) for Earth system analysis of the Anthropocene, i.e., models of social (World) - ecological (Earth) co-evolution on up to planetary scales, and (ii) presents the copan:CORE open simulation modeling framework for developing, composing and analyzing such WEMs based on the proposed principles. The framework provides a modular structure to flexibly construct and study WEMs. These can contain biophysical (e.g. carbon cycle dynamics), socio-metabolic/economic (e.g. economic growth) and socio-cultural processes (e.g. voting on climate policies or changing social norms) and their feedback interactions, and are based on elementary entity types, e.g., grid cells and social systems. Thereby, copan:CORE enables the epistemic flexibility needed for contributions towards Earth system analysis of the Anthropocene given the large diversity of competing theories and methodologies used for describing socio-metabolic/economic and socio-cultural processes in the Earth system by various fields and schools of thought. To illustrate the capabilities of the framework, we present an exemplary and highly stylized WEM implemented in copan:CORE that illustrates how endogenizing socio-cultural processes and feedbacks could fundamentally change macroscopic model outcomes.

preprint2019arXiv

The physics of governance networks: critical transitions in contagion dynamics on multilayer adaptive networks with application to the sustainable use of renewable resources

Adaptive networks are a versatile approach to model phenomena such as contagion and spreading dynamics, critical transitions and structure formation that emerge from the dynamic coevolution of complex network structure and node states. Here, we study critical transitions in contagion dynamics on multilayer adaptive networks with dynamic node states and present an application to the governance of sustainable resource use. We focus on a three layer adaptive network model, where a polycentric governance network interacts with a social network of resource users which in turn interacts with an ecological network of renewable resources. We uncover that sustainability is favored for slow interaction timescales, large homophilic network adaptation rate (as long it is below the fragmentation threshold) and high taxation rates. Interestingly, we also observe a trade-off between an eco-dictatorship (reduced model with a single governance actor that always taxes unsustainable resource use) and the polycentric governance network of multiple actors. In the latter setup, sustainability is enhanced for low but hindered for high tax rates compared to the eco-dictatorship case. These results highlight mechanisms generating emergent critical transitions in contagion dynamics on multilayer adaptive network and show how these can be understood and approximated analytically, relevant for understanding complex adaptive systems from various disciplines ranging from physics and epidemiology to sociology and global sustainability science. The paper also provides insights into potential critical intervention points for policy in the form of taxes in the governance of sustainable renewable resource use that can inform more process-detailed social-ecological modeling.