Researcher profile

Thomas Parr

Thomas Parr contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2022arXiv

Action and Perception as Divergence Minimization

To learn directed behaviors in complex environments, intelligent agents need to optimize objective functions. Various objectives are known for designing artificial agents, including task rewards and intrinsic motivation. However, it is unclear how the known objectives relate to each other, which objectives remain yet to be discovered, and which objectives better describe the behavior of humans. We introduce the Action Perception Divergence (APD), an approach for categorizing the space of possible objective functions for embodied agents. We show a spectrum that reaches from narrow to general objectives. While the narrow objectives correspond to domain-specific rewards as typical in reinforcement learning, the general objectives maximize information with the environment through latent variable models of input sequences. Intuitively, these agents use perception to align their beliefs with the world and use actions to align the world with their beliefs. They infer representations that are informative of past inputs, explore future inputs that are informative of their representations, and select actions or skills that maximally influence future inputs. This explains a wide range of unsupervised objectives from a single principle, including representation learning, information gain, empowerment, and skill discovery. Our findings suggest leveraging powerful world models for unsupervised exploration as a path toward highly adaptive agents that seek out large niches in their environments, rendering task rewards optional.

preprint2022arXiv

Reclaiming saliency: rhythmic precision-modulated action and perception

Computational models of visual attention in artificial intelligence and robotics have been inspired by the concept of a saliency map. These models account for the mutual information between the (current) visual information and its estimated causes. However, they fail to consider the circular causality between perception and action. In other words, they do not consider where to sample next, given current beliefs. Here, we reclaim salience as an active inference process that relies on two basic principles: uncertainty minimisation and rhythmic scheduling. For this, we make a distinction between attention and salience. Briefly, we associate attention with precision control, i.e., the confidence with which beliefs can be updated given sampled sensory data, and salience with uncertainty minimisation that underwrites the selection of future sensory data. Using this, we propose a new account of attention based on rhythmic precision-modulation and discuss its potential in robotics, providing numerical experiments that showcase advantages of precision-modulation for state and noise estimation, system identification and action selection for informative path planning.

preprint2022arXiv

Reward Maximisation through Discrete Active Inference

Active inference is a probabilistic framework for modelling the behaviour of biological and artificial agents, which derives from the principle of minimising free energy. In recent years, this framework has successfully been applied to a variety of situations where the goal was to maximise reward, offering comparable and sometimes superior performance to alternative approaches. In this paper, we clarify the connection between reward maximisation and active inference by demonstrating how and when active inference agents perform actions that are optimal for maximising reward. Precisely, we show the conditions under which active inference produces the optimal solution to the Bellman equation--a formulation that underlies several approaches to model-based reinforcement learning and control. On partially observed Markov decision processes, the standard active inference scheme can produce Bellman optimal actions for planning horizons of 1, but not beyond. In contrast, a recently developed recursive active inference scheme (sophisticated inference) can produce Bellman optimal actions on any finite temporal horizon. We append the analysis with a discussion of the broader relationship between active inference and reinforcement learning.

preprint2020arXiv

Active inference on discrete state-spaces: a synthesis

Active inference is a normative principle underwriting perception, action, planning, decision-making and learning in biological or artificial agents. From its inception, its associated process theory has grown to incorporate complex generative models, enabling simulation of a wide range of complex behaviours. Due to successive developments in active inference, it is often difficult to see how its underlying principle relates to process theories and practical implementation. In this paper, we try to bridge this gap by providing a complete mathematical synthesis of active inference on discrete state-space models. This technical summary provides an overview of the theory, derives neuronal dynamics from first principles and relates this dynamics to biological processes. Furthermore, this paper provides a fundamental building block needed to understand active inference for mixed generative models; allowing continuous sensations to inform discrete representations. This paper may be used as follows: to guide research towards outstanding challenges, a practical guide on how to implement active inference to simulate experimental behaviour, or a pointer towards various in-silico neurophysiological responses that may be used to make empirical predictions.

preprint2020arXiv

Active inference: demystified and compared

Active inference is a first principle account of how autonomous agents operate in dynamic, non-stationary environments. This problem is also considered in reinforcement learning (RL), but limited work exists on comparing the two approaches on the same discrete-state environments. In this paper, we provide: 1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in RL; 2) an explicit discrete-state comparison between active inference and RL on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of RL. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration, and account for uncertainty about their environment in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in RL is removed in active inference, where reward can simply be treated as another observation; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based RL agents; by placing zero prior preferences over rewards and by learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation, and demonstrate these behaviors in a OpenAI gym environment, alongside RL agents.

preprint2020arXiv

Dynamic causal modelling of COVID-19

This technical report describes a dynamic causal model of the spread of coronavirus through a population. The model is based upon ensemble or population dynamics that generate outcomes, like new cases and deaths over time. The purpose of this model is to quantify the uncertainty that attends predictions of relevant outcomes. By assuming suitable conditional dependencies, one can model the effects of interventions (e.g., social distancing) and differences among populations (e.g., herd immunity) to predict what might happen in different circumstances. Technically, this model leverages state-of-the-art variational (Bayesian) model inversion and comparison procedures, originally developed to characterise the responses of neuronal ensembles to perturbations. Here, this modelling is applied to epidemiological populations to illustrate the kind of inferences that are supported and how the model per se can be optimised given timeseries data. Although the purpose of this paper is to describe a modelling protocol, the results illustrate some interesting perspectives on the current pandemic; for example, the nonlinear effects of herd immunity that speak to a self-organised mitigation process.

preprint2020arXiv

Dynamic causal modelling of immune heterogeneity

An interesting inference drawn by some Covid-19 epidemiological models is that there exists a proportion of the population who are not susceptible to infection -- even at the start of the current pandemic. This paper introduces a model of the immune response to a virus. This is based upon the same sort of mean-field dynamics as used in epidemiology. However, in place of the location, clinical status, and other attributes of people in an epidemiological model, we consider the state of a virus, B and T-lymphocytes, and the antibodies they generate. Our aim is to formalise some key hypotheses as to the mechanism of resistance. We present a series of simple simulations illustrating changes to the dynamics of the immune response under these hypotheses. These include attenuated viral cell entry, pre-existing cross-reactive humoral (antibody-mediated) immunity, and enhanced T-cell dependent immunity. Finally, we illustrate the potential application of this sort of model by illustrating variational inversion (using simulated data) of this model to illustrate its use in testing hypotheses. In principle, this furnishes a fast and efficient immunological assay--based on sequential serology--that provides a (i) quantitative measure of latent immunological responses and (ii) a Bayes optimal classification of the different kinds of immunological response (c.f., glucose tolerance tests used to test for insulin resistance). This may be especially useful in assessing SARS-CoV-2 vaccines.

preprint2020arXiv

Effective immunity and second waves: a dynamic causal modelling study

This technical report addresses a pressing issue in the trajectory of the coronavirus outbreak; namely, the rate at which effective immunity is lost following the first wave of the pandemic. This is a crucial epidemiological parameter that speaks to both the consequences of relaxing lockdown and the propensity for a second wave of infections. Using a dynamic causal model of reported cases and deaths from multiple countries, we evaluated the evidence models of progressively longer periods of immunity. The results speak to an effective population immunity of about three months that, under the model, defers any second wave for approximately six months in most countries. This may have implications for the window of opportunity for tracking and tracing, as well as for developing vaccination programmes, and other therapeutic interventions.

preprint2020arXiv

Markov Blankets in the Brain

Recent characterisations of self-organising systems depend upon the presence of a Markov blanket: a statistical boundary that mediates the interactions between what is inside of and outside of a system. We leverage this idea to provide an analysis of partitions in neuronal systems. This is applicable to brain architectures at multiple scales, enabling partitions into single neurons, brain regions, and brain-wide networks. This treatment is based upon the canonical micro-circuitry used in empirical studies of effective connectivity, so as to speak directly to practical applications. This depends upon the dynamic coupling between functional units, whose form recapitulates that of a Markov blanket at each level. The nuance afforded by partitioning neural systems in this way highlights certain limitations of modular perspectives of brain function that only consider a single level of description.

preprint2020arXiv

Parcels and particles: Markov blankets in the brain

At the inception of human brain mapping, two principles of functional anatomy underwrote most conceptions - and analyses - of distributed brain responses: namely functional segregation and integration. There are currently two main approaches to characterising functional integration. The first is a mechanistic modelling of connectomics in terms of directed effective connectivity that mediates neuronal message passing and dynamics on neuronal circuits. The second phenomenological approach usually characterises undirected functional connectivity (i.e., measurable correlations), in terms of intrinsic brain networks, self-organised criticality, dynamical instability, etc. This paper describes a treatment of effective connectivity that speaks to the emergence of intrinsic brain networks and critical dynamics. It is predicated on the notion of Markov blankets that play a fundamental role in the self-organisation of far from equilibrium systems. Using the apparatus of the renormalisation group, we show that much of the phenomenology found in network neuroscience is an emergent property of a particular partition of neuronal states, over progressively larger scales. As such, it offers a way of linking dynamics on directed graphs to the phenomenology of intrinsic brain networks.

preprint2020arXiv

Second waves, social distancing, and the spread of COVID-19 across America

We recently described a dynamic causal model of a COVID-19 outbreak within a single region. Here, we combine several of these (epidemic) models to create a (pandemic) model of viral spread among regions. Our focus is on a second wave of new cases that may result from loss of immunity--and the exchange of people between regions--and how mortality rates can be ameliorated under different strategic responses. In particular, we consider hard or soft social distancing strategies predicated on national (Federal) or regional (State) estimates of the prevalence of infection in the population. The modelling is demonstrated using timeseries of new cases and deaths from the United States to estimate the parameters of a factorial (compartmental) epidemiological model of each State and, crucially, coupling between States. Using Bayesian model reduction, we identify the effective connectivity between States that best explains the initial phases of the outbreak in the United States. Using the ensuing posterior parameter estimates, we then evaluate the likely outcomes of different policies in terms of mortality, working days lost due to lockdown and demands upon critical care. The provisional results of this modelling suggest that social distancing and loss of immunity are the two key factors that underwrite a return to endemic equilibrium.

preprint2020arXiv

Sophisticated Inference

Active inference offers a first principle account of sentient behaviour, from which special and important cases can be derived, e.g., reinforcement learning, active learning, Bayes optimal inference, Bayes optimal design, etc. Active inference resolves the exploitation-exploration dilemma in relation to prior preferences, by placing information gain on the same footing as reward or value. In brief, active inference replaces value functions with functionals of (Bayesian) beliefs, in the form of an expected (variational) free energy. In this paper, we consider a sophisticated kind of active inference, using a recursive form of expected free energy. Sophistication describes the degree to which an agent has beliefs about beliefs. We consider agents with beliefs about the counterfactual consequences of action for states of affairs and beliefs about those latent states. In other words, we move from simply considering beliefs about 'what would happen if I did that' to 'what would I believe about what would happen if I did that'. The recursive form of the free energy functional effectively implements a deep tree search over actions and outcomes in the future. Crucially, this search is over sequences of belief states, as opposed to states per se. We illustrate the competence of this scheme, using numerical simulations of deep decision problems.

preprint2020arXiv

Tracking and tracing in the UK: a dynamic causal modelling study

By equipping a previously reported dynamic causal model of COVID-19 with an isolation state, we modelled the effects of self-isolation consequent on tracking and tracing. Specifically, we included a quarantine or isolation state occupied by people who believe they might be infected but are asymptomatic, and only leave if they test negative. We recovered maximum posteriori estimates of the model parameters using time series of new cases, daily deaths, and tests for the UK. These parameters were used to simulate the trajectory of the outbreak in the UK over an 18-month period. Several clear-cut conclusions emerged from these simulations. For example, under plausible (graded) relaxations of social distancing, a rebound of infections within weeks is unlikely. The emergence of a later second wave depends almost exclusively on the rate at which we lose immunity, inherited from the first wave. There exists no testing strategy that can attenuate mortality rates, other than by deferring or delaying a second wave. A sufficiently powerful tracking and tracing policy--implemented at the time of writing (10th May 2020)--will defer any second wave beyond a time horizon of 18 months. Crucially, this deferment is within current testing capabilities (requiring an efficacy of tracing and tracking of about 20% of asymptomatic infected cases, with less than 50,000 tests per day). These conclusions are based upon a dynamic causal model for which we provide some construct and face validation, using a comparative analysis of the United Kingdom and Germany, supplemented with recent serological studies.