Researcher profile

Mike Steel

Mike Steel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

Counting and optimising maximum phylogenetic diversity sets

In conservation biology, phylogenetic diversity (PD) provides a way to quantify the impact of the current rapid extinction of species on the evolutionary `Tree of Life'. This approach recognises that extinction not only removes species but also the branches of the tree on which unique features shared by the extinct species arose. In this paper, we investigate three questions that are relevant to PD. The first asks how many sets of species of given size $k$ preserve the maximum possible amount of PD in a given tree. The number of such maximum PD sets can be very large, even for moderate-sized phylogenies. We provide a combinatorial characterisation of maximum PD sets, focusing on the setting where the branch lengths are ultrametric (e.g. proportional to time). This leads to a polynomial-time algorithm for calculating the number of maximum PD sets of size $k$ by applying a generating function; we also investigate the types of tree shapes that harbour the most (or fewest) maximum PD sets of size $k$. Our second question concerns optimising a linear function on the species (regarded as leaves of the phylogenetic tree) across all the maximum PD sets of a given size. Using the characterisation result from the first question, we show how this optimisation problem can be solved in polynomial time, even though the number of maximum PD sets can grow exponentially. Our third question considers a dual problem: If $k$ species were to become extinct, then what is the largest possible {\em loss} of PD in the resulting tree? For this question, we describe a polynomial-time solution based on dynamical programming.

preprint2021arXiv

Combinatorial and stochastic properties of ranked tree-child networks

Tree-child networks are a recently-described class of directed acyclic graphs that have risen to prominence in phylogenetics (the study of evolutionary trees and networks). Although these networks have a number of attractive mathematical properties, many combinatorial questions concerning them remain intractable. In this paper, we show that endowing these networks with a biologically relevant ranking structure yields mathematically tractable objects, which we term ranked tree-child networks (RTCNs). We explain how to derive exact and explicit combinatorial results concerning the enumeration and generation of these networks. We also explore probabilistic questions concerning the properties of RTCNs when they are sampled uniformly at random. These questions include the lengths of random walks between the root and leaves (both from the root to the leaves and from a leaf to the root); the distribution of the number of cherries in the network; and sampling RTCNs conditional on displaying a given tree. We also formulate a conjecture regarding the scaling limit of the process that counts the number of lineages in the ancestry of a leaf. The main idea in this paper, namely using ranking as a way to achieve combinatorial tractability, may also extend to other classes of networks.

preprint2020arXiv

Modeling a Cognitive Transition at the Origin of Cultural Evolution using Autocatalytic Networks

Autocatalytic networks have been used to model the emergence of self-organizing structure capable of sustaining life and undergoing biological evolution. Here, we model the emergence of cognitive structure capable of undergoing cultural evolution. Mental representations of knowledge and experiences play the role of catalytic molecules, and interactions amongst them (e.g., the forging of new associations) play the role of reactions, and result in representational redescription. The approach tags mental representations with their source, i.e., whether they were acquired through social learning, individual learning (of pre-existing information), or creative thought (resulting in the generation of new information). This makes it possible to model how cognitive structure emerges, and to trace lineages of cumulative culture step by step. We develop a formal representation of the cultural transition from Oldowan to Acheulean tool technology using Reflexively Autocatalytifc and Food set generated (RAF) networks. Unlike more primitive Oldowan stone tools, the Acheulean hand axe required not only the capacity to envision and bring into being something that did not yet exist, but hierarchically structured thought and action, and the generation of new mental representations: the concepts EDGING, THINNING, SHAPING, and a meta-concept, HAND AXE. We show how this constituted a key transition towards the emergence of semantic networks that were self-organizing, self-sustaining, and autocatalytic, and discuss how such networks replicated through social interaction. The model provides a promising approach to unraveling one of the greatest anthropological mysteries: that of why development of the Acheulean hand axe was followed by over a million years of cultural stasis.

preprint2019arXiv

A class of phylogenetic networks reconstructable from ancestral profiles

Rooted phylogenetic networks provide an explicit representation of the evolutionary history of a set $X$ of sampled species. In contrast to phylogenetic trees which show only speciation events, networks can also accommodate reticulate processes (for example, hybrid evolution, endosymbiosis, and lateral gene transfer). A major goal in systematic biology is to infer evolutionary relationships, and while phylogenetic trees can be uniquely determined from various simple combinatorial data on $X$, for networks the reconstruction question is much more subtle. Here we ask when can a network be uniquely reconstructed from its `ancestral profile' (the number of paths from each ancestral vertex to each element in $X$). We show that reconstruction holds (even within the class of all networks) for a class of networks we call `orchard networks', and we provide a polynomial-time algorithm for reconstructing any orchard network from its ancestral profile. Our approach relies on establishing a structural theorem for orchard networks, which also provides for a fast (polynomial-time) algorithm to test if any given network is of orchard type. Since the class of orchard networks includes tree-sibling tree-consistent networks and tree-child networks, our result generalise reconstruction results from 2008 and 2009. Orchard networks allow for an unbounded number $k$ of reticulation vertices, in contrast to tree-sibling tree-consistent networks and tree-child networks for which $k$ is at most $2|X|-4$ and $|X|-1$, respectively.

preprint2012arXiv

Autocatalytic Sets Extended: Dynamics, Inhibition, and a Generalization

Background: Autocatalytic sets are often considered a necessary (but not sufficient) condition for the origin and early evolution of life. Although the idea of autocatalytic sets was already conceived of many years ago, only recently have they gained more interest, following advances in creating them experimentally in the laboratory. In our own work, we have studied autocatalytic sets extensively from a computational and theoretical point of view. Results: We present results from an initial study of the dynamics of self-sustaining autocatalytic sets (RAFs). In particular, simulations of molecular flow on autocatalytic sets are performed, to illustrate the kinds of dynamics that can occur. Next, we present an extension of our (previously introduced) algorithm for finding autocatalytic sets in general reaction networks, which can also handle inhibition. We show that in this case detecting autocatalytic sets is fixed parameter tractable. Finally, we formulate a generalized version of the algorithm that can also be applied outside the context of chemistry and origin of life, which we illustrate with a toy example from economics. Conclusions: Having shown theoretically (in previous work) that autocatalytic sets are highly likely to exist, we conclude here that also in terms of dynamics such sets are viable and outcompete non-autocatalytic sets. Furthermore, our dynamical results confirm arguments made earlier about how autocatalytic subsets can enable their own growth or give rise to other such subsets coming into existence. Finally, our algorithmic extension and generalization show that more realistic scenarios (e.g., including inhibition) can also be dealt with within our framework, and that it can even be applied to areas outside of chemistry, such as economics.

preprint2012arXiv

The Structure of Autocatalytic Sets: Evolvability, Enablement, and Emergence

This paper presents new results from a detailed study of the structure of autocatalytic sets. We show how autocatalytic sets can be decomposed into smaller autocatalytic subsets, and how these subsets can be identified and classified. We then argue how this has important consequences for the evolvability, enablement, and emergence of autocatalytic sets. We end with some speculation on how all this might lead to a generalized theory of autocatalytic sets, which could possibly be applied to entire ecologies or even economies.

preprint2011arXiv

'Bureaucratic' set systems, and their role in phylogenetics

We say that a collection $\Cc$ of subsets of $X$ is {\em bureaucratic} if every maximal hierarchy on $X$ contained in $\Cc$ is also maximum. We characterise bureaucratic set systems and show how they arise in phylogenetics. This framework has several useful algorithmic consequences: we generalize some earlier results and derive a polynomial-time algorithm for a parsimony problem arising in phylogenetic networks.

preprint2011arXiv

The 'Butterfly effect' in Cayley graphs, and its relevance for evolutionary genomics

Suppose a finite set $X$ is repeatedly transformed by a sequence of permutations of a certain type acting on an initial element $x$ to produce a final state $y$. We investigate how 'different' the resulting state $y'$ to $y$ can be if a slight change is made to the sequence, either by deleting one permutation, or replacing it with another. Here the 'difference' between $y$ and $y'$ might be measured by the minimum number of permutations of the permitted type required to transform $y$ to $y'$, or by some other metric. We discuss this first in the general setting of sensitivity to perturbation of walks in Cayley graphs of groups with a specified set of generators. We then investigate some permutation groups and generators arising in computational genomics, and the statistical implications of the findings.

preprint2010arXiv

Inferring ancestral sequences in taxon-rich phylogenies

Statistical consistency in phylogenetics has traditionally referred to the accuracy of estimating phylogenetic parameters for a fixed number of species as we increase the number of characters. However, as sequences are often of fixed length (e.g. for a gene) although we are often able to sample more taxa, it is useful to consider a dual type of statistical consistency where we increase the number of species, rather than characters. This raises some basic questions: what can we learn about the evolutionary process as we increase the number of species? In particular, does having more species allow us to infer the ancestral state of characters accurately? This question is particularly relevant when sequence site evolution varies in a complex way from character to character, as well as for reconstructing ancestral sequences. In this paper, we assemble a collection of results to analyse various approaches for inferring ancestral information with increasing accuracy as the number of taxa increases.

preprint2010arXiv

Locating a tree in a phylogenetic network

Phylogenetic trees and networks are leaf-labelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NP-complete in general. In this article, we consider the restriction of these problems to several well-studied classes of phylogenetic networks. We show that Tree Containment is polynomial-time solvable for normal networks, for binary tree-child networks, and for level-$k$ networks. On the other hand, we show that, even for tree-sibling, time-consistent, regular networks, both Tree Containment and Cluster Containment remain NP-complete.