Researcher profile

Barbara R. Holland

Barbara R. Holland contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

The impracticalities of multiplicatively-closed codon models: a retreat to linear alternatives

A matrix Lie algebra is a linear space of matrices closed under the operation $ [A, B] = AB-BA $. The "Lie closure" of a set of matrices is the smallest matrix Lie algebra which contains the set. In the context of Markov chain theory, if a set of rate matrices form a Lie algebra, their corresponding Markov matrices are closed under matrix multiplication; this has been found to be a useful property in phylogenetics. Inspired by previous research involving Lie closures of DNA models, it was hypothesised that finding the Lie closure of a codon model could help to solve the problem of mis-estimation of the non-synonymous/synonymous rate ratio, $ ω$. We propose two different methods of finding a linear space from a model: the first is the \emph{linear closure} which is the smallest linear space which contains the model, and the second is the \emph{linear version} which changes multiplicative constraints in the model to additive ones. For each of these linear spaces we then find the Lie closures of them. Under both methods, it was found that closed codon models would require thousands of parameters, and that any partial solution to this problem that was of a reasonable size violated stochasticity. Investigation of toy models indicated that finding the Lie closure of matrix linear spaces which deviated only slightly from a simple model resulted in a Lie closure that was close to having the maximum number of parameters possible. Given that Lie closures are not practical, we propose further consideration of the two variants of linearly closed models.

preprint2012arXiv

A tensorial approach to the inversion of group-based phylogenetic models

Using a tensorial approach, we show how to construct a one-one correspondence between pattern probabilities and edge parameters for any group-based model. This is a generalisation of the "Hadamard conjugation" and is equivalent to standard results that use Fourier analysis. In our derivation we focus on the connections to group representation theory and emphasize that the inversion is possible because, under their usual definition, group-based models are defined for abelian groups only. We also argue that our approach is elementary in the sense that it can be understood as simple matrix multiplication where matrices are rectangular and indexed by ordered-partitions of varying sizes.

preprint2012arXiv

Low-parameter phylogenetic estimation under the general Markov model

In their 2008 and 2009 papers, Sumner and colleagues introduced the "squangles" - a small set of Markov invariants for phylogenetic quartets. The squangles are consistent with the general Markov model (GM) and can be used to infer quartets without the need to explicitly estimate all parameters. As GM is inhomogeneous and hence non-stationary, the squangles are expected to perform well compared to standard approaches when there are changes in base-composition amongst species. However, GM includes the IID assumption, so the squangles should be confounded by data generated with invariant sites or with rate-variation across sites. Here we implement the squangles in a least-squares setting that returns quartets weighted by either confidence or internal edge lengths; and use these as input into a variety of quartet-based supertree methods. For the first time, we quantitatively investigate the robustness of the squangles to the breaking of IID assumptions on both simulated and real data sets; and we suggest a modification that improves the performance of the squangles in the presence of invariant sites. Our conclusion is that the squangles provide a novel tool for phylogenetic estimation that is complementary to methods that explicitly account for rate-variation across sites, but rely on homogeneous - and hence stationary - models.

preprint2012arXiv

Novel Distances for Dollo Data

We investigate distances on binary (presence/absence) data in the context of a Dollo process, where a trait can only arise once on a phylogenetic tree but may be lost many times. We introduce a novel distance, the Additive Dollo Distance (ADD), which is consistent for data generated under a Dollo model, and show that it has some useful theoretical properties including an intriguing link to the LogDet distance. Simulations of Dollo data are used to compare a number of binary distances including ADD, LogDet, Nei Li and some simple, but to our knowledge previously unstudied, variations on common binary distances. The simulations suggest that ADD outperforms other distances on Dollo data. Interestingly, we found that the LogDet distance performs poorly in the context of a Dollo process, which may have implications for its use in connection with conditioned genome reconstruction. We apply the ADD to two Diversity Arrays Technology (DArT) datasets, one that broadly covers Eucalyptus species and one that focuses on the Eucalyptus series Adnataria. We also reanalyse gene family presence/absence data on bacteria from the COG database and compare the results to previous phylogenies estimated using the conditioned genome reconstruction approach.