Researcher profile

Gil McVean

Gil McVean contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2016arXiv

Identifying lineage effects when controlling for population structure improves power in bacterial association studies

Bacteria pose unique challenges for genome-wide association studies (GWAS) because of strong structuring into distinct strains and substantial linkage disequilibrium across the genome. While methods developed for human studies can correct for strain structure, this risks considerable loss- of-power because genetic differences between strains often contribute substantial phenotypic variability. Here we propose a new method that captures lineage-level associations even when locus-specific associations cannot be fine-mapped. We demonstrate its ability to detect genes and genetic variants underlying resistance to 17 antimicrobials in 3144 isolates from four taxonomically diverse clonal and recombining bacteria: Mycobacterium tuberculosis, Staphylococcus aureus, Escherichia coli and Klebsiella pneumoniae. Strong selection, recombination and penetrance confer high power to recover known antimicrobial resistance mechanisms, and reveal a candidate association between the outer membrane porin nmpC and cefazolin resistance in E. coli. Hence our method pinpoints locus-specific effects where possible, and boosts power by detecting lineage-level differences when fine-mapping is intractable.

preprint2015arXiv

Predictable patterns of CTL escape and reversion across host populations and viral subtypes in HIV-1 evolution

The twin processes of viral evolutionary escape and reversion in response to host immune pressure, in particular the cytotoxic T-lymphocyte (CTL) response, shape Human Immunodeficiency Virus-1 sequence evolution in infected host populations. The tempo of CTL escape and reversion is known to differ between CTL escape variants in a given host population. Here, we ask: are rates of escape and reversion comparable across infected host populations? For three cohorts taken from three continents, we estimate escape and reversion rates at 23 escape sites in optimally defined Gag epitopes. We find consistent escape rate estimates across the examined cohorts. Reversion rates are also consistent between a Canadian and South African infected host population. Certain Gag escape variants that incur a large replicative fitness cost are known to revert rapidly upon transmission. However, the relationship between escape/reversion rates and viral replicative capacity across a large number of epitopes has not been interrogated. We investigate this relationship by examining $in$ $vitro$ replicative capacities of viral sequences with minimal variation: point escape mutants induced in a lab strain. Remarkably, despite the complexities of epistatic effects exemplified by pathways to escape in famous epitopes, and the diversity of both hosts and viruses, CTL escape mutants which escape rapidly tend to be those with the highest replicative capacity when applied as a single point mutation. Similarly, mutants inducing the greatest costs to viral replicative capacity tend to revert more quickly. These data suggest that escape rates in Gag are consistent across host populations, and that in general these rates are dominated by site specific effects upon viral replicative capacity.

preprint2014arXiv

Demography and the age of rare variants

Large whole-genome sequencing projects have provided access to much of the rare variation in human populations, which is highly informative about population structure and recent demography. Here, we show how the age of rare variants can be estimated from patterns of haplotype sharing and how these ages can be related to historical relationships between populations. We investigate the distribution of the age of variants occurring exactly twice (f2 variants) in a worldwide sample sequenced by the 1000 Genomes Project, revealing enormous variation across populations. The median age of haplotypes carrying f2 variants is 50 to 160 generations across populations within Europe or Asia, and 170 to 320 generations within Africa. Haplotypes shared between continents are much older with median ages for haplotypes shared between Europe and Asia ranging from 320 to 670 generations. The distribution of the ages of f2 haplotypes is informative about their demography, revealing recent bottlenecks, ancient splits, and more modern connections between populations. We see the signature of selection in the observation that functional variants are significantly younger than nonfunctional variants of the same frequency. This approach is relatively insensitive to mutation rate and complements other nonparametric methods for demographic inference.

preprint2014arXiv

Identifying recombination hotspots using population genetic data

Motivation: Recombination rates vary considerably at the fine scale within mammalian genomes, with the majority of recombination occurring within hotspots of ~2 kb in width. We present a method for inferring the location of recombination hotspots from patterns of linkage disequilibrium within samples of population genetic data. Results: Using simulations, we show that our method has hotspot detection power of approximately 50-60%, but depending on the magnitude of the hotspot. The false positive rate is between 0.24 and 0.56 false positives per Mb for data typical of humans. Availability: http://github.com/auton1/LDhot

preprint2013arXiv

Integrating genealogical and dynamical modelling to infer escape and reversion rates in HIV epitopes

The rates of escape and reversion in response to selection pressure arising from the host immune system, notably the cytotoxic T-lymphocyte (CTL) response, are key factors determining the evolution of HIV. Existing methods for estimating these parameters from cross-sectional population data using ordinary differential equations (ODE) ignore information about the genealogy of sampled HIV sequences, which has the potential to cause systematic bias and over-estimate certainty. Here, we describe an integrated approach, validated through extensive simulations, which combines genealogical inference and epidemiological modelling, to estimate rates of CTL escape and reversion in HIV epitopes. We show that there is substantial uncertainty about rates of viral escape and reversion from cross-sectional data, which arises from the inherent stochasticity in the evolutionary process. By application to empirical data, we find that point estimates of rates from a previously published ODE model and the integrated approach presented here are often similar, but can also differ several-fold depending on the structure of the genealogy. The model-based approach we apply provides a framework for the statistical analysis of escape and reversion in population data and highlights the need for longitudinal and denser cross-sectional sampling to enable accurate estimate of these key parameters.