Researcher profile

Dmitri A. Petrov

Dmitri A. Petrov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2014arXiv

Genomic evidence of rapid and stable adaptive oscillations over seasonal time scales in Drosophila

In many species, genomic data have revealed pervasive adaptive evolution indicated by the fixation of beneficial alleles. However, when selection pressures are highly variable along a species range or through time adaptive alleles may persist at intermediate frequencies for long periods. So called balanced polymorphisms have long been understood to be an important component of standing genetic variation yet direct evidence of the strength of balancing selection and the stability and prevalence of balanced polymorphisms has remained elusive. We hypothesized that environmental fluctuations between seasons in a North American orchard would impose temporally variable selection on Drosophila melanogaster and consequently maintain allelic variation at polymorphisms adaptively evolving in response to climatic variation. We identified hundreds of polymorphisms whose frequency oscillates among seasons and argue that these loci are subject to strong, temporally variable selection. We show that these polymorphisms respond to acute and persistent changes in climate and are associated in predictable ways with seasonally variable phenotypes. In addition, we show that adaptively oscillating polymorphisms are likely millions of years old, with some likely predating the divergence between D. melanogaster and D. simulans. Taken together, our results demonstrate that rapid temporal fluctuations in climate over generational time promotes adaptive genetic diversity at loci affecting polygenic phenotypes.

preprint2014arXiv

Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps

Rapid adaptation has been observed in numerous organisms in response to selective pressures, such as the application of pesticides and the presence of pathogens. When rapid adaptation is driven by rare alleles from the standing genetic variation or by a high population rate of de novo adaptive mutation, positive selection should commonly generate soft rather that hard selective sweeps. In a soft sweep, multiple adaptive haplotypes sweep through the population simultaneously, in contrast to hard sweeps in which only a single adaptive haplotype rises to high frequency. Current statistical methods were not designed to detect soft sweeps, and are therefore likely to miss these possibly numerous adaptive events. Here, we develop a statistical test (H12) based on haplotype homozygosity that is capable of detecting both hard and soft sweeps with similar power. We use H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a population sample of fully sequenced Drosophila melanogaster strains from the Drosophila Genetic Reference Panel (DGRP). Visual inspection of the top 50 peaks revealed that multiple haplotypes are at high frequency, consistent with signatures of soft sweep. We developed a second statistic (H2/H1) that is sensitive to signatures common to soft sweeps but not hard sweeps, in order to determine whether sweeps detected by H12 can be more easily generated by hard versus soft sweeps. Surprisingly, we find that the H12 and H2/H1 values for all top 50 peaks are more easily generated by soft sweeps than hard sweeps under several evolutionary scenarios.

preprint2013arXiv

Host species and environmental effects on bacterial communities associated with Drosophila in the laboratory and in the natural environment

The fruit fly Drosophila is a classic model organism to study adaptation as well as the relationship between genetic variation and phenotypes. Although associated bacterial communities might be important for many aspects of Drosophila biology, knowledge about their diversity, composition, and factors shaping them is limited. We used 454-based sequencing of a variable region of the bacterial 16S ribosomal RNA gene to characterize the bacterial communities associated with wild and laboratory Drosophila isolates. In order to specifically investigate effects of food source and host species on bacterial communities, we analyzed samples from wild Drosophila melanogaster and D. simulans collected from a variety of natural substrates, as well as from adults and larvae of nine laboratory reared Drosophila species. We find no evidence for host species effects in lab reared flies, instead lab of origin and stochastic effects, which could influence studies of Drosophila phenotypes, are pronounced. In contrast, the natural Drosophila associated microbiota appears to be predominantly shaped by food substrate with an additional but smaller effect of host species identity. We identify a core member of this natural microbiota that belongs to the genus Gluconobacter and is common to all wild caught flies in this study, but absent from the laboratory. This makes it a strong candidate for being part of what could be a natural D. melanogaster and D. simulans core microbiome. Furthermore we were able to identify candidate pathogens in natural fly isolates.

preprint2013arXiv

Strong Purifying Selection at Synonymous Sites in D. melanogaster

Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in D. melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.

preprint2012arXiv

LDx: estimation of linkage disequilibrium from high-throughput pooled resequencing data

High-throughput pooled resequencing offers significant potential for whole genome population sequencing. However, its main drawback is the loss of haplotype information. In order to regain some of this information, we present LDx, a computational tool for estimating linkage disequilibrium (LD) from pooled resequencing data. LDx uses an approximate maximum likelihood approach to estimate LD (r2) between pairs of SNPs that can be observed within and among single reads. LDx also reports r2 estimates derived solely from observed genotype counts. We demonstrate that the LDx estimates are highly correlated with r2 estimated from individually resequenced strains. We discuss the performance of LDx using more stringent quality conditions and infer via simulation the degree to which performance can improve based on read depth. Finally we demonstrate two possible uses of LDx with real and simulated pooled resequencing data. First, we use LDx to infer genomewide patterns of decay of LD with physical distance in D. melanogaster population resequencing data. Second, we demonstrate that r2 estimates from LDx are capable of distinguishing alternative demographic models representing plausible demographic histories of D. melanogaster.

preprint2012arXiv

The McDonald-Kreitman Test and its Extensions under Frequent Adaptation: Problems and Solutions

Population genomic studies have shown that genetic draft and background selection can profoundly affect the genome-wide patterns of molecular variation. We performed forward simulations under realistic gene-structure and selection scenarios to investigate whether such linkage effects impinge on the ability of the McDonald-Kreitman (MK) test to infer the rate of positive selection (α) from polymorphism and divergence data. We find that in the presence of slightly deleterious mutations, MK estimates of α severely underestimate the true rate of adaptation even if all polymorphisms with population frequencies under 50% are excluded. Furthermore, already under intermediate rates of adaptation, genetic draft substantially distorts the site frequency spectra at neutral and functional sites from the expectations under mutation-selection-drift balance. MK-type approaches that first infer demography from synonymous sites and then use the inferred demography to correct the estimation of α obtain almost the correct α in our simulations. However, these approaches typically infer a severe past population expansion although there was no such expansion in the simulations, casting doubt on the accuracy of methods that infer demography from synonymous polymorphism data. We suggest a simple asymptotic extension of the MK test that should yield accurate estimates of α even in the presence of linkage effects.