Researcher profile

Serik Sagitov

Serik Sagitov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

Counting unique molecular identifiers in sequencing using a multitype branching process with immigration

Detection of extremely rare variant alleles, such as tumour DNA, within a complex mixture of DNA molecules is experimentally challenging due to sequencing errors. Barcoding of target DNA molecules in library construction for next-generation sequencing provides a way to identify and bioinformatically remove polymerase induced errors. During the barcoding procedure involving $t$ consecutive PCR cycles, the DNA molecules become barcoded by unique molecular identifiers (UMI). Different library construction protocols utilise different values of $t$. The effect of a larger $t$ and imperfect PCR amplifications is poorly described. This paper proposes a branching process with growing immigration as a model describing the random outcome of $t$ cycles of PCR barcoding. Our model discriminates between five different amplification rates $r_1$, $r_2$, $r_3$, $r_4$, $r$ for different types of molecules associated with the PCR barcoding procedure. We study this model by focussing on $C_t$, the number of clusters of molecules sharing the same UMI, as well as $C_t(m)$, the number of UMI clusters of size $m$. Our main finding is a remarkable asymptotic pattern valid for moderately large $t$. It turns out that $E(C_t(m))/E(C_t)\approx 2^{-m}$ for $m=1,2,\ldots$, regardless of the underlying parameters $(r_1,r_2,r_3,r_4,r)$. The knowledge of the quantities $C_t$ and $C_t(m)$ as functions of the experimental parameters $t$ and $(r_1,r_2,r_3,r_4,r)$ will help the users to draw more adequate conclusions from the outcomes of different sequencing protocols.

preprint2022arXiv

Critical branching as a pure death process coming down from infinity

We consider the critical Galton-Watson process with overlapping generations stemming from a single founder. Assuming that both the variance of the offspring number and the average generation length are finite, we establish the convergence of the finite-dimensional distributions, conditioned on non-extinction at a remote time of observation. The limiting process is identified as a pure death process coming down from infinity. This result brings a new perspective on Vatutin's dichotomy claiming that in the critical regime of age-dependent reproduction, an extant population either contains a large number of short-living individuals or consists of few long-living individuals.

preprint2013arXiv

Skeletons of near-critical Bienaymé-Galton-Watson branching processes

Skeletons of branching processes are defined as trees of lineages characterized by an appropriate signature of future reproduction success. In the supercritical case a natural choice is to look for the lineages that survive forever. In the critical case it was earlier suggested to distinguish the particles with the total number of descendants exceeding a certain threshold. These two definitions lead to asymptotic representations of the skeletons as either pure birth process (in the slightly supercritical case) or critical birth-death processes (in the critical case conditioned on the total number of particles exceeding a high threshold value). The limit skeletons reveal typical survival scenarios for the underlying branching processes. In this paper we consider near-critical Bienaymé-Galton-Watson processes and define their skeletons using marking of particles. If marking is rare, such skeletons are approximated by birth and death processes which can be subcritical, critical or supercritical. We obtain the limit skeleton for a sequential mutation model and compute the density distribution function for the time to escape from extinction.

preprint2012arXiv

Decomposition of supercritical linear-fractional branching processes

It is well known that a supercritical single-type Bienyamé-Galton-Watson process can be viewed as a decomposable branching process formed by two subtypes of particles: those having infinite line of descent and those who have finite number of descendants. In this paper we analyze such a decomposition for the linear-fractional Bienyamé-Galton-Watson processes with countably many types.

preprint2012arXiv

Interspecies correlation for neutrally evolving traits

A simple way to model phenotypic evolution is to assume that after splitting, the trait values of the sister species diverge as independent Brownian motions. Relying only on a prior distribution for the underlying species tree (conditioned on the number, n, of extant species) we study the random vector (X_1,...,X_n) of the observed trait values. In this paper we derive compact formulae for the variance of the sample mean and the mean of the sample variance for the vector (X_1,...,X_n). The key ingredient of these formulae is the correlation coefficient between two trait values randomly chosen from (X_1,...,X_n). This interspecies correlation coefficient takes into account not only variation due to the random sampling of two species out of n and the stochastic nature of Brownian motion but also the uncertainty in the phylogenetic tree. The latter is modeled by a (supercritical or critical) conditioned branching process. In the critical case we modify the Aldous-Popovic model by assuming a proper prior for the time of origin.

preprint2012arXiv

Linear-fractional branching processes with countably many types

We study multi-type Bienaymé-Galton-Watson processes with linear-fractional reproduction laws using various analytical tools like contour process, spinal representation, Perron-Frobenius theorem for countable matrices, renewal theory. For this special class of branching processes with countably many types we present a transparent criterion for $R$-positive recurrence with respect to the type space. This criterion appeals to the Malthusian parameter and the mean age at childbearing of the associated linear-fractional Crump-Mode-Jagers process.

preprint2012arXiv

Statistical Inference of Allopolyploid Species Networks in the Presence of Incomplete Lineage Sorting

Polyploidy is an important speciation mechanism, particularly in land plants. Allopolyploid species are formed after hybridization between otherwise intersterile parental species. Recent theoretical progress has led to successful implementation of species tree models that take population genetic parameters into account. However, these models have not included allopolyploid hybridization and the special problems imposed when species trees of allopolyploids are inferred. Here, two new models for the statistical inference of the evolutionary history of allopolyploids are evaluated using simulations and demonstrated on two empirical data sets. It is assumed that there has been a single hybridization event between two diploid species resulting in a genomic allotetraploid. The evolutionary history can be represented as a network or as a multiply labeled tree, in which some pairs of tips are labeled with the same species. In one of the models (AlloppMUL), the multiply labeled tree is inferred directly. This is the simplest model and the most widely applicable, since fewer assumptions are made. The second model (AlloppNET) incorporates the hybridization event explicitly which means that fewer parameters need to be estimated. Both models are implemented in the BEAST framework. Simulations show that both models are useful and that AlloppNET is more accurate if the assumptions it is based on are valid. The models are demonstrated on previously analyzed data from the genus Pachycladon (Brassicaceae) and from the genus Silene (Caryophyllaceae).

preprint2012arXiv

Time to a single hybridization event in a group of species with unknown ancestral history

We consider a stochastic process for the generation of species which combines a Yule process with a simple model for hybridization between pairs of co-existent species. We assume that the origin of the process, when there was one species, occurred at an unknown time in the past, and we condition the process on producing n species via the Yule process and a single hybridization event. We prove results about the distribution of the time of the hybridization event. In particular we calculate a formula for all moments, and show that under various conditions, the distribution tends to an exponential with rate twice that of the birth rate for the Yule process.

preprint2011arXiv

Survival of branching processes in random environments

This review paper presents the known results on the asymptotics of the survival probability and limit theorems conditioned on survival of critical and subcritical branching processes in IID random environments. The key assumptions of the family of population models in question are: non-overlapping generations, independent reproduction of particles within a generation, independent reproduction laws between generations. This is a biologically important generalization of the time inhomogeneous branching processes. The assumption of IID (independent and identically distributed) random environments reflects uncertainty in the future (as well as historical) reproduction regimes in actual populations. This review focusses on a particular range of questions of prime interest for the authors. The reader should be aware of the fact that there are many very interesting papers covering other issues on branching processes in varying and random environments which are not mentioned here.

preprint2010arXiv

Coalescent approximation for structured populations in a stationary random environment

We establish convergence to the Kingman coalescent for the genealogy of a geographically - or otherwise - structured version of the Wright-Fisher population model with fast migration. The new feature is that migration probabilities may change in a random fashion. This brings a novel formula for the coalescent effective population size (EPS). We call it a quenched EPS to emphasize the key feature of our model - random environment. The quenched EPS is compared with an annealed (mean-field) EPS which describes the case of constant migration probabilities obtained by averaging the random migration probabilities over possible environments.