Source author record

Florian Markowetz

Florian Markowetz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Molecular Networks Applications Quantitative Methods Populations and Evolution Genomics Machine Learning Methodology

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

Estimating cellular pathways from an ensemble of heterogeneous data sources

Building better models of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of high-throughput studies. Moreover, the available data sources are heterogeneous and need to be combined in a way specific for the part of the pathway in which they are most informative. Here, we present a compartment specific strategy to integrate edge, node and path data for the refinement of a network hypothesis. Specifically, we use a local-move Gibbs sampler for refining pathway hypotheses from a compendium of heterogeneous data sources, including novel methodology for integrating protein attributes. We demonstrate the utility of this approach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae.

preprint2014arXiv

Reconstructing evolving signalling networks by hidden Markov nested effects models

Inferring time-varying networks is important to understand the development and evolution of interactions over time. However, the vast majority of currently used models assume direct measurements of node states, which are often difficult to obtain, especially in fields like cell biology, where perturbation experiments often only provide indirect information of network structure. Here we propose hidden Markov nested effects models (HM-NEMs) to model the evolving network by a Markov chain on a state space of signalling networks, which are derived from nested effects models (NEMs) of indirect perturbation data. To infer the hidden network evolution and unknown parameter, a Gibbs sampler is developed, in which sampling network structure is facilitated by a novel structural Metropolis--Hastings algorithm. We demonstrate the potential of HM-NEMs by simulations on synthetic time-series perturbation data. We also show the applicability of HM-NEMs in two real biological case studies, in one capturing dynamic crosstalk during the progression of neutrophil polarisation, and in the other inferring an evolving network underlying early differentiation of mouse embryonic stem cells.

preprint2014arXiv

SANTA: quantifying the functional content of molecular networks

Linking networks of molecular interactions to cellular functions and phenotypes is a key goal in systems biology. Here, we adapt concepts of spatial statistics to assess the functional content of molecular networks. Based on the guilt-by-association principle, our approach (called SANTA) quantifies the strength of association between a gene set and a network, and functionally annotates molecular networks like other enrichment methods annotate lists of genes. As a general association measure, SANTA can (i) functionally annotate experimentally derived networks using a collection of curated gene sets, and (ii) annotate experimentally derived gene sets using a collection of curated networks, as well as (iii) prioritize genes for follow-up analyses. We exemplify the efficacy of SANTA in several case studies using the \emph{S. cerevisiae} genetic interaction network and genome-wide RNAi screens in cancer cell lines. Our theory, simulations and applications show that SANTA provides a principled statistical way to quantify the association between molecular networks and cellular functions and phenotypes. SANTA is available from http://bioconductor.org/packages/release/bioc/html/SANTA.html.

preprint2013arXiv

Phylogenetic quantification of intra-tumour heterogeneity

Background: Intra-tumour heterogeneity (ITH) is the result of ongoing evolutionary change within each cancer. The expansion of genetically distinct sub-clonal populations may explain the emergence of drug resistance and if so would have prognostic and predictive utility. However, methods for objectively quantifying ITH have been missing and are particularly difficult to establish in cancers where predominant copy number variation prevents accurate phylogenetic reconstruction owing to horizontal dependencies caused by long and cascading genomic rearrangements. Results: To address these challenges we present MEDICC, a method for phylogenetic reconstruction and ITH quantification based on a Minimum Event Distance for Intra-tumour Copynumber Comparisons. Using a transducer-based pairwise comparison function we determine optimal phasing of major and minor alleles, as well as evolutionary distances between samples, and are able to reconstruct ancestral genomes. Rigorous simulations and an extensive clinical study show the power of our method, which outperforms state-of-the-art competitors in reconstruction accuracy and additionally allows unbiased numerical quantification of ITH. Conclusions: Accurate quantification and evolutionary inference are essential to understand the functional consequences of ITH. The MEDICC algorithms are independent of the experimental techniques used and are applicable to both next-generation sequencing and array CGH data.

preprint2010arXiv

A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes

The influence of DNA cis-regulatory elements on a gene's expression has been intensively studied. However, little is known about expressions driven by trans-acting DNA hotspots. DNA hotspots harboring copy number aberrations are recognized to be important in cancer as they influence multiple genes on a global scale. The challenge in detecting trans-effects is mainly due to the computational difficulty in detecting weak and sparse trans-acting signals amidst co-occuring passenger events. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream targets in a breast cancer dataset. Information from this network helps distinguish copy-number driven from copy-number independent expression changes on a global scale. Our result further delineates cis- and trans-effects in a breast cancer dataset, for which important oncogenes such as ESR1 and ERBB2 appear to be highly copy-number dependent. Further, our model is shown to be efficient and in terms of goodness of fit no worse than other state-of the art predictors and network reconstruction models using both simulated and real data.

preprint2010arXiv

Evolutionary distances in the twilight zone -- a rational kernel approach

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.

preprint2010arXiv

Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells

Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We analyze genome-wide histone acetylation patterns and gene expression profiles measured over the first five days of cell differentiation triggered by silencing Nanog, a key transcription factor in ESC regulation. We explore the temporal and spatial dynamics of histone acetylation data and its correlation with gene expression using supervised and unsupervised statistical models. On a genome-wide scale, changes in acetylation are significantly correlated to changes in mRNA expression and, surprisingly, this coherence increases over time. We quantify the predictive power of histone acetylation for gene expression changes in a balanced cross-validation procedure. In an in-depth study we focus on genes central to the regulatory network of Mouse ESC, including those identified in a recent genome-wide RNAi screen and in the PluriNet, a computationally derived stem cell signature. We find that compared to the rest of the genome, ESC-specific genes show significantly more acetylation signal and a much stronger decrease in acetylation over time, which is often not reflected in an concordant expression change. These results shed light on the complexity of the relationship between histone acetylation and gene expression and are a step forward to dissect the multilayer regulatory mechanisms that determine stem cell fate.

preprint2009arXiv

How to understand the cell by breaking it: network analysis of gene perturbation screens

Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.

preprint2008arXiv

Structure Learning in Nested Effects Models

Nested Effects Models (NEMs) are a class of graphical models introduced to analyze the results of gene perturbation screens. NEMs explore noisy subset relations between the high-dimensional outputs of phenotyping studies, e.g. the effects showing in gene expression profiles or as morphological features of the perturbed cell. In this paper we expand the statistical basis of NEMs in four directions: First, we derive a new formula for the likelihood function of a NEM, which generalizes previous results for binary data. Second, we prove model identifiability under mild assumptions. Third, we show that the new formulation of the likelihood allows to efficiently traverse model space. Fourth, we incorporate prior knowledge and an automated variable selection criterion to decrease the influence of noise in the data.

Florian Markowetz

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Estimating cellular pathways from an ensemble of heterogeneous data sources

Reconstructing evolving signalling networks by hidden Markov nested effects models

SANTA: quantifying the functional content of molecular networks

Phylogenetic quantification of intra-tumour heterogeneity

A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes

Evolutionary distances in the twilight zone -- a rational kernel approach

Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells

How to understand the cell by breaking it: network analysis of gene perturbation screens

Structure Learning in Nested Effects Models