Source author record

Aïda Ouangraoua

Aïda Ouangraoua appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Genomics Computational Engineering, Finance, and Science

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Gene Tree Construction and Correction using SuperTree and Reconciliation

The supertree problem asking for a tree displaying a set of consistent input trees has been largely considered for the reconstruction of species trees. Here, we rather explore this framework for the sake of reconstructing a gene tree from a set of input gene trees on partial data. In this perspective, the phylogenetic tree for the species containing the genes of interest can be used to choose among the many possible compatible "supergenetrees", the most natural criteria being to minimize a reconciliation cost. We develop a variety of algorithmic solutions for the construction and correction of gene trees using the supertree framework. A dynamic programming supertree algorithm for constructing or correcting gene trees, exponential in the number of input trees, is first developed for the less constrained version of the problem. It is then adapted to gene trees with nodes labeled as duplication or speciation, the additional constraint being to preserve the orthology and paralogy relations between genes. Then, a quadratic time algorithm is developed for efficiently correcting an initial gene tree while preserving a set of "trusted" subtrees, as well as the relative phylogenetic distance between them, in both cases of labeled or unlabeled input trees. By applying these algorithms to the set of Ensembl gene trees, we show that this new correction framework is particularly useful to correct weaklysupported duplication nodes. The C++ source code for the algorithms and simulations described in the paper are available at https://github.com/UdeM-LBIT/SuGeT.

preprint2016arXiv

The SCJ small parsimony problem for weighted gene adjacencies (Extended version)

Reconstructing ancestral gene orders in a given phylogeny is a classical problem in comparative genomics. Most existing methods compare conserved features in extant genomes in the phylogeny to define potential ancestral gene adjacencies, and either try to reconstruct all ancestral genomes under a global evolutionary parsimony criterion, or, focusing on a single ancestral genome, use a scaffolding approach to select a subset of ancestral gene adjacencies, generally aiming at reducing the fragmentation of the reconstructed ancestral genome. In this paper, we describe an exact algorithm for the Small Parsimony Problem that combines both approaches. We consider that gene adjacencies at internal nodes of the species phylogeny are weighted, and we introduce an objective function defined as a convex combination of these weights and the evolutionary cost under the Single-Cut-or-Join (SCJ) model. The weights of ancestral gene adjacencies can e.g. be obtained through the recent availability of ancient DNA sequencing data, which provide a direct hint at the genome structure of the considered ancestor, or through probabilistic analysis of gene adjacencies evolution. We show the NP-hardness of our problem variant and propose a Fixed-Parameter Tractable algorithm based on the Sankoff-Rousseau dynamic programming algorithm that also allows to sample co-optimal solutions. We apply our approach to mammalian and bacterial data providing different degrees of complexity. We show that including adjacency weights in the objective has a significant impact in reducing the fragmentation of the reconstructed ancestral gene orders.

preprint2015arXiv

Alignment of protein-coding sequences with frameshift extension penalties

We introduce an algorithm for the alignment of protein- coding sequences accounting for frameshifts. The main specificity of this algorithm as compared to previously published protein-coding sequence alignment methods is the introduction of a penalty cost for frameshift ex- tensions. Previous algorithms have only used constant frameshift penal- ties. This is similar to the use of scoring schemes with affine gap penalties in classical sequence alignment algorithms. However, the overall penalty of a frameshift portion in an alignment cannot be formulated as an affine function, because it should also incorporate varying codon substitution scores. The second specificity of the algorithm is its search space being the set of all possible alignments between two coding sequences, under the classical definition of an alignment between two DNA sequences. Previous algorithms have introduced constraints on the length of the alignments, and additional symbols for the representation of frameshift openings in an alignment. The algorithm has the same asymptotic space and time complexity as the classical Needleman-Wunsch algorithm.

preprint2012arXiv

Tandem halving problems by DCJ

This paper has been withdrawn by the author.

preprint2011arXiv

Genome Halving by Block Interchange

We address the problem of finding the minimal number of block interchanges (exchange of two intervals) required to transform a duplicated linear genome into a tandem duplicated linear genome. We provide a formula for the distance as well as a polynomial time algorithm for the sorting problem.