Source author record

Vincent Miele

Vincent Miele appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Computational Engineering, Finance, and Science Data Structures and Algorithms Machine Learning Methodology Quantitative Methods

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Statistical clustering of temporal networks through a dynamic stochastic block model

Statistical node clustering in discrete time dynamic networks is an emerging field that raises many challenges. Here, we explore statistical properties and frequentist inference in a model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time. We model binary data as well as weighted dynamic random graphs (with discrete or continuous edges values). Our approach, motivated by the importance of controlling for label switching issues across the different time steps, focuses on detecting groups characterized by a stable within group connectivity behavior. We study identifiability of the model parameters, propose an inference procedure based on a variational expectation maximization algorithm as well as a model selection criterion to select for the number of groups. We carefully discuss our initialization strategy which plays an important role in the method and compare our procedure with existing ones on synthetic datasets. We also illustrate our approach on dynamic contact networks, one of encounters among high school students and two others on animal interactions. An implementation of the method is available as a R package called dynsbm.

preprint2014arXiv

Navigating in a sea of repeats in RNA-seq without drowning

The main challenge in de novo assembly of NGS data is certainly to deal with repeats that are longer than the reads. This is particularly true for RNA- seq data, since coverage information cannot be used to flag repeated sequences, of which transposable elements are one of the main examples. Most transcriptome assemblers are based on de Bruijn graphs and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them. The results of this work are twofold. First, we introduce a formal model for repre- senting high copy number repeats in RNA-seq data and exploit its properties for inferring a combinatorial characteristic of repeat-associated subgraphs. We show that the problem of identifying in a de Bruijn graph a subgraph with this charac- teristic is NP-complete. In a second step, we show that in the specific case of a local assembly of alternative splicing (AS) events, we can implicitly avoid such subgraphs. In particular, we designed and implemented an algorithm to efficiently identify AS events that are not included in repeated regions. Finally, we validate our results using synthetic data. We also give an indication of the usefulness of our method on real data.

preprint2014arXiv

Spatially-constrained clustering of ecological networks

Spatial ecological networks are widely used to model interactions between georeferenced biological entities (e.g., populations or communities). The analysis of such data often leads to a two-step approach where groups containing similar biological entities are firstly identified and the spatial information is used afterwards to improve the ecological interpretation. We develop an integrative approach to retrieve groups of nodes that are geographically close and ecologically similar. Our model-based spatially-constrained method embeds the geographical information within a regularization framework by adding some constraints to the maximum likelihood estimation of parameters. A simulation study and the analysis of real data demonstrate that our approach is able to detect complex spatial patterns that are ecologically meaningful. The model-based framework allows us to consider external information (e.g., geographic proximities, covariates) in the analysis of ecological networks and appears to be an appealing alternative to consider such data.

preprint2010arXiv

Strategies for online inference of model-based clustering in large and growing networks

In this paper we adapt online estimation strategies to perform model-based clustering on large networks. Our work focuses on two algorithms, the first based on the SAEM algorithm, and the second on variational methods. These two strategies are compared with existing approaches on simulated and real data. We use the method to decipher the connexion structure of the political websphere during the US political campaign in 2008. We show that our online EM-based algorithms offer a good trade-off between precision and speed, when estimating parameters for mixture distributions in the context of random graphs.