Source author record

Michelle Girvan

Michelle Girvan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

27works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Parallel Machine Learning for Forecasting the Dynamics of Complex Networks

Forecasting the dynamics of large complex networks from previous time-series data is important in a wide range of contexts. Here we present a machine learning scheme for this task using a parallel architecture that mimics the topology of the network of interest. We demonstrate the utility and scalability of our method implemented using reservoir computing on a chaotic network of oscillators. Two levels of prior knowledge are considered: (i) the network links are known; and (ii) the network links are unknown and inferred via a data-driven approach to approximately optimize prediction.

preprint2020arXiv

Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics

We examine the efficiency of Recurrent Neural Networks in forecasting the spatiotemporal dynamics of high dimensional and reduced order complex systems using Reservoir Computing (RC) and Backpropagation through time (BPTT) for gated network architectures. We highlight advantages and limitations of each method and discuss their implementation for parallel computing architectures. We quantify the relative prediction accuracy of these algorithms for the longterm forecasting of chaotic systems using as benchmarks the Lorenz-96 and the Kuramoto-Sivashinsky (KS) equations. We find that, when the full state dynamics are available for training, RC outperforms BPTT approaches in terms of predictive performance and in capturing of the long-term statistics, while at the same time requiring much less training time. However, in the case of reduced order data, large scale RC models can be unstable and more likely than the BPTT algorithms to diverge. In contrast, RNNs trained via BPTT show superior forecasting abilities and capture well the dynamics of reduced order systems. Furthermore, the present study quantifies for the first time the Lyapunov Spectrum of the KS equation with BPTT, achieving similar accuracy as RC. This study establishes that RNNs are a potent computational framework for the learning and forecasting of complex spatiotemporal systems.

preprint2020arXiv

Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems

We consider the commonly encountered situation (e.g., in weather forecasting) where the goal is to predict the time evolution of a large, spatiotemporally chaotic dynamical system when we have access to both time series data of previous system states and an imperfect model of the full system dynamics. Specifically, we attempt to utilize machine learning as the essential tool for integrating the use of past data into predictions. In order to facilitate scalability to the common scenario of interest where the spatiotemporally chaotic system is very large and complex, we propose combining two approaches:(i) a parallel machine learning prediction scheme; and (ii) a hybrid technique, for a composite prediction system composed of a knowledge-based component and a machine-learning-based component. We demonstrate that not only can this method combining (i) and (ii) be scaled to give excellent performance for very large systems, but also that the length of time series data needed to train our multiple, parallel machine learning components is dramatically less than that necessary without parallelization. Furthermore, considering cases where computational realization of the knowledge-based component does not resolve subgrid-scale processes, our scheme is able to use training data to incorporate the effect of the unresolved short-scale dynamics upon the resolved longer-scale dynamics ("subgrid-scale closure").

preprint2020arXiv

Critical Network Cascades with Re-excitable nodes: Why tree-like approximations usually work, when they breakdown, and how to correct them

Network science is a rapidly expanding field, with a large and growing body of work on network-based dynamical processes. Most theoretical results in this area rely on the so-called \emph{locally tree-like approximation}. This is, however, usually an `uncontrolled' approximation, in the sense that the magnitudes of the error are typically unknown, although numerical results show that this error is often surprisingly small. In this paper, we place this approximation on more rigorous footing by calculating the magnitude of deviations away from tree-based theories in the context of discrete-time critical network cascades with re-excitable nodes. We discuss the conditions under which tree-like approximations give good results for calculating network criticality, and also explain the reasons for deviation from this approximation, in terms of the density of certain kinds of network motifs. Using this understanding, we derive results for network criticality that apply to general networks that explicitly do not satisfy the locally tree-like approximation. In particular, we focus on the bi-parallel motif, the smallest motif relevant to the failure of a tree-based theory in this context, and we derive the corrections due to such motifs on the conditions for criticality. We verify our claims on computer-generated networks, and we confirm that our theory accurately predicts the observed deviations from criticality. Using our theory, we explain why numerical simulations often show that deviations from a tree-based theory are surprisingly small. More specifically, we show that these deviations are negligible for networks whose average degree is even modestly large compared to one, justifying why tree-based theories appear to work well for most real-world networks.

preprint2019arXiv

Identifying and Predicting Parkinson's Disease Subtypes through Trajectory Clustering via Bipartite Networks

Parkinson's disease (PD) is a common neurodegenerative disease with a high degree of heterogeneity in its clinical features, rate of progression, and change of variables over time. In this work, we present a novel data-driven, network-based Trajectory Profile Clustering (TPC) algorithm for 1) identification of PD subtypes and 2) early prediction of disease progression in individual patients. Our subtype identification is based not only on PD variables, but also on their complex patterns of progression, providing a useful tool for the analysis of large heterogenous, longitudinal data. Specifically, we cluster patients based on the similarity of their trajectories through a time series of bipartite networks connecting patients to demographic, clinical, and genetic variables. We apply this approach to demographic and clinical data from the Parkinson's Progression Markers Initiative (PPMI) dataset and identify 3 patient clusters, consistent with 3 distinct PD subtypes, each with a characteristic variable progression profile. Additionally, TPC predicts an individual patient's subtype and future disease trajectory, based on baseline assessments. Application of our approach resulted in 74% accurate subtype prediction in year 5 in a test/validation cohort. Furthermore, we show that genetic variability can be integrated seamlessly in our TPC approach. In summary, using PD as a model for chronic progressive diseases, we show that TPC leverages high-dimensional longitudinal datasets for subtype identification and early prediction of individual disease subtype. We anticipate this approach will be broadly applicable to multidimensional longitudinal datasets in diverse chronic diseases.

preprint2019arXiv

Separation of Chaotic Signals by Reservoir Computing

We demonstrate the utility of machine learning in the separation of superimposed chaotic signals using a technique called Reservoir Computing. We assume no knowledge of the dynamical equations that produce the signals, and require only training data consisting of finite time samples of the component signals. We test our method on signals that are formed as linear combinations of signals from two Lorenz systems with different parameters. Comparing our nonlinear method with the optimal linear solution to the separation problem, the Wiener filter, we find that our method significantly outperforms the Wiener filter in all the scenarios we study. Furthermore, this difference is particularly striking when the component signals have similar frequency spectra. Indeed, our method works well when the component frequency spectra are indistinguishable - a case where a Wiener filter performs essentially no separation.

preprint2016arXiv

Competing opinions and stubbornness: connecting models to data

We introduce a general contagion-like model for competing opinions that includes dynamic resistance to alternative opinions. We show that this model can describe candidate vote distributions, spatial vote correlations, and a slow approach to opinion consensus with sensible parameter values. These empirical properties of large group dynamics, previously understood using distinct models, may be different aspects of human behavior that can be captured by a more unified model, such as the one introduced in this paper.

preprint2014arXiv

Dynamical Transitions in Large Systems of Mean Field-Coupled Landau-Stuart Oscillators: Extensive Chaos and Clumped States

In this paper, we study dynamical systems in which a large number $N$ of identical Landau-Stuart oscillators are globally coupled via a mean-field. Previously, it has been observed that this type of system can exhibit a variety of different dynamical behaviors including clumped states in which each oscillator is in one of a small number of groups for which all oscillators in each group have the same state which is different from group to group, as well as situations in which all oscillators have different states and the macroscopic dynamics of the mean field is chaotic. We argue that this second type of behavior is $^{\backprime}$extensive$^{\prime}$ in the sense that the chaotic attractor in the full phase space of the system has a fractal dimension that scales linearly with $N$ and that the number of positive Lyapunov exponents of the attractor also scales with linearly $N$. An important focus of this paper is the transition between clumped states and extensive chaos as the system is subjected to slow adiabatic parameter change. We observe explosive (i.e., discontinuous) transitions between the clumped states (which correspond to low dimensional dynamics) and the extensively chaotic states. Furthermore, examining the clumped state, as the system approaches the explosive transition to extensive chaos, we find that the oscillator population distribution between the clumps continually evolves so that the clumped state is always marginally stable. This behavior is used to reveal the mechanism of the explosive transition. We also apply the Kaplan-Yorke formula to study the fractal structure of the extensively chaotic attractors.

preprint2014arXiv

The Impact of Imperfect Information on Network Attack

This paper explores the effectiveness of network attack when the attacker has imperfect information about the network. For Erdős-Rényi networks, we observe that dynamical importance and betweenness centrality-based attacks are surprisingly robust to the presence of a moderate amount of imperfect information and are more effective compared with simpler degree-based attacks even at moderate levels of network information error. In contrast, for scale-free networks the effectiveness of attack is much less degraded by a moderate level of information error. Furthermore, in the Erdőos-Rényi case the effectiveness of network attack is much more degraded by missing links as compared with the same number of false links.

preprint2013arXiv

A hierarchical network heuristic for solving the orientation problem in genome assembly

In the past several years, the problem of genome assembly has received considerable attention from both biologists and computer scientists. An important component of current assembly methods is the scaffolding process. This process involves building ordered and oriented linear collections of contigs (continuous overlapping sequence reads) called scaffolds and relies on the use of mate pair data. A mate pair is a set of two reads that are sequenced from the ends of a single fragment of DNA, and therefore have opposite mutual orientations. When two reads of a mate-pair are placed into two different contigs, one can infer the mutual orientation of these contigs. While several orientation algorithms exist as part of assembly programs, all encounter challenges while solving the orientation problem due to errors from mis-assemblies in contigs or errors in read placements. In this paper we present an algorithm based on hierarchical clustering that independently solves the orientation problem and is robust to errors. We show that our algorithm can correctly solve the orientation problem for both faux (generated) assembly data and real assembly data for {\em R. sphaeroides bacteria}. We demonstrate that our algorithm is stable to both changes in the initial orientations as well as noise in the data, making it advantageous compared to traditional approaches.

preprint2013arXiv

Annotation Enrichment Analysis: An Alternative Method for Evaluating the Functional Properties of Gene Sets

Gene annotation databases (compendiums maintained by the scientific community that describe the biological functions performed by individual genes) are commonly used to evaluate the functional properties of experimentally derived gene sets. Overlap statistics, such as Fisher's Exact Test (FET), are often employed to assess these associations, but don't account for non-uniformity in the number of genes annotated to individual functions or the number of functions associated with individual genes. We find FET is strongly biased toward over-estimating overlap significance if a gene set has an unusually high number of annotations. To correct for these biases, we develop Annotation Enrichment Analysis (AEA), which properly accounts for the non-uniformity of annotations. We show that AEA is able to identify biologically meaningful functional enrichments that are obscured by numerous false-positive enrichment scores in FET, and we therefore suggest it be used to more accurately assess the biological properties of gene sets.

preprint2013arXiv

Finding New Order in Biological Functions from the Network Structure of Gene Annotations

The Gene Ontology (GO) provides biologists with a controlled terminology that describes how genes are associated with functions and how functional terms are related to each other. These term-term relationships encode how scientists conceive the organization of biological functions, and they take the form of a directed acyclic graph (DAG). Here, we propose that the network structure of gene-term annotations made using GO can be employed to establish an alternate natural way to group the functional terms which is different from the hierarchical structure established in the GO DAG. Instead of relying on an externally defined organization for biological functions, our method connects biological functions together if they are performed by the same genes, as indicated in a compendium of gene annotation data from numerous different experiments. We show that grouping terms by this alternate scheme is distinct from term relationships defined in the ontological structure and provides a new framework with which to describe and predict the functions of experimentally identified sets of genes.

preprint2013arXiv

Modeling the dynamics of bivalent histone modifications

Epigenetic modifications to histones may promote either activation or repression of the transcription of nearby genes. Recent experimental studies show that the promoters of many lineage-control genes in stem cells have "bivalent domains" in which the nucleosomes contain both active (H3K4me3) and repressive (H3K27me3) marks. It is generally agreed that bivalent domains play an important role in stem cell differentiation, but the underlying mechanisms remain unclear. Here we formulate a mathematical model to investigate the dynamic properties of histone modification patterns. We then illustrate that our modeling framework can be used to capture key features of experimentally observed combinatorial chromatin states.

preprint2013arXiv

Robustness of Network Measures to Link Errors

In various applications involving complex networks, network measures are employed to assess the relative importance of network nodes. However, the robustness of such measures in the presence of link inaccuracies has not been well characterized. Here we present two simple stochastic models of false and missing links and study the effect of link errors on three commonly used node centrality measures: degree centrality, betweenness centrality, and dynamical importance. We perform numerical simulations to assess robustness of these three centrality measures. We also develop an analytical theory, which we compare with our simulations, obtaining very good agreement.

preprint2013arXiv

Spatially embedded growing small-world networks

Networks in nature are often formed within a spatial domain in a dynamical manner, gaining links and nodes as they develop over time. We propose a class of spatially-based growing network models and investigate the relationship between the resulting statistical network properties and the dimension and topology of the space in which the networks are embedded. In particular, we consider models in which nodes are placed one by one in random locations in space, with each such placement followed by configuration relaxation toward uniform node density, and connection of the new node with spatially nearby nodes. We find that such growth processes naturally result in networks with small-world features, including a short characteristic path length and nonzero clustering. These properties do not appear to depend strongly on the topology of the embedding space, but do depend strongly on its dimension; higher-dimensional spaces result in shorter path lengths but less clustering.

preprint2013arXiv

Stability of Boolean networks: The joint effects of topology and update rules

We study the stability of orbits in large Boolean networks with given complex topology. We impose no restrictions on the form of the update rules, which may be correlated with local topological properties of the network. While recent past work has addressed the separate effects of nontrivial network topology and certain special classes of update rules on stability, only crude results exist about how these effects interact. We present a widely applicable solution to this problem. Numerical experiments confirm our theory and show that local correlations between topology and update rules can have profound effects on the qualitative behavior of these systems.

preprint2013arXiv

Understanding the Predictive Power of Computational Mechanics and Echo State Networks in Social Media

There is a large amount of interest in understanding users of social media in order to predict their behavior in this space. Despite this interest, user predictability in social media is not well-understood. To examine this question, we consider a network of fifteen thousand users on Twitter over a seven week period. We apply two contrasting modeling paradigms: computational mechanics and echo state networks. Both methods attempt to model the behavior of users on the basis of their past behavior. We demonstrate that the behavior of users on Twitter can be well-modeled as processes with self-feedback. We find that the two modeling approaches perform very similarly for most users, but that they differ in performance on a small subset of the users. By exploring the properties of these performance-differentiated users, we highlight the challenges faced in applying predictive models to dynamic social data.

preprint2013arXiv

Weakly Explosive Percolation in Directed Networks

Percolation, the formation of a macroscopic connected component, is a key feature in the description of complex networks. The dynamical properties of a variety of systems can be understood in terms of percolation, including the robustness of power grids and information networks, the spreading of epidemics and forest fires, and the stability of gene regulatory networks. Recent studies have shown that if network edges are added "competitively" in undirected networks, the onset of percolation is abrupt or "explosive." The unusual qualitative features of this phase transition have been the subject of much recent attention. Here we generalize this previously studied network growth process from undirected networks to directed networks and use finite-size scaling theory to find several scaling exponents. We find that this process is also characterized by a very rapid growth in the giant component, but that this growth is not as sudden as in undirected networks.

preprint2012arXiv

Consequences of Anomalous Diffusion in Disordered Systems Under Cyclic Forcing

We use numerical simulations to study the behavior of 2D frictionless disk systems under cyclic shear as a function of reversal amplitude γ_r. Our studies focus on mean bulk and disk dynamics. These measurements suggest a crossover from a subdiffusive, γ_r dependent regime to a regime where the grain motions are diffusive, with properties dependent only on total shear strain. We discuss model stochastic processes that are consistent with these observations. Finally, we introduce a modified Mean-Squared Displacement (mMSD) which takes into account the motion of the neighborhood of nearby grains and yields new insights into local displacement fluctuations. We find that scaling properties of the displacement distributions are consistent with well studied stochastic models of anomalous diffusion and suggest scale-invariant cage dynamics.

preprint2012arXiv

Detecting Functional Communities in Complex Networks

We consider an alternate definition of community structure that is functionally motivated. We define network community structure-based on the function the network system is intended to perform. In particular, as a specific example of this approach, we consider communities whose function is enhanced by the ability to synchronize and/or by resilience to node failures. Previous work has shown that, in many cases, the largest eigenvalue of the network's adjacency matrix controls the onset of both synchronization and percolation processes. Thus, for networks whose functional performance is dependent on these processes, we propose a method that divides a given network into communities based on maximizing a function of the largest eigenvalues of the adjacency matrices of the resulting communities. We also explore the differences between the partitions obtained by our method and the modularity approach (which is based solely on consideration of network structure). We do this for several different classes of networks. We find that, in many cases, modularity-based partitions do almost as well as our function-based method in finding functional communities, even though modularity does not specifically incorporate consideration of function.

preprint2012arXiv

Dynamical Instability in Boolean Networks as a Percolation Problem

Boolean networks, widely used to model gene regulation, exhibit a phase transition between regimes in which small perturbations either die out or grow exponentially. We show and numerically verify that this phase transition in the dynamics can be mapped onto a static percolation problem which predicts the long-time average Hamming distance between perturbed and unperturbed orbits.

preprint2012arXiv

Onset of Irreversibility in Cyclic Shear of Granular Packings

We investigate the onset of irreversibility in a dense granular medium subjected to cyclic shear in a split-bottom geometry. To probe the micro and mesoscale we image bead trajectories in 3D throughout a series of shear strain oscillations. Though beads lose and regain contact with neighbors during a cycle, the topology of the contact network exhibits reversible properties for small oscillation amplitudes. With increasing reversal amplitude a transition to an irreversible diffusive regime occurs.

preprint2012arXiv

The Stability of Boolean Networks with Generalized Canalizing Rules

Boolean networks are discrete dynamical systems in which the state (zero or one) of each node is updated at each time t to a state determined by the states at time t-1 of those nodes that have links to it. When these systems are used to model genetic control, the case of 'canalizing' update rules is of particular interest. A canalizing rule is one for which a node state at time $t$ is determined by the state at time t-1 of a single one of its inputs when that inputting node is in its canalizing state. Previous work on the order/disorder transition in Boolean networks considered complex, non-random network topology. In the current paper we extend this previous work to account for canalizing behavior.

preprint2011arXiv

Multiscale Dynamics in Communities of Phase Oscillators

We investigate the dynamics of systems of many coupled phase oscillators with het- erogeneous frequencies. We suppose that the oscillators occur in M groups. Each oscillator is connected to other oscillators in its group with "attractive" coupling, such that the coupling promotes synchronization within the group. The coupling between oscillators in different groups is "repulsive"; i.e., their oscillation phases repel. To address this problem, we reduce the governing equations to a lower-dimensional form via the ansatz of Ott and Antonsen . We first consider the symmetric case where all group parameters are the same, and the attractive and repulsive coupling are also the same for each of the M groups. We find a manifold L of neutrally stable equilibria, and we show that all other equilibria are unstable. For M \geq 3, L has dimension M - 2, and for M = 2 it has dimension 1. To address the general asymmetric case, we then introduce small deviations from symmetry in the group and coupling param- eters. Doing a slow/fast timescale analysis, we obtain slow time evolution equations for the motion of the M groups on the manifold L. We use these equations to study the dynamics of the groups and compare the results with numerical simulations.

preprint2011arXiv

The path to fracture in granular flows: dynamics of contact networks

Capturing the dynamics of granular flows at intermediate length scales can often be difficult. We propose studying the dynamics of contact networks as a new tool to study fracture at intermediate scales. Using experimental three-dimensional flow fields with particle-scale resolution, we calculate the time evolving broken-links network and find that a giant component of this network is formed as shear is applied to this system. We implement a model of link breakages where the probability of a link breaking is proportional to the average rate of longitudinal strain (elongation) in the direction of the edge and find that the model demonstrates qualitative agreement with the data when studying the onset of the giant component. We note, however, that the broken-links network formed in the model is less clustered than our experimental observations, indicating that the model reflects less localized breakage events and does not fully capture the dynamics of the granular flow.

preprint2010arXiv

Universality under conditions of self-tuning

We study systems with a continuous phase transition that tune their parameters to maximize a quantity that diverges solely at a unique critical point. Varying the size of these systems with dynamically adjusting parameters, the same finite-size scaling is observed as in systems where all relevant parameters are fixed at their critical values. This scheme is studied using a self-tuning variant of the Ising model. It is contrasted with a scheme where systems approach criticality through a target value for the order parameter that vanishes with increasing system size. In the former scheme, the universal exponents are observed in naive finite-size scaling studies, whereas in the latter they are not.

preprint2009arXiv

The effect of network topology on the stability of discrete state models of genetic control

Boolean networks have been proposed as potentially useful models for genetic control. An important aspect of these networks is the stability of their dynamics in response to small perturbations. Previous approaches to stability have assumed uncorrelated random network structure. Real gene networks typically have nontrivial topology significantly different from the random network paradigm. In order to address such situations, we present a general method for determining the stability of large Boolean networks of any specified network topology and predicting their steady-state behavior in response to small perturbations. Additionally, we generalize to the case where individual genes have a distribution of `expression biases,' and we consider non-synchronous update, as well as extension of our method to non-Boolean models in which there are more than two possible gene states. We find that stability is governed by the maximum eigenvalue of a modified adjacency matrix, and we test this result by comparison with numerical simulations. We also discuss the possible application of our work to experimentally inferred gene networks.

Michelle Girvan

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

Parallel Machine Learning for Forecasting the Dynamics of Complex Networks

Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics

Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems

Critical Network Cascades with Re-excitable nodes: Why tree-like approximations usually work, when they breakdown, and how to correct them

Identifying and Predicting Parkinson's Disease Subtypes through Trajectory Clustering via Bipartite Networks

Separation of Chaotic Signals by Reservoir Computing

Competing opinions and stubbornness: connecting models to data

Dynamical Transitions in Large Systems of Mean Field-Coupled Landau-Stuart Oscillators: Extensive Chaos and Clumped States

The Impact of Imperfect Information on Network Attack

A hierarchical network heuristic for solving the orientation problem in genome assembly

Annotation Enrichment Analysis: An Alternative Method for Evaluating the Functional Properties of Gene Sets

Finding New Order in Biological Functions from the Network Structure of Gene Annotations

Modeling the dynamics of bivalent histone modifications

Robustness of Network Measures to Link Errors

Spatially embedded growing small-world networks

Stability of Boolean networks: The joint effects of topology and update rules

Understanding the Predictive Power of Computational Mechanics and Echo State Networks in Social Media

Weakly Explosive Percolation in Directed Networks

Consequences of Anomalous Diffusion in Disordered Systems Under Cyclic Forcing

Detecting Functional Communities in Complex Networks

Dynamical Instability in Boolean Networks as a Percolation Problem

Onset of Irreversibility in Cyclic Shear of Granular Packings

The Stability of Boolean Networks with Generalized Canalizing Rules

Multiscale Dynamics in Communities of Phase Oscillators

The path to fracture in granular flows: dynamics of contact networks

Universality under conditions of self-tuning

The effect of network topology on the stability of discrete state models of genetic control