Researcher profile

Colm Connaughton

Colm Connaughton contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Forecasting new diseases in low-data settings using transfer learning

Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiological models. Time series forecasts and machine learning, while less reliant on assumptions about the disease, require large amounts of data that are also not available in early stages of an outbreak. In this study, we examine how knowledge of related diseases can help make predictions of new diseases in data-scarce environments using transfer learning. We implement both an empirical and a theoretical approach. Using empirical data from Brazil, we compare how well different machine learning models transfer knowledge between two different disease pairs: (i) dengue and Zika, and (ii) influenza and COVID-19. In the theoretical analysis, we generate data using different transmission and recovery rates with an SIR compartmental model, and then compare the effectiveness of different transfer learning methods. We find that transfer learning offers the potential to improve predictions, even beyond a model based on data from the target disease, though the appropriate source disease must be chosen carefully. While imperfect, these models offer an additional input for decision makers during pandemic response.

preprint2021arXiv

Dynamic and interpretable hazard-based models of traffic incident durations

Understanding and predicting the duration or "return-to-normal" time of traffic incidents is important for system-level management and optimisation of road transportation networks. Increasing real-time availability of multiple data sources characterising the state of urban traffic networks, together with advances in machine learning offer the opportunity for new and improved approaches to this problem that go beyond static statistical analyses of incident duration. In this paper we consider two such improvements: dynamic update of incident duration predictions as new information about incidents becomes available and automated interpretation of the factors responsible for these predictions. For our use case, we take one year of incident data and traffic state time-series from the M25 motorway in London. We use it to train models that predict the probability distribution of incident durations, utilising both time-invariant and time-varying features of the data. The latter allow predictions to be updated as an incident progresses, and more information becomes available. For dynamic predictions, time-series features are fed into the Match-Net algorithm, a temporal convolutional hitting-time network, recently developed for dynamical survival analysis in clinical applications. The predictions are benchmarked against static regression models for survival analysis and against an established dynamic technique known as landmarking and found to perform favourably by several standard comparison measures. To provide interpretability, we utilise the concept of Shapley values from the domain of interpretable artificial intelligence to rank the features most relevant to the model predictions at different time horizons. For example, the time of day is always a significantly influential time-invariant feature, whereas the time-series features strongly influence predictions at 5 and 60-minute horizons.

preprint2020arXiv

A non-parametric Hawkes process model of primary and secondary accidents on a UK smart motorway

A self-exciting spatio-temporal point process is fitted to incident data from the UK National Traffic Information Service to model the rates of primary and secondary accidents on the M25 motorway in a 12-month period during 2017-18. This process uses a background component to represent primary accidents, and a self-exciting component to represent secondary accidents. The background consists of periodic daily and weekly components, a spatial component and a long-term trend. The self-exciting components are decaying, unidirectional functions of space and time. These components are determined via kernel smoothing and likelihood estimation. Temporally, the background is stable across seasons with a daily double peak structure reflecting commuting patterns. Spatially, there are two peaks in intensity, one of which becomes more pronounced during the study period. Self-excitation accounts for 6-7% of the data with associated time and length scales around 100 minutes and 1 kilometre respectively. In-sample and out-of-sample validation are performed to assess the model fit. When we restrict the data to incidents that resulted in large speed drops on the network, the results remain coherent.

preprint2020arXiv

Discovering causal factors of drought in Ethiopia

Drought is a costly natural hazard, many aspects of which remain poorly understood. It has many contributory factors, driving its outset, duration, and severity, including land surface, anthropogenic activities, and, most importantly, meteorological anomalies. Prediction plays a crucial role in drought preparedness and risk mitigation. However, this is a challenging task at socio-economically critical lead times (1-2 years), because meteorological anomalies operate at a wide range of temporal and spatial scales. Among them, past studies have shown a correlation between the Sea Surface Temperature (SST) anomaly and the amount of precipitation in various locations in Africa. In its Eastern part, the cooling phase of El Nino-Southern Oscillation (ENSO) and SST anomaly in the Indian ocean are correlated with the lack of rainfall. Given the intrinsic shortcomings of correlation coefficients, we investigate the association among SST modes of variability and the monthly fraction of grid points in Ethiopia, which are in drought conditions in terms of causality. Using the empirical extreme quantiles of precipitation distribution as a proxy for drought, We show that the level of SST second mode of variability in the prior year influences the occurrence of drought in Ethiopia. The causal link between these two variables has a negative coefficient that verifies the conclusion of past studies that rainfall deficiency in the Horn of Africa is associated with ENSO's cooling phase.

preprint2020arXiv

Disease and information spreading at different speeds in multiplex networks

Nowadays, one of the challenges we face when carrying out modeling of epidemic spreading is to develop methods to control disease transmission. In this article we study how the spreading of knowledge of a disease affects the propagation of that disease in a population of interacting individuals. For that, we analyze the interaction between two different processes on multiplex networks: the propagation of an epidemic using the susceptible-infected-susceptible dynamics and the dissemination of information about the disease --and its prevention methods-- using the unaware-aware-unaware dynamics, so that informed individuals are less likely to be infected. Unlike previous related models where disease and information spread at the same time scale, we introduce here a parameter that controls the relative speed between the propagation of the two processes. We study the behavior of this model using a mean-field approach that gives results in good agreement with Monte Carlo simulations on homogeneous complex networks. We find that increasing the rate of information dissemination reduces the disease prevalence, as one may expect. However, increasing the speed of the information process as compared to that of the epidemic process has the counter intuitive effect of increasing the disease prevalence. This result opens an interesting discussion about the effects of information spreading on disease propagation.

preprint2019arXiv

The role of zero-clusters in exchange-driven growth with and without input

The exchange-driven growth model describes the mean field kinetics of a population of composite particles (clusters) subject to pairwise exchange interactions. Exchange in this context means that upon interaction of two clusters, one loses a constituent unit (monomer) and the other gains this unit. Two variants of the exchange-driven growth model appear in applications. They differ in whether clusters of zero size are considered active or passive. In the active case, clusters of size zero can acquire a monomer from clusters of positive size. In the passive case they cannot, meaning that clusters reaching size zero are effectively removed from the system. The large time behaviour is very different for the two variants of the model. We first consider an isolated system. In the passive case, the cluster size distribution tends towards a self-similar evolution and the typical cluster size grows as a power of time. In the active case, we identify a broad class of kernels for which the the cluster size distribution tends to a non-trivial time-independent equilibrium in which the typical cluster size is finite. We next consider a non-isolated system in which monomers are input at a constant rate. In the passive case, the cluster size distribution again attains a self-similar profile in which the typical cluster size grows as a power of time. In the active case, a surprising new behavior is found: the cluster size distribution asymptotes to the same equilibrium profile found in the isolated case but with an amplitude that grows linearly in time.

preprint2010arXiv

Scaling properties of one-dimensional cluster-cluster aggregation with Levy diffusion

We present a study of the scaling properties of cluster-cluster aggregation with a source of monomers in the stationary state when the spatial transport of particles occurs by Levy flights. We show that the transition from mean-field statistics to fluctuation-dominated statistics which, for the more commonly considered case of diffusive transport, occurs as the spatial dimension of the system is tuned through two from above, can be mimicked even in one dimension by varying the characteristic exponent, beta, of the the Levy jump length distribution. We also show that the two-point mass correlation function, responsible for the flux of mass in the stationary state, is strongly universal: its scaling exponent is given by the mean field value independent of the spatial dimension and independent of the value of beta. Finally we study numerically the two point spatial correlation function which characterises the structure of the depletion zone around heavy particles in the diffusion limited regime. We find that this correlation function vanishes with a non-trivial fractional power of the separation between particles as this separation goes to zero. We provide a scaling argument for the value of this exponent which is in reasonable agreement with the numerical measurements.