Researcher profile

Laurent Hébert-Dufresne

Laurent Hébert-Dufresne contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2023arXiv

Nonlinear bias toward complex contagion in uncertain transmission settings

Current epidemics in the biological and social domains are challenging the standard assumptions of mathematical contagion models. Chief among them are the complex patterns of transmission caused by heterogeneous group sizes and infection risk varying by orders of magnitude in different settings, like indoor versus outdoor gatherings in the COVID-19 pandemic or different moderation practices in social media communities. However, quantifying these heterogeneous levels of risk is difficult and most models typically ignore them. Here, we include these novel features in an epidemic model on weighted hypergraphs to capture group-specific transmission rates. We study analytically the consequences of ignoring the heterogeneous transmissibility and find an induced superlinear infection rate during the emergence of a new outbreak, even though the underlying mechanism is a simple, linear contagion. The dynamics produced at the individual and group levels are therefore more similar to complex, nonlinear contagions, thus blurring the line between simple and complex contagions in realistic settings. We support this claim by introducing a Bayesian inference framework to quantify the nonlinearity of contagion processes. We show that simple contagions on real weighted hypergraphs are systematically biased toward the superlinear regime if the heterogeneity of the weights is ignored, greatly increasing the risk of erroneous classification as complex contagions. Our results provide an important cautionary tale for the challenging task of inferring transmission mechanisms from incidence data. Yet, it also paves the way for effective models that capture complex features of epidemics through nonlinear infection rates.

preprint2023arXiv

TRACE-Omicron: Policy Counterfactuals to Inform Mitigation of COVID-19 Spread in the United States

The Omicron wave was the largest wave of COVID-19 pandemic to date, more than doubling any other in terms of cases and hospitalizations in the United States. In this paper, we present a large-scale agent-based model of policy interventions that could have been implemented to mitigate the Omicron wave. Our model takes into account the behaviors of individuals and their interactions with one another within a nationally representative population, as well as the efficacy of various interventions such as social distancing, mask wearing, testing, tracing, and vaccination. We use the model to simulate the impact of different policy scenarios and evaluate their potential effectiveness in controlling the spread of the virus. Our results suggest the Omicron wave could have been substantially curtailed via a combination of interventions comparable in effectiveness to extreme and unpopular singular measures such as widespread closure of schools and workplaces, and highlight the importance of early and decisive action.

preprint2022arXiv

Hierarchical team structure and multidimensional localization (or siloing) on networks

Knowledge silos emerge when structural properties of organizational interaction networks limit the diffusion of information. These structural barriers are known to take many forms at different scales - hubs in otherwise sparse organisations, large dense teams, or global core-periphery structure - but we lack an understanding of how these different structures interact. Here we bridge the gap between the mathematical literature on localization of spreading dynamics and the more applied literature on knowledge silos in organizational interaction networks. To do so, we introduce a new model that considers a layered structure of teams to unveil a new form of hierarchical localization (i.e., the localization of information at the top or center of an organization) and study its interplay with known phenomena of mesoscopic localization (i.e., the localization of information in large groups), $k$-core localization (i.e., around denser $k$-cores) and hub localization (i.e., around high degree stars). We also include a complex contagion mechanism by considering a general infection kernel which can depend on hierarchical level (influence), degree (popularity), infectious neighbors (social reinforcement) or team size (importance). This general model allows us to study the multifaceted phenomenon of information siloing in complex organizational interaction networks and opens the door to new optimization problems to promote or hinder the emergence of different localization regimes.

preprint2022arXiv

Limits of Individual Consent and Models of Distributed Consent in Online Social Networks

Personal data are not discrete in socially-networked digital environments. A user who consents to allow access to their profile can expose the personal data of their network connections to non-consented access. Therefore, the traditional consent model (informed and individual) is not appropriate in social networks where informed consent may not be possible for all users affected by data processing and where information is distributed across users. Here, we outline the adequacy of consent for data transactions. Informed by the shortcomings of individual consent, we introduce both a platform-specific model of "distributed consent" and a cross-platform model of a "consent passport." In both models, individuals and groups can coordinate by giving consent conditional on that of their network connections. We simulate the impact of these distributed consent models on the observability of social networks and find that low adoption would allow macroscopic subsets of networks to preserve their connectivity and privacy.

preprint2022arXiv

Predicting the diversity of early epidemic spread on networks

The interplay of biological, social, structural and random factors makes disease forecasting extraordinarily complex. The course of an epidemic exhibits average growth dynamics determined by features of the pathogen and the population, yet also features significant variability reflecting the stochastic nature of disease spread. In this work, we reframe a stochastic branching process analysis in terms of probability generating functions and compare it to continuous time epidemic simulations on networks. In doing so, we predict the diversity of emerging epidemic courses on both homogeneous and heterogeneous networks. We show how the challenge of inferring the early course of an epidemic falls on the randomness of disease spread more so than on the heterogeneity of contact patterns. We provide an analysis which helps quantify, in real time, the probability that an epidemic goes supercritical or conversely, dies stochastically. These probabilities are often assumed to be one and zero, respectively, if the basic reproduction number, or R0, is greater than 1, ignoring the heterogeneity and randomness inherent to disease spread. This framework can give more insight into early epidemic spread by weighting standard deterministic models with likelihood to inform pandemic preparedness with probabilistic forecasts.

preprint2022arXiv

The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories

Communication surrounding the development of an open source project largely occurs outside the software repository itself. Historically, large communities often used a collection of mailing lists to discuss the different aspects of their projects. Multimodal tool use, with software development and communication happening on different channels, complicates the study of open source projects as a sociotechnical system. Here, we combine and standardize mailing lists of the Python community, resulting in 954,287 messages from 1995 to the present. We share all scraping and cleaning code to facilitate reproduction of this work, as well as smaller datasets for the Golang (122,721 messages), Angular (20,041 messages) and Node.js (12,514 messages) communities. To showcase the usefulness of these data, we focus on the CPython repository and merge the technical layer (which GitHub account works on what file and with whom) with the social layer (messages from unique email addresses) by identifying 33% of GitHub contributors in the mailing list data. We then explore correlations between the valence of social messaging and the structure of the collaboration network. We discuss how these data provide a laboratory to test theories from standard organizational science in large open source projects.

preprint2022arXiv

The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative

GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In many ways, however, GitHub is a convenience sample, and may not be representative of open source development off the platform. Here we develop a novel, extensive sample of public open source project repositories outside of centralized platforms. We characterized these projects along a number of dimensions, and compare to a time-matched sample of corresponding GitHub projects. Our sample projects tend to have more collaborators, are maintained for longer periods, and tend to be more focused on academic and scientific problems.

preprint2022arXiv

The role of directionality, heterogeneity and correlations in epidemic risk and spread

Most models of epidemic spread, including many designed specifically for COVID-19, implicitly assume mass-action contact patterns and undirected contact networks, meaning that the individuals most likely to spread the disease are also the most at risk to receive it from others. Here, we review results from the theory of random directed graphs which show that many important quantities, including the reproduction number and the epidemic size, depend sensitively on the joint distribution of in- and out-degrees ("risk" and "spread"), including their heterogeneity and the correlation between them. By considering joint distributions of various kinds, we elucidate why some types of heterogeneity cause a deviation from the standard Kermack-McKendrick analysis of SIR models, i.e., so-called mass-action models where contacts are homogeneous and random, and some do not. We also show that some structured SIR models informed by realistic complex contact patterns among types of individuals (age or activity) are simply mixtures of Poisson processes and tend not to deviate significantly from the simplest mass-action model. Finally, we point out some possible policy implications of this directed structure, both for contact tracing strategy and for interventions designed to prevent superspreading events. In particular, directed graphs have a forward and backward version of the classic "friendship paradox" -- forward edges tend to lead to individuals with high risk, while backward edges lead to individuals with high spread -- such that a combination of both forward and backward contact tracing is necessary to find superspreading events and prevent future cascades of infection.

preprint2021arXiv

Social confinement and mesoscopic localization of epidemics on networks

Recommendations around epidemics tend to focus on individual behaviors, with much less efforts attempting to guide event cancellations and other collective behaviors since most models lack the higher-order structure necessary to describe large gatherings. Through a higher-order description of contagions on networks, we model the impact of a blanket cancellation of events larger than a critical size and find that epidemics can suddenly collapse when interventions operate over groups of individuals rather than at the level of individuals. We relate this phenomenon to the onset of mesoscopic localization, where contagions concentrate around dominant groups.

preprint2020arXiv

Beyond $R_0$: Heterogeneity in secondary infections and probabilistic epidemic forecasting

The basic reproductive number -- $R_0$ -- is one of the most common and most commonly misapplied numbers in public health. Although often used to compare outbreaks and forecast pandemic risk, this single number belies the complexity that two different pathogens can exhibit, even when they have the same $R_0$. Here, we show how to predict outbreak size using estimates of the distribution of secondary infections, leveraging both its average $R_0$ and the underlying heterogeneity. To do so, we reformulate and extend a classic result from random network theory that relies on contact tracing data to simultaneously determine the first moment ($R_0$) and the higher moments (representing the heterogeneity) in the distribution of secondary infections. Further, we show the different ways in which this framework can be implemented in the data-scarce reality of emerging pathogens. Lastly, we demonstrate that without data on the heterogeneity in secondary infections for emerging infectious diseases like COVID-19, the uncertainty in outbreak size ranges dramatically. Taken together, our work highlights the critical need for contact tracing during emerging infectious disease outbreaks and the need to look beyond $R_0$ when predicting epidemic size.

preprint2020arXiv

Countering hate on social media: Large scale classification of hate and counter speech

Hateful rhetoric is plaguing online discourse, fostering extreme societal movements and possibly giving rise to real-world violence. A potential solution to this growing global problem is citizen-generated counter speech where citizens actively engage in hate-filled conversations to attempt to restore civil non-polarized discourse. However, its actual effectiveness in curbing the spread of hatred is unknown and hard to quantify. One major obstacle to researching this question is a lack of large labeled data sets for training automated classifiers to identify counter speech. Here we made use of a unique situation in Germany where self-labeling groups engaged in organized online hate and counter speech. We used an ensemble learning algorithm which pairs a variety of paragraph embeddings with regularized logistic regression functions to classify both hate and counter speech in a corpus of millions of relevant tweets from these two groups. Our pipeline achieved macro F1 scores on out of sample balanced test sets ranging from 0.76 to 0.97---accuracy in line and even exceeding the state of the art. On thousands of tweets, we used crowdsourcing to verify that the judgments made by the classifier are in close alignment with human judgment. We then used the classifier to discover hate and counter speech in more than 135,000 fully-resolved Twitter conversations occurring from 2013 to 2018 and study their frequency and interaction. Altogether, our results highlight the potential of automated methods to evaluate the impact of coordinated counter speech in stabilizing conversations on social media.

preprint2020arXiv

Immunization Strategies in Networks with Missing Data

Network-based intervention strategies can be effective and cost-efficient approaches to curtailing harmful contagions in myriad settings. As studied, these strategies are often impractical to implement, as they typically assume complete knowledge of the network structure, which is unusual in practice. In this paper, we investigate how different immunization strategies perform under realistic conditions where the strategies are informed by partially-observed network data. Our results suggest that global immunization strategies, like degree immunization, are optimal in most cases; the exception is at very high levels of missing data, where stochastic strategies, like acquaintance immunization, begin to outstrip them in minimizing outbreaks. Stochastic strategies are more robust in some cases due to the different ways in which they can be affected by missing data. In fact, one of our proposed variants of acquaintance immunization leverages a logistically-realistic ongoing survey-intervention process as a form of targeted data-recovery to improve with increasing levels of missing data. These results support the effectiveness of targeted immunization as a general practice. They also highlight the risks of considering networks as idealized mathematical objects: overestimating the accuracy of network data and foregoing the rewards of additional inquiry.

preprint2020arXiv

Master equation analysis of mesoscopic localization in contagion dynamics on higher-order networks

Simple models of infectious diseases tend to assume random mixing of individuals, but real interactions are not random pairwise encounters: they occur within various types of gatherings such as workplaces, households, schools, and concerts, best described by a higher-order network structure. We model contagions on higher-order networks using group-based approximate master equations, in which we track all states and interactions within a group of nodes and assume a mean-field coupling between them. Using the Susceptible-Infected-Susceptible dynamics, our approach reveals the existence of a mesoscopic localization regime, where a disease can concentrate and self-sustain only around large groups in the network overall organization. In this regime, the phase transition is smeared, characterized by an inhomogeneous activation of the groups. At the mesoscopic level, we observe that the distribution of infected nodes within groups of a same size can be very dispersed, even bimodal. When considering heterogeneous networks, both at the level of nodes and groups, we characterize analytically the region associated with mesoscopic localization in the structural parameter space. We put in perspective this phenomenon with eigenvector localization and discuss how a focus on higher-order structures is needed to discern the more subtle localization at the mesoscopic level. Finally, we discuss how mesoscopic localization affects the response to structural interventions and how this framework could provide important insights for a broad range of dynamics.

preprint2020arXiv

Spread of infectious disease and social awareness as parasitic contagions on clustered networks

There is a rich history of models for the interaction of a biological contagion like influenza with the spread of related information such as an influenza vaccination campaign. Recent work on the spread of interacting contagions on networks has highlighted that these interacting contagions can have counter-intuitive interplay with network structure. Here we generalize one of these frameworks to tackle three important features of the spread of awareness and disease: one, we model the dynamics on highly clustered, cliquish, networks to mimic the role of workplaces and households; two, the awareness contagion affects the spread of the biological contagion by reducing its transmission rate where an aware or vaccinated individual is less likely to be infected; and three, the biological contagion also affects the spread of the awareness contagion but by increasing its transmission rate where an infected individual is more receptive and more likely to share information related to the disease. Under these conditions, we find that increasing network clustering, which is known to hinder disease spread, can actually allow them to sustain larger epidemics of the disease in models with awareness. This counter-intuitive result goes against the conventional wisdom suggesting that random networks are justifiable as they provide worst-case scenario forecasts. To further investigate this result, we provide a closed-form criterion based on a two-step branching process (i.e., the numbers of expected tertiary infections) to identify different regions in parameter space where the net effect of clustering and co-infection varies. Altogether, our results highlight once again the need to go beyond random networks in disease modeling and illustrate the type of analysis that is possible even in complex models of interacting contagions.

preprint2020arXiv

Stochasticity and heterogeneity in the transmission dynamics of SARS-CoV-2

SARS-CoV-2 causing COVID-19 disease has moved rapidly around the globe, infecting millions and killing hundreds of thousands. The basic reproduction number, which has been widely used and misused to characterize the transmissibility of the virus, hides the fact that transmission is stochastic, is dominated by a small number of individuals, and is driven by super-spreading events (SSEs). The distinct transmission features, such as high stochasticity under low prevalence, and the central role played by SSEs on transmission dynamics, should not be overlooked. Many explosive SSEs have occurred in indoor settings stoking the pandemic and shaping its spread, such as long-term care facilities, prisons, meat-packing plants, fish factories, cruise ships, family gatherings, parties and night clubs. These SSEs demonstrate the urgent need to understand routes of transmission, while posing an opportunity that outbreak can be effectively contained with targeted interventions to eliminate SSEs. Here, we describe the potential types of SSEs, how they influence transmission, and give recommendations for control of SARS-CoV-2.

preprint2019arXiv

Interacting contagions are indistinguishable from social reinforcement

From fake news to innovative technologies, many contagions spread via a process of social reinforcement, where multiple exposures are distinct from prolonged exposure to a single source. Contrarily, biological agents such as Ebola or measles are typically thought to spread as simple contagions. Here, we demonstrate that interacting simple contagions are indistinguishable from complex contagions. In the social context, our results highlight the challenge of identifying and quantifying mechanisms, such as social reinforcement, in a world where an innumerable amount of ideas, memes and behaviors interact. In the biological context, this parallel allows the use of complex contagions to effectively quantify the non-trivial interactions of infectious diseases.