Source author record

William Schueller

William Schueller appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Social and Information Networks cs.CY econ.GN q-fin.EC Software Engineering

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Loss of sustainability in scientific work

For decades the number of scientific publications has been rapidly increasing, effectively out-dating knowledge at a tremendous rate. Only few scientific milestones remain relevant and continuously attract citations. Here we quantify how long scientific work remains being utilized, how long it takes before today's work is forgotten, and how milestone papers differ from those forgotten. To answer these questions, we study the complete temporal citation network of all American Physical Society journals. We quantify the probability of attracting citations for individual publications based on age and the number of citations they have received in the past. We capture both aspects, the forgetting and the tendency to cite already popular works, in a microscopic generative model for the dynamics of scientific citation networks. We find that the probability of citing a specific paper declines with age as a power law with an exponent of $α\sim -1.4$. Whenever a paper in its early years can be characterized by a scaling exponent above a critical value, $α_c$, the paper is likely to become "ever-lasting". We validate the model with out-of-sample predictions, with an accuracy of up to 90% (AUC $\sim 0.9$). The model also allows us to estimate an expected citation landscape of the future, predicting that 95% of papers cited in 2050 have yet to be published. The exponential growth of articles, combined with a power-law type of forgetting and papers receiving fewer and fewer citations on average, suggests a worrying tendency toward information overload and raises concerns about scientific publishing's long-term sustainability.

preprint2022arXiv

Modeling Interconnected Social and Technical Risks in Open Source Software Ecosystems

Open source software ecosystems consist of thousands of interdependent libraries, which users can combine to great effect. Recent work has pointed out two kinds of risks in these systems: that technical problems like bugs and vulnerabilities can spread through dependency links, and that relatively few developers are responsible for maintaining even the most widely used libraries. However, a more holistic diagnosis of systemic risk in software ecosystem should consider how these social and technical sources of risk interact and amplify one another. Motivated by the observation that the same individuals maintain several libraries within dependency networks, we present a methodological framework to measure risk in software ecosystems as a function of both dependencies and developers. In our models, a library's chance of failure increases as its developers leave and as its upstream dependencies fail. We apply our method to data from the Rust ecosystem, highlighting several systemically important libraries that are overlooked when only considering technical dependencies. We compare potential interventions, seeking better ways to deploy limited developer resources with a view to improving overall ecosystem health and software supply chain resilience.

preprint2022arXiv

Propagation of disruptions in supply networks of essential goods: A population-centered perspective of systemic risk

The Covid-19 pandemic drastically emphasized the fragility of national and international supply networks (SNs),leading to significant supply shortages of essential goods for people, such as food and medical equipment. Severe disruptions that propagate along complex SNs can expose the population of entire regions or even countries to these risks. A lack of both, data and quantitative methodology, has hitherto hindered us to empirically quantify the vulnerability of the population to disruptions. Here we develop a data-driven simulation methodology to locally quantify actual supply losses for the population that result from the cascading of supply disruptions. We demonstrate the method on a large food SN of a European country including 22,938 business premises, 44,355 supply links and 116 local administrative districts. We rank the business premises with respect to their criticality for the districts' population with the proposed systemic risk index, SRIcrit, to identify around 30 premises that -- in case of their failure -- are expected to cause critical supply shortages in sizable fractions of the population. The new methodology is immediately policy relevant as a fact-driven and generalizable crisis management tool. This work represents a starting point for quantitatively studying SN disruptions focused on the well-being of the population.

preprint2021arXiv

The Geography of Open Source Software: Evidence from GitHub

Open Source Software (OSS) plays an important role in the digital economy. Yet although software production is amenable to remote collaboration and its outputs are easily shared across distances, software development seems to cluster geographically in places such as Silicon Valley, London, or Berlin. And while recent work indicates that OSS activity creates positive externalities which accrue locally through knowledge spillovers and information effects, up-to-date data on the geographic distribution of active open source developers is limited. This presents a significant blindspot for policymakers, who tend to promote OSS at the national level as a cost-saving tool for public sector institutions. We address this gap by geolocating more than half a million active contributors to GitHub in early 2021 at various spatial scales. Compared to results from 2010, we find a significant increase in the share of developers based in Asia, Latin America and Eastern Europe, suggesting a more even spread of OSS developers globally. Within countries, however, we find significant concentration in regions, exceeding the concentration of workers in high-tech fields. Social and economic development indicators predict at most half of regional variation in OSS activity in the EU, suggesting that clusters of OSS have idiosyncratic roots. We argue that policymakers seeking to foster OSS should focus locally rather than nationally, using the tools of cluster policy to support networks of OSS developers.

preprint2020arXiv

Quantifying exaptation in scientific evolution

Rediscovering a new function for something can be just as important as the discovery itself. In 1982, Stephen Jay Gould and Elisabeth Vrba named this phenomenon Exaptation to describe a radical shift in the function of a specific trait during biological evolution. While exaptation is thought to be a fundamental mechanism for generating adaptive innovations, diversity, and sophisticated features, relatively little effort has been made to quantify exaptation outside the topic of biological evolution. We think that this concept provides a useful framework for characterising the emergence of innovations in science. This article explores the notion that exaptation arises from the usage of scientific ideas in domains other than the area that they were originally applied to. In particular, we adopt a normalised entropy and an inverse participation ratio as observables that reveal and quantify the concept of exaptation. We identify distinctive patterns of exaptation and expose specific examples of papers that display those patterns. Our approach represents a first step towards the quantification of exaptation phenomena in the context of scientific evolution.

William Schueller

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Loss of sustainability in scientific work

Modeling Interconnected Social and Technical Risks in Open Source Software Ecosystems

Propagation of disruptions in supply networks of essential goods: A population-centered perspective of systemic risk

The Geography of Open Source Software: Evidence from GitHub

Quantifying exaptation in scientific evolution