Source author record

Frank Schweitzer

Frank Schweitzer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

69works

34topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set

Massive data from software repositories and collaboration tools are widely used to study social aspects in software development. One question that several recent works have addressed is how a software project's size and structure influence team productivity, a question famously considered in Brooks' law. Recent studies using massive repository data suggest that developers in larger teams tend to be less productive than smaller teams. Despite using similar methods and data, other studies argue for a positive linear or even super-linear relationship between team size and productivity, thus contesting the view of software economics that software projects are diseconomies of scale. In our work, we study challenges that can explain the disagreement between recent studies of developer productivity in massive repository data. We further provide, to the best of our knowledge, the largest, curated corpus of GitHub projects tailored to investigate the influence of team size and collaboration patterns on individual and collective productivity. Our work contributes to the ongoing discussion on the choice of productivity metrics in the operationalisation of hypotheses about determinants of successful software projects. It further highlights general pitfalls in big data analysis and shows that the use of bigger data sets does not automatically lead to more reliable insights.

preprint2022arXiv

Consensus from group interactions: An adaptive voter model on hypergraphs

We study the effect of group interactions on the emergence of consensus in a spin system. Agents with discrete opinions $\{0,1\}$ form groups. They can change their opinion based on their group's influence (voter dynamics), but groups can also split and merge (adaptation). In a hypergraph, these groups are represented by hyperedges of different sizes. The heterogeneity of group sizes is controlled by a parameter $β$. To study the impact of $β$ on reaching consensus, we provide extensive computer simulations and compare them with an analytic approach for the dynamics of the average magnetization. We find that group interactions amplify small initial opinion biases, accelerate the formation of consensus and lead to a drift of the average magnetization. The conservation of the initial magnetization, known for basic voter models, is no longer obtained.

preprint2022arXiv

Disentangling Active and Passive Cosponsorship in the U.S. Congress

In the U.S. Congress, legislators can use active and passive cosponsorship to support bills. We show that these two types of cosponsorship are driven by two different motivations: the backing of political colleagues and the backing of the bill's content. To this end, we develop an Encoder+RGCN based model that learns legislator representations from bill texts and speech transcripts. These representations predict active and passive cosponsorship with an F1-score of 0.88. Applying our representations to predict voting decisions, we show that they are interpretable and generalize to unseen tasks.

preprint2022arXiv

Group relations, resilience and the I Ching

We evaluate the robustness and adaptivity of social groups with heterogeneous agents that are characterized by their binary state, their ability to change this state, their status and their preferred relations to other agents. To define group structures, we operationalize the hexagrams of the \emph{I Ching}. The relations and properties of agents are used to quantify their influence according to the social impact theory. From these influence values we derive a weighted stability measure for triads involving three agents, which is based on the weighted balance theory. It allows to quantify the robustness of groups and to propose a novel measure for group resilience which combines robustness and adaptivity. A stochastic approach determines the probabilities to find robust and adaptive groups. The discussion focuses on the generalization of our approach.

preprint2022arXiv

Network embeddedness indicates the innovation potential of firms

Firms' innovation potential depends on their position in the R&D network. But details on this relation remain unclear because measures to quantify network embeddedness have been controversially discussed. We propose and validate a new measure, coreness, obtained from the weighted k-core decomposition of the R&D network. Using data on R&D alliances, we analyse the change of coreness for 14,000 firms over 25 years and patenting activity. A regression analysis demonstrates that coreness explains firms' R&D output by predicting future patenting.

preprint2022arXiv

Reconstructing signed relations from interaction data

Positive and negative relations play an essential role in human behavior and shape the communities we live in. Despite their importance, data about signed relations is rare and commonly gathered through surveys. Interaction data is more abundant, for instance, in the form of proximity or communication data. So far, though, it could not be utilized to detect signed relations. In this paper, we show how the underlying signed relations can be extracted with such data. Employing a statistical network approach, we construct networks of signed relations in four communities. We then show that these relations correspond to the ones reported in surveys. Additionally, the inferred relations allow us to study the homophily of individuals with respect to gender, religious beliefs, and financial backgrounds. We evaluate the importance of triads in the signed network to study group cohesion.

preprint2022arXiv

The role of network embeddedness on the selection of collaboration partners: An agent-based model with empirical validation

We use a data-driven agent-based model to study the core-periphery structure of two collaboration networks, R&D alliances between firms and co-authorship relations between scientists. To characterize the network embeddedness of agents, we introduce a coreness value, obtained from a weighted $k$-core decomposition. We study the change of these coreness values when collaborations with newcomers or established agents are formed. Our agent-based model is able to reproduce the empirical coreness differences of collaboration partners and to explain why we observe a change in partner selection for agents with high network embeddedness.

preprint2021arXiv

Quantifying the importance of firms by means of reputation and network control

The reputation of firms is largely channeled through their ownership structure. We use this relation to determine reputation spillovers between transnational companies and their participated companies in an ownership network core of 1318 firms. We then apply concepts of network controllability to identify minimum sets of driver nodes (MDS) of 314 firms in this network. The importance of these driver nodes is classified regarding their control contribution, their operating revenue, and their reputation. The latter two are also taken as proxies for the access costs when utilizing firms as driver nodes. Using an enrichment analysis, we find that firms with high reputation maintain the controllability of the network, but rarely become top drivers, whereas firms with medium reputation most likely become top driver nodes. We further show that MDSs with lower access costs can be used to control the reputation dynamics in the whole network.

preprint2021arXiv

The downside of heterogeneity: How established relations counteract systemic adaptivity in tasks assignments

We study the lock-in effect in a network of task assignments. Agents have a heterogeneous fitness for solving tasks and can redistribute unfinished tasks to other agents. They learn over time to whom to reassign tasks and preferably choose agents with higher fitness. A lock-in occurs if reassignments can no longer adapt. Agents overwhelmed with tasks then fail, leading to failure cascades. We find that the probability for lock-ins and systemic failures increase with the heterogeneity in fitness values. To study this dependence, we use the Shannon entropy of the network of task assignments. A detailed discussion links our findings to the problem of resilience and observations in social systems.

preprint2020arXiv

A multi-layer network approach to modelling authorship influence on citation dynamics in physics journals

We provide a general framework to model the growth of networks consisting of different coupled layers. Our aim is to estimate the impact of one such layer on the dynamics of the others. As an application, we study a scientometric network, where one layer consists of publications as nodes and citations as links, whereas the second layer represents the authors. This allows to address the question how characteristics of authors, such as their number of publications or number of previous co-authors, impacts the citation dynamics of a new publication. To test different hypotheses about this impact, our model combines citation constituents and social constituents in different ways. We then evaluate their performance in reproducing the citation dynamics in nine different physics journals. For this, we develop a general method for statistical parameter estimation and model selection that is applicable to growing multi-layer networks. It takes both the parameter errors and the model complexity into account and is computationally efficient and scalable to large networks.

preprint2020arXiv

Enhanced or distorted wisdom of crowds? An agent-based model of opinion formation under social influence

We propose an agent-based model of collective opinion formation to study the wisdom of crowds under social influence. The opinion of an agent is a continuous positive value, denoting its subjective answer to a factual question. The wisdom of crowds states that the average of all opinions is close to the truth, i.e. the correct answer. But if agents have the chance to adjust their opinion in response to the opinions of others, this effect can be destroyed. Our model investigates this scenario by evaluating two competing effects: (i) agents tend to keep their own opinion (individual conviction $β$), (ii) they tend to adjust their opinion if they have information about the opinions of others (social influence $α$). For the latter, two different regimes (full information vs. aggregated information) are compared. Our simulations show that social influence only in rare cases enhances the wisdom of crowds. Most often, we find that agents converge to a collective opinion that is even farther away from the true answer. So, under social influence the wisdom of crowds can be systematically wrong.

preprint2020arXiv

Fragile, yet resilient: Adaptive decline in a collaboration network of firms

The dynamics of collaboration networks of firms follow a life-cycle of growth and decline. That does not imply they also become less resilient. Instead, declining collaboration networks may still have the ability to mitigate shocks from firms leaving, and to recover from these losses by adapting to new partners. To demonstrate this, we analyze 21.500 R\&D collaborations of 14.500 firms in six different industrial sectors over 25 years. We calculate time-dependent probabilities of firms leaving the network and simulate drop-out cascades, to determine the expected dynamics of decline. We then show that deviations from these expectations result from the adaptivity of the network, which mitigates the decline. These deviations can be used as a measure of network resilience.

preprint2020arXiv

HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks

The unsupervised detection of anomalies in time series data has important applications in user behavioral modeling, fraud detection, and cybersecurity. Anomaly detection has, in fact, been extensively studied in categorical sequences. However, we often have access to time series data that represent paths through networks. Examples include transaction sequences in financial networks, click streams of users in networks of cross-referenced documents, or travel itineraries in transportation networks. To reliably detect anomalies, we must account for the fact that such data contain a large number of independent observations of paths constrained by a graph topology. Moreover, the heterogeneity of real systems rules out frequency-based anomaly detection techniques, which do not account for highly skewed edge and degree statistics. To address this problem, we introduce HYPA, a novel framework for the unsupervised detection of anomalies in large corpora of variable-length temporal paths in a graph. HYPA provides an efficient analytical method to detect paths with anomalous frequencies that result from nodes being traversed in unexpected chronological order.

preprint2020arXiv

Intervention scenarios to enhance knowledge transfer in a network of firm

We investigate a multi-agent model of firms in an R\&D network. Each firm is characterized by its knowledge stock $x_{i}(t)$, which follows a non-linear dynamics. It can grow with the input from other firms, i.e., by knowledge transfer, and decays otherwise. Maintaining interactions is costly. Firms can leave the network if their expected knowledge growth is not realized, which may cause other firms to also leave the network. The paper discusses two bottom-up intervention scenarios to prevent, reduce, or delay cascades of firms leaving. The first one is based on the formalism of network controllability, in which driver nodes are identified and subsequently incentivized, by reducing their costs. The second one combines node interventions and network interventions. It proposes the controlled removal of a single firm and the random replacement of firms leaving. This allows to generate small cascades, which prevents the occurrence of large cascades. We find that both approaches successfully mitigate cascades and thus improve the resilience of the R\&D network.

preprint2020arXiv

Reproducing scientists' mobility: A data-driven model

High skill labour is an important factor underpinning the competitive advantage of modern economies. Therefore, attracting and retaining scientists has become a major concern for migration policy. In this work, we study the migration of scientists on a global scale, by combining two large data sets covering the publications of 3.5 Mio scientists over 60 years. We analyse their geographical distances moved for a new affiliation and their age when moving, this way reconstructing their geographical "career paths". These paths are used to derive the world network of scientists mobility between cities and to analyse its topological properties. We further develop and calibrate an agent-based model, such that it reproduces the empirical findings both at the level of scientists and of the global network. Our model takes into account that the academic hiring process is largely demand-driven and demonstrates that the probability of scientists to relocate decreases both with age and with distance. Our results allow interpreting the model assumptions as micro-based decision rules that can explain the observed mobility patterns of scientists.

preprint2020arXiv

The ambiguous role of social influence on the wisdom of crowds: An analytic approach

"Wisdom of crowds" refers to the phenomenon that the average opinion of a group of individuals on a given question can be very close to the true answer. It requires a large group diversity of opinions, but the collective error, the difference between the average opinion and the true value, has to be small. We consider a stochastic opinion dynamics where individuals can change their opinion based on the opinions of others (social influence $α$), but to some degree also stick to their initial opinion (individual conviction $β$). We then derive analytic expressions for the dynamics of the collective error and the group diversity. We analyze their long-term behavior to determine the impact of the two parameters $(α,β)$ and the initial opinion distribution on the wisdom of crowds. This allows us to quantify the ambiguous role of social influence: only if the initial collective error is large, it helps to improve the wisdom of crowds, but in most cases it deteriorates the outcome. In these cases, individual conviction still improves the wisdom of crowds because it mitigates the impact of social influence.

preprint2019arXiv

Improving the robustness of online social networks: A simulation approach of network interventions

Online social networks (OSN) are prime examples of socio-technical systems in which individuals interact via a technical platform. OSN are very volatile because users enter and exit and frequently change their interactions. This makes the robustness of such systems difficult to measure and to control. To quantify robustness, we propose a coreness value obtained from the directed interaction network. We study the emergence of large drop-out cascades of users leaving the OSN by means of an agent-based model. For agents, we define a utility function that depends on their relative reputation and their costs for interactions. The decision of agents to leave the OSN depends on this utility. Our aim is to prevent drop-out cascades by influencing specific agents with low utility. We identify strategies to control agents in the core and the periphery of the OSN such that drop-out cascades are significantly reduced, and the robustness of the OSN is increased.

preprint2019arXiv

International crop trade networks: The impact of shocks and cascades

Analyzing available FAO data from 176 countries over 21 years, we observe an increase of complexity in the international trade of maize, rice, soy, and wheat. A larger number of countries play a role as producers or intermediaries, either for trade or food processing. In consequence, we find that the trade networks become more prone to failure cascades caused by exogenous shocks. In our model, countries compensate for demand deficits by imposing export restrictions. To capture these, we construct higher-order trade dependency networks for the different crops and years. These networks reveal hidden dependencies between countries and allow to discuss policy implications.

preprint2019arXiv

Quantifying Triadic Closure in Multi-Edge Social Networks

Multi-edge networks capture repeated interactions between individuals. In social networks, such edges often form closed triangles, or triads. Standard approaches to measure this triadic closure, however, fail for multi-edge networks, because they do not consider that triads can be formed by edges of different multiplicity. We propose a novel measure of triadic closure for multi-edge networks of social interactions based on a shared partner statistic. We demonstrate that our operalization is able to detect meaningful closure in synthetic and empirical multi-edge networks, where common approaches fail. This is a cornerstone in driving inferential network analyses from the analysis of binary networks towards the analyses of multi-edge and weighted networks, which offer a more realistic representation of social interactions and relations.

preprint2017arXiv

From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles

The inference of network topologies from relational data is an important problem in data analysis. Exemplary applications include the reconstruction of social ties from data on human interactions, the inference of gene co-expression networks from DNA microarray data, or the learning of semantic relationships based on co-occurrences of words in documents. Solving these problems requires techniques to infer significant links in noisy relational data. In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.

preprint2016arXiv

A conceptual approach to model co-evolution of urban structures

Urban structures encompass settlements, characterized by the spatial distribution of built-up areas, but also transportation structures, to connect these built-up areas. These two structures are very different in their origin and function, fulfilling complementary needs: (i) to access space, and (ii) to occupy space. Their evolution cannot be understood by looking at the dynamics of urban aggregations and transportation systems separately. Instead, existing built-up areas feed back on the further development of transportation structures, and the availability of the latter feeds back on the future growth of urban aggregations. To model this co-evolution, we propose an agent-based approach that builds on existing agent-based models for the evolution of trail systems and of urban settlements. The key element in these separate approaches is a generalized communication of agents by means of an adaptive landscape. This landscape is only generated by the agents, but once it exists, it feeds back on their further actions. The emerging trail system or urban aggregation results as a self-organized structure from these collective interactions. In our co-evolutionary approach, we couple these two separate models by means of meta-agents that represent humans with their different demands for housing and mobility. We characterize our approach as a statistical ensemble approach, which allows to capture the potential of urban evolution in a bottom-up manner, but can be validated against empirical observations.

preprint2016arXiv

A model of dynamic rewiring and knowledge exchange in R&D networks

This paper investigates the process of knowledge exchange in inter-firm Research and Development (R&D) alliances by means of an agent-based model. Extant research has pointed out that firms select alliance partners considering both network-related and network-unrelated features (e.g., social capital versus complementary knowledge stocks). In our agent-based model, firms are located in a metric knowledge space. The interaction rules incorporate an exploration phase and a knowledge transfer phase, during which firms search for a new partner and then evaluate whether they can establish an alliance to exchange their knowledge stocks. The model parameters determining the overall system properties are the rate at which alliances form and dissolve and the agents' interaction radius. Next, we define a novel indicator of performance, based on the distance traveled by the firms in the knowledge space. Remarkably, we find that - depending on the alliance formation rate and the interaction radius - firms tend to cluster around one or more attractors in the knowledge space, whose position is an emergent property of the system. And, more importantly, we find that there exists an inverted U-shaped dependence of the network performance on both model parameters.

preprint2016arXiv

Generalized Hypergeometric Ensembles: Statistical Hypothesis Testing in Complex Networks

Statistical ensembles of networks, i.e., probability spaces of all networks that are consistent with given aggregate statistics, have become instrumental in the analysis of complex networks. Their numerical and analytical study provides the foundation for the inference of topological patterns, the definition of network-analytic measures, as well as for model selection and statistical hypothesis testing. Contributing to the foundation of these data analysis techniques, in this Letter we introduce generalized hypergeometric ensembles, a broad class of analytically tractable statistical ensembles of finite, directed and weighted networks. This framework can be interpreted as a generalization of the classical configuration model, which is commonly used to randomly generate networks with a given degree sequence or distribution. Our generalization rests on the introduction of dyadic link propensities, which capture the degree-corrected tendencies of pairs of nodes to form edges between each other. Studying empirical and synthetic data, we show that our approach provides broad perspectives for model selection and statistical hypothesis testing in data on complex networks.

preprint2016arXiv

The Dynamics of Emotions in Online Interaction

We study the changes in emotional states induced by reading and participating in online discussions, empirically testing a computational model of online emotional interaction. Using principles of dynamical systems, we quantify changes in valence and arousal through subjective reports, as recorded in three independent studies including 207 participants (110 female). In the context of online discussions, the dynamics of valence and arousal are composed of two forces: an internal relaxation towards baseline values independent of the emotional charge of the discussion, and a driving force of emotional states that depends on the content of the discussion. The dynamics of valence show the existence of positive and negative tendencies, while arousal increases when reading emotional content regardless of its polarity. The tendency of participants to take part in the discussion increases with positive arousal. When participating in an online discussion, the content of participants' expression depends on their valence, and their arousal significantly decreases afterwards as a regulation mechanism. We illustrate how these results allow the design of agent-based models to reproduce and analyze emotions in online communities. Our work empirically validates the microdynamics of a model of online collective emotions, bridging online data analysis with research in the laboratory.

preprint2016arXiv

Value of peripheral nodes in controlling multilayer networks

We analyze the controllability of a two-layer network, where driver nodes can be chosen randomly only from one layer. Each layer contains a scale-free network with directed links and the node dynamics depends on the incoming links from other nodes. We combine the in-degree and out-degree values to assign an importance value $w$ to each node, and distinguish between peripheral nodes with low $w$ and central nodes with high $w$. Based on numerical simulations, we find that the controllable part of the network is larger when choosing low $w$ nodes to connect the two layers. The control is as efficient when peripheral nodes are driver nodes as it is for the case of more central nodes. However, if we assume a cost to utilize nodes that is proportional to their overall degree, utilizing peripheral nodes to connect the two layers or to act as driver nodes is not only the most cost-efficient solution, it is also the one that performs best in controlling the two-layer network among the different interconnecting strategies we have tested.

preprint2016arXiv

When the Filter Bubble Bursts: Collective Evaluation Dynamics in Online Communities

We analyze online collective evaluation processes through positive and negative votes in various social media. We find two modes of collective evaluations that stem from the existence of filter bubbles. Above a threshold of collective attention, negativity grows faster with positivity, as a sign of the burst of a filter bubble when information reaches beyond the local social context of a user. We analyze how collectively evaluated content can reach large social contexts and create polarization, showing that emotions expressed through text play a key role in collective evaluation processes.

preprint2015arXiv

Causality-Driven Slow-Down and Speed-Up of Diffusion in Non-Markovian Temporal Networks

Recent research has highlighted limitations of studying complex systems with time-varying topologies from the perspective of static, time-aggregated networks. Non-Markovian characteristics resulting from the ordering of interactions in temporal networks were identified as one important mechanism that alters causality, and affects dynamical processes. So far, an analytical explanation for this phenomenon and for the significant variations observed across different systems is missing. Here we introduce a methodology that allows to analytically predict causality-driven changes of diffusion speed in non-Markovian temporal networks. Validating our predictions in six data sets, we show that - compared to the time-aggregated network - non-Markovian characteristics can lead to both a slow-down, or speed-up of diffusion which can even outweigh the decelerating effect of community structures in the static topology. Thus, non-Markovian properties of temporal networks constitute an important additional dimension of complexity in time-varying complex systems.

preprint2015arXiv

How Damage Diversification Can Reduce Systemic Risk

We consider the problem of risk diversification in complex networks. Nodes represent e.g. financial actors, whereas weighted links represent e.g. financial obligations (credits/debts). Each node has a risk to fail because of losses resulting from defaulting neighbors, which may lead to large failure cascades. Classical risk diversification strategies usually neglect network effects and therefore suggest that risk can be reduced if possible losses (i.e., exposures) are split among many neighbors (exposure diversification, ED). But from a complex networks perspective diversification implies higher connectivity of the system as a whole which can also lead to increasing failure risk of a node. To cope with this, we propose a different strategy (damage diversification, DD), i.e. the diversification of losses that are imposed on neighboring nodes as opposed to losses incurred by the node itself. Here, we quantify the potential of DD to reduce systemic risk in comparison to ED. For this, we develop a branching process approximation that we generalize to weighted networks with (almost) arbitrary degree and weight distributions. This allows us to identify systemically relevant nodes in a network even if their directed weights differ strongly. On the macro level, we provide an analytical expression for the average cascade size, to quantify systemic risk. Furthermore, on the meso level we calculate failure probabilities of nodes conditional on their system relevance.

preprint2015arXiv

How do OSS projects change in number and size? A large-scale analysis to test a model of project growth

Established Open Source Software (OSS) projects can grow in size if new developers join, but also the number of OSS projects can grow if developers choose to found new projects. We discuss to what extent an established model for firm growth can be applied to the dynamics of OSS projects. Our analysis is based on a large-scale data set from SourceForge (SF) consisting of monthly data for 10 years, for up to 360'000 OSS projects and up to 340'000 developers. Over this time period, we find an exponential growth both in the number of projects and developers, with a remarkable increase of single-developer projects after 2009. We analyze the monthly entry and exit rates for both projects and developers, the growth rate of established projects and the monthly project size distribution. To derive a prediction for the latter, we use modeling assumptions of how newly entering developers choose to either found a new project or to join existing ones. Our model applies only to collaborative projects that are deemed to grow in size by attracting new developers. We verify, by a thorough statistical analysis, that the Yule-Simon distribution is a valid candidate for the size distribution of collaborative projects except for certain time periods where the modeling assumptions no longer hold. We detect and empirically test the reason for this limitation, i.e., the fact that an increasing number of established developers found additional new projects after 2009.

preprint2015arXiv

Ideological and Temporal Components of Network Polarization in Online Political Participatory Media

Political polarization is traditionally analyzed through the ideological stances of groups and parties, but it also has a behavioral component that manifests in the interactions between individuals. We present an empirical analysis of the digital traces of politicians in politnetz.ch, a Swiss online platform focused on political activity, in which politicians interact by creating support links, comments, and likes. We analyze network polarization as the level of intra- party cohesion with respect to inter-party connectivity, finding that supports show a very strongly polarized structure with respect to party alignment. The analysis of this multiplex network shows that each layer of interaction contains relevant information, where comment groups follow topics related to Swiss politics. Our analysis reveals that polarization in the layer of likes evolves in time, increasing close to the federal elections of 2011. Furthermore, we analyze the internal social network of each party through metrics related to hierarchical structures, information efficiency, and social resilience. Our results suggest that the online social structure of a party is related to its ideology, and reveal that the degree of connectivity across two parties increases when they are close in the ideological space of a multi-party system.

preprint2015arXiv

Neighborhood approximations for non-linear voter models

Non-linear voter models assume that the opinion of an agent depends on the opinions of its neighbors in a non-linear manner. This allows for voting rules different from majority voting. While the linear voter model is known to reach consensus, non-linear voter models can result in the coexistence of opposite opinions. Our aim is to derive approximations to correctly predict the time dependent dynamics, or at least the asymptotic outcome, of such local interactions. Emphasis is on a probabilistic approach to decompose the opinion distribution in a second-order neighborhood into lower-order probability distributions. This is compared with an analytic pair approximation for the expected value of the global fraction of opinions and a mean-field approximation. Our reference case are averaged stochastic simulations of a one-dimensional cellular automaton. We find that the probabilistic second-order approach captures the dynamics of the reference case very well for different non-linearities, i.e for both majority and minority voting rules, which only partly holds for the first-order pair approximation and not at all for the mean-field approximation. We further discuss the interesting phenomenon of a correlated coexistence, characterized by the formation of large domains of opinions that dominate for some time, but slowly change.

preprint2015arXiv

Sentiment cascades in the 15M movement

Recent grassroots movements have suggested that online social networks might play a key role in their organization, as adherents have a fast, many-to-many, communication channel to help coordinate their mobilization. The structure and dynamics of the networks constructed from the digital traces of protesters have been analyzed to some extent recently. However, less effort has been devoted to the analysis of the semantic content of messages exchanged during the protest. Using the data obtained from a microblogging service during the brewing and active phases of the 15M movement in Spain, we perform the first large scale test of theories on collective emotions and social interaction in collective actions. Our findings show that activity and information cascades in the movement are larger in the presence of negative collective emotions and when users express themselves in terms related to social content. At the level of individual participants, our results show that their social integration in the movement, as measured through social network metrics, increases with their level of engagement and of expression of negativity. Our findings show that non-rational factors play a role in the formation and activity of social movements through online media, having important consequences for viral spreading.

preprint2015arXiv

Social signals and algorithmic trading of Bitcoin

The availability of data on digital traces is growing to unprecedented sizes, but inferring actionable knowledge from large-scale data is far from being trivial. This is especially important for computational finance, where digital traces of human behavior offer a great potential to drive trading strategies. We contribute to this by providing a consistent approach that integrates various datasources in the design of algorithmic traders. This allows us to derive insights into the principles behind the profitability of our trading strategies. We illustrate our approach through the analysis of Bitcoin, a cryptocurrency known for its large price fluctuations. In our analysis, we include economic signals of volume and price of exchange for USD, adoption of the Bitcoin technology, and transaction volume of Bitcoin. We add social signals related to information search, word of mouth volume, emotional valence, and opinion polarization as expressed in tweets related to Bitcoin for more than 3 years. Our analysis reveals that increases in opinion polarization and exchange volume precede rising Bitcoin prices, and that emotional valence precedes opinion polarization and rising exchange volumes. We apply these insights to design algorithmic trading strategies for Bitcoin, reaching very high profits in less than a year. We verify this high profitability with robust statistical methods that take into account risk and trading costs, confirming the long-standing hypothesis that trading based social media sentiment has the potential to yield positive returns on investment.

preprint2015arXiv

The Network of Counterparty Risk: Analysing Correlations in OTC Derivatives

Counterparty risk denotes the risk that a party defaults in a bilateral contract. This risk not only depends on the two parties involved, but also on the risk from various other contracts each of these parties holds. In rather informal markets, such as the OTC (over-the-counter) derivative market, institutions only report their aggregated quarterly risk exposure, but no details about their counterparties. Hence, little is known about the diversification of counterparty risk. In this paper, we reconstruct the weighted and time-dependent network of counterparty risk in the OTC derivatives market of the United States between 1998 and 2012. To proxy unknown bilateral exposures, we first study the co-occurrence patterns of institutions based on their quarterly activity and ranking in the official report. The network obtained this way is further analysed by a weighted k-core decomposition, to reveal a core-periphery structure. This allows us to compare the activity-based ranking with a topology-based ranking, to identify the most important institutions and their mutual dependencies. We also analyse correlations in these activities, to show strong similarities in the behavior of the core institutions. Our analysis clearly demonstrates the clustering of counterparty risk in a small set of about a dozen US banks. This not only increases the default risk of the central institutions, but also the default risk of peripheral institutions which have contracts with the central ones. Hence, all institutions indirectly have to bear (part of) the counterparty risk of all others, which needs to be better reflected in the price of OTC derivatives.

preprint2015arXiv

The network structure of city-firm relations

How are economic activities linked to geographic locations? To answer this question, we use a data-driven approach that builds on the information about location, ownership and economic activities of the world's 3,000 largest firms and their almost one million subsidiaries. From this information we generate a bipartite network of cities linked to economic activities. Analysing the structure of this network, we find striking similarities with nested networks observed in ecology, where links represent mutualistic interactions between species. This motivates us to apply ecological indicators to identify the unbalanced deployment of economic activities. Such deployment can lead to an over-representation of specific economic sectors in a given city, and poses a significant thread for the city's future especially in times when the over-represented activities face economic uncertainties. If we compare our analysis with external rankings about the quality of life in a city, we find that the nested structure of the city-firm network also reflects such information about the quality of life, which can usually be assessed only via dedicated survey-based indicators.

preprint2015arXiv

The spatial component of R&D networks

We study the role of geography in R&D networks by means of a quantitative, micro-geographic approach. Using a large database that covers international R&D collaborations from 1984 to 2009, we localize each actor precisely in space through its latitude and longitude. This allows us to analyze the R&D network at all geographic scales simultaneously. Our empirical results show that despite the high importance of the city level, transnational R&D collaborations at large distances are much more frequent than expected from similar networks. This provides evidence for the ambiguity of distance in economic cooperation which is also suggested by the existing literature. In addition we test whether the hypothesis of local buzz and global pipelines applies to the observed R&D network by calculating well-defined metrics from network theory.

preprint2014arXiv

Online Privacy as a Collective Phenomenon

The problem of online privacy is often reduced to individual decisions to hide or reveal personal information in online social networks (OSNs). However, with the increasing use of OSNs, it becomes more important to understand the role of the social network in disclosing personal information that a user has not revealed voluntarily: How much of our private information do our friends disclose about us, and how much of our privacy is lost simply because of online social interaction? Without strong technical effort, an OSN may be able to exploit the assortativity of human private features, this way constructing shadow profiles with information that users chose not to share. Furthermore, because many users share their phone and email contact lists, this allows an OSN to create full shadow profiles for people who do not even have an account for this OSN. We empirically test the feasibility of constructing shadow profiles of sexual orientation for users and non-users, using data from more than 3 Million accounts of a single OSN. We quantify a lower bound for the predictive power derived from the social network of a user, to demonstrate how the predictability of sexual orientation increases with the size of this network and the tendency to share personal information. This allows us to define a privacy leak factor that links individual privacy loss with the decision of other individuals to disclose information. Our statistical analysis reveals that some individuals are at a higher risk of privacy loss, as prediction accuracy increases for users with a larger and more homogeneous first- and second-order neighborhood of their social network. While we do not provide evidence that shadow profiles exist at all, our results show that disclosing of private information is not restricted to an individual choice, but becomes a collective decision that has implications for policy and privacy regulation.

preprint2014arXiv

Predicting Scientific Success Based on Coauthorship Networks

We address the question to what extent the success of scientific articles is due to social influence. Analyzing a data set of over 100000 publications from the field of Computer Science, we study how centrality in the coauthorship network differs between authors who have highly cited papers and those who do not. We further show that a machine learning classifier, based only on coauthorship network centrality measures at time of publication, is able to predict with high precision whether an article will be highly cited five years after publication. By this we provide quantitative insight into the social dimension of scientific publishing - challenging the perception of citations as an objective, socially unbiased measure of scientific success.

preprint2014arXiv

The role of endogenous and exogenous mechanisms in the formation of R&D networks

We develop an agent-based model of strategic link formation in Research and Development (R&D) networks. Empirical evidence has shown that the growth of these networks is driven by mechanisms which are both endogenous to the system (that is, depending on existing alliances patterns) and exogenous (that is, driven by an exploratory search for newcomer firms). Extant research to date has not investigated both mechanisms simultaneously in a comparative manner. To overcome this limitation, we develop a general modeling framework to shed light on the relative importance of these two mechanisms. We test our model against a comprehensive dataset, listing cross-country and cross-sectoral R&D alliances from 1984 to 2009. Our results show that by fitting only three macroscopic properties of the network topology, this framework is able to reproduce a number of micro-level measures, including the distributions of degree, local clustering, path length and component size, and the emergence of network clusters. Furthermore, by estimating the link probabilities towards newcomers and established firms from the data, we find that endogenous mechanisms are predominant over the exogenous ones in the network formation, thus quantifying the importance of existing structures in selecting partner firms.

preprint2013arXiv

A Network Perspective on Software Modularity

Modularity is a desirable characteristic for software systems. In this article we propose to use a quantitative method from complex network sciences to estimate the coherence between the modularity of the dependency network of large open source Java projects and their decomposition in terms of Java packages. The results presented in this article indicate that our methodology offers a promising and reasonable quantitative approach with potential impact on software engineering processes.

preprint2013arXiv

A Quantitative Study of Social Organisation in Open Source Software Communities

The success of open source projects crucially depends on the voluntary contributions of a sufficiently large community of users. Apart from the mere size of the community, interesting questions arise when looking at the evolution of structural features of collaborations between community members. In this article, we discuss several network analytic proxies that can be used to quantify different aspects of the social organisation in social collaboration networks. We particularly focus on measures that can be related to the cohesiveness of the communities, the distribution of responsibilities and the resilience against turnover of community members. We present a comparative analysis on a large-scale dataset that covers the full history of collaborations between users of 14 major open source software communities. Our analysis covers both aggregate and time-evolving measures and highlights differences in the social organisation across communities. We argue that our results are a promising step towards the definition of suitable, potentially multi-dimensional, resilience and risk indicators for open source software communities.

preprint2013arXiv

Betweenness Preference: Quantifying Correlations in the Topological Dynamics of Temporal Networks

We study correlations in temporal networks and introduce the notion of betweenness preference. It allows to quantify to what extent paths, existing in time-aggregated representations of temporal networks, are actually realizable based on the sequence of interactions. We show that betweenness preference is present in empirical temporal network data and that it influences the length of shortest time-respecting paths. Using four different data sets, we further argue that neglecting betweenness preference leads to wrong conclusions about dynamical processes on temporal networks.

preprint2013arXiv

Categorizing Bugs with Social Networks: A Case Study on Four Open Source Software Communities

Efficient bug triaging procedures are an important precondition for successful collaborative software engineering projects. Triaging bugs can become a laborious task particularly in open source software (OSS) projects with a large base of comparably inexperienced part-time contributors. In this paper, we propose an efficient and practical method to identify valid bug reports which a) refer to an actual software bug, b) are not duplicates and c) contain enough information to be processed right away. Our classification is based on nine measures to quantify the social embeddedness of bug reporters in the collaboration network. We demonstrate its applicability in a case study, using a comprehensive data set of more than 700,000 bug reports obtained from the Bugzilla installation of four major OSS communities, for a period of more than ten years. For those projects that exhibit the lowest fraction of valid bug reports, we find that the bug reporters' position in the collaboration network is a strong indicator for the quality of bug reports. Based on this finding, we develop an automated classification scheme that can easily be integrated into bug tracking platforms and analyze its performance in the considered OSS communities. A support vector machine (SVM) to identify valid bug reports based on the nine measures yields a precision of up to 90.3% with an associated recall of 38.9%. With this, we significantly improve the results obtained in previous case studies for an automated early identification of bugs that are eventually fixed. Furthermore, our study highlights the potential of using quantitative measures of social organization in collaborative software engineering. It also opens a broad perspective for the integration of social awareness in the design of support infrastructures.

preprint2013arXiv

Dynamical coupling during collective animal motion

The measurement of information flows within moving animal groups has recently been a topic of considerable interest, and it has become clear that the individual(s) that drive collective movement may change over time, and that such individuals may not necessarily always lead from the front. However, methods to quantify the influence of specific individuals on the behaviour of other group members and the direction of information flow in moving group, are lacking on the level of empirical studies and theoretical models. Using high spatio-temporal resolution GPS trajectories of foraging meerkats, Suricata suricatta, we provide an information-theoretic framework to identify dynamical coupling between animals independent of their relative spatial positions. Based on this identification, we then compare designations of individuals as either drivers or responders against designations provided by the relative spatial position. We find that not only does coupling occur both from the frontal to the trailing individuals and vice versa, but also that the coupling direction is a non-linear function of the relative position. This provides evidence for (i) intermittent fluctuation of the coupling strength and (ii) alternation in the coupling direction within foraging meerkat pairs. The framework we introduce allows for a detailed description of the dynamical patterns of mutual influence between all pairs of individuals within moving animal groups. We argue that applying an information-theoretic perspective to the study of coordinated phenomena in animal groups will eventually help to understand cause and effect in collective behaviour.

preprint2013arXiv

How big is too big? Critical Shocks for Systemic Failure Cascades

External or internal shocks may lead to the collapse of a system consisting of many agents. If the shock hits only one agent initially and causes it to fail, this can induce a cascade of failures among neighoring agents. Several critical constellations determine whether this cascade remains finite or reaches the size of the system, i.e. leads to systemic risk. We investigate the critical parameters for such cascades in a simple model, where agents are characterized by an individual threshold θ_i determining their capacity to handle a load αθ_i with 1-αbeing their safety margin. If agents fail, they redistribute their load equally to K neighboring agents in a regular network. For three different threshold distributions P(θ), we derive analytical results for the size of the cascade, X(t), which is regarded as a measure of systemic risk, and the time when it stops. We focus on two different regimes, (i) EEE, an external extreme event where the size of the shock is of the order of the total capacity of the network, and (ii) RIE, a random internal event where the size of the shock is of the order of the capacity of an agent. We find that even for large extreme events that exceed the capacity of the network finite cascades are still possible, if a power-law threshold distribution is assumed. On the other hand, even small random fluctuations may lead to full cascades if critical conditions are met. Most importantly, we demonstrate that the size of the "big" shock is not the problem, as the systemic risk only varies slightly for changes of 10 to 50 percent of the external shock. Systemic risk depends much more on ingredients such as the network topology, the safety margin and the threshold distribution, which gives hints on how to reduce systemic risk.

preprint2013arXiv

Quantifying the effects of social influence

How do humans respond to indirect social influence when making decisions? We analysed an experiment where subjects had to repeatedly guess the correct answer to factual questions, while having only aggregated information about the answers of others. While the response of humans to aggregated information is a widely observed phenomenon, it has not been investigated quantitatively, in a controlled setting. We found that the adjustment of individual guesses depends linearly on the distance to the mean of all guesses. This is a remarkable, and yet surprisingly simple, statistical regularity. It holds across all questions analysed, even though the correct answers differ in several orders of magnitude. Our finding supports the assumption that individual diversity does not affect the response to indirect social influence. It also complements previous results on the nonlinear response in information-rich scenarios. We argue that the nature of the response to social influence crucially changes with the level of information aggregation. This insight contributes to the empirical foundation of models for collective decisions under social influence.

preprint2013arXiv

Quantifying the Impact of Leveraging and Diversification on Systemic Risk

Excessive leverage, i.e. the abuse of debt financing, is considered one of the primary factors in the default of financial institutions. Systemic risk results from correlations between individual default probabilities that cannot be considered independent. Based on the structural framework by Merton (1974), we discuss a model in which these correlations arise from overlaps in banks' portfolios. Portfolio diversification is used as a strategy to mitigate losses from investments in risky projects. We calculate an optimal level of diversification that has to be reached for a given level of excessive leverage to still mitigate an increase in systemic risk. In our model, this optimal diversification further depends on the market size and the market conditions (e.g. volatility). It allows to distinguish between a safe regime, in which excessive leverage does not result in an increase of systemic risk, and a risky regime, in which excessive leverage cannot be mitigated leading to an increased systemic risk. Our results are of relevance for financial regulators.

preprint2013arXiv

Social Resilience in Online Communities: The Autopsy of Friendster

We empirically analyze five online communities: Friendster, Livejournal, Facebook, Orkut, Myspace, to identify causes for the decline of social networks. We define social resilience as the ability of a community to withstand changes. We do not argue about the cause of such changes, but concentrate on their impact. Changes may cause users to leave, which may trigger further leaves of others who lost connection to their friends. This may lead to cascades of users leaving. A social network is said to be resilient if the size of such cascades can be limited. To quantify resilience, we use the k-core analysis, to identify subsets of the network in which all users have at least k friends. These connections generate benefits (b) for each user, which have to outweigh the costs (c) of being a member of the network. If this difference is not positive, users leave. After all cascades, the remaining network is the k-core of the original network determined by the cost-to-benefit c/b ratio. By analysing the cumulative distribution of k-cores we are able to calculate the number of users remaining in each community. This allows us to infer the impact of the c/b ratio on the resilience of these online communities. We find that the different online communities have different k-core distributions. Consequently, similar changes in the c/b ratio have a different impact on the amount of active users. As a case study, we focus on the evolution of Friendster. We identify time periods when new users entering the network observed an insufficient c/b ratio. This measure can be seen as a precursor of the later collapse of the community. Our analysis can be applied to estimate the impact of changes in the user interface, which may temporarily increase the c/b ratio, thus posing a threat for the community to shrink, or even to collapse.

preprint2013arXiv

The Rise and Fall of a Central Contributor: Dynamics of Social Organization and Performance in the Gentoo Community

Social organization and division of labor crucially influence the performance of collaborative software engineering efforts. In this paper, we provide a quantitative analysis of the relation between social organization and performance in Gentoo, an Open Source community developing a Linux distribution. We study the structure and dynamics of collaborations as recorded in the project's bug tracking system over a period of ten years. We identify a period of increasing centralization after which most interactions in the community were mediated by a single central contributor. In this period of maximum centralization, the central contributor unexpectedly left the project, thus posing a significant challenge for the community. We quantify how the rise, the activity as well as the subsequent sudden dropout of this central contributor affected both the social organization and the bug handling performance of the Gentoo community. We analyze social organization from the perspective of network theory and augment our quantitative findings by interviews with prominent members of the Gentoo community which shared their personal insights.

preprint2013arXiv

The Role of Emotions in Contributors Activity: A Case Study on the GENTOO Community

We analyse the relation between the emotions and the activity of contributors in the Open Source Software project Gentoo. Our case study builds on extensive data sets from the project's bug tracking platform Bugzilla, to quantify the activity of contributors, and its mail archives, to quantify the emotions of contributors by means of sentiment analysis. The Gentoo project is known for a period of centralization within its bug triaging community. This was followed by considerable changes in community organization and performance after the sudden retirement of the central contributor. We analyse how this event correlates with the negative emotions, both in bilateral email discussions with the central contributor, and at the level of the whole community of contributors. We then extend our study to consider the activity patterns on Gentoo contributors in general. We find that contributors are more likely to become inactive when they express strong positive or negative emotions in the bug tracker, or when they deviate from the expected value of emotions in the mailing list. We use these insights to develop a Bayesian classifier that detects the risk of contributors leaving the project. Our analysis opens new perspectives for measuring online contributor motivation by means of sentiment analysis and for real-time predictions of contributor turnover in Open Source Software projects.

preprint2012arXiv

A k-shell decomposition method for weighted networks

We present a generalized method for calculating the k-shell structure of weighted networks. The method takes into account both the weight and the degree of a network, in such a way that in the absence of weights we resume the shell structure obtained by the classic k-shell decomposition. In the presence of weights, we show that the method is able to partition the network in a more refined way, without the need of any arbitrary threshold on the weight values. Furthermore, by simulating spreading processes using the susceptible-infectious-recovered model in four different weighted real-world networks, we show that the weighted k-shell decomposition method ranks the nodes more accurately, by placing nodes with higher spreading potential into shells closer to the core. In addition, we demonstrate our new method on a real economic network and show that the core calculated using the weighted k-shell method is more meaningful from an economic perspective when compared with the unweighted one.

preprint2012arXiv

A stochastic model of social interaction in wild house mice

We investigate to what extent the interaction dynamics of a population of wild house mouse (Mus musculus domesticus) in their environment can be explained by a simple stochastic model. We use a Markov chain model to describe the transitions of mice in a discrete space of nestboxes, and implement a multi-agent simulation of the model. We find that some important features of our behavioural dataset can be reproduced using this simplified stochastic representation, and discuss the improvements that could be made to our model in order to increase the accuracy of its predictions. Our findings have implications for the understanding of the complexity underlying social behaviour in the animal kingdom and the cognitive requirements of such behaviour.

preprint2012arXiv

A Tunable Mechanism for Identifying Trusted Nodes in Large Scale Distributed Networks

In this paper, we propose a simple randomized protocol for identifying trusted nodes based on personalized trust in large scale distributed networks. The problem of identifying trusted nodes, based on personalized trust, in a large network setting stems from the huge computation and message overhead involved in exhaustively calculating and propagating the trust estimates by the remote nodes. However, in any practical scenario, nodes generally communicate with a small subset of nodes and thus exhaustively estimating the trust of all the nodes can lead to huge resource consumption. In contrast, our mechanism can be tuned to locate a desired subset of trusted nodes, based on the allowable overhead, with respect to a particular user. The mechanism is based on a simple exchange of random walk messages and nodes counting the number of times they are being hit by random walkers of nodes in their neighborhood. Simulation results to analyze the effectiveness of the algorithm show that using the proposed algorithm, nodes identify the top trusted nodes in the network with a very high probability by exploring only around 45% of the total nodes, and in turn generates nearly 90% less overhead as compared to an exhaustive trust estimation mechanism, named TrustWebRank. Finally, we provide a measure of the global trustworthiness of a node; simulation results indicate that the measures generated using our mechanism differ by only around 0.6% as compared to TrustWebRank.

preprint2012arXiv

Agent-based simulations of emotion spreading in online social networks

Quantitative analysis of empirical data from online social networks reveals group dynamics in which emotions are involved (Šuvakov et al). Full understanding of the underlying mechanisms, however, remains a challenging task. Using agent-based computer simulations, in this paper we study dynamics of emotional communications in online social networks. The rules that guide how the agents interact are motivated, and the realistic network structure and some important parameters are inferred from the empirical dataset of \texttt{MySpace} social network. Agent's emotional state is characterized by two variables representing psychological arousal---reactivity to stimuli, and valence---attractiveness or aversiveness, by which common emotions can be defined. Agent's action is triggered by increased arousal. High-resolution dynamics is implemented where each message carrying agent's emotion along the network link is identified and its effect on the recipient agent is considered as continuously aging in time. Our results demonstrate that (i) aggregated group behaviors may arise from individual emotional actions of agents; (ii) collective states characterized by temporal correlations and dominant positive emotions emerge, similar to the empirical system; (iii) nature of the driving signal---rate of user's stepping into online world, has profound effects on building the coherent behaviors, which are observed for users in online social networks. Further, our simulations suggest that spreading patterns differ for the emotions, e.g., "enthusiastic" and "ashamed", which have entirely different emotional content. {\bf {All data used in this study are fully anonymized.}}

preprint2012arXiv

Diversity-induced resonance in the response to social norms

In this paper we focus on diversity-induced resonance, which was recently found in bistable, excitable and other physical systems. We study the appearance of this phenomenon in a purely economic model of cooperating and defecting agents. Agent's contribution to a public good is seen as a social norm. So defecting agents face a social pressure, which decreases if free-riding becomes widespread. In this model, diversity among agents naturally appears because of the different sensitivity towards the social norm. We study the evolution of cooperation as a response to the social norm (i) for the replicator dynamics, and (ii) for the logit dynamics by means of numerical simulations. Diversity-induced resonance is observed as a maximum in the response of agents to changes in the social norm as a function of the degree of heterogeneity in the population. We provide an analytical, mean-field approach for the logit dynamics and find very good agreement with the simulations. From a socio-economic perspective, our results show that, counter-intuitively, diversity in the individual sensitivity to social norms may result in a society that better follows such norms as a whole, even if part of the population is less prone to follow them.

preprint2012arXiv

Effects of Social Influence on the Wisdom of Crowds

Wisdom of crowds refers to the phenomenon that the aggregate prediction or forecast of a group of individuals can be surprisingly more accurate than most individuals in the group, and sometimes - than any of the individuals comprising it. This article models the impact of social influence on the wisdom of crowds. We build a minimalistic representation of individuals as Brownian particles coupled by means of social influence. We demonstrate that the model can reproduce results of a previous empirical study. This allows us to draw more fundamental conclusions about the role of social influence: In particular, we show that the question of whether social influence has a positive or negative net effect on the wisdom of crowds is ill-defined. Instead, it is the starting configuration of the population, in terms of its diversity and accuracy, that directly determines how beneficial social influence actually is. The article further examines the scenarios under which social influence promotes or impairs the wisdom of crowds.

preprint2012arXiv

Emotional persistence in online chatting communities

How do users behave in online chatrooms, where they instantaneously read and write posts? We analyzed about 2.5 million posts covering various topics in Internet relay channels, and found that user activity patterns follow known power-law and stretched exponential distributions, indicating that online chat activity is not different from other forms of communication. Analysing the emotional expressions (positive, negative, neutral) of users, we revealed a remarkable persistence both for individual users and channels. I.e. despite their anonymity, users tend to follow social norms in repeated interactions in online chats, which results in a specific emotional "tone" of the channels. We provide an agent-based model of emotional interaction, which recovers qualitatively both the activity patterns in chatrooms and the emotional persistence of users and channels. While our assumptions about agent's emotional expressions are rooted in psychology, the model allows to test different hypothesis regarding their emotional impact in online communication.

preprint2012arXiv

Hierarchical Consensus Formation Reduces the Influence of Opinion Bias

We study the role of hierarchical structures in a simple model of collective consensus formation based on the bounded confidence model with continuous individual opinions. For the particular variation of this model considered in this paper, we assume that a bias towards an extreme opinion is introduced whenever two individuals interact and form a common decision. As a simple proxy for hierarchical social structures, we introduce a two-step decision making process in which in the second step groups of like-minded individuals are replaced by representatives once they have reached local consensus, and the representatives in turn form a collective decision in a downstream process. We find that the introduction of such a hierarchical decision making structure can improve consensus formation, in the sense that the eventual collective opinion is closer to the true average of individual opinions than without it. In particular, we numerically study how the size of groups of like-minded individuals being represented by delegate individuals affects the impact of the bias on the final population-wide consensus. These results are of interest for the design of organisational policies and the optimisation of hierarchical structures in the context of group decision making.

preprint2012arXiv

How can social herding enhance cooperation?

We study a system in which N agents have to decide between two strategies θ_i (i \in 1... N), for defection or cooperation, when interacting with other n agents (either spatial neighbors or randomly chosen ones). After each round, they update their strategy responding nonlinearly to two different information sources: (i) the payoff a_i(θ_i, f_i) received from the strategic interaction with their n counterparts, (ii) the fraction f_i of cooperators in this interaction. For the latter response, we assume social herding, i.e. agents adopt their strategy based on the frequencies of the different strategies in their neighborhood, without taking into account the consequences of this decision. We note that f_i already determines the payoff, so there is no additional information assumed. A parameter ζdefines to what level agents take the two different information sources into account. For the strategic interaction, we assume a Prisoner's Dilemma game, i.e. one in which defection is the evolutionary stable strategy. However, if the additional dimension of social herding is taken into account, we find instead a stable outcome where cooperators are the majority. By means of agent-based computer simulations and analytical investigations, we evaluate the critical conditions for this transition towards cooperation. We find that, in addition to a high degree of social herding, there has to be a nonlinear response to the fraction of cooperators. We argue that the transition to cooperation in our model is based on less information, i.e. on agents which are not informed about the payoff matrix, and therefore rely on just observing the strategy of others, to adopt it. By designing the right mechanisms to respond to this information, the transition to cooperation can be remarkably enhanced.

preprint2012arXiv

Optimal migration promotes the outbreak of cooperation in heterogeneous populations

We consider a population of agents that are heterogeneous with respect to (i) their strategy when interacting $n_{g}$ times with other agents in an iterated prisoners dilemma game, (ii) their spatial location on $K$ different islands. After each generation, agents adopt strategies proportional to their average payoff received. Assuming a mix of two cooperating and two defecting strategies, we first investigate for isolated islands the conditions for an exclusive domination of each of these strategies and their possible coexistence. This allows to define a threshold frequency for cooperation that, dependent on $n_{g}$ and the initial mix of strategies, describes the outbreak of cooperation in the absense of migration. We then allow migration of a fixed fraction of the population after each generation. Assuming a worst case scenario where all islands are occupied by defecting strategies, whereas only one island is occupied by cooperators at the threshold frequency, we determine the optimal migration rate that allows the outbreak of cooperation on \emph{all} islands. We further find that the threshold frequency divided by the number of islands, i.e. the relative effort for invading defecting islands with cooperators decreeses with the number of islands. We also show that there is only a small bandwidth of migration rates, to allow the outbreak of cooperation. Larger migration rates destroy cooperation.

preprint2012arXiv

Positive words carry less information than negative words

We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.

preprint2012arXiv

Redistribution spurs growth by using a portfolio effect on human capital

We demonstrate by mathematical analysis and systematic computer simulations that redistribution can lead to sustainable growth in a society. The human capital dynamics of each agent is described by a stochastic multiplicative process which, in the long run, leads to the destruction of individual human capital and the extinction of the individualistic society. When agents are linked by fully-redistributive taxation the situation might turn to individual growth in the long run. We consider that a government collects a proportion of income and reduces it by a fraction as costs for administration (efficiency losses). The remaining public good is equally redistributed to all agents. We derive conditions under which the destruction of human capital can be turned into sustainable growth, despite the losses from the random growth process and despite the administrative costs. Sustainable growth is induced by redistribution. This effect could be explained by a simple portfolio-effect which re-balances individual stochastic processes. The findings are verified for three different tax schemes: proportional tax, taking proportional more from the rich, and proportionally more from the poor. We discuss which of these tax schemes is optimal with respect to maximize growth under a fixed rate of administrative costs, or with respect to maximize the governmental income. This leads us to some general conclusions about governmental decisions, the relation to public good games, and the use of taxation in a risk taking society.

preprint2011arXiv

Agent-Based Modeling of Intracellular Transport

We develop an agent-based model of the motion and pattern formation of vesicles. These intracellular particles can be found in four different modes of (undirected and directed) motion and can fuse with other vesicles. While the size of vesicles follows a log-normal distribution that changes over time due to fusion processes, their spatial distribution gives rise to distinct patterns. Their occurrence depends on the concentration of proteins which are synthesized based on the transcriptional activities of some genes. Hence, differences in these spatio-temporal vesicle patterns allow indirect conclusions about the (unknown) impact of these genes. By means of agent-based computer simulations we are able to reproduce such patterns on real temporal and spatial scales. Our modeling approach is based on Brownian agents with an internal degree of freedom, $θ$, that represents the different modes of motion. Conditions inside the cell are modeled by an effective potential that differs for agents dependent on their value $θ$. Agent's motion in this effective potential is modeled by an overdampted Langevin equation, changes of $θ$ are modeled as stochastic transitions with values obtained from experiments, and fusion events are modeled as space-dependent stochastic transitions. Our results for the spatio-temporal vesicle patterns can be used for a statistical comparison with experiments. We also derive hypotheses of how the silencing of some genes may affect the intracellular transport, and point to generalizations of the model.

preprint2011arXiv

Testing an agent-based model of bacterial cell motility: How nutrient concentration affects speed distribution

We revisit a recently proposed agent-based model of active biological motion and compare its predictions with own experimental findings for the speed distribution of bacterial cells, \emph{Salmonella typhimurium}. Agents move according to a stochastic dynamics and use energy stored in an internal depot for metabolism and active motion. We discuss different assumptions of how the conversion from internal to kinetic energy $d(v)$ may depend on the actual speed, to conclude that $d_{2}v^ξ$ with either $ξ=2$ or $1<ξ<2$ are promising hypotheses. To test these, we compare the model's prediction with the speed distribution of bacteria which were obtained in media of different nutrient concentration and at different times. We find that both hypotheses are in line with the experimental observations, with $ξ$ between 1.67 and 2.0. Regarding the influence of a higher nutrient concentration, we conclude that the take-up of energy by bacterial cells is indeed increased. But this energy is not used to increase the speed, with 40$μ$m/s as the most probable value of the speed distribution, but is rather spend on metabolism and growth.

preprint2010arXiv

An Agent-Based Model of Collective Emotions in Online Communities

We develop a agent-based framework to model the emergence of collective emotions, which is applied to online communities. Agents individual emotions are described by their valence and arousal. Using the concept of Brownian agents, these variables change according to a stochastic dynamics, which also considers the feedback from online communication. Agents generate emotional information, which is stored and distributed in a field modeling the online medium. This field affects the emotional states of agents in a non-linear manner. We derive conditions for the emergence of collective emotions, observable in a bimodal valence distribution. Dependent on a saturated or a superlinear feedback between the information field and the agent's arousal, we further identify scenarios where collective emotions only appear once or in a repeated manner. The analytical results are illustrated by agent-based computer simulations. Our framework provides testable hypotheses about the emergence of collective emotions, which can be verified by data from online communities.

preprint2010arXiv

New Power Law Signature of Media Exposure in Human Response Waiting Time Distributions

We study the humanitarian response to the destruction brought by the tsunami generated by the Sumatra earthquake of December 26, 2004, as measured by donations, and find that it decays in time as a power law ~ 1/t^(alpha) with alpha=2.5 +/- 0.1. This behavior is suggested to be the rare outcome of a priority queuing process in which individuals execute tasks at a rate slightly faster than the rate at which new tasks arise. We believe this to be the first empirical evidence documenting this recently predicted regime, and provide additional independent evidence that suggests it arises as a result of the intense focus placed on this donation "task" by the media.

preprint2010arXiv

Systemic Risk in a Unifying Framework for Cascading Processes on Networks

We introduce a general framework for models of cascade and contagion processes on networks, to identify their commonalities and differences. In particular, models of social and financial cascades, as well as the fiber bundle model, the voter model, and models of epidemic spreading are recovered as special cases. To unify their description, we define the net fragility of a node, which is the difference between its fragility and the threshold that determines its failure. Nodes fail if their net fragility grows above zero and their failure increases the fragility of neighbouring nodes, thus possibly triggering a cascade. In this framework, we identify three classes depending on the way the fragility of a node is increased by the failure of a neighbour. At the microscopic level, we illustrate with specific examples how the failure spreading pattern varies with the node triggering the cascade, depending on its position in the network and its degree. At the macroscopic level, systemic risk is measured as the final fraction of failed nodes, $X^\ast$, and for each of the three classes we derive a recursive equation to compute its value. The phase diagram of $X^\ast$ as a function of the initial conditions, thus allows for a prediction of the systemic risk as well as a comparison of the three different model classes. We could identify which model class lead to a first-order phase transition in systemic risk, i.e. situations where small changes in the initial conditions may lead to a global failure. Eventually, we generalize our framework to encompass stochastic contagion models. This indicates the potential for further generalizations.

preprint2009arXiv

A complementary view on the growth of directory trees

Trees are a special sub-class of networks with unique properties, such as the level distribution which has often been overlooked. We analyse a general tree growth model proposed by Klemm {\em et. al.} (2005) to explain the growth of user-generated directory structures in computers. The model has a single parameter $q$ which interpolates between preferential attachment and random growth. Our analysis results in three contributions: First, we propose a more efficient estimation method for $q$ based on the degree distribution, which is one specific representation of the model. Next, we introduce the concept of a level distribution and analytically solve the model for this representation. This allows for an alternative and independent measure of $q$. We argue that, to capture real growth processes, the $q$ estimations from the degree and the level distributions should coincide. Thus, we finally apply both representations to validate the model with synthetically generated tree structures, as well as with collected data of user directories. In the case of real directory structures, we show that $q$ measured from the level distribution are incompatible with $q$ measured from the degree distribution. In contrast to this, we find perfect agreement in the case of simulated data. Thus, we conclude that the model is an incomplete description of the growth of real directory structures as it fails to reproduce the level distribution. This insight can be generalised to point out the importance of the level distribution for modeling tree growth.

preprint1999arXiv

Brownian Particles far from Equilibrium

We study a model of Brownian particles which are pumped with energy by means of a non-linear friction function, for which different types are discussed. A suitable expression for a non-linear, velocity-dependent friction function is derived by considering an internal energy depot of the Brownian particles. In this case, the friction function describes the pumping of energy in the range of small velocities, while in the range of large velocities the known limit of dissipative friction is reached. In order to investigate the influence of additional energy supply, we discuss the velocity distribution function for different cases. Analytical solutions of the corresponding Fokker-Planck equation in 2d are presented and compared with computer simulations. Different to the case of passive Brownian motion, we find several new features of the dynamics, such as the formation of limit cycles in the four-dimensional phase-space, a large mean squared displacement which increases quadratically with the energy supply, or non-equilibrium velocity distributions with crater-like form. Further, we point to some generalizations and possible applications of the model.

Frank Schweitzer

What is connected

Connect this record

See the researcher in context

Building this map preview

69 published item(s)

Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set

Consensus from group interactions: An adaptive voter model on hypergraphs

Disentangling Active and Passive Cosponsorship in the U.S. Congress

Group relations, resilience and the I Ching

Network embeddedness indicates the innovation potential of firms

Reconstructing signed relations from interaction data

The role of network embeddedness on the selection of collaboration partners: An agent-based model with empirical validation

Quantifying the importance of firms by means of reputation and network control

The downside of heterogeneity: How established relations counteract systemic adaptivity in tasks assignments

A multi-layer network approach to modelling authorship influence on citation dynamics in physics journals

Enhanced or distorted wisdom of crowds? An agent-based model of opinion formation under social influence

Fragile, yet resilient: Adaptive decline in a collaboration network of firms

HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks

Intervention scenarios to enhance knowledge transfer in a network of firm

Reproducing scientists' mobility: A data-driven model

The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Improving the robustness of online social networks: A simulation approach of network interventions

International crop trade networks: The impact of shocks and cascades

Quantifying Triadic Closure in Multi-Edge Social Networks

From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles

A conceptual approach to model co-evolution of urban structures

A model of dynamic rewiring and knowledge exchange in R&D networks

Generalized Hypergeometric Ensembles: Statistical Hypothesis Testing in Complex Networks

The Dynamics of Emotions in Online Interaction

Value of peripheral nodes in controlling multilayer networks

When the Filter Bubble Bursts: Collective Evaluation Dynamics in Online Communities

Causality-Driven Slow-Down and Speed-Up of Diffusion in Non-Markovian Temporal Networks

How Damage Diversification Can Reduce Systemic Risk

How do OSS projects change in number and size? A large-scale analysis to test a model of project growth

Ideological and Temporal Components of Network Polarization in Online Political Participatory Media

Neighborhood approximations for non-linear voter models

Sentiment cascades in the 15M movement

Social signals and algorithmic trading of Bitcoin

The Network of Counterparty Risk: Analysing Correlations in OTC Derivatives

The network structure of city-firm relations

The spatial component of R&D networks

Online Privacy as a Collective Phenomenon

Predicting Scientific Success Based on Coauthorship Networks

The role of endogenous and exogenous mechanisms in the formation of R&D networks

A Network Perspective on Software Modularity

A Quantitative Study of Social Organisation in Open Source Software Communities

Betweenness Preference: Quantifying Correlations in the Topological Dynamics of Temporal Networks

Categorizing Bugs with Social Networks: A Case Study on Four Open Source Software Communities

Dynamical coupling during collective animal motion

How big is too big? Critical Shocks for Systemic Failure Cascades

Quantifying the effects of social influence

Quantifying the Impact of Leveraging and Diversification on Systemic Risk

Social Resilience in Online Communities: The Autopsy of Friendster

The Rise and Fall of a Central Contributor: Dynamics of Social Organization and Performance in the Gentoo Community

The Role of Emotions in Contributors Activity: A Case Study on the GENTOO Community

A k-shell decomposition method for weighted networks

A stochastic model of social interaction in wild house mice

A Tunable Mechanism for Identifying Trusted Nodes in Large Scale Distributed Networks

Agent-based simulations of emotion spreading in online social networks

Diversity-induced resonance in the response to social norms

Effects of Social Influence on the Wisdom of Crowds

Emotional persistence in online chatting communities

Hierarchical Consensus Formation Reduces the Influence of Opinion Bias

How can social herding enhance cooperation?

Optimal migration promotes the outbreak of cooperation in heterogeneous populations

Positive words carry less information than negative words

Redistribution spurs growth by using a portfolio effect on human capital

Agent-Based Modeling of Intracellular Transport

Testing an agent-based model of bacterial cell motility: How nutrient concentration affects speed distribution

An Agent-Based Model of Collective Emotions in Online Communities

New Power Law Signature of Media Exposure in Human Response Waiting Time Distributions

Systemic Risk in a Unifying Framework for Cascading Processes on Networks

A complementary view on the growth of directory trees

Brownian Particles far from Equilibrium