Researcher profile

Diego Rybski

Diego Rybski contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
15works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2016arXiv

Relating urban scaling, fundamental allometry, and density scaling

We study the connection between urban scaling, fundamental allometry (between city population and city area), and per capita vs.\ population density scaling. From simple analytical derivations we obtain the relation between the 3 involved exponents. We discuss particular cases and ranges of the exponents which we illustrate in a "phase diagram". As we show, the results are consistent with previous work.

preprint2016arXiv

The Area and Population of Cities: New Insights from a Different Perspective on Cities

The distribution of the population of cities has attracted a great deal of attention, in part because it sharply constrains models of local growth. However, to this day, there is no consensus on the distribution below the very upper tail, because available data need to rely on the "legal" rather than "economic" definition of cities for medium and small cities. To remedy this difficulty, in this work we construct cities "from the bottom up" by clustering populated areas obtained from high-resolution data. This method allows us to investigate the population and area of cities for urban agglomerations of all sizes using clustering methods from percolation theory. We find that Zipf's law (a power law with exponent close to 1) for population holds for cities as small as 12,000 inhabitants in the USA and 5,000 inhabitants in Great Britain. In addition the distribution of city areas is also close to a Zipf's law. We provide a parsimonious model with endogenous city area that is consistent with those findings.

preprint2015arXiv

Indirect identification of damage functions from damage records

In order to assess future damage caused by natural disasters, it is desirable to estimate the damage caused by single events. So called damage functions provide -- for a natural disaster of certain magnitude -- a specific damage value. However, in general, the functional form of such damage functions is unknown. We study the distributions of recorded flood damages on extended scales and deduce which damage functions lead to such distributions when the floods obey Generalized Extreme Value statistics and follow Generalized Pareto distributions. Based on the finding of broad damage distributions we investigate two possible functional forms to characterize the data. In the case of Gumbel distributed extreme events, (i) a power-law distribution density with an exponent close to 2 (Zipf's law) implies an exponential damage function; (ii) stretched exponential distribution densities imply power-law damage functions. In the case of Weibull (Frechet) distributed extreme events we find correspondingly steeper (less steep) damage functions.

preprint2013arXiv

Characterizing the development of sectoral Gross Domestic Product composition

We consider the sectoral composition of a country's GDP, i.e. the partitioning into agrarian, industrial, and service sectors. Exploring a simple system of differential equations we characterize the transfer of GDP shares between the sectors in the course of economic development. The model fits for the majority of countries providing 4 country-specific parameters. Relating the agrarian with the industrial sector, a data collapse over all countries and all years supports the applicability of our approach. Depending on the parameter ranges, country development exhibits different transfer properties. Most countries follow 3 of 8 characteristic paths. The types are not random but show distinct geographic and development patterns.

preprint2013arXiv

Distance weighted city growth

Urban agglomerations exhibit complex emergent features of which Zipf's law, i.e.\ a power-law size distribution, and fractality may be regarded as the most prominent ones. We propose a simplistic model for the generation of city-like structures which is solely based on the assumption that growth is more likely to take place close to inhabited space. The model involves one parameter which is an exponent determining how strongly the attraction decays with the distance. In addition, the model is run iteratively so that existing clusters can grow (together) and new ones can emerge. The model is capable of reproducing the size distribution and the fractality of the boundary of the largest cluster. While the power-law distribution depends on both, the imposed exponent and the iteration, the fractality seems to be independent of the former and only depends on the latter. Analyzing land-cover data we estimate the parameter-value $γ\approx 2.5$ for Paris and it's surroundings.

preprint2013arXiv

Probing the statistical properties of unknown texts: application to the Voynich Manuscript

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed investigating the properties of statistical measurements across different languages and texts. In this study we propose a framework that aims at determining if a text is compatible with a natural language and which languages are closest to it, without any knowledge of the meaning of the words. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing text, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for key-words of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

preprint2012arXiv

Communication activity in a social network: relation between long-term correlations and inter-event clustering

The timing patterns of human communication in social networks is not random. On the contrary, communication is dominated by emergent statistical laws such as non-trivial correlations and clustering. Recently, we found long-term correlations in the user's activity in social communities. Here, we extend this work to study collective behavior of the whole community. The goal is to understand the origin of clustering and long-term persistence. At the individual level, we find that the correlations in activity are a byproduct of the clustering expressed in the power-law distribution of inter-event times of single users. On the contrary, the activity of the whole community presents long-term correlations that are a true emergent property of the system, i.e. they are not related to the distribution of inter-event times. This result suggests the existence of collective behavior, possible arising from nontrivial communication patterns through the embedding social network.

preprint2012arXiv

Projecting human development and CO2 emissions

We estimate cumulative CO2 emissions during the period 2000 to 2050 from developed and developing countries based on the empirical relationship between CO2 per capita emissions (due to fossil fuel combustion and cement production) and corresponding HDI. In order to project per capita emissions of individual countries we make three assumptions which are detailed below. First, we use logistic regressions to fit and extrapolate the HDI on a country level as a function of time. This is mainly motivated by the fact that the HDI is bounded between 0 and 1 and that it decelerates as it approaches 1. Second, we employ for individual countries the correlations between CO2 per capita emissions and HDI in order to extrapolate their emissions. This is an ergodic assumption. Third, we let countries with incomplete data records evolve similarly as their close neighbors (in the emissions-HDI plane, see Fig. 1 in the main text) with complete time series of CO2 per capita emissions and HDI. Country-based emissions estimates are obtained by multiplying extrapolated CO2 per capita values by population numbers of three scenarios extracted from the Millennium Ecosystem Assessment report. Finally, we propose a reduction scheme, where countries with an HDI above the development threshold reduce their per capita CO2 emissions with a rate that is proportional to their HDI. We estimate the minimum proportionality constant so that the global emissions by 2050 meet the 1000Gt limit.

preprint2011arXiv

Communication activity in social networks: growth and correlations

We investigate the timing of messages sent in two online communities with respect to growth fluctuations and long-term correlations. We find that the timing of sending and receiving messages comprises pronounced long-term persistence. Considering the activity of the community members as growing entities, i.e. the cumulative number of messages sent (or received) by the individuals, we identify non-trivial scaling in the growth fluctuations which we relate to the long-term correlations. We find a connection between the scaling exponents of the growth and the long-term correlations which is supported by numerical simulations based on peaks over threshold. In addition, we find that the activity on directed links between pairs of members exhibits long-term correlations, indicating that communication activity with the most liked partners may be responsible for the long-term persistence in the timing of messages. Finally, we show that the number of messages, $M$, and the number of communication partners, $K$, of the individual members are correlated following a power-law, $K\sim M^λ$, with exponent $λ\approx 3/4$.

preprint2011arXiv

How people interact in evolving online affiliation networks

The study of human interactions is of central importance for understanding the behavior of individuals, groups and societies. Here, we observe the formation and evolution of networks by monitoring the addition of all new links and we analyze quantitatively the tendencies used to create ties in these evolving online affiliation networks. We first show that an accurate estimation of these probabilistic tendencies can only be achieved by following the time evolution of the network. For example, actions that are attributed to the usual friend of a friend mechanism through a static snapshot of the network are overestimated by a factor of two. A detailed analysis of the dynamic network evolution shows that half of those triangles were generated through other mechanisms, in spite of the characteristic static pattern. We start by characterizing every single link when the tie was established in the network. This allows us to describe the probabilistic tendencies of tie formation and extract sociological conclusions as follows. The tendencies to add new links differ significantly from what we would expect if they were not affected by the individuals' structural position in the network, i.e., from random link formation. We also find significant differences in behavioral traits among individuals according to their degree of activity, gender, age, popularity and other attributes. For instance, in the particular datasets analyzed here, we find that women reciprocate connections three times as much as men and this difference increases with age. Men tend to connect with the most popular people more often than women across all ages. On the other hand, triangular ties tendencies are similar and independent of gender. Our findings can be useful to build models of realistic social network structures and discover the underlying laws that govern establishment of ties in evolving social networks.

preprint2011arXiv

The Myth of Global Science Collaboration - Collaboration patterns in epistemic communities

Scientific collaboration is often perceived as a joint global process that involves researchers worldwide, regardless of their place of work and residence. Globalization of science, in this respect, implies that collaboration among scientists takes place along the lines of common topics and irrespective of the spatial distances between the collaborators. The networks of collaborators, termed 'epistemic communities', should thus have a space-independent structure. This paper shows that such a notion of globalized scientific collaboration is not supported by empirical data. It introduces a novel approach of analyzing distance-dependent probabilities of collaboration. The results of the analysis of six distinct scientific fields reveal that intra-country collaboration is about 10-50 times more likely to occur than international collaboration. Moreover, strong dependencies exist between collaboration activity (measured in co-authorships) and spatial distance when confined to national borders. However, the fact that distance becomes irrelevant once collaboration is taken to the international scale suggests a globalized science system that is strongly influenced by the gravity of local science clusters. The similarity of the probability functions of the six science fields analyzed suggests a universal mode of spatial governance that is independent from the mode of knowledge creation in science.

preprint2010arXiv

Quantifying long-range correlations in complex networks beyond nearest neighbors

We propose a fluctuation analysis to quantify spatial correlations in complex networks. The approach considers the sequences of degrees along shortest paths in the networks and quantifies the fluctuations in analogy to time series. In this work, the Barabasi-Albert (BA) model, the Cayley tree at the percolation transition, a fractal network model, and examples of real-world networks are studied. While the fluctuation functions for the BA model show exponential decay, in the case of the Cayley tree and the fractal network model the fluctuation functions display a power-law behavior. The fractal network model comprises long-range anti-correlations. The results suggest that the fluctuation exponent provides complementary information to the fractal dimension.

preprint2010arXiv

Towards a unified characterization of phenological phases: fluctuations and correlations with temperature

Phenological timing -- i.e. the course of annually recurring development stages in nature -- is of particular interest since it can be understood as a proxy for the climate at a specific region; moreover changes in the so called phenological phases can be a direct consequence of climate change. We analyze records of botanical phenology and study their fluctuations which we find to depend on the seasons. In contrast to previous studies, where typically trends in the phenology of individual species are estimated, we consider the ensemble of all available phases and propose a phenological index that characterizes the influence of climate on the multitude of botanical species.

preprint2009arXiv

Scaling laws of human interaction activity

Even though people in our contemporary, technological society are depending on communication, our understanding of the underlying laws of human communicational behavior continues to be poorly understood. Here we investigate the communication patterns in two social Internet communities in search of statistical laws in human interaction activity. This research reveals that human communication networks dynamically follow scaling laws that may also explain the observed trends in economic growth. Specifically, we identify a generalized version of Gibrat's law of social activity expressed as a scaling law between the fluctuations in the number of messages sent by members and their level of activity. Gibrat's law has been essential in understanding economic growth patterns, yet without an underlying general principle for its origin. We attribute this scaling law to long-term correlation patterns in human activity, which surprisingly span from days to the entire period of the available data of more than one year. Further, we provide a mathematical framework that relates the generalized version of Gibrat's law to the long-term correlated dynamics, which suggests that the same underlying mechanism could be the source of Gibrat's law in economics, ranging from large firms, research and development expenditures, gross domestic product of countries, to city population growth. These findings are also of importance for designing communication networks and for the understanding of the dynamics of social systems in which communication plays a role, such as economic markets and political systems.

preprint2003arXiv

Multifractality of river runoff and precipitation: Comparison of fluctuation analysis and wavelet methods

We study the multifractal temporal scaling properties of river discharge and precipitation records. We compare the results for the multifractal detrended fluctuation analysis method with the results for the wavelet transform modulus maxima technique and obtain agreement within the error margins. In contrast to previous studies, we find non-universal behaviour: On long time scales, above a crossover time scale of several months, the runoff records are described by fluctuation exponents varying from river to river in a wide range. Similar variations are observed for the precipitation records which exhibit weaker, but still significant multifractality. For all runoff records the type of multifractality is consistent with a modified version of the binomial multifractal model, while several precipitation records seem to require different models.