Researcher profile

János Kertész

János Kertész contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
25works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

25 published item(s)

preprint2026arXiv

Convergence criteria for self-consistent measures in bipartite networks

Many quantities that characterize network elements are defined in an explicit form and calculated directly from the network structure; examples of include several centrality measures like degree, closeness, or betweenness. However, there are also implicitly defined quantitative measures, which are usually calculated iteratively, in a self-consistent manner, like PageRank or countries' fitness / products' complexity relations. The iteration algorithms involve calculations over the entire network; therefore, their convergence properties depend on the structure of the network. Here, we focus on investigating self-consistently defined quantities in bipartite networks of two sets of nodes where the quantities in one set are determined by the quantities in the other set and vice versa. We derive an explicit convergence criterion for iterations of these quantities and describe two different approaches to improve the convergence properties. In the first one, we identify "problematic nodes" that can be removed or merged while in the second one, we introduce a regularization scheme and show how to estimate the regularization parameter.

preprint2022arXiv

Computational Approaches to the Study of Corruption

Studying corruption presents unique challenges. Recent work in the spirit of computational social science exploits newly available data and methods to give a fresh perspective on this important topic. In this chapter we highlight some of these works, describing how they provide insights into classic social scientific questions about the structure and dynamics of corruption in society from micro to macro scales. We argue that corruption is fruitfully understood as a collective action problem that happens between embedded people and organizations. Computational methods like network science and agent-based modeling can give insights into such situations. We also present various (big) data sources that have been exploited to study corruption. We conclude by highlighting work in adjacent fields, for instance on the problems of collusion, tax evasion, organized crime, and the darkweb, and promising avenues for future work.

preprint2021arXiv

Attention dynamics on the Chinese social media Sina Weibo during the COVID-19 pandemic

Understanding attention dynamics on social media during pandemics could help governments minimize the effects. We focus on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic. We study the real-time Hot Search List (HSL), which provides the ranking of the most popular 50 hashtags based on the amount of Sina Weibo searches. We show how the specific events, measures and developments during the epidemic affected the emergence of different kinds of hashtags and the ranking on the HSL. A significant increase of COVID-19 related hashtags started to occur on HSL around January 20, 2020, when the transmission of the disease between humans was announced. Then very rapidly a situation was reached where COVID-related hashtags occupied 30-70% of the HSL, however, with changing content. We give an analysis of how the hashtag topics changed during the investigated time span and conclude that there are three periods separated by February 12 and March 12. In period 1, we see strong topical correlations and clustering of hashtags; in period 2, the correlations are weakened, without clustering pattern; in period 3, we see a potential of clustering while not as strong as in period 1. We further explore the dynamics of HSL by measuring the ranking dynamics and the lifetimes of hashtags on the list. This way we can obtain information about the decay of attention, which is important for decisions about the temporal placement of governmental measures to achieve permanent awareness. Furthermore, our observations indicate abnormally higher rank diversity in the top 15 ranks on HSL due to the COVID-19 related hashtags, revealing the possibility of algorithmic intervention from the platform provider.

preprint2020arXiv

The role of geography in the complex diffusion of innovations

The urban-rural divide is increasing in modern societies calling for geographical extensions of social influence modelling. Improved understanding of innovation diffusion across locations and through social connections can provide us with new insights into the spread of information, technological progress and economic development. In this work, we analyze the spatial adoption dynamics of iWiW, an Online Social Network (OSN) in Hungary and uncover empirical features about the spatial adoption in social networks. During its entire life cycle from 2002 to 2012, iWiW reached up to 300 million friendship ties of 3 million users. We find that the number of adopters as a function of town population follows a scaling law that reveals a strongly concentrated early adoption in large towns and a less concentrated late adoption. We also discover a strengthening distance decay of spread over the life-cycle indicating high fraction of distant diffusion in early stages but the dominance of local diffusion in late stages. The spreading process is modelled within the Bass diffusion framework that enables us to compare the differential equation version with an agent-based version of the model run on the empirical network. Although both models can capture the macro trend of adoption, they have limited capacity to describe the observed trends of urban scaling and distance decay. We find, however that incorporating adoption thresholds, defined by the fraction of social connections that adopt a technology before the individual adopts, improves the network model fit to the urban scaling of early adopters. Controlling for the threshold distribution enables us to eliminate the bias induced by local network structure on predicting local adoption peaks. Finally, we show that geographical features such as distance from the innovation origin and town size influence prediction of adoption peak at local scales.

preprint2019arXiv

Inequality is rising where social network segregation interacts with urban topology

Social networks amplify inequalities due to fundamental mechanisms of social tie formation such as homophily and triadic closure. These forces sharpen social segregation reflected in network fragmentation. Yet, little is known about what structural factors facilitate fragmentation. In this paper we use big data from a widely-used online social network to demonstrate that there is a significant relationship between social network fragmentation and income inequality in cities and towns. We find that the organization of the physical urban space has a stronger relationship with fragmentation than unequal access to education, political segregation, or the presence of ethnic and religious minorities. Fragmentation of social networks is significantly higher in towns in which residential neighborhoods are divided by physical barriers such as rivers and railroads and are relatively distant from the center of town. Towns in which amenities are spatially concentrated are also typically more socially segregated. These relationships suggest how urban planning may be a useful point of intervention to mitigate inequalities in the long run.

preprint2014arXiv

Modeling Social Dynamics in a Collaborative Environment

Wikipedia is a prime example of today's value production in a collaborative environment. Using this example, we model the emergence, persistence and resolution of severe conflicts during collaboration by coupling opinion formation with article editing in a bounded confidence dynamics. The complex social behavior involved in editing articles is implemented as a minimal model with two basic elements; (i) individuals interact directly to share information and convince each other, and (ii) they edit a common medium to establish their own opinions. Opinions of the editors and that represented by the article are characterised by a scalar variable. When the pool of editors is fixed, three regimes can be distinguished: (a) a stable mainstream article opinion is continuously contested by editors with extremist views and there is slow convergence towards consensus, (b) the article oscillates between editors with extremist views, reaching consensus relatively fast at one of the extremes, and (c) the extremist editors are converted very fast to the mainstream opinion and the article has an erratic evolution. When editors are renewed with a certain rate, a dynamical transition occurs between different kinds of edit wars, which qualitatively reflect the dynamics of conflicts as observed in real Wikipedia data.

preprint2013arXiv

Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data

Use of socially generated "big data" to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between "real time monitoring" and "early predicting" remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.

preprint2013arXiv

The most controversial topics in Wikipedia: A multilingual and geographical analysis

We present, visualize and analyse the similarities and differences between the controversial topics related to "edit wars" identified in 10 different language versions of Wikipedia. After a brief review of the related work we describe the methods developed to locate, measure, and categorize the controversial topics in the different languages. Visualizations of the degree of overlap between the top 100 lists of most controversial articles in different languages and the content related to geographical locations will be presented. We discuss what the presented analysis and visualizations can tell us about the multicultural aspects of Wikipedia and practices of peer-production. Our results indicate that Wikipedia is more than just an encyclopaedia; it is also a window into convergent and divergent social-spatial priorities, interests and preferences.

preprint2013arXiv

Value production in a collaborative environment

We review some recent endeavors and add some new results to characterize and understand underlying mechanisms in Wikipedia (WP), the paradigmatic example of collaborative value production. We analyzed the statistics of editorial activity in different languages and observed typical circadian and weekly patterns, which enabled us to estimate the geographical origins of contributions to WPs in languages spoken in several time zones. Using a recently introduced measure we showed that the editorial activities have intrinsic dependencies in the burstiness of events. A comparison of the English and Simple English WPs revealed important aspects of language complexity and showed how peer cooperation solved the task of enhancing readability. One of our focus issues was characterizing the conflicts or edit wars in WPs, which helped us to automatically filter out controversial pages. When studying the temporal evolution of the controversiality of such pages we identified typical patterns and classified conflicts accordingly. Our quantitative analysis provides the basis of modeling conflicts and their resolution in collaborative environments and contribute to the understanding of this issue, which becomes increasingly important with the development of information communication technology.

preprint2012arXiv

A practical approach to language complexity: a Wikipedia case study

In this paper we present statistical analysis of English texts from Wikipedia. We try to address the issue of language complexity empirically by comparing the simple English Wikipedia (Simple) to comparable samples of the main English Wikipedia (Main). Simple is supposed to use a more simplified language with a limited vocabulary, and editors are explicitly requested to follow this guideline, yet in practice the vocabulary richness of both samples are at the same level. Detailed analysis of longer units (n-grams of words and part of speech tags) shows that the language of Simple is less complex than that of Main primarily due to the use of shorter sentences, as opposed to drastically simplified syntax or vocabulary. Comparing the two language varieties by the Gunning readability index supports this conclusion. We also report on the topical dependence of language complexity, e.g. that the language is more advanced in conceptual articles compared to person-based (biographical) and object-based articles. Finally, we investigate the relation between conflict and language complexity by analyzing the content of the talk pages associated to controversial and peacefully developing articles, concluding that controversy has the effect of reducing language complexity.

preprint2012arXiv

Dynamics of conflicts in Wikipedia

In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only.

preprint2012arXiv

Edit wars in Wikipedia

We present a new, efficient method for automatically detecting severe conflicts `edit wars' in Wikipedia and evaluate this method on six different language WPs. We discuss how the number of edits, reverts, the length of discussions, the burstiness of edits and reverts deviate in such pages from those following the general workflow, and argue that earlier work has significantly over-estimated the contentiousness of the Wikipedia editing process.

preprint2012arXiv

Opinions, Conflicts and Consensus: Modeling Social Dynamics in a Collaborative Environment

Information-communication technology promotes collaborative environments like Wikipedia where, however, controversiality and conflicts can appear. To describe the rise, persistence, and resolution of such conflicts we devise an extended opinion dynamics model where agents with different opinions perform a single task to make a consensual product. As a function of the convergence parameter describing the influence of the product on the agents, the model shows spontaneous symmetry breaking of the final consensus opinion represented by the medium. In the case when agents are replaced with new ones at a certain rate, a transition from mainly consensus to a perpetual conflict occurs, which is in qualitative agreement with the scenarios observed in Wikipedia.

preprint2012arXiv

Sex differences in intimate relationships

Social networks have turned out to be of fundamental importance both for our understanding human sociality and for the design of digital communication technology. However, social networks are themselves based on dyadic relationships and we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating systems, the pattern of investment in close relationships should vary across the lifespan when post-weaning investment plays an important role in maximising fitness. Mobile phone data sets provide us with a unique window into the structure of relationships and the way these change across the lifespan. We here use data from a large national mobile phone dataset to demonstrate striking sex differences in the pattern in the gender-bias of preferred relationships that reflect the way the reproductive investment strategies of the two sexes change across the lifespan: these differences mainly reflect women's shifting patterns of investment in reproduction and parental care. These results suggest that human social strategies may have more complex dynamics than we have tended to assume and a life-history perspective may be crucial for understanding them.

preprint2011arXiv

Circadian pattern and burstiness in mobile phone communication

The temporal communication patterns of human individuals are known to be inhomogeneous or bursty, which is reflected as the heavy tail behavior in the inter-event time distribution. As the cause of such bursty behavior two main mechanisms have been suggested: a) Inhomogeneities due to the circadian and weekly activity patterns and b) inhomogeneities rooted in human task execution behavior. Here we investigate the roles of these mechanisms by developing and then applying systematic de-seasoning methods to remove the circadian and weekly patterns from the time-series of mobile phone communication events of individuals. We find that the heavy tails in the inter-event time distributions remain robustly with respect to this procedure, which clearly indicates that the human task execution based mechanism is a possible cause for the remaining burstiness in temporal mobile phone communication patterns.

preprint2011arXiv

Circadian patterns of Wikipedia editorial activity: A demographic analysis

Wikipedia (WP) as a collaborative, dynamical system of humans is an appropriate subject of social studies. Each single action of the members of this society, i.e. editors, is well recorded and accessible. Using the cumulative data of 34 Wikipedias in different languages, we try to characterize and find the universalities and differences in temporal activity patterns of editors. Based on this data, we estimate the geographical distribution of editors for each WP in the globe. Furthermore we also clarify the differences among different groups of WPs, which originate in the variance of cultural and social features of the communities of editors.

preprint2011arXiv

Multiscale Analysis of Spreading in a Large Communication Network

In temporal networks, both the topology of the underlying network and the timings of interaction events can be crucial in determining how some dynamic process mediated by the network unfolds. We have explored the limiting case of the speed of spreading in the SI model, set up such that an event between an infectious and susceptible individual always transmits the infection. The speed of this process sets an upper bound for the speed of any dynamic process that is mediated through the interaction events of the network. With the help of temporal networks derived from large scale time-stamped data on mobile phone calls, we extend earlier results that point out the slowing-down effects of burstiness and temporal inhomogeneities. In such networks, links are not permanently active, but dynamic processes are mediated by recurrent events taking place on the links at specific points in time. We perform a multi-scale analysis and pinpoint the importance of the timings of event sequences on individual links, their correlations with neighboring sequences, and the temporal pathways taken by the network-scale spreading process. This is achieved by studying empirically and analytically different characteristic relay times of links, relevant to the respective scales, and a set of temporal reference models that allow for removing selected time-domain correlations one by one.

preprint2011arXiv

Temporal motifs in time-dependent networks

Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as networks of telecommunication, neural signal processing, biochemical reactions and human social interactions. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to colored directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

preprint2011arXiv

Universal features of correlated bursty behaviour

Inhomogeneous temporal processes, like those appearing in human communications, neuron spike trains, and seismic signals, consist of high-activity bursty intervals alternating with long low-activity periods. In recent studies such bursty behavior has been characterized by a fat-tailed inter-event time distribution, while temporal correlations were measured by the autocorrelation function. However, these characteristic functions are not capable to fully characterize temporally correlated heterogenous behavior. Here we show that the distribution of the number of events in a bursty period serves as a good indicator of the dependencies, leading to the universal observation of power-law distribution in a broad class of phenomena. We find that the correlations in these quite different systems can be commonly interpreted by memory effects and described by a simple phenomenological model, which displays temporal behavior qualitatively similar to that in real systems.

preprint2011arXiv

Using explosive percolation in analysis of real-world networks

We apply a variant of the explosive percolation procedure to large real-world networks, and show with finite-size scaling that the university class, ordinary or explosive, of the resulting percolation transition depends on the structural properties of the network as well as the number of unoccupied links considered for comparison in our procedure. We observe that in our social networks, the percolation clusters close to the critical point are related to the community structure. This relationship is further highlighted by applying the procedure to model networks with pre-defined communities.

preprint2010arXiv

Modelling opinion formation driven communities in social networks

In a previous paper we proposed a model to study the dynamics of opinion formation in human societies by a co-evolution process involving two distinct time scales of fast transaction and slower network evolution dynamics. In the transaction dynamics we take into account short range interactions as discussions between individuals and long range interactions to describe the attitude to the overall mood of society. The latter is handled by a uniformly distributed parameter $α$, assigned randomly to each individual, as quenched personal bias. The network evolution dynamics is realized by rewiring the societal network due to state variable changes as a result of transaction dynamics. The main consequence of this complex dynamics is that communities emerge in the social network for a range of values in the ratio between time scales. In this paper we focus our attention on the attitude parameter $α$ and its influence on the conformation of opinion and the size of the resulting communities. We present numerical studies and extract interesting features of the model that can be interpreted in terms of social behaviour.

preprint2010arXiv

Phase change in an opinion-dynamics model with separation of time scales

We define an opinion formation model of agents in a 1d ring, where the opinion of an agent evolves due to its interactions with close neighbors and due to its either positive or negative attitude toward the overall mood of all the other agents. While the dynamics of the agent's opinion is described with an appropriate differential equation, from time to time pairs of agents are allowed to change their locations to improve the homogeneity of opinion (or comfort feeling) with respect to their short range environment. In this way the time scale of transaction dynamics and that of environment update are well separated and controlled by a single parameter. By varying this parameter we discovered a phase change in the number of undecided individuals. This phenomenon arises from the fact that too frequent location exchanges among agents result in frustration in their opinion formation. Our mean field analysis supports this picture.

preprint2009arXiv

Long-term correlations and multifractal analysis of trading volumes for Chinese stocks

We investigate the temporal correlations and multifractal nature of trading volume of 22 liquid stocks traded on the Shenzhen Stock Exchange in 2003. We find that the trading volume exhibit size-dependent non-universal long memory and multifractal nature. No crossover in the power-law dependence of the detrended fluctuation functions is observed. Our results show that the intraday pattern in the trading volume has negligible impact on the long memory and multifractality.

preprint2009arXiv

Opinion and community formation in coevolving networks

In human societies opinion formation is mediated by social interactions, consequently taking place on a network of relationships and at the same time influencing the structure of the network and its evolution. To investigate this coevolution of opinions and social interaction structure we develop a dynamic agent-based network model, by taking into account short range interactions like discussions between individuals, long range interactions like a sense for overall mood modulated by the attitudes of individuals, and external field corresponding to outside influence. Moreover, individual biases can be naturally taken into account. In addition the model includes the opinion dependent link-rewiring scheme to describe network topology coevolution with a slower time scale than that of the opinion formation. With this model comprehensive numerical simulations and mean field calculations have been carried out and they show the importance of the separation between fast and slow time scales resulting in the network to organize as well-connected small communities of agents with the same opinion.