Source author record

Shi Zhou

Shi Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

20works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Modelling the Spread of New Information on X

There has been considerable interest in modelling the spread of information on X (formerly Twitter) using machine learning models. Here, we consider the problem of predicting the reposting of new information, i.e., when a user propagates information about a topic previously unseen by the user. In existing work, information and users are randomly assigned to a test or training set, ensuring that both sets are drawn from the same distribution. In the spread of new information, the problem becomes an out-of-distribution classification task. Our experimental results reveal that while existing algorithms, which predominantly use features derived from the content of posts, perform well when the training and test distributions are the same, they perform much worse when the test set is out-of-distribution, i.e., when the topic of the testing data is absent from the training data. We then show that if the post features are supplemented or replaced with features derived from user profiles and past behaviours, the out-of-distribution prediction is greatly improved, with the F1 score increasing from 0.117 to 0.705. Our experimental results suggest that a significant component of reposting behaviour for previously unseen topics can be predicted from user profiles and past behaviours, and is largely content-agnostic.

preprint2022arXiv

BGP-Multipath Routing in the Internet

BGP-Multipath (BGP-M) is a multipath routing technique for load balancing. Distinct from other techniques deployed at a router inside an Autonomous System (AS), BGP-M is deployed at a border router that has installed multiple inter-domain border links to a neighbour AS. It uses the equal-cost multi-path (ECMP) function of a border router to share traffic to a destination prefix on different border links. Despite recent research interests in multipath routing, there is little study on BGP-M. Here we provide the first measurement and a comprehensive analysis of BGP-M routing in the Internet. We extracted information on BGP-M from query data collected from Looking Glass (LG) servers. We revealed that BGP-M has already been extensively deployed and used in the Internet. A particular example is Hurricane Electric (AS6939), a Tier-1 network operator, which has implemented >1,000 cases of BGP-M at 69 of its border routers to prefixes in 611 of its neighbour ASes, including many hyper-giant ASes and large content providers, on both IPv4 and IPv6 Internet. We examined the distribution and operation of BGP-M. We also ran traceroute using RIPE Atlas to infer the routing paths, the schemes of traffic allocation, and the delay on border links. This study provided the state-of-the-art knowledge on BGP-M with novel insights into the unique features and the distinct advantages of BGP-M as an effective and readily available technique for load balancing.

preprint2022arXiv

Reliable and Broad-range Layer Identification of Au-assisted Exfoliated Large Area MoS$_2$ and WS$_2$ Using Reflection Spectroscopic Fingerprints

The emerging Au-assisted exfoliation technique provides a wealth of large-area and high-quality ultrathin two-dimensional (2D) materials compared with traditional tape-based exfoliation. Fast, damage-free, and reliable determination of the layer number of such 2D films is essential to study layer-dependent physics and promote device applications. Here, an optical method has been developed for simple, high throughput, and accurate determination of the layer number for Au-assisted exfoliated MoS$_2$ and WS$_2$ films in a broad thickness range. The method is based on quantitative analysis of layer-dependent white light reflection spectra, revealing that the reflection peak intensity can be used as a clear indicator for determining the layer number. The simple yet robust method will facilitate the fundamental study on layer-dependent optical, electrical, and thermal properties and device applications of 2D materials. The technique can also be readily combined with photoluminescence and Raman spectroscopies to study other layer-dependent physical properties of 2D materials.

preprint2022arXiv

Towards control of opinion diversity by introducing zealots into a polarised social group

We explore a method to influence or even control the diversity of opinions within a polarised social group. We leverage the voter model in which users hold binary opinions and repeatedly update their beliefs based on others they connect with. Stubborn agents who never change their minds ("zealots") are also disseminated through the network, which is modelled by a connected graph. Building on earlier results, we provide a closed-form expression for the average opinion of the group at equilibrium. This leads us to a strategy to inject zealots into a polarised network in order to shift the average opinion towards any target value. We account for the possible presence of a backfire effect, which may lead the group to react negatively and reinforce its level of polarisation in response. Our results are supported by numerical experiments on synthetic data.

preprint2020arXiv

Mining the Automotive Industry: A Network Analysis of Corporate Positioning and Technological Trends

The digital transformation is driving revolutionary innovations and new market entrants threaten established sectors of the economy such as the automotive industry. Following the need for monitoring shifting industries, we present a network-centred analysis of car manufacturer web pages. Solely exploiting publicly-available information, we construct large networks from web pages and hyperlinks. The network properties disclose the internal corporate positioning of the three largest automotive manufacturers, Toyota, Volkswagen and Hyundai with respect to innovative trends and their international outlook. We tag web pages concerned with topics like e-mobility and environment or autonomous driving, and investigate their relevance in the network. Sentiment analysis on individual web pages uncovers a relationship between page linking and use of positive language, particularly with respect to innovative trends. Web pages of the same country domain form clusters of different size in the network that reveal strong correlations with sales market orientation. Our approach maintains the web content's hierarchical structure imposed by the web page networks. It, thus, presents a method to reveal hierarchical structures of unstructured text content obtained from web scraping. It is highly transparent, reproducible and data driven, and could be used to gain complementary insights into innovative strategies of firms and competitive landscapes, which would not be detectable by the analysis of web content alone.

preprint2015arXiv

Hybrid Epidemics - A Case Study on Computer Worm Conficker

Conficker is a computer worm that erupted on the Internet in 2008. It is unique in combining three different spreading strategies: local probing, neighbourhood probing, and global probing. We propose a mathematical model that combines three modes of spreading, local, neighbourhood and global to capture the worm's spreading behaviour. The parameters of the model are inferred directly from network data obtained during the first day of the Conifcker epidemic. The model is then used to explore the trade-off between spreading modes in determining the worm's effectiveness. Our results show that the Conficker epidemic is an example of a critically hybrid epidemic, in which the different modes of spreading in isolation do not lead to successful epidemics. Such hybrid spreading strategies may be used beneficially to provide the most effective strategies for promulgating information across a large population. When used maliciously, however, they can present a dangerous challenge to current internet security protocols.

preprint2015arXiv

Hybrid spreading mechanisms and T cell activation shape the dynamics of HIV-1 infection

HIV-1 can disseminate between susceptible cells by two mechanisms: cell-free infection following fluid-phase diffusion of virions and by highly-efficient direct cell-to-cell transmission at immune cell contacts. The contribution of this hybrid spreading mechanism, which is also a characteristic of some important computer worm outbreaks, to HIV-1 progression in vivo remains unknown. Here we present a new mathematical model that explicitly incorporates the ability of HIV-1 to use hybrid spreading mechanisms and evaluate the consequences for HIV-1 pathogenenesis. The model captures the major phases of the HIV-1 infection course of a cohort of treatment naive patients and also accurately predicts the results of the Short Pulse Anti-Retroviral Therapy at Seroconversion (SPARTAC) trial. Using this model we find that hybrid spreading is critical to seed and establish infection, and that cell-to-cell spread and increased CD4+ T cell activation are important for HIV-1 progression. Notably, the model predicts that cell-to-cell spread becomes increasingly effective as infection progresses and thus may present a considerable treatment barrier. Deriving predictions of various treatments' influence on HIV-1 progression highlights the importance of earlier intervention and suggests that treatments effectively targeting cell-to-cell HIV-1 spread can delay progression to AIDS. This study suggests that hybrid spreading is a fundamental feature of HIV infection, and provides the mathematical framework incorporating this feature with which to evaluate future therapeutic strategies.

preprint2015arXiv

LeoTask: a fast, flexible and reliable framework for computational research

LeoTask is a Java library for computation-intensive and time-consuming research tasks. It automatically executes tasks in parallel on multiple CPU cores on a computing facility. It uses a configuration file to enable automatic exploration of parameter space and flexible aggregation of results, and therefore allows researchers to focus on programming the key logic of a computing task. It also supports reliable recovery from interruptions, dynamic and cloneable networks, and integration with the plotting software Gnuplot.

preprint2015arXiv

Optimizing Hybrid Spreading in Metapopulations

Epidemic spreading phenomena are ubiquitous in nature and society. Examples include the spreading of diseases, information, and computer viruses. Epidemics can spread by local spreading, where infected nodes can only infect a limited set of direct target nodes and global spreading, where an infected node can infect every other node. In reality, many epidemics spread using a hybrid mixture of both types of spreading. In this study we develop a theoretical framework for studying hybrid epidemics, and examine the optimum balance between spreading mechanisms in terms of achieving the maximum outbreak size. We show the existence of critically hybrid epidemics where neither spreading mechanism alone can cause a noticeable spread but a combination of the two spreading mechanisms would produce an enormous outbreak. Our results provide new strategies for maximising beneficial epidemics and estimating the worst outcome of damaging hybrid epidemics.

preprint2014arXiv

Emergence of Cooperation in Non-scale-free Networks

Evolutionary game theory is one of the key paradigms behind many scientific disciplines from science to engineering. Previous studies proposed a strategy updating mechanism, which successfully demonstrated that the scale-free network can provide a framework for the emergence of cooperation. Instead, individuals in random graphs and small-world networks do not favor cooperation under this updating rule. However, a recent empirical result shows the heterogeneous networks do not promote cooperation when humans play a Prisoner's Dilemma. In this paper, we propose a strategy updating rule with payoff memory. We observe that the random graphs and small-world networks can provide even better frameworks for cooperation than the scale-free networks in this scenario. Our observations suggest that the degree heterogeneity may be neither a sufficient condition nor a necessary condition for the widespread cooperation in complex networks. Also, the topological structures are not sufficed to determine the level of cooperation in complex networks.

preprint2014arXiv

Fence-sitters Protect Cooperation in Complex Networks

Evolutionary game theory is one of the key paradigms behind many scientific disciplines from science to engineering. In complex networks, because of the difficulty of formulating the replicator dynamics, most of previous studies are confined to a numerical level. In this paper, we introduce a vectorial formulation to derive three classes of individuals' payoff analytically. The three classes are pure cooperators, pure defectors, and fence-sitters. Here, fence-sitters are the individuals who change their strategies at least once in the strategy evolutionary process. As a general approach, our vectorial formalization can be applied to all the two-strategies games. To clarify the function of the fence-sitters, we define a parameter, payoff memory, as the number of rounds that the individuals' payoffs are aggregated. We observe that the payoff memory can control the fence-sitters' effects and the level of cooperation efficiently. Our results indicate that the fence-sitters' role is nontrivial in the complex topologies, which protects cooperation in an indirect way. Our results may provide a better understanding of the composition of cooperators in a circumstance where the temptation to defect is larger.

preprint2013arXiv

Rumor Evolution in Social Networks

Social network is a main tunnel of rumor spreading. Previous studies are concentrated on a static rumor spreading. The content of the rumor is invariable during the whole spreading process. Indeed, the rumor evolves constantly in its spreading process, which grows shorter, more concise, more easily grasped and told. In an early psychological experiment, researchers found about 70% of details in a rumor were lost in the first 6 mouth-to-mouth transmissions \cite{TPR}. Based on the facts, we investigate rumor spreading on social networks, where the content of the rumor is modified by the individuals with a certain probability. In the scenario, they have two choices, to forward or to modify. As a forwarder, an individual disseminates the rumor directly to its neighbors. As a modifier, conversely, an individual revises the rumor before spreading it out. When the rumor spreads on the social networks, for instance, scale-free networks and small-world networks, the majority of individuals actually are infected by the multi-revised version of the rumor, if the modifiers dominate the networks. Our observation indicates that the original rumor may lose its influence in the spreading process. Similarly, a true information may turn to be a rumor as well. Our result suggests the rumor evolution should not be a negligible question, which may provide a better understanding of the generation and destruction of a rumor.

preprint2011arXiv

Diffusion-annihilation proecesses in weighted scale-free networks with identical degree sequence

The studies based on $A+A \rightarrow \emptyset$ and $A+B\rightarrow \emptyset$ diffusion-annihilation processes have so far been studied on weighted uncorrelated scale-free networks and fractal scale-free networks. In the previous reports, it is widely accepted that the segregation of particles in the processes is introduced by the fractal structure. In this paper, we study these processes on a family of weighted scale-free networks with identical degree sequence. We find that the depletion zone and segregation are essentially caused by the disassortative mixing, namely, high-degree nodes tend to connect with low-degree nodes. Their influence on the processes is governed by the correlation between the weight and degree. Our finding suggests both the weight and degree distribution don't suffice to characterize the diffusion-annihilation processes on weighted scale-free networks.

preprint2011arXiv

Inferring Internet AS Relationships Based on BGP Routing Policies

The type of business relationships between the Internet autonomous systems (AS) determines the BGP inter-domain routing. Previous works on inferring AS relationships relied on the connectivity information between ASes. In this paper we infer AS relationships by analysing the routing polices of ASes encoded in the BGP attributes Communities and the Locpref. We accumulate BGP data from RouteViews, RIPE RIS and the public Route Servers in August 2010 and February 2011. Based on the routing policies extracted from data of the two BGP attributes, we obtain AS relationships for 39% links in our data, which include all links among the Tier-1 ASes and most links between Tier-1 and Tier-2 ASes. We also reveal a number of special AS relationships, namely the hybrid relationship, the partial-transit relationship, the indirect peering relationship and the backup links. These special relationships are relevant to a better understanding of the Internet routing. Our work provides a profound methodological progress for inferring the AS relationships.

preprint2011arXiv

Traffic Fluctuations on Weighted Networks

Traffic fluctuation has so far been studied on unweighted networks. However many real traffic systems are better represented as weighted networks, where nodes and links are assigned a weight value representing their physical properties such as capacity and delay. Here we introduce a general random diffusion (GRD) model to investigate the traffic fluctuation in weighted networks, where a random walk's choice of route is affected not only by the number of links a node has, but also by the weight of individual links. We obtain analytical solutions that characterise the relation between the average traffic and the fluctuation through nodes and links. Our analysis is supported by the results of numerical simulations. We observe that the value ranges of the average traffic and the fluctuation, through nodes or links, increase dramatically with the level of heterogeneity in link weight. This highlights the key role that link weight plays in traffic fluctuation and the necessity to study traffic fluctuation on weighted networks.

preprint2010arXiv

Phase Changes in the Evolution of the IPv4 and IPv6 AS-Level Internet Topologies

In this paper we investigate the evolution of the IPv4 and IPv6 Internet topologies at the autonomous system (AS) level over a long period of time.We provide abundant empirical evidence that there is a phase transition in the growth trend of the two networks. For the IPv4 network, the phase change occurred in 2001. Before then the network's size grew exponentially, and thereafter it followed a linear growth. Changes are also observed around the same time for the maximum node degree, the average node degree and the average shortest path length. For the IPv6 network, the phase change occurred in late 2006. It is notable that the observed phase transitions in the two networks are different, for example the size of IPv6 network initially grew linearly and then shifted to an exponential growth. Our results show that following decades of rapid expansion up to the beginning of this century, the IPv4 network has now evolved into a mature, steady stage characterised by a relatively slow growth with a stable network structure; whereas the IPv6 network, after a slow startup process, has just taken off to a full speed growth. We also provide insight into the possible impact of IPv6-over-IPv4 tunneling deployment scheme on the evolution of the IPv6 network. The Internet topology generators so far are based on an inexplicit assumption that the evolution of Internet follows non-changing dynamic mechanisms. This assumption, however, is invalidated by our results.Our work reveals insights into the Internet evolution and provides inputs to future AS-Level Internet models.

preprint2010arXiv

Why the Internet is so 'small'?

During the last three decades the Internet has experienced fascinating evolution, both exponential growth in traffic and rapid expansion in topology. The size of the Internet becomes enormous, yet the network is very `small' in the sense that it is extremely efficient to route data packets across the global Internet. This paper provides a brief review on three fundamental properties of the Internet topology at the autonomous systems (AS) level. Firstly the Internet has a power-law degree distribution, which means the majority of nodes on the Internet AS graph have small numbers of links, whereas a few nodes have very large numbers of links. Secondly the Internet exhibits a property called disassortative mixing, which means poorly-connected nodes tend to link with well-connected nodes, and vice versa. Thirdly the best-connected nodes, or the rich nodes, are tightly interconnected with each other forming a rich-club. We explain that it is these structural properties that make the global Internet so 'small'.

preprint2009arXiv

A critical look at power law modelling of the Internet

This paper takes a critical look at the usefulness of power law models of the Internet. The twin focuses of the paper are Internet traffic and topology generation. The aim of the paper is twofold. Firstly it summarises the state of the art in power law modelling particularly giving attention to existing open research questions. Secondly it provides insight into the failings of such models and where progress needs to be made for power law research to feed through to actual improvements in network performance.

preprint2008arXiv

Fingerprint for Network Topologies

A network's topology information can be given as an adjacency matrix. The bitmap of sorted adjacency matrix(BOSAM) is a network visualisation tool which can emphasise different network structures by just looking at reordered adjacent matrixes. A BOSAM picture resembles the shape of a flower and is characterised by a series of 'leaves'. Here we show and mathematically prove that for most networks, there is a self-similar relation between the envelope of the BOSAM leaves. This self-similar property allows us to use a single envelope to predict all other envelopes and therefore reconstruct the outline of a network's BOSAM picture. We analogise the BOSAM envelope to human's fingerprint as they share a number of common features, e.g. both are simple, easy to obtain, and strongly characteristic encoding essential information for identification.

preprint2007arXiv

Characterising Web Site Link Structure

The topological structures of the Internet and the Web have received considerable attention. However, there has been little research on the topological properties of individual web sites. In this paper, we consider whether web sites (as opposed to the entire Web) exhibit structural similarities. To do so, we exhaustively crawled 18 web sites as diverse as governmental departments, commercial companies and university departments in different countries. These web sites consisted of as little as a few thousand pages to millions of pages. Statistical analysis of these 18 sites revealed that the internal link structure of the web sites are significantly different when measured with first and second-order topological properties, i.e. properties based on the connectivity of an individual or a pairs of nodes. However, examination of a third-order topological property that consider the connectivity between three nodes that form a triangle, revealed a strong correspondence across web sites, suggestive of an invariant. Comparison with the Web, the AS Internet, and a citation network, showed that this third-order property is not shared across other types of networks. Nor is the property exhibited in generative network models such as that of Barabasi and Albert.

Shi Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Modelling the Spread of New Information on X

BGP-Multipath Routing in the Internet

Reliable and Broad-range Layer Identification of Au-assisted Exfoliated Large Area MoS$_2$ and WS$_2$ Using Reflection Spectroscopic Fingerprints

Towards control of opinion diversity by introducing zealots into a polarised social group

Mining the Automotive Industry: A Network Analysis of Corporate Positioning and Technological Trends

Hybrid Epidemics - A Case Study on Computer Worm Conficker

Hybrid spreading mechanisms and T cell activation shape the dynamics of HIV-1 infection

LeoTask: a fast, flexible and reliable framework for computational research

Optimizing Hybrid Spreading in Metapopulations

Emergence of Cooperation in Non-scale-free Networks

Fence-sitters Protect Cooperation in Complex Networks

Rumor Evolution in Social Networks

Diffusion-annihilation proecesses in weighted scale-free networks with identical degree sequence

Inferring Internet AS Relationships Based on BGP Routing Policies

Traffic Fluctuations on Weighted Networks

Phase Changes in the Evolution of the IPv4 and IPv6 AS-Level Internet Topologies

Why the Internet is so 'small'?

A critical look at power law modelling of the Internet

Fingerprint for Network Topologies

Characterising Web Site Link Structure