Researcher profile

Steve Gregory

Steve Gregory contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
11works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2016arXiv

Link Prediction with Node Clustering Coefficient

Predicting missing links in incomplete complex networks efficiently and accurately is still a challenging problem. The recently proposed CAR (Cannistrai-Alanis-Ravai) index shows the power of local link/triangle information in improving link-prediction accuracy. With the information of level-2 links, which are links between common-neighbors, most classical similarity indices can be improved. Nevertheless, calculating the number of level-2 links makes CAR index not efficient enough. Inspired by the idea of employing local link/triangle information, we propose a new similarity index with more local structure information. In our method, local link/triangle structure information can be conveyed by clustering coefficient of common neighbors directly. The reason why clustering coefficient has good effectiveness in estimating the contribution of a common-neighbor is because that it employs links existing between neighbors of the common-neighbor and these links have the same structural position with the candidate link to this common-neighbor. Ten real-world networks drawn from five various fields are used to test the performance of our method against to classical similarity indices and recently proposed CAR index. Two estimators: precision and AUP, are used to evaluate the accuracy of link prediction algorithms. Generally speaking, our new index only performs competitively with CAR, but it is a good complement to CAR for networks with not very high LCP-corr, which is a measure to estimate the correlation between number of common-neighbors and number of links between common-neighbors. Besides, the proposed index is also more efficient than CAR index.

preprint2013arXiv

Efficient local behavioral change strategies to reduce the spread of epidemics in networks

It has recently become established that the spread of infectious diseases between humans is affected not only by the pathogen itself but also by changes in behavior as the population becomes aware of the epidemic; for example, social distancing. It is also well known that community structure (the existence of relatively densely connected groups of vertices) in contact networks influences the spread of disease. We propose a set of local strategies for social distancing, based on community structure, that can be employed in the event of an epidemic to reduce the epidemic size. Unlike most social distancing methods, ours do not require individuals to know the disease state (infected or susceptible, etc.) of others, and we do not make the unrealistic assumption that the structure of the entire contact network is known. Instead, the recommended behavior change is based only on an individual's local view of the network. Each individual avoids contact with a fraction of his/her contacts, using knowledge of his/her local network to decide which contacts should be avoided. If the behavior change occurs only when an individual becomes ill or aware of the disease, these strategies can substantially reduce epidemic size with a relatively small cost, measured by the number of contacts avoided.

preprint2013arXiv

Identifying Communities and Key Vertices by Reconstructing Networks from Samples

Sampling techniques such as Respondent-Driven Sampling (RDS) are widely used in epidemiology to sample "hidden" populations, such that properties of the network can be deduced from the sample. We consider how similar techniques can be designed that allow the discovery of the structure, especially the community structure, of networks. Our method involves collecting samples of a network by random walks and reconstructing the network by probabilistically coalescing vertices, using vertex attributes to determine the probabilities. Even though our method can only approximately reconstruct a part of the original network, it can recover its community structure relatively well. Moreover, it can find the key vertices which, when immunized, can effectively reduce the spread of an infection through the original network.

preprint2013arXiv

Inferring High Quality Co-Travel Networks

Social networks provide a new perspective for enterprises to better understand their customers and have attracted substantial attention in industry. However, inferring high quality customer social networks is a great challenge while there are no explicit customer relations in many traditional OLTP environments. In this paper, we study this issue in the field of passenger transport and introduce a new member to the family of social networks, which is named Co-Travel Networks, consisting of passengers connected by their co-travel behaviors. We propose a novel method to infer high quality co-travel networks of civil aviation passengers from their co-booking behaviors derived from the PNRs (Passenger Naming Records). In our method, to accurately evaluate the strength of ties, we present a measure of Co-Journey Times to count the co-travel times of complete journeys between passengers. We infer a high quality co-travel network based on a large encrypted PNR dataset and conduct a series of network analyses on it. The experimental results show the effectiveness of our inferring method, as well as some special characteristics of co-travel networks, such as the sparsity and high aggregation, compared with other kinds of social networks. It can be expected that such co-travel networks will greatly help the industry to better understand their passengers so as to improve their services. More importantly, we contribute a special kind of social networks with high strength of ties generated from very close and high cost travel behaviors, for further scientific researches on human travel behaviors, group travel patterns, high-end travel market evolution, etc., from the perspective of social networks.

preprint2012arXiv

Detecting Communities in Networks by Merging Cliques

Many algorithms have been proposed for detecting disjoint communities (relatively densely connected subgraphs) in networks. One popular technique is to optimize modularity, a measure of the quality of a partition in terms of the number of intracommunity and intercommunity edges. Greedy approximate algorithms for maximizing modularity can be very fast and effective. We propose a new algorithm that starts by detecting disjoint cliques and then merges these to optimize modularity. We show that this performs better than other similar algorithms in terms of both modularity and execution speed.

preprint2012arXiv

Detecting community structure in networks using edge prediction methods

Community detection and edge prediction are both forms of link mining: they are concerned with discovering the relations between vertices in networks. Some of the vertex similarity measures used in edge prediction are closely related to the concept of community structure. We use this insight to propose a novel method for improving existing community detection algorithms by using a simple vertex similarity measure. We show that this new strategy can be more effective in detecting communities than the basic community detection algorithms.

preprint2012arXiv

Finding missing edges in networks based on their community structure

Many edge prediction methods have been proposed, based on various local or global properties of the structure of an incomplete network. Community structure is another significant feature of networks: Vertices in a community are more densely connected than average. It is often true that vertices in the same community have "similar" properties, which suggests that missing edges are more likely to be found within communities than elsewhere. We use this insight to propose a strategy for edge prediction that combines existing edge prediction methods with community detection. We show that this method gives better prediction accuracy than existing edge prediction methods alone.

preprint2012arXiv

Ordered community structure in networks

Community structure in networks is often a consequence of homophily, or assortative mixing, based on some attribute of the vertices. For example, researchers may be grouped into communities corresponding to their research topic. This is possible if vertex attributes have discrete values, but many networks exhibit assortative mixing by some continuous-valued attribute, such as age or geographical location. In such cases, no discrete communities can be identified. We consider how the notion of community structure can be generalized to networks that are based on continuous-valued attributes: in general, a network may contain discrete communities which are ordered according to their attribute values. We propose a method of generating synthetic ordered networks and investigate the effect of ordered community structure on the spread of infectious diseases. We also show that community detection algorithms fail to recover community structure in ordered networks, and evaluate an alternative method using a layout algorithm to recover the ordering.

preprint2011arXiv

Finding missing edges and communities in incomplete networks

Many algorithms have been proposed for predicting missing edges in networks, but they do not usually take account of which edges are missing. We focus on networks which have missing edges of the form that is likely to occur in real networks, and compare algorithms that find these missing edges. We also investigate the effect of this kind of missing data on community detection algorithms.

preprint2011arXiv

Fuzzy overlapping communities in networks

Networks commonly exhibit a community structure, whereby groups of vertices are more densely connected to each other than to other vertices. Often these communities overlap, such that each vertex may occur in more than one community. However, two distinct types of overlapping are possible: crisp (where each vertex belongs fully to each community of which it is a member) and fuzzy (where each vertex belongs to each community to a different extent). We investigate the effects of the fuzziness of community overlap. We find that it has a strong effect on the performance of community detection methods: some algorithms perform better with fuzzy overlapping while others favour crisp overlapping. We also evaluate the performance of some algorithms that recover the belonging coefficients when the overlap is fuzzy. Finally, we investigate whether real networks contain fuzzy or crisp overlapping.

preprint2010arXiv

Finding overlapping communities in networks by label propagation

We propose an algorithm for finding overlapping community structure in very large networks. The algorithm is based on the label propagation technique of Raghavan, Albert, and Kumara, but is able to detect communities that overlap. Like the original algorithm, vertices have labels that propagate between neighbouring vertices so that members of a community reach a consensus on their community membership. Our main contribution is to extend the label and propagation step to include information about more than one community: each vertex can now belong to up to v communities, where v is the parameter of the algorithm. Our algorithm can also handle weighted and bipartite networks. Tests on an independently designed set of benchmarks, and on real networks, show the algorithm to be highly effective in recovering overlapping communities. It is also very fast and can process very large and dense networks in a short time.