Researcher profile

Alexander Y. Shestopaloff

Alexander Y. Shestopaloff contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Statistical network isomorphism

Graph isomorphism is a problem for which there is no known polynomial-time solution. Nevertheless, assessing (dis)similarity between two or more networks is a key task in many areas, such as image recognition, biology, chemistry, computer and social networks. Moreover, questions of similarity are typically more general and their answers more widely applicable than the more restrictive isomorphism question. In this article, we offer a statistical answer to the following questions: a) {\it ``Are networks $G_1$ and $G_2$ similar?''}, b) {\it ``How different are the networks $G_1$ and $G_2$?''} and c) {\it ``Is $G_3$ more similar to $G_1$ or $G_2$?''}. Our comparisons begin with the transformation of each graph into an all-pairs distance matrix. Our node-node distance, Jaccard distance, has been shown to offer a good reflection of the graph's connectivity structure. We then model these distances as probability distributions. Finally, we use well-established statistical tools to gauge the (dis)similarities in terms of probability distribution (dis)similarity. This comparison procedure aims to detect (dis)similarities in connectivity structure, not in easily observable graph characteristics, such as degrees, edge counts or density. We validate our hypothesis that graphs can be meaningfully summarized and compared via their node-node distance distributions, using several synthetic and real-world graphs. Empirical results demonstrate its validity and the accuracy of our comparison technique.

preprint2020arXiv

A Statistical Density-Based Analysis of Graph Clustering Algorithm Performance

Measuring graph clustering quality remains an open problem. To address it, we introduce quality measures based on comparisons of intra- and inter-cluster densities, an accompanying statistical test of the significance of their differences and a step-by-step routine for clustering quality assessment. Our null hypothesis does not rely on any generative model for the graph, unlike modularity which uses the configuration model as a null model. Our measures are shown to meet the axioms of a good clustering quality function, unlike the very commonly used modularity measure. They also have an intuitive graph-theoretic interpretation, a formal statistical interpretation and can be easily tested for significance. Our work is centered on the idea that well clustered graphs will display a significantly larger intra-cluster density than inter-cluster density. We develop tests to validate the existence of such a cluster structure. We empirically explore the behavior of our measures under a number of stress test scenarios and compare their behavior to the commonly used modularity and conductance measures. Empirical stress test results confirm that our measures compare very favorably to the established ones. In particular, they are shown to be more responsive to graph structure and less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns.

preprint2020arXiv

Graph Distances and Clustering

With a view on graph clustering, we present a definition of vertex-to-vertex distance which is based on shared connectivity. We argue that vertices sharing more connections are closer to each other than vertices sharing fewer connections. Our thesis is centered on the widely accepted notion that strong clusters are formed by high levels of induced subgraph density, where subgraphs represent clusters. We argue these clusters are formed by grouping vertices deemed to be similar in their connectivity. At the cluster level (induced subgraph level), our thesis translates into low mean intra-cluster distances. Our definition differs from the usual shortest-path geodesic distance. In this article, we compare three distance measures from the literature. Our benchmark is the accuracy of each measure's reflection of intra-cluster density, when aggregated (averaged) at the cluster level. We conduct our tests on synthetic graphs generated using the planted partition model, where clusters and intra-cluster density are known in advance. We examine correlations between mean intra-cluster distances and intra-cluster densities. Our numerical experiments show that Jaccard and Otsuka-Ochiai offer very accurate measures of density, when averaged over vertex pairs within clusters.