Researcher profile

Zhaoyang Shao

Zhaoyang Shao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 11 - UnverifiedVerification L1Unclaimed author
1works
0followers
1topics
2close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2020arXiv

Efficient Error-tolerant Search on Knowledge Graphs

Edge-labeled graphs are widely used to describe relationships between entities in a database. Given a query subgraph that represents an example of what the user is searching for, we study the problem of efficiently searching for similar subgraphs in a large data graph, where the similarity is defined in terms of the well-known graph edit distance. We call these queries "error-tolerant exemplar queries" since matches are allowed despite small variations in the graph structure and the labels. The problem in its general case is computationally intractable, but efficient solutions are reachable for labeled graphs under well-behaved distribution of the labels, commonly found in knowledge graphs. We propose two efficient exact algorithms, based on a filtering-and-verification framework, for finding subgraphs in a large data graph that are isomorphic to a query graph under some edit operations. Our filtering scheme, which uses the neighbourhood structure around a node and the presence or absence of paths, significantly reduces the number of candidates that are passed to the verification stage. Moreover, we analyze the costs of our algorithms and the conditions under which one algorithm is expected to outperform the other. Our analysis identifies some of the variables that affect the cost, including the number and the selectivity of query edge labels and the degree of nodes in the data graph, and characterizes their relationships. We empirically evaluate the effectiveness of our filtering schemes and queries, the efficiency of our algorithms and the reliability of our cost models on real datasets.