Researcher profile

Emily Smith

Emily Smith contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
2close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2015arXiv

Estimating Subgraph Frequencies with or without Attributes from Egocentrically Sampled Data

In this paper we show how to efficiently produce unbiased estimates of subgraph frequencies from a probability sample of egocentric networks (i.e., focal nodes, their neighbors, and the induced subgraphs of ties among their neighbors). A key feature of our proposed method that differentiates it from prior methods is the use of egocentric data. Because of this, our method is suitable for estimation in large unknown graphs, is easily parallelizable, handles privacy sensitive network data (e.g. egonets with no neighbor labels), and supports counting of large subgraphs (e.g. maximal clique of size 205 in Section 6) by building on top of existing exact subgraph counting algorithms that may not support sampling. It gracefully handles a variety of sampling designs such as uniform or weighted independence or random walk sampling. Our method can be used for subgraphs that are: (i) undirected or directed; (ii) induced or non-induced; (iii) maximal or non-maximal; and (iv) potentially annotated with attributes. We compare our estimators on a variety of real-world graphs and sampling methods and provide suggestions for their use. Simulation shows that our method outperforms the state-of-the-art approach for relative subgraph frequencies by up to an order of magnitude for the same sample size. Finally, we apply our methodology to a rare sample of Facebook users across the social graph to estimate and interpret the clique size distribution and gender composition of cliques.

preprint2013arXiv

Estimating Clique Composition and Size Distributions from Sampled Network Data

Cliques are defined as complete graphs or subgraphs; they are the strongest form of cohesive subgroup, and are of interest in both social science and engineering contexts. In this paper we show how to efficiently estimate the distribution of clique sizes from a probability sample of nodes obtained from a graph (e.g., by independence or link-trace sampling). We introduce two types of unbiased estimators, one of which exploits labeling of sampled nodes neighbors and one of which does not require this information. We compare the estimators on a variety of real-world graphs and provide suggestions for their use. We generalize our estimators to cases in which cliques are distinguished not only by size but also by node attributes, allowing us to estimate clique composition by size. Finally, we apply our methodology to a sample of Facebook users to estimate the clique size distribution by gender over the social graph.