Source author record

Mark S. Handcock

Mark S. Handcock appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Computation physics.soc-ph Social and Information Networks stat.OT

Catalog footprint

What is connected

13works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Approach to Causal Inference over Stochastic Networks

Claiming causal inferences in network settings necessitates careful consideration of the often complex dependency between outcomes for actors. Of particular importance are treatment spillover or outcome interference effects. We consider causal inference when the actors are connected via an underlying network structure. Our key contribution is a model for causality when the underlying network is unobserved and the actor covariates evolve stochastically over time. We develop a joint model for the relational and covariate generating process that avoids restrictive separability assumptions and deterministic network assumptions that do not hold in the majority of social network settings of interest. Our framework utilizes the highly general class of Exponential-family Random Network models (ERNM) of which Markov Random Fields (MRF) and Exponential-family Random Graph models (ERGM) are special cases. We present potential outcome based inference within a Bayesian framework, and propose a simple modification to the exchange algorithm to allow for sampling from ERNM posteriors. We present results of a simulation study demonstrating the validity of the approach. Finally, we demonstrate the value of the framework in a case-study of smoking over time in the context of adolescent friendship networks.

preprint2022arXiv

Population level information combined parameter estimation from complex survey datasets

We consider an empirical likelihood framework for inference for a statistical model based on an informative sampling design and population-level information. The population-level information is summarized in the form of estimating equations and incorporated into the inference through additional constraints. Covariate information is incorporated both through the weights and the estimating equations. The estimator is based on conditional weights. We show that under usual conditions, with population size increasing unbounded, the estimates are strongly consistent, asymptotically unbiased, and normally distributed. Moreover, they are more efficient than other probability-weighted analogs. Our framework provides additional justification for inverse probability weighted score estimators in terms of conditional empirical likelihood. We give an application to demographic hazard modeling by combining birth registration data with panel survey data to estimate annual first birth probabilities.

preprint2016arXiv

Spatial Temporal Exponential-Family Point Process Models for the Evolution of Social Systems

We develop a class of exponential-family point processes based on a latent social space to model the coevolution of social structure and behavior over time. Temporal dynamics are modeled as a discrete Markov process specified through individual transition distributions for each actor in the system at a given time. We prove that these distributions have an analytic closed form under certain conditions and use the result to develop likelihood-based inference. We provide a computational framework to enable both simulation and inference in practice. Finally, we demonstrate the value of these models by analyzing alcohol and drug use over time in the context of adolescent friendship networks.

preprint2013arXiv

Analysis of Partially Observed Networks via Exponential-family Random Network Models

Exponential-family random network (ERN) models specify a joint representation of both the dyads of a network and nodal characteristics. This class of models allow the nodal characteristics to be modelled as stochastic processes, expanding the range and realism of exponential-family approaches to network modelling. In this paper we develop a theory of inference for ERN models when only part of the network is observed, as well as specific methodology for missing data, including non-ignorable mechanisms for network-based sampling designs and for latent class models. In particular, we consider data collected via contact tracing, of considerable importance to infectious disease epidemiology and public health.

preprint2012arXiv

A Separable Model for Dynamic Networks

Models of dynamic networks --- networks that evolve over time --- have manifold applications. We develop a discrete-time generative model for social network evolution that inherits the richness and flexibility of the class of exponential-family random graph models. The model --- a Separable Temporal ERGM (STERGM) --- facilitates separable modeling of the tie duration distributions and the structural dynamics of tie formation. We develop likelihood-based inference for the model, and provide computational algorithms for maximum likelihood estimation. We illustrate the interpretability of the model in analyzing a longitudinal network of friendship ties within a school.

preprint2012arXiv

Estimating Hidden Population Size using Respondent-Driven Sampling Data

Respondent-Driven Sampling (RDS) is an approach to sampling design and inference in hard-to-reach human populations. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is proposed that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. A particular choice of the prior produces interval estimates with good frequentist properties. Finally, the method demonstrates sensible results when used to estimate the numbers of sub-populations most at risk for HIV in two cities in El Salvador.

preprint2012arXiv

Estimating within-school contact networks to understand influenza transmission

Many epidemic models approximate social contact behavior by assuming random mixing within mixing groups (e.g., homes, schools and workplaces). The effect of more realistic social network structure on estimates of epidemic parameters is an open area of exploration. We develop a detailed statistical model to estimate the social contact network within a high school using friendship network data and a survey of contact behavior. Our contact network model includes classroom structure, longer durations of contacts to friends than nonfriends and more frequent contacts with friends, based on reports in the contact survey. We performed simulation studies to explore which network structures are relevant to influenza transmission. These studies yield two key findings. First, we found that the friendship network structure important to the transmission process can be adequately represented by a dyad-independent exponential random graph model (ERGM). This means that individual-level sampled data is sufficient to characterize the entire friendship network. Second, we found that contact behavior was adequately represented by a static rather than dynamic contact network.

preprint2012arXiv

Exponential-family Random Network Models

Random graphs, where the connections between nodes are considered random variables, have wide applicability in the social sciences. Exponential-family Random Graph Models (ERGM) have shown themselves to be a useful class of models for representing com- plex social phenomena. We generalize ERGM by also modeling nodal attributes as random variates, thus creating a random model of the full network, which we call Exponential-family Random Network Models (ERNM). We demonstrate how this framework allows a new formu- lation for logistic regression in network data. We develop likelihood-based inference for the model and an MCMC algorithm to implement it. This new model formulation is used to analyze a peer social network from the National Lon- gitudinal Study of Adolescent Health. We model the relationship between substance use and friendship relations, and show how the results differ from the standard use of logistic regression on network data.

preprint2011arXiv

Estimating within-household contact networks from egocentric data

Acute respiratory diseases are transmitted over networks of social contacts. Large-scale simulation models are used to predict epidemic dynamics and evaluate the impact of various interventions, but the contact behavior in these models is based on simplistic and strong assumptions which are not informed by survey data. These assumptions are also used for estimating transmission measures such as the basic reproductive number and secondary attack rates. Development of methodology to infer contact networks from survey data could improve these models and estimation methods. We contribute to this area by developing a model of within-household social contacts and using it to analyze the Belgian POLYMOD data set, which contains detailed diaries of social contacts in a 24-hour period. We model dependency in contact behavior through a latent variable indicating which household members are at home. We estimate age-specific probabilities of being at home and age-specific probabilities of contact conditional on two members being at home. Our results differ from the standard random mixing assumption. In addition, we find that the probability that all members contact each other on a given day is fairly low: 0.49 for households with two 0--5 year olds and two 19--35 year olds, and 0.36 for households with two 12--18 year olds and two 36+ year olds. We find higher contact rates in households with 2--3 members, helping explain the higher influenza secondary attack rates found in households of this size.

preprint2011arXiv

Network Model-Assisted Inference from Respondent-Driven Sampling Data

Respondent-Driven Sampling is a method to sample hard-to-reach human populations by link-tracing over their social networks. Beginning with a convenience sample, each person sampled is given a small number of uniquely identified coupons to distribute to other members of the target population, making them eligible for enrollment in the study. This can be an effective means to collect large diverse samples from many populations. Inference from such data requires specialized techniques for two reasons. Unlike in standard sampling designs, the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights necessary for traditional design-based inference. Any likelihood-based inference requires the modeling of the complex sampling process often beginning with a convenience sample. We introduce a model-assisted approach, resulting in a design-based estimator leveraging a working model for the structure of the population over which sampling is conducted. We demonstrate that the new estimator has improved performance compared to existing estimators and is able to adjust for the bias induced by the selection of the initial sample. We present sensitivity analyses for unknown population sizes and the misspecification of the working network model. We develop a bootstrap procedure to compute measures of uncertainty. We apply the method to the estimation of HIV prevalence in a population of injecting drug users (IDU) in the Ukraine, and show how it can be extended to include application-specific information.

preprint2011arXiv

On the Concept of Snowball Sampling

This brief comment reflects on the historical and current uses of the term "snowball sampling."

preprint2010arXiv

Adjusting for Network Size and Composition Effects in Exponential-Family Random Graph Models

Exponential-family random graph models (ERGMs) provide a principled way to model and simulate features common in human social networks, such as propensities for homophily and friend-of-a-friend triad closure. We show that, without adjustment, ERGMs preserve density as network size increases. Density invariance is often not appropriate for social networks. We suggest a simple modification based on an offset which instead preserves the mean degree and accommodates changes in network composition asymptotically. We demonstrate that this approach allows ERGMs to be applied to the important situation of egocentrically sampled data. We analyze data from the National Health and Social Life Survey (NHSLS).

preprint2010arXiv

Modeling social networks from sampled data

Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors. Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data). In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs. We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.

Mark S. Handcock

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

An Approach to Causal Inference over Stochastic Networks

Population level information combined parameter estimation from complex survey datasets

Spatial Temporal Exponential-Family Point Process Models for the Evolution of Social Systems

Analysis of Partially Observed Networks via Exponential-family Random Network Models

A Separable Model for Dynamic Networks

Estimating Hidden Population Size using Respondent-Driven Sampling Data

Estimating within-school contact networks to understand influenza transmission

Exponential-family Random Network Models

Estimating within-household contact networks from egocentric data

Network Model-Assisted Inference from Respondent-Driven Sampling Data

On the Concept of Snowball Sampling

Adjusting for Network Size and Composition Effects in Exponential-Family Random Graph Models

Modeling social networks from sampled data