Researcher profile

Mark S. Handcock

Mark S. Handcock contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

An Approach to Causal Inference over Stochastic Networks

Claiming causal inferences in network settings necessitates careful consideration of the often complex dependency between outcomes for actors. Of particular importance are treatment spillover or outcome interference effects. We consider causal inference when the actors are connected via an underlying network structure. Our key contribution is a model for causality when the underlying network is unobserved and the actor covariates evolve stochastically over time. We develop a joint model for the relational and covariate generating process that avoids restrictive separability assumptions and deterministic network assumptions that do not hold in the majority of social network settings of interest. Our framework utilizes the highly general class of Exponential-family Random Network models (ERNM) of which Markov Random Fields (MRF) and Exponential-family Random Graph models (ERGM) are special cases. We present potential outcome based inference within a Bayesian framework, and propose a simple modification to the exchange algorithm to allow for sampling from ERNM posteriors. We present results of a simulation study demonstrating the validity of the approach. Finally, we demonstrate the value of the framework in a case-study of smoking over time in the context of adolescent friendship networks.

preprint2022arXiv

Population level information combined parameter estimation from complex survey datasets

We consider an empirical likelihood framework for inference for a statistical model based on an informative sampling design and population-level information. The population-level information is summarized in the form of estimating equations and incorporated into the inference through additional constraints. Covariate information is incorporated both through the weights and the estimating equations. The estimator is based on conditional weights. We show that under usual conditions, with population size increasing unbounded, the estimates are strongly consistent, asymptotically unbiased, and normally distributed. Moreover, they are more efficient than other probability-weighted analogs. Our framework provides additional justification for inverse probability weighted score estimators in terms of conditional empirical likelihood. We give an application to demographic hazard modeling by combining birth registration data with panel survey data to estimate annual first birth probabilities.

preprint2013arXiv

Analysis of Partially Observed Networks via Exponential-family Random Network Models

Exponential-family random network (ERN) models specify a joint representation of both the dyads of a network and nodal characteristics. This class of models allow the nodal characteristics to be modelled as stochastic processes, expanding the range and realism of exponential-family approaches to network modelling. In this paper we develop a theory of inference for ERN models when only part of the network is observed, as well as specific methodology for missing data, including non-ignorable mechanisms for network-based sampling designs and for latent class models. In particular, we consider data collected via contact tracing, of considerable importance to infectious disease epidemiology and public health.

preprint2012arXiv

Estimating Hidden Population Size using Respondent-Driven Sampling Data

Respondent-Driven Sampling (RDS) is an approach to sampling design and inference in hard-to-reach human populations. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is proposed that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. A particular choice of the prior produces interval estimates with good frequentist properties. Finally, the method demonstrates sensible results when used to estimate the numbers of sub-populations most at risk for HIV in two cities in El Salvador.

preprint2012arXiv

Estimating within-school contact networks to understand influenza transmission

Many epidemic models approximate social contact behavior by assuming random mixing within mixing groups (e.g., homes, schools and workplaces). The effect of more realistic social network structure on estimates of epidemic parameters is an open area of exploration. We develop a detailed statistical model to estimate the social contact network within a high school using friendship network data and a survey of contact behavior. Our contact network model includes classroom structure, longer durations of contacts to friends than nonfriends and more frequent contacts with friends, based on reports in the contact survey. We performed simulation studies to explore which network structures are relevant to influenza transmission. These studies yield two key findings. First, we found that the friendship network structure important to the transmission process can be adequately represented by a dyad-independent exponential random graph model (ERGM). This means that individual-level sampled data is sufficient to characterize the entire friendship network. Second, we found that contact behavior was adequately represented by a static rather than dynamic contact network.

preprint2012arXiv

Exponential-family Random Network Models

Random graphs, where the connections between nodes are considered random variables, have wide applicability in the social sciences. Exponential-family Random Graph Models (ERGM) have shown themselves to be a useful class of models for representing com- plex social phenomena. We generalize ERGM by also modeling nodal attributes as random variates, thus creating a random model of the full network, which we call Exponential-family Random Network Models (ERNM). We demonstrate how this framework allows a new formu- lation for logistic regression in network data. We develop likelihood-based inference for the model and an MCMC algorithm to implement it. This new model formulation is used to analyze a peer social network from the National Lon- gitudinal Study of Adolescent Health. We model the relationship between substance use and friendship relations, and show how the results differ from the standard use of logistic regression on network data.

preprint2011arXiv

Estimating within-household contact networks from egocentric data

Acute respiratory diseases are transmitted over networks of social contacts. Large-scale simulation models are used to predict epidemic dynamics and evaluate the impact of various interventions, but the contact behavior in these models is based on simplistic and strong assumptions which are not informed by survey data. These assumptions are also used for estimating transmission measures such as the basic reproductive number and secondary attack rates. Development of methodology to infer contact networks from survey data could improve these models and estimation methods. We contribute to this area by developing a model of within-household social contacts and using it to analyze the Belgian POLYMOD data set, which contains detailed diaries of social contacts in a 24-hour period. We model dependency in contact behavior through a latent variable indicating which household members are at home. We estimate age-specific probabilities of being at home and age-specific probabilities of contact conditional on two members being at home. Our results differ from the standard random mixing assumption. In addition, we find that the probability that all members contact each other on a given day is fairly low: 0.49 for households with two 0--5 year olds and two 19--35 year olds, and 0.36 for households with two 12--18 year olds and two 36+ year olds. We find higher contact rates in households with 2--3 members, helping explain the higher influenza secondary attack rates found in households of this size.

preprint2011arXiv

Network Model-Assisted Inference from Respondent-Driven Sampling Data

Respondent-Driven Sampling is a method to sample hard-to-reach human populations by link-tracing over their social networks. Beginning with a convenience sample, each person sampled is given a small number of uniquely identified coupons to distribute to other members of the target population, making them eligible for enrollment in the study. This can be an effective means to collect large diverse samples from many populations. Inference from such data requires specialized techniques for two reasons. Unlike in standard sampling designs, the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights necessary for traditional design-based inference. Any likelihood-based inference requires the modeling of the complex sampling process often beginning with a convenience sample. We introduce a model-assisted approach, resulting in a design-based estimator leveraging a working model for the structure of the population over which sampling is conducted. We demonstrate that the new estimator has improved performance compared to existing estimators and is able to adjust for the bias induced by the selection of the initial sample. We present sensitivity analyses for unknown population sizes and the misspecification of the working network model. We develop a bootstrap procedure to compute measures of uncertainty. We apply the method to the estimation of HIV prevalence in a population of injecting drug users (IDU) in the Ukraine, and show how it can be extended to include application-specific information.

preprint2010arXiv

Adjusting for Network Size and Composition Effects in Exponential-Family Random Graph Models

Exponential-family random graph models (ERGMs) provide a principled way to model and simulate features common in human social networks, such as propensities for homophily and friend-of-a-friend triad closure. We show that, without adjustment, ERGMs preserve density as network size increases. Density invariance is often not appropriate for social networks. We suggest a simple modification based on an offset which instead preserves the mean degree and accommodates changes in network composition asymptotically. We demonstrate that this approach allows ERGMs to be applied to the important situation of egocentrically sampled data. We analyze data from the National Health and Social Life Survey (NHSLS).

preprint2010arXiv

Modeling social networks from sampled data

Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors. Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data). In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs. We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.