Researcher profile

Michael Baiocchi

Michael Baiocchi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2021arXiv

stratamatch: Prognostic ScoreStratification using a Pilot Design

Optimal propensity score matching has emerged as one of the most ubiquitous approaches for causal inference studies on observational data; However, outstanding critiques of the statistical properties of propensity score matching have cast doubt on the statistical efficiency of this technique, and the poor scalability of optimal matching to large data sets makes this approach inconvenient if not infeasible for sample sizes that are increasingly commonplace in modern observational data. The stratamatch package provides implementation support and diagnostics for `stratified matching designs,' an approach which addresses both of these issues with optimal propensity score matching for large-sample observational studies. First, stratifying the data enables more computationally efficient matching of large data sets. Second, stratamatch implements a `pilot design' approach in order to stratify by a prognostic score, which may increase the precision of the effect estimate and increase power in sensitivity analyses of unmeasured confounding.

preprint2020arXiv

A Pilot Design for Observational Studies: Using Abundant Data Thoughtfully

Observational studies often benefit from an abundance of observational units. This can lead to studies that -- while challenged by issues of internal validity -- have inferences derived from sample sizes substantially larger than randomized controlled trials. But is the information provided by an observational unit best used in the analysis phase? We propose the use of `pilot design,' in which observations are expended in the design phase of the study, and the post-treatment information from these observations is used to improve study design. In modern observational studies, which are data rich but control poor, pilot designs can be used to gain information about the structure of post-treatment variation. This information can then be used to improve instrumental variable designs, propensity score matching, doubly-robust estimation, and other observational study designs. We illustrate one version of a pilot design, which aims to reduce within-set heterogeneity and improve performance in sensitivity analyses. This version of a pilot design expends observational units during the design phase to fit a prognostic model, avoiding concerns of overfitting. Additionally, it enables the construction of `Assignment-Control (AC) plots,' which visualize the relationship between propensity and prognostic scores. We first show some examples of these plots, then we demonstrate in a simulation setting how this alternative use of the observations can lead to gains in terms of both treatment effect estimation and sensitivity analyses of unobserved confounding.

preprint2020arXiv

Combining Observational and Experimental Datasets Using Shrinkage Estimators

We consider the problem of combining data from observational and experimental sources to make causal conclusions. This problem is increasingly relevant, as the modern era has yielded passive collection of massive observational datasets in areas such as e-commerce and electronic health. These data may be used to supplement experimental data, which is frequently expensive to obtain. In Rosenman et al. (2018), we considered this problem under the assumption that all confounders were measured. Here, we relax the assumption of unconfoundedness. To derive combined estimators with desirable properties, we make use of results from the Stein Shrinkage literature. Our contributions are threefold. First, we propose a generic procedure for deriving shrinkage estimators in this setting, making use of a generalized unbiased risk estimate. Second, we develop two new estimators, prove finite sample conditions under which they have lower risk than an estimator using only experimental data, and show that each achieves a notion of asymptotic optimality. Third, we draw connections between our approach and results in sensitivity analysis, including proposing a method for evaluating the feasibility of our estimators.

preprint2020arXiv

Understanding the spatial burden of gender-based violence: Modelling patterns of violence in Nairobi, Kenya through geospatial information

We present statistical techniques for analyzing global positioning system (GPS) data in order to understand, communicate about, and prevent patterns of violence. In this pilot study, participants in Nairobi, Kenya were asked to rate their safety at several locations, with the goal of predicting safety and learning important patterns. These approaches are meant to help articulate differences in experiences, fostering a discussion that will help communities identify issues and policymakers develop safer communities. A generalized linear mixed model incorporating spatial information taken from existing maps of Kibera showed significant predictors of perceived lack of safety included being alone and time of day; in debrief interviews, participants described feeling unsafe in spaces with hiding places, disease carrying animals, and dangerous individuals. This pilot study demonstrates promise for detecting spatial patterns of violence, which appear to be confirmed by actual rates of measured violence at schools. Several factors relevant to community building consistently predict perceived safety and emerge in participants' qualitative descriptions, telling a cohesive story about perceived safety and empowering communication to community stakeholders.