Researcher profile

Roger D. Peng

Roger D. Peng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Inside Out: Externalizing Assumptions in Data Analysis as Validation Checks

In data analysis, unexpected results often prompt researchers to revisit their procedures to identify potential issues. While some researchers may struggle to identify the root causes, experienced researchers can often quickly diagnose problems by checking a few key assumptions. These checked assumptions, or expectations, are typically informal, difficult to trace, and rarely discussed in publications. In this paper, we introduce the term *analysis validation checks* to formalize and externalize these informal assumptions. We then introduce a procedure to identify a subset of checks that best predict the occurrence of unexpected outcomes, based on simulations of the original data. The checks are evaluated in terms of accuracy, determined by binary classification metrics, and independence, which measures the shared information among checks. We demonstrate this approach with a toy example using step count data and a generalized linear model example examining the effect of particulate matter air pollution on daily mortality.

preprint2022arXiv

Implications of Mortality Displacement for Effect Modification and Selection Bias

Mortality displacement is the concept that deaths are moved forward in time (e.g., a few days, several months, and years) by exposure from when they would occur without the exposure, which is common in environmental time-series studies. Using concepts of a frail population and loss of life expectancy, it is understood that mortality displacement may decrease rate ratio (RR). Such decreases are thought to be minimal or substantial depending on study populations. Environmental epidemiologists have interpreted RR considering mortality displacement. This theoretical paper reveals that mortality displacement can be formulated as a built-in selection bias of RR in Cox models due to unmeasured risk factors independent from exposure of interest, and mortality displacement can also be viewed as an effect modifier by integrating the concepts of rate and loss of life expectancy. Thus, depending on the framework through which we view bias, mortality displacement can be categorized as selection bias in the bias taxonomy of epidemiology, and simultaneously mortality displacement can be seen as an effect modifier. This dichotomy provides useful implications regarding policy, effect modification, exposure time-windows selection, and generalizability, specifically why research in epidemiology may produce unexpected and heterogeneous RR over different studies and sub-populations.

preprint2020arXiv

Reproducible Research: A Retrospective

Rapid advances in computing technology over the past few decades have spurred two extraordinary phenomena in science: large-scale and high-throughput data collection coupled with the creation and implementation of complex statistical algorithms for data analysis. Together, these two phenomena have brought about tremendous advances in scientific discovery but have also raised two serious concerns, one relatively new and one quite familiar. The complexity of modern data analyses raises questions about the reproducibility of the analyses, meaning the ability of independent analysts to re-create the results claimed by the original authors using the original data and analysis techniques. While seemingly a straightforward concept, reproducibility of analyses is typically thwarted by the lack of availability of the data and computer code that were used in the analyses. A much more general concern is the replicability of scientific findings, which concerns the frequency with which scientific claims are confirmed by completely independent investigations. While the concepts of reproduciblity and replicability are related, it is worth noting that they are focused on quite different goals and address different aspects of scientific progress. In this review, we will discuss the origins of reproducible research, characterize the current status of reproduciblity in public health research, and connect reproduciblity to current concerns about replicability of scientific findings. Finally, we describe a path forward for improving both the reproducibility and replicability of public health research in the future.