Source author record

Michael Schomaker

Michael Schomaker appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

The Delta-Method and Influence Function in Medical Statistics: a Reproducible Tutorial

Approximate statistical inference via determination of the asymptotic distribution of a statistic is routinely used for inference in applied medical statistics (e.g. to estimate the standard error of the marginal or conditional risk ratio). One method for variance estimation is the classical Delta-method but there is a knowledge gap as this method is not routinely included in training for applied medical statistics and its uses are not widely understood. Given that a smooth function of an asymptotically normal estimator is also asymptotically normally distributed, the Delta-method allows approximating the large-sample variance of a function of an estimator with known large-sample properties. In a more general setting, it is a technique for approximating the variance of a functional (i.e., an estimand) that takes a function as an input and applies another function to it (e.g. the expectation function). Specifically, we may approximate the variance of the function using the functional Delta-method based on the influence function (IF). The IF explores how a functional $ϕ(θ)$ changes in response to small perturbations in the sample distribution of the estimator and allows computing the empirical standard error of the distribution of the functional. The ongoing development of new methods and techniques may pose a challenge for applied statisticians who are interested in mastering the application of these methods. In this tutorial, we review the use of the classical and functional Delta-method and their links to the IF from a practical perspective. We illustrate the methods using a cancer epidemiology example and we provide reproducible and commented code in R and Python using symbolic programming. The code can be accessed at https://github.com/migariane/DeltaMethodInfluenceFunction

preprint2021arXiv

Regression and Causality

The causal effect of an intervention (treatment/exposure) on an outcome can be estimated by: i) specifying knowledge about the data-generating process; ii) assessing under what assumptions a target quantity, such as for example a causal odds ratio, can be identified given the specified knowledge (and given the measured data); and then, iii) using appropriate statistical estimation techniques to estimate the desired parameter of interest. As regression is the cornerstone of statistical analysis, it seems obvious to ask: is it appropriate to use estimated regression parameters for causal effect estimation? It turns out that using regression for effect estimation is possible, but typically requires more assumptions than competing methods. This manuscript provides a comprehensive summary of the assumptions needed to identify and estimate a causal parameter using regression and, equally important, discusses the resulting implications for statistical practice.

preprint2021arXiv

Using Longitudinal Targeted Maximum Likelihood Estimation in Complex Settings with Dynamic Interventions

Longitudinal targeted maximum likelihood estimation (LTMLE) has very rarely been used to estimate dynamic treatment effects in the context of time-dependent confounding affected by prior treatment when faced with long follow-up times, multiple time-varying confounders, and complex associational relationships simultaneously. Reasons for this include the potential computational burden, technical challenges, restricted modeling options for long follow-up times, and limited practical guidance in the literature. However, LTMLE has desirable asymptotic properties, i.e. it is doubly robust, and can yield valid inference when used in conjunction with machine learning. We use a topical and sophisticated question from HIV treatment research to show that LTMLE can be used successfully in complex realistic settings and compare results to competing estimators. Our example illustrates the following practical challenges common to many epidemiological studies 1) long follow-up time (30 months), 2) gradually declining sample size 3) limited support for some intervention rules of interest 4) a high-dimensional set of potential adjustment variables, increasing both the need and the challenge of integrating appropriate machine learning methods 5) consideration of collider bias. Our analyses, as well as simulations, shed new light on the application of LTMLE in complex and realistic settings: we show that (i) LTMLE can yield stable and good estimates, even when confronted with small samples and limited modeling options; (ii) machine learning utilized with a small set of simple learners (if more complex ones can't be fitted) can outperform a single, complex model, which is tailored to incorporate prior clinical knowledge; (iii) performance can vary considerably depending on interventions and their support in the data, and therefore critical quality checks should accompany every LTMLE analysis.

preprint2021arXiv

When and when not to use optimal model averaging

Traditionally model averaging has been viewed as an alternative to model selection with the ultimate goal to incorporate the uncertainty associated with the model selection process in standard errors and confidence intervals by using a weighted combination of candidate models. In recent years, a new class of model averaging estimators has emerged in the literature, suggesting to combine models such that the squared risk, or other risk functions, are minimized. We argue that, contrary to popular belief, these estimators do not necessarily address the challenges induced by model selection uncertainty, but should be regarded as attractive complements for the machine learning and forecasting literature, as well as tools to identify causal parameters. We illustrate our point by means of several targeted simulation studies.

Michael Schomaker

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

The Delta-Method and Influence Function in Medical Statistics: a Reproducible Tutorial

Regression and Causality

Using Longitudinal Targeted Maximum Likelihood Estimation in Complex Settings with Dynamic Interventions

When and when not to use optimal model averaging