Source author record

Linbo Wang

Linbo Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Machine Learning math.ST Statistics Theory

Catalog footprint

What is connected

11works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Coherent modeling of longitudinal causal effects on binary outcomes

Analyses of biomedical studies often necessitate modeling longitudinal causal effects. The current focus on personalized medicine and effect heterogeneity makes this task even more challenging. Towards this end, structural nested mean models (SNMMs) are fundamental tools for studying heterogeneous treatment effects in longitudinal studies. However, when outcomes are binary, current methods for estimating multiplicative and additive SNMM parameters suffer from variation dependence between the causal parameters and the non-causal nuisance parameters. This leads to a series of difficulties in interpretation, estimation and computation. These difficulties have hindered the uptake of SNMMs in biomedical practice, where binary outcomes are very common. We solve the variation dependence problem for the binary multiplicative SNMM via a reparametrization of the non-causal nuisance parameters. Our novel nuisance parameters are variation independent of the causal parameters, and hence allow for coherent modeling of heterogeneous effects from longitudinal studies with binary outcomes. Our parametrization also provides a key building block for flexible doubly robust estimation of the causal parameters. Along the way, we prove that an additive SNMM with binary outcomes does not admit a variation independent parametrization, thereby justifying the restriction to multiplicative SNMMs.

preprint2022arXiv

Homogeneity in the instrument-treatment association is not sufficient for the Wald estimand to equal the average causal effect for a binary instrument and a continuous exposure

Background: Interpreting instrumental variable results often requires further assumptions in addition to the core assumptions of relevance, independence, and the exclusion restriction. Methods: We assess whether instrument-exposure additive homogeneity renders the Wald estimand equal to the average derivative effect (ADE) in the case of a binary instrument and a continuous exposure. Results: Instrument-exposure additive homogeneity is insufficient for ADE identification when the instrument is binary, the exposure is continuous and the effect of the exposure on the outcome is non-linear on the additive scale. For a binary exposure, the exposure-outcome effect is necessarily additive linear, so the homogeneity condition is sufficient. Conclusions: For binary instruments, instrument-exposure additive homogeneity identifies the ADE if the exposure is also binary. Otherwise, additional assumptions (such as additive linearity of the exposure-outcome effect) are required.

preprint2022arXiv

IV estimation of causal hazard ratio

Cox's proportional hazards model is one of the most popular statistical models to evaluate associations of exposure with a censored failure time outcome. When confounding factors are not fully observed, the exposure hazard ratio estimated using a Cox model is subject to unmeasured confounding bias. To address this, we propose a novel approach for the identification and estimation of the causal hazard ratio in the presence of unmeasured confounding factors. Our approach is based on a binary instrumental variable, and an additional no-interaction assumption in a first stage regression of the treatment on the IV and unmeasured confounders. We propose, to the best of our knowledge, the first consistent estimator of the (population) causal hazard ratio within an instrumental variable framework. A version of our estimator admits a closed-form representation. We derive the asymptotic distribution of our estimator, and provide a consistent estimator for its asymptotic variance. Our approach is illustrated via simulation studies and a data application.

preprint2022arXiv

Mapping the Genetic-Imaging-Clinical Pathway with Applications to Alzheimer's Disease

Alzheimer's disease is a progressive form of dementia that results in problems with memory, thinking, and behavior. It often starts with abnormal aggregation and deposition of beta amyloid and tau, followed by neuronal damage such as atrophy of the hippocampi, leading to Alzheimer's Disease (AD). The aim of this paper is to map the genetic-imaging-clinical pathway for AD in order to delineate the genetically regulated brain changes that drive disease progression based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We develop a novel two-step approach to delineate the association between high-dimensional 2D hippocampal surface exposures and the Alzheimer's Disease Assessment Scale (ADAS) cognitive score, while taking into account the ultra-high dimensional clinical and genetic covariates at baseline. Analysis results suggest that the radial distance of each pixel of both hippocampi is negatively associated with the severity of behavioral deficits conditional on observed clinical and genetic covariates. These associations are stronger in Cornu Ammonis region 1 (CA1) and subiculum subregions compared to Cornu Ammonis region 2 (CA2) and Cornu Ammonis region 3 (CA3) subregions.

preprint2022arXiv

Multiplicative Effect Modeling: The General Case

Generalized linear models, such as logistic regression, are widely used to model the association between a treatment and a binary outcome as a function of baseline covariates. However, the coefficients of a logistic regression model correspond to log odds ratios, while subject-matter scientists are often interested in relative risks. Although odds ratios are sometimes used to approximate relative risks, this approximation is appropriate only when the outcome of interest is rare for all levels of the covariates. Poisson regressions do measure multiplicative treatment effects including relative risks, but with a binary outcome not all combinations of parameters lead to fitted means that are between zero and one. Enforcing this constraint makes the parameters variation dependent, which is undesirable for modeling, estimation and computation. Focusing on the special case where the treatment is also binary, Richardson2017 propose a novel binomial regression model, that allows direct modeling of the relative risk. The model uses a log odds-product nuisance model leading to variation independent parameter spaces. Building on this we present general approaches to modeling the multiplicative effect of a continuous or categorical treatment on a binary outcome. Monte Carlo simulations demonstrate the desirable performance of our proposed methods. A data analysis further exemplifies our methods.

preprint2022arXiv

Ultra-high Dimensional Variable Selection for Doubly Robust Causal Inference

Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose causal ball screening for confounder selection from modern ultra-high dimensional data sets. Unlike the familiar task of variable selection for prediction modeling, our confounder selection procedure aims to control for confounding while improving efficiency in the resulting causal effect estimate. Previous empirical and theoretical studies suggest excluding causes of the treatment that are not confounders. Motivated by these results, our goal is to keep all the predictors of the outcome in both the propensity score and outcome regression models. A distinctive feature of our proposal is that we use an outcome model-free procedure for propensity score model selection, thereby maintaining double robustness in the resulting causal effect estimator. Our theoretical analyses show that the proposed procedure enjoys a number of properties, including model selection consistency and point-wise normality. Synthetic and real data analysis show that our proposal performs favorably with existing methods in a range of realistic settings. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.

preprint2021arXiv

In Search of Robust Measures of Generalization

One of the principal scientific challenges in deep learning is explaining generalization, i.e., why the particular way the community now trains networks to achieve small training error also leads to small error on held-out data from the same population. It is widely appreciated that some worst-case theories -- such as those based on the VC dimension of the class of predictors induced by modern neural network architectures -- are unable to explain empirical performance. A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk. When evaluated empirically, however, most of these bounds are numerically vacuous. Focusing on generalization bounds, this work addresses the question of how to evaluate such bounds empirically. Jiang et al. (2020) recently described a large-scale empirical study aimed at uncovering potential causal relationships between bounds/measures and generalization. Building on their study, we highlight where their proposed methods can obscure failures and successes of generalization measures in explaining generalization. We argue that generalization measures should instead be evaluated within the framework of distributional robustness.

preprint2020arXiv

Conditional Independence Beyond Domain Separability: Discussion of Engelke and Hitz (2020)

We congratulate Engelke and Hitz on a thought-provoking paper on graphical models for extremes. A key contribution of the paper is the introduction of a novel definition of conditional independence for a multivariate Pareto distribution. Here, we outline a proposal for independence and conditional independence of general random variables whose support is a general set Omega in multidimensional real number space. Our proposal includes the authors' definition of conditional independence, and the analogous definition of independence as special cases. By making our proposal independent of the context of extreme value theory, we highlight the importance of the authors' contribution beyond this particular context.

preprint2016arXiv

Causal analysis of ordinal treatments and binary outcomes under truncation by death

It is common that in multiarm randomized trials, the outcome of interest is "truncated by death," meaning that it is only observed or well defined conditioning on an intermediate outcome. In this case, in addition to pairwise contrasts, the joint inference for all treatment arms is also of interest. Under a monotonicity assumption we present methods for both pairwise and joint causal analyses of ordinal treatments and binary outcomes in presence of truncation by death. We illustrate via examples the appropriateness of our assumptions in different scientific contexts.

preprint2016arXiv

On falsification of the binary instrumental variable model

Instrumental variables are widely used for estimating causal effects in the presence of unmeasured confounding. The discrete instrumental variable model has testable implications on the law of the observed data. However, current assessments of instrumental validity are typically based solely on subject-matter arguments rather than these testable implications, partly due to a lack of formal statistical tests with known properties. In this paper, we develop simple procedures for testing the binary instrumental variable model. Our methods are based on existing approaches for comparing two treatments, such as the t-test and the Gail--Simon test. We illustrate the importance of testing the instrumental variable model by evaluating the exogeneity of college proximity using the National Longitudinal Survey of Young Men.

preprint2016arXiv

On Modeling and Estimation for the Relative Risk and Risk Difference

A common problem in formulating models for the relative risk and risk difference is the variation dependence between these parameters and the baseline risk, which is a nuisance model. We address this problem by proposing the conditional log odds-product as a preferred nuisance model. This novel nuisance model facilitates maximum-likelihood estimation, but also permits doubly-robust estimation for the parameters of interest. Our approach is illustrated via simulations and a data analysis.

Linbo Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Coherent modeling of longitudinal causal effects on binary outcomes

Homogeneity in the instrument-treatment association is not sufficient for the Wald estimand to equal the average causal effect for a binary instrument and a continuous exposure

IV estimation of causal hazard ratio

Mapping the Genetic-Imaging-Clinical Pathway with Applications to Alzheimer's Disease

Multiplicative Effect Modeling: The General Case

Ultra-high Dimensional Variable Selection for Doubly Robust Causal Inference

In Search of Robust Measures of Generalization

Conditional Independence Beyond Domain Separability: Discussion of Engelke and Hitz (2020)

Causal analysis of ordinal treatments and binary outcomes under truncation by death

On falsification of the binary instrumental variable model

On Modeling and Estimation for the Relative Risk and Risk Difference