Source author record

Yifan Cui

Yifan Cui appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Machine Learning

Catalog footprint

What is connected

6works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Proximal Path-Specific Inference

Causal mediation analysis has been extended to estimate path-specific effects with multiple intermediate variables, isolating treatment effects through a mediator of interest while excluding pathways through its ancestors. Such analyses address bias from recanting witnesses, i.e., treatment-induced mediator-outcome confounders. However, existing methods typically rely on stringent assumptions precluding general unmeasured confounding, which are often violated in practice. In this paper, we relax these restrictions by leveraging observed covariates as proxy variables to accommodate unmeasured confounding among the treatment, recanting witness, mediator, and outcome. Using proximal confounding bridge functions, we develop four nonparametric identification strategies for the path-specific effect. We further derive the efficient influence function and propose a quadruply robust, locally efficient estimator. To handle high-dimensional nuisance parameters, we propose a proximal debiased machine learning approach. We theoretically guarantee that our estimator achieves $\sqrt{n}$-consistency and asymptotic normality even when machine learning estimators for nuisance functions converge at slower rates. Our approaches are validated via semiparametric and nonparametric simulations and an application to the CDC WONDER Natality study, estimating the path-specific effect of prenatal care on preterm birth through preeclampsia, independent of maternal smoking during pregnancy.

preprint2022arXiv

Demystifying Inferential Models: A Fiducial Perspective

Inferential models have recently gained in popularity for valid uncertainty quantification. In this paper, we investigate inferential models by exploring relationships between inferential models, fiducial inference, and confidence curves. In short, we argue that from a certain point of view, inferential models can be viewed as fiducial distribution based confidence curves. We show that all probabilistic uncertainty quantification of inferential models is based on a collection of sets we name principle sets and principle assertions.

preprint2022arXiv

Proximal Causal Inference for Marginal Counterfactual Survival Curves

Contrasting marginal counterfactual survival curves across treatment arms is an effective and popular approach for inferring the causal effect of an intervention on a right-censored time-to-event outcome. A key challenge to drawing such inferences in observational settings is the possible existence of unmeasured confounding, which may invalidate most commonly used methods that assume no hidden confounding bias. In this paper, rather than making the standard no unmeasured confounding assumption, we extend the recently proposed proximal causal inference framework of Miao et al. (2018), Tchetgen et al. (2020), Cui et al. (2020) to obtain nonparametric identification of a causal survival contrast by leveraging observed covariates as imperfect proxies of unmeasured confounders. Specifically, we develop a proximal inverse probability-weighted (PIPW) estimator, the proximal analog of standard IPW, which allows the observed data distribution for the time-to-event outcome to remain completely unrestricted. PIPW estimation relies on a parametric model for a so-called treatment confounding bridge function relating the treatment process to confounding proxies. As a result, PIPW might be sensitive to model misspecification. To improve robustness and efficiency, we also propose a proximal doubly robust estimator and establish uniform consistency and asymptotic normality of both estimators. We conduct extensive simulations to examine the finite sample performance of our estimators, and proposed methods are applied to a study evaluating the effectiveness of right heart catheterization in the intensive care unit of critically ill patients.

preprint2022arXiv

Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable

Instrumental variable methods have been widely used to identify causal effects in the presence of unmeasured confounding. A key identification condition known as the exclusion restriction states that the instrument cannot have a direct effect on the outcome which is not mediated by the exposure in view. In the health and social sciences, such an assumption is often not credible. To address this concern, we consider identification conditions of the population average treatment effect with an invalid instrumental variable which does not satisfy the exclusion restriction, and derive the efficient influence function targeting the identifying functional under a nonparametric observed data model. We propose a novel multiply robust locally efficient estimator of the average treatment effect that is consistent in the union of multiple parametric nuisance models, as well as a multiply debiased machine learning estimator for which the nuisance parameters are estimated using generic machine learning methods, that effectively exploit various forms of linear or nonlinear structured sparsity in the nuisance parameter space. When one cannot be confident that any of these machine learners is consistent at sufficiently fast rates to ensure $\surd{n}$-consistency for the average treatment effect, we introduce a new criteria for selective machine learning which leverages the multiple robustness property in order to ensure small bias. The proposed methods are illustrated through extensive simulations and a data analysis evaluating the causal effect of 401(k) participation on savings.

preprint2020arXiv

A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity

There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we establish identification of both value function $E[Y_{\mathcal{D}(L)}]$ for a given regime $\mathcal{D}$ and optimal regimes $\text{argmax}_{\mathcal{D}} E[Y_{\mathcal{D}(L)}]$ with the aid of a binary instrumental variable, when no unmeasured confounding fails to hold. We also construct novel multiply robust classification-based estimators. Furthermore, we propose to identify and estimate optimal treatment regimes among those who would comply to the assigned treatment under a standard monotonicity assumption. In this latter case, we establish the somewhat surprising result that complier optimal regimes can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation.

preprint2020arXiv

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment

Robins 1997 introduced marginal structural models (MSMs), a general class of counterfactual models for the joint effects of time-varying treatment regimes in complex longitudinal studies subject to time-varying confounding. In his work, identification of MSM parameters is established under a sequential randomization assumption (SRA), which rules out unmeasured confounding of treatment assignment over time. We consider sufficient conditions for identification of the parameters of a subclass, Marginal Structural Mean Models (MSMMs), when sequential randomization fails to hold due to unmeasured confounding, using instead a time-varying instrumental variable. Our identification conditions require that no unobserved confounder predicts compliance type for the time-varying treatment. We describe a simple weighted estimator and examine its finite-sample properties in a simulation study. We apply the proposed estimator to examine the effect of delivery hospital on neonatal survival probability.

Yifan Cui

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Proximal Path-Specific Inference

Demystifying Inferential Models: A Fiducial Perspective

Proximal Causal Inference for Marginal Counterfactual Survival Curves

Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable

A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment