Researcher profile

Zhi Geng

Zhi Geng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Assessing Interactive Causes of an Occurred Outcome Due to Two Binary Exposures

In contrast to evaluating treatment effects, causal attribution analysis focuses on identifying the key factors responsible for an observed outcome. For two binary exposure variables and a binary outcome variable, researchers need to assess not only the likelihood that an observed outcome was caused by a particular exposure, but also the likelihood that it resulted from the interaction between the two exposures. For example, in the case of a male worker who smoked, was exposed to asbestos, and developed lung cancer, researchers aim to explore whether the cancer resulted from smoking, asbestos exposure, or their interaction. Even in randomized controlled trials, widely regarded as the gold standard for causal inference, identifying and evaluating retrospective causal interactions between two exposures remains challenging. In this paper, we define posterior probabilities to characterize the interactive causes of an observed outcome. We establish the identifiability of posterior probabilities by using a secondary outcome variable that may appear after the primary outcome. We apply the proposed method to the classic case of smoking and asbestos exposure. Our results indicate that for lung cancer patients who smoked and were exposed to asbestos, the disease is primarily attributable to the synergistic effect between smoking and asbestos exposure.

preprint2022arXiv

A Local Method for Identifying Causal Relations under Markov Equivalence

Causality is important for designing interpretable and robust methods in artificial intelligence research. We propose a local approach to identify whether a variable is a cause of a given target under the framework of causal graphical models of directed acyclic graphs (DAGs). In general, the causal relation between two variables may not be identifiable from observational data as many causal DAGs encoding different causal relations are Markov equivalent. In this paper, we first introduce a sufficient and necessary graphical condition to check the existence of a causal path from a variable to a target in every Markov equivalent DAG. Next, we provide local criteria for identifying whether a variable is a cause/non-cause of a target based only on the local structure instead of the entire graph. Finally, we propose a local learning algorithm for this causal query via learning the local structure of the variable and some additional statistical independence tests related to the target. Simulation studies show that our local algorithm is efficient and effective, compared with other state-of-art methods.

preprint2022arXiv

Modeling High-Dimensional Data with Unknown Cut Points: A Fusion Penalized Logistic Threshold Regression

In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors. Here, we consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses. Both the threshold points and regression coefficients are unknown and to be estimated. For high dimensional data, we propose a fusion penalized logistic threshold regression (FILTER) model, where a fused lasso penalty is employed to control the total variation and shrink the coefficients to zero as a method of variable selection. Under mild conditions on the estimate of unknown threshold points, we establish the non-asymptotic error bound for coefficient estimation and the model selection consistency. With a careful characterization of the error propagation, we have also shown that the tree-based method, such as CART, fulfill the threshold estimation conditions. We find the FILTER model is well suited in the problem of early detection and prediction for chronic disease like diabetes, using physical examination data. The finite sample behavior of our proposed method are also explored and compared with extensive Monte Carlo studies, which supports our theoretical discoveries.

preprint2020arXiv

Causal Mediation Analysis with Multiple Treatments and Latent Confounders

Causal mediation analysis is used to evaluate direct and indirect causal effects of a treatment on an outcome of interest through an intermediate variable or a mediator.It is difficult to identify the direct and indirect causal effects because the mediator cannot be randomly assigned in many real applications. In this article, we consider a causal model including latent confounders between the mediator and the outcome. We present sufficient conditions for identifying the direct and indirect effects and propose an approach for estimating them. The performance of the proposed approach is evaluated by simulation studies. Finally, we apply the approach to a data set of the customer loyalty survey by a telecom company.