Source author record

Zijian Guo

Zijian Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Applications Machine Learning econ.EM math.CA math.FA math.OC math.PR

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Causal Invariance Learning via Efficient Nonconvex Optimization

Identifying the causal relationship among variables from observational data is an important yet challenging task. This work focuses on identifying the direct causes of an outcome and estimating their magnitude, i.e., learning the causal outcome model. Data from multiple environments provide valuable opportunities to uncover causality by exploiting the invariance principle that the causal outcome model holds across heterogeneous environments. Based on the invariance principle, we propose the Negative Weighted Distributionally Robust Optimization (NegDRO) framework to learn an invariant prediction model. NegDRO minimizes the worst-case combination of risks across multiple environments and enforces invariance by allowing potential negative weights. Under the additive interventions regime, we establish three major contributions: (i) On the statistical side, we provide sufficient and nearly necessary identification conditions under which the invariant prediction model coincides with the causal outcome model; (ii) On the optimization side, despite the nonconvexity of NegDRO, we establish its benign optimization landscape, where all stationary points lie close to the true causal outcome model; (iii) On the computational side, we develop a gradient-based algorithm that provably converges to the causal outcome model, with non-asymptotic convergence rates in both sample size and gradient-descent iterations. In particular, our method avoids exhaustive combinatorial searches over exponentially many subsets of covariates found in the literature, ensuring scalability even when the dimension of the covariates is large. To our knowledge, this is the first causal invariance learning method that finds the approximate global optimality for a nonconvex optimization problem efficiently.

preprint2026arXiv

Distributionally Robust Synthetic Control: Ensuring Robustness Against Highly Correlated Controls and Weight Shifts

The synthetic control method estimates the causal effect by comparing the treated unit's outcomes to a weighted average of control units that closely match its pre-treatment outcomes, assuming the relationship between treated and control potential outcomes remains stable before and after treatment. However, the estimator may become unreliable when these relationships shift or when control units are highly correlated. To address these challenges, we introduce the Distributionally Robust Synthetic Control (DRoSC) method, which accommodates potential shifts in relationships and addresses high correlations among control units. The DRoSC method targets a novel causal estimand defined as the optimizer of a worst-case optimization problem considering all possible weights compatible with the pre-treatment period. When the identification conditions for the classical synthetic control method hold, the DRoSC method targets the same causal effect as the synthetic control; when these conditions are violated, we demonstrate that this new causal estimand is a conservative proxy for the non-identifiable causal effect. We further show that the DRoSC estimator's limiting distribution is non-normal and propose a novel inferential approach. We demonstrate its performance through numerical studies and an analysis of the economic impact of terrorism in the Basque Country.

preprint2024arXiv

Robustness Against Weak or Invalid Instruments: Exploring Nonlinear Treatment Models with Machine Learning

We discuss causal inference for observational studies with possibly invalid instrumental variables. We propose a novel methodology called two-stage curvature identification (TSCI) by exploring the nonlinear treatment model with machine learning. {The first-stage machine learning enables improving the instrumental variable's strength and adjusting for different forms of violating the instrumental variable assumptions.} The success of TSCI requires the instrumental variable's effect on treatment to differ from its violation form. A novel bias correction step is implemented to remove bias resulting from the potentially high complexity of machine learning. Our proposed \texttt{TSCI} estimator is shown to be asymptotically unbiased and Gaussian even if the machine learning algorithm does not consistently estimate the treatment model. Furthermore, we design a data-dependent method to choose the best among several candidate violation forms. We apply TSCI to study the effect of education on earnings.

preprint2023arXiv

Robust Inference for Federated Meta-Learning

Synthesizing information from multiple data sources is critical to ensure knowledge generalizability. Integrative analysis of multi-source data is challenging due to the heterogeneity across sources and data-sharing constraints due to privacy concerns. In this paper, we consider a general robust inference framework for federated meta-learning of data from multiple sites, enabling statistical inference for the prevailing model, defined as the one matching the majority of the sites. Statistical inference for the prevailing model is challenging since it requires a data-adaptive mechanism to select eligible sites and subsequently account for the selection uncertainty. We propose a novel sampling method to address the additional variation arising from the selection. Our devised CI construction does not require sites to share individual-level data and is shown to be valid without requiring the selection of eligible sites to be error-free. The proposed robust inference for federated meta-learning (RIFL) methodology is broadly applicable and illustrated with three inference problems: aggregation of parametric models, high-dimensional prediction models, and inference for average treatment effects. We use RIFL to perform federated learning of mortality risk for patients hospitalized with COVID-19 using real-world EHR data from 16 healthcare centers representing 275 hospitals across four countries.

preprint2022arXiv

Causal Inference for Nonlinear Outcome Models with Possibly Invalid Instrumental Variables

Instrumental variable methods are widely used for inferring the causal effect in the presence of unmeasured confounders. Existing instrumental variable methods for nonlinear outcome models require stringent identifiability conditions. This paper considers a flexible semi-parametric potential outcome model that allows for possibly invalid instruments. We propose new identifiability conditions to identify the causal parameters when the majority of the instrumental variables are valid. We devise a novel inference procedure for a new average structural function and the conditional average treatment effect. We establish the asymptotic normality of the proposed estimators and construct confidence intervals for the causal estimands by bootstrap. The proposed method is demonstrated in large-scale simulation studies and is applied to infer the effect of income on house ownership.

preprint2022arXiv

Decorrelated Local Linear Estimator: Inference for Non-linear Effects in High-dimensional Additive Models

Additive models play an essential role in studying non-linear relationships. Despite many recent advances in estimation, there is a lack of methods and theories for inference in high-dimensional additive models, including confidence interval construction and hypothesis testing. Motivated by inference for non-linear treatment effects, we consider the high-dimensional additive model and make inference for the derivative of the function of interest. We propose a novel decorrelated local linear estimator and establish its asymptotic normality. The main novelty is the construction of the decorrelation weights, which is instrumental in reducing the error inherited from estimating the nuisance functions in the high-dimensional additive model. We construct the confidence interval for the function derivative and conduct the related hypothesis testing. We demonstrate our proposed method over large-scale simulation studies and apply it to identify non-linear effects in the motif regression problem. Our proposed method is implemented in the R package \texttt{DLL} available from CRAN.

preprint2020arXiv

Extreme Eigenvalues of Nonlinear Correlation Matrices with Applications to Additive Models

The maximum correlation of functions of a pair of random variables is an important measure of stochastic dependence. It is known that this maximum nonlinear correlation is identical to the absolute value of the Pearson correlation for a pair of Gaussian random variables or a pair of finite sums of iid random variables. This paper extends these results to pairwise Gaussian vectors and processes, nested sums of iid random variables, and permutation symmetric functions of sub-groups of iid random variables. It also discusses applications to additive regression models.

preprint2020arXiv

Optimal Statistical Inference for Individualized Treatment Effects in High-dimensional Models

The ability to predict individualized treatment effects (ITEs) based on a given patient's profile is essential for personalized medicine. We propose a hypothesis testing approach to choosing between two potential treatments for a given individual in the framework of high-dimensional linear models. The methodological novelty lies in the construction of a debiased estimator of the ITE and establishment of its asymptotic normality uniformly for an arbitrary future high-dimensional observation, while the existing methods can only handle certain specific forms of observations. We introduce a testing procedure with the type-I error controlled and establish its asymptotic power. The proposed method can be extended to making inference for general linear contrasts, including both the average treatment effect and outcome prediction. We introduce the optimality framework for hypothesis testing from both the minimaxity and adaptivity perspectives and establish the optimality of the proposed procedure. An extension to high-dimensional approximate linear models is also considered. The finite sample performance of the procedure is demonstrated in simulation studies and further illustrated through an analysis of electronic health records data from patients with rheumatoid arthritis.

preprint2016arXiv

Accuracy Assessment for High-dimensional Linear Regression

This paper considers point and interval estimation of the $\ell_q$ loss of an estimator in high-dimensional linear regression with random design. We establish the minimax rate for estimating the $\ell_{q}$ loss and the minimax expected length of confidence intervals for the $\ell_{q}$ loss of rate-optimal estimators of the regression vector, including commonly used estimators such as Lasso, scaled Lasso, square-root Lasso and Dantzig Selector. Adaptivity of the confidence intervals for the $\ell_{q}$ loss is also studied. Both the setting of known identity design covariance matrix and known noise level and the setting of unknown design covariance matrix and unknown noise level are studied. The results reveal interesting and significant differences between estimating the $\ell_2$ loss and $\ell_q$ loss with $1\le q <2$ as well as between the two settings. New technical tools are developed to establish rate sharp lower bounds for the minimax estimation error and the expected length of minimax and adaptive confidence intervals for the $\ell_q$ loss. A significant difference between loss estimation and the traditional parameter estimation is that for loss estimation the constraint is on the performance of the estimator of the regression vector, but the lower bounds are on the difficulty of estimating its $\ell_q$ loss. The technical tools developed in this paper can also be of independent interest.

preprint2016arXiv

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

The instrumental variable method consistently estimates the effect of a treatment when there is unmeasured confounding and a valid instrumental variable. A valid instrumental variable is a variable that is independent of unmeasured confounders and affects the treatment but does not have a direct effect on the outcome beyond its effect on the treatment. Two commonly used estimators for using an instrumental variable to estimate a treatment effect are the two stage least squares estimator and the control function estimator. For linear causal effect models, these two estimators are equivalent, but for nonlinear causal effect models, the estimators are different. We provide a systematic comparison of these two estimators for nonlinear causal effect models and develop an approach to combing the two estimators that generally performs better than either one alone. We show that the control function estimator is a two stage least squares estimator with an augmented set of instrumental variables. If these augmented instrumental variables are valid, then the control function estimator can be much more efficient than usual two stage least squares without the augmented instrumental variables while if the augmented instrumental variables are not valid, then the control function estimator may be inconsistent while the usual two stage least squares remains consistent. We apply the Hausman test to test whether the augmented instrumental variables are valid and construct a pretest estimator based on this test. The pretest estimator is shown to work well in a simulation study. An application to the effect of exposure to violence on time preference is considered.

preprint2016arXiv

Mediation Analysis for Count and Zero-Inflated Count Data without Sequential Ignorability and Its Application in Dental Studies

Mediation analysis seeks to understand the mechanism by which a treatment affects an outcome. Count or zero-inflated count outcome are common in many studies in which mediation analysis is of interest. For example, in dental studies, outcomes such as decayed, missing and filled teeth are typically zero inflated. Existing mediation analysis approaches for count data assume sequential ignorability of the mediator. This is often not plausible because the mediator is not randomized so that there are unmeasured confounders associated with the mediator and the outcome. In this paper, we develop causal methods based on instrumental variable (IV) approaches for mediation analysis for count data possibly with a lot of zeros that do not require the assumption of sequential ignorability. We first define the direct and indirect effect ratios for those data, and then propose estimating equations and use empirical likelihood to estimate the direct and indirect effects consistently. A sensitivity analysis is proposed for violations of the IV exclusion restriction assumption. Simulation studies demonstrate that our method works well for different types of outcomes under different settings. Our method is applied to a randomized dental caries prevention trial and a study of the effect of a massive flood in Bangladesh on children's diarrhea.

preprint2016arXiv

Optimal Estimation of Co-heritability in High-dimensional Linear Models

Co-heritability is an important concept that characterizes the genetic associations within pairs of quantitative traits. There has been significant recent interest in estimating the co-heritability based on data from the genome-wide association studies (GWAS). This paper introduces two measures of co-heritability in the high-dimensional linear model framework, including the inner product of the two regression vectors and a normalized inner product by their lengths. Functional de-biased estimators (FDEs) are developed to estimate these two co-heritability measures. In addition, estimators of quadratic functionals of the regression vectors are proposed. Both theoretical and numerical properties of the estimators are investigated. In particular, minimax rates of convergence are established and the proposed estimators of the inner product, the quadratic functionals and the normalized inner product are shown to be rate-optimal. Simulation results show that the FDEs significantly outperform the naive plug-in estimates. The FDEs are also applied to analyze a yeast segregant data set with multiple traits to estimate heritability and co-heritability among the traits.

preprint2016arXiv

Using an Instrumental Variable to Test for Unmeasured Confounding

An important concern in an observational study is whether or not there is unmeasured confounding, i.e., unmeasured ways in which the treatment and control groups differ before treatment that affect the outcome. We develop a test of whether there is unmeasured confounding when an instrumental variable (IV) is available. An IV is a variable that is independent of the unmeasured confounding and encourages a subject to take one treatment level vs. another, while having no effect on the outcome beyond its encouragement of a certain treatment level. We show what types of unmeasured confounding can be tested for with an IV and develop a test for this type of unmeasured confounding that has correct type I error rate. We show that the widely used Durbin-Wu-Hausman (DWH) test can have inflated type I error rates when there is treatment effect heterogeneity. Additionally, we show that our test provides more insight into the nature of the unmeasured confounding than the DWH test. We apply our test to an observational study of the effect of a premature infant being delivered in a high-level neonatal intensive care unit (one with mechanical assisted ventilation and high volume) vs. a lower level unit, using the excess travel time a mother lives from the nearest high-level unit to the nearest lower-level unit as an IV.

preprint2015arXiv

Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity

Confidence sets play a fundamental role in statistical inference. In this paper, we consider confidence intervals for high dimensional linear regression with random design. We first establish the convergence rates of the minimax expected length for confidence intervals in the oracle setting where the sparsity parameter is given. The focus is then on the problem of adaptation to sparsity for the construction of confidence intervals. Ideally, an adaptive confidence interval should have its length automatically adjusted to the sparsity of the unknown regression vector, while maintaining a prespecified coverage probability. It is shown that such a goal is in general not attainable, except when the sparsity parameter is restricted to a small region over which the confidence intervals have the optimal length of the usual parametric rate. It is further demonstrated that the lack of adaptivity is not due to the conservativeness of the minimax framework, but is fundamentally caused by the difficulty of learning the bias accurately.

preprint2013arXiv

Boundary Value Problems for a Family of Domains in the Sierpinski Gasket

For a family of domains in the Sierpinski gasket, we study harmonic functions of finite energy, characterizing them in terms of their boundary values, and study their normal derivatives on the boundary. We characterize those domains for which there is an extension operator for functions of finite energy. We give an explicit construction of the Green's function for these domains.

Zijian Guo

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Causal Invariance Learning via Efficient Nonconvex Optimization

Distributionally Robust Synthetic Control: Ensuring Robustness Against Highly Correlated Controls and Weight Shifts

Robustness Against Weak or Invalid Instruments: Exploring Nonlinear Treatment Models with Machine Learning

Robust Inference for Federated Meta-Learning

Causal Inference for Nonlinear Outcome Models with Possibly Invalid Instrumental Variables

Decorrelated Local Linear Estimator: Inference for Non-linear Effects in High-dimensional Additive Models

Extreme Eigenvalues of Nonlinear Correlation Matrices with Applications to Additive Models

Optimal Statistical Inference for Individualized Treatment Effects in High-dimensional Models

Accuracy Assessment for High-dimensional Linear Regression

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

Mediation Analysis for Count and Zero-Inflated Count Data without Sequential Ignorability and Its Application in Dental Studies

Optimal Estimation of Co-heritability in High-dimensional Linear Models

Using an Instrumental Variable to Test for Unmeasured Confounding

Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity

Boundary Value Problems for a Family of Domains in the Sierpinski Gasket