Source author record

Jiji Zhang

Jiji Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Artificial Intelligence Machine Learning Logic in Computer Science math.CT math.ST Neurons and Cognition Statistics Theory

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Local search for efficient causal effect estimation

Causal effect estimation from observational data is a challenging problem, especially with high dimensional data and in the presence of unobserved variables. The available data-driven methods for tackling the problem either provide an estimation of the bounds of a causal effect (i.e. nonunique estimation) or have low efficiency. The major hurdle for achieving high efficiency while trying to obtain unique and unbiased causal effect estimation is how to find a proper adjustment set for confounding control in a fast way, given the huge covariate space and considering unobserved variables. In this paper, we approach the problem as a local search task for finding valid adjustment sets in data. We establish the theorems to support the local search for adjustment sets, and we show that unique and unbiased estimation can be achieved from observational data even when there exist unobserved variables. We then propose a data-driven algorithm that is fast and consistent under mild assumptions. We also make use of a frequent pattern mining method to further speed up the search of minimal adjustment sets for causal effect estimation. Experiments conducted on extensive synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art criteria/estimators in both accuracy and time-efficiency.

preprint2022arXiv

Markov categories, causal theories, and the do-calculus

We give a category-theoretic treatment of causal models that formalizes the syntax for causal reasoning over a directed acyclic graph (DAG) by associating a free Markov category with the DAG in a canonical way. This framework enables us to define and study important concepts in causal reasoning from an abstract and "purely causal" point of view, such as causal independence/separation, causal conditionals, and decomposition of intervention effects. Our results regarding these concepts abstract away from the details of the commonly adopted causal models such as (recursive) structural equation models or causal Bayesian networks. They are therefore more widely applicable and in a way conceptually clearer. Our results are also intimately related to Judea Pearl's celebrated do-calculus, and yield a syntactic version of a core part of the calculus that is inherited in all causal models. In particular, it induces a simpler and specialized version of Pearl's do-calculus in the context of causal Bayesian networks, which we show is as strong as the full version.

preprint2022arXiv

Reframed GES with a Neural Conditional Dependence Measure

In a nonparametric setting, the causal structure is often identifiable only up to Markov equivalence, and for the purpose of causal inference, it is useful to learn a graphical representation of the Markov equivalence class (MEC). In this paper, we revisit the Greedy Equivalence Search (GES) algorithm, which is widely cited as a score-based algorithm for learning the MEC of the underlying causal structure. We observe that in order to make the GES algorithm consistent in a nonparametric setting, it is not necessary to design a scoring metric that evaluates graphs. Instead, it suffices to plug in a consistent estimator of a measure of conditional dependence to guide the search. We therefore present a reframing of the GES algorithm, which is more flexible than the standard score-based version and readily lends itself to the nonparametric setting with a general measure of conditional dependence. In addition, we propose a neural conditional dependence (NCD) measure, which utilizes the expressive power of deep neural networks to characterize conditional independence in a nonparametric manner. We establish the optimality of the reframed GES algorithm under standard assumptions and the consistency of using our NCD estimator to decide conditional independence. Together these results justify the proposed approach. Experimental results demonstrate the effectiveness of our method in causal discovery, as well as the advantages of using our NCD measure over kernel-based measures.

preprint2022arXiv

Reliable Causal Discovery with Improved Exact Search and Weaker Assumptions

Many of the causal discovery methods rely on the faithfulness assumption to guarantee asymptotic correctness. However, the assumption can be approximately violated in many ways, leading to sub-optimal solutions. Although there is a line of research in Bayesian network structure learning that focuses on weakening the assumption, such as exact search methods with well-defined score functions, they do not scale well to large graphs. In this work, we introduce several strategies to improve the scalability of exact score-based methods in the linear Gaussian setting. In particular, we develop a super-structure estimation method based on the support of inverse covariance matrix which requires assumptions that are strictly weaker than faithfulness, and apply it to restrict the search space of exact search. We also propose a local search strategy that performs exact search on the local clusters formed by each variable and its neighbors within two hops in the super-structure. Numerical experiments validate the efficacy of the proposed procedure, and demonstrate that it scales up to hundreds of nodes with a high accuracy.

preprint2020arXiv

Causal Discovery from Heterogeneous/Nonstationary Data with Independent Changes

It is commonplace to encounter heterogeneous or nonstationary data, of which the underlying generating process changes across domains or over time. Such a distribution shift feature presents both challenges and opportunities for causal discovery. In this paper, we develop a framework for causal discovery from such data, called Constraint-based causal Discovery from heterogeneous/NOnstationary Data (CD-NOD), to find causal skeleton and directions and estimate the properties of mechanism changes. First, we propose an enhanced constraint-based procedure to detect variables whose local mechanisms change and recover the skeleton of the causal structure over observed variables. Second, we present a method to determine causal orientations by making use of independent changes in the data distribution implied by the underlying causal model, benefiting from information carried by changing distributions. After learning the causal structure, next, we investigate how to efficiently estimate the "driving force" of the nonstationarity of a causal mechanism. That is, we aim to extract from data a low-dimensional representation of changes. The proposed methods are nonparametric, with no hard restrictions on data distributions and causal mechanisms, and do not rely on window segmentation. Furthermore, we find that data heterogeneity benefits causal structure identification even with particular types of confounders. Finally, we show the connection between heterogeneity/nonstationarity and soft intervention in causal discovery. Experimental results on various synthetic and real-world data sets (task-fMRI and stock market data) are presented to demonstrate the efficacy of the proposed methods.

preprint2016arXiv

Discovery and Visualization of Nonstationary Causal Models

It is commonplace to encounter nonstationary data, of which the underlying generating process may change over time or across domains. The nonstationarity presents both challenges and opportunities for causal discovery. In this paper we propose a principled framework to handle nonstationarity, and develop some methods to address three important questions. First, we propose an enhanced constraint-based method to detect variables whose local mechanisms are nonstationary and recover the skeleton of the causal structure over observed variables. Second, we present a way to determine some causal directions by taking advantage of information carried by changing distributions. Third, we develop a method for visualizing the nonstationarity of causal modules. Experimental results on various synthetic and real-world data sets are presented to demonstrate the efficacy of our methods.

preprint2015arXiv

A Uniformly Consistent Estimator of Causal Effects under the $k$-Triangle-Faithfulness Assumption

Spirtes, Glymour and Scheines [Causation, Prediction, and Search (1993) Springer] described a pointwise consistent estimator of the Markov equivalence class of any causal structure that can be represented by a directed acyclic graph for any parametric family with a uniformly consistent test of conditional independence, under the Causal Markov and Causal Faithfulness assumptions. Robins et al. [Biometrika 90 (2003) 491-515], however, proved that there are no uniformly consistent estimators of Markov equivalence classes of causal structures under those assumptions. Subsequently, Kalisch and Bühlmann [J. Mach. Learn. Res. 8 (2007) 613-636] described a uniformly consistent estimator of the Markov equivalence class of a linear Gaussian causal structure under the Causal Markov and Strong Causal Faithfulness assumptions. However, the Strong Faithfulness assumption may be false with high probability in many domains. We describe a uniformly consistent estimator of both the Markov equivalence class of a linear Gaussian causal structure and the identifiable structural coefficients in the Markov equivalence class under the Causal Markov assumption and the considerably weaker k-Triangle-Faithfulness assumption.

preprint2015arXiv

Distinguishing Cause from Effect Based on Exogeneity

Recent developments in structural equation modeling have produced several methods that can usually distinguish cause from effect in the two-variable case. For that purpose, however, one has to impose substantial structural constraints or smoothness assumptions on the functional causal models. In this paper, we consider the problem of determining the causal direction from a related but different point of view, and propose a new framework for causal direction determination. We show that it is possible to perform causal inference based on the condition that the cause is "exogenous" for the parameters involved in the generating process from the cause to the effect. In this way, we avoid the structural constraints required by the SEM-based approaches. In particular, we exploit nonparametric methods to estimate marginal and conditional distributions, and propose a bootstrap-based approach to test for the exogeneity condition; the testing results indicate the causal direction between two variables. The proposed method is validated on both synthetic and real data.

preprint2012arXiv

A Characterization of Markov Equivalence Classes for Directed Acyclic Graphs with Latent Variables

Different directed acyclic graphs (DAGs) may be Markov equivalent in the sense that they entail the same conditional independence relations among the observed variables. Meek (1995) characterizes Markov equivalence classes for DAGs (with no latent variables) by presenting a set of orientation rules that can correctly identify all arrow orientations shared by all DAGs in a Markov equivalence class, given a member of that class. For DAG models with latent variables, maximal ancestral graphs (MAGs) provide a neat representation that facilitates model search. Earlier work (Ali et al. 2005) has identified a set of orientation rules sufficient to construct all arrowheads common to a Markov equivalence class of MAGs. In this paper, we provide extra rules sufficient to construct all common tails as well. We end up with a set of orientation rules sound and complete for identifying commonalities across a Markov equivalence class of MAGs, which is particularly useful for causal inference.

preprint2012arXiv

A Transformational Characterization of Markov Equivalence for Directed Acyclic Graphs with Latent Variables

Different directed acyclic graphs (DAGs) may be Markov equivalent in the sense that they entail the same conditional independence relations among the observed variables. Chickering (1995) provided a transformational characterization of Markov equivalence for DAGs (with no latent variables), which is useful in deriving properties shared by Markov equivalent DAGs, and, with certain generalization, is needed to prove the asymptotic correctness of a search procedure over Markov equivalence classes, known as the GES algorithm. For DAG models with latent variables, maximal ancestral graphs (MAGs) provide a neat representation that facilitates model search. However, no transformational characterization -- analogous to Chickering's -- of Markov equivalent MAGs is yet available. This paper establishes such a characterization for directed MAGs, which we expect will have similar uses as it does for DAGs.

preprint2012arXiv

Adjacency-Faithfulness and Conservative Causal Inference

Most causal inference algorithms in the literature (e.g., Pearl (2000), Spirtes et al. (2000), Heckerman et al. (1999)) exploit an assumption usually referred to as the causal Faithfulness or Stability condition. In this paper, we highlight two components of the condition used in constraint-based algorithms, which we call "Adjacency-Faithfulness" and "Orientation-Faithfulness". We point out that assuming Adjacency-Faithfulness is true, it is in principle possible to test the validity of Orientation-Faithfulness. Based on this observation, we explore the consequence of making only the Adjacency-Faithfulness assumption. We show that the familiar PC algorithm has to be modified to be (asymptotically) correct under the weaker, Adjacency-Faithfulness assumption. Roughly the modified algorithm, called Conservative PC (CPC), checks whether Orientation-Faithfulness holds in the orientation phase, and if not, avoids drawing certain causal conclusions the PC algorithm would draw. However, if the stronger, standard causal Faithfulness condition actually obtains, the CPC algorithm is shown to output the same pattern as the PC algorithm does in the large sample limit. We also present a simulation study showing that the CPC algorithm runs almost as fast as the PC algorithm, and outputs significantly fewer false causal arrowheads than the PC algorithm does on realistic sample sizes. We end our paper by discussing how score-based algorithms such as GES perform when the Adjacency-Faithfulness but not the standard causal Faithfulness condition holds, and how to extend our work to the FCI algorithm, which allows for the possibility of latent variables.

preprint2012arXiv

Strong Faithfulness and Uniform Consistency in Causal Inference

A fundamental question in causal inference is whether it is possible to reliably infer manipulation effects from observational data. There are a variety of senses of asymptotic reliability in the statistical literature, among which the most commonly discussed frequentist notions are pointwise consistency and uniform consistency. Uniform consistency is in general preferred to pointwise consistency because the former allows us to control the worst case error bounds with a finite sample size. In the sense of pointwise consistency, several reliable causal inference algorithms have been established under the Markov and Faithfulness assumptions [Pearl 2000, Spirtes et al. 2001]. In the sense of uniform consistency, however, reliable causal inference is impossible under the two assumptions when time order is unknown and/or latent confounders are present [Robins et al. 2000]. In this paper we present two natural generalizations of the Faithfulness assumption in the context of structural equation models, under which we show that the typical algorithms in the literature (in some cases with modifications) are uniformly consistent even when the time order is unknown. We also discuss the situation where latent confounders may be present and the sense in which the Faithfulness assumption is a limiting case of the stronger assumptions.

preprint2012arXiv

Towards Characterizing Markov Equivalence Classes for Directed Acyclic Graphs with Latent Variables

It is well known that there may be many causal explanations that are consistent with a given set of data. Recent work has been done to represent the common aspects of these explanations into one representation. In this paper, we address what is less well known: how do the relationships common to every causal explanation among the observed variables of some DAG process change in the presence of latent variables? Ancestral graphs provide a class of graphs that can encode conditional independence relations that arise in DAG models with latent and selection variables. In this paper we present a set of orientation rules that construct the Markov equivalence class representative for ancestral graphs, given a member of the equivalence class. These rules are sound and complete. We also show that when the equivalence class includes a DAG, the equivalence class representative is the essential graph for the said DAG

Jiji Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Local search for efficient causal effect estimation

Markov categories, causal theories, and the do-calculus

Reframed GES with a Neural Conditional Dependence Measure

Reliable Causal Discovery with Improved Exact Search and Weaker Assumptions

Causal Discovery from Heterogeneous/Nonstationary Data with Independent Changes

Discovery and Visualization of Nonstationary Causal Models

A Uniformly Consistent Estimator of Causal Effects under the $k$-Triangle-Faithfulness Assumption

Distinguishing Cause from Effect Based on Exogeneity

A Characterization of Markov Equivalence Classes for Directed Acyclic Graphs with Latent Variables

A Transformational Characterization of Markov Equivalence for Directed Acyclic Graphs with Latent Variables

Adjacency-Faithfulness and Conservative Causal Inference

Strong Faithfulness and Uniform Consistency in Causal Inference

Towards Characterizing Markov Equivalence Classes for Directed Acyclic Graphs with Latent Variables