Source author record

Dominik Janzing

Dominik Janzing appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence quant-ph cond-mat.stat-mech Methodology math.ST stat.OT Statistics Theory astro-ph.EP astro-ph.IM Computer Vision cond-mat.dis-nn cond-mat.mes-hall gr-qc Information Theory math.IT physics.data-an

Catalog footprint

What is connected

43works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Causal Forecasting:Generalization Bounds for Autoregressive Models

Despite the increasing relevance of forecasting methods, causal implications of these algorithms remain largely unexplored. This is concerning considering that, even under simplifying assumptions such as causal sufficiency, the statistical risk of a model can differ significantly from its \textit{causal risk}. Here, we study the problem of \textit{causal generalization} -- generalizing from the observational to interventional distributions -- in forecasting. Our goal is to find answers to the question: How does the efficacy of an autoregressive (VAR) model in predicting statistical associations compare with its ability to predict under interventions? To this end, we introduce the framework of \textit{causal learning theory} for forecasting. Using this framework, we obtain a characterization of the difference between statistical and causal risks, which helps identify sources of divergence between them. Under causal sufficiency, the problem of causal generalization amounts to learning under covariate shifts, albeit with additional structure (restriction to interventional distributions under the VAR model). This structure allows us to obtain uniform convergence bounds on causal generalizability for the class of VAR models. To the best of our knowledge, this is the first work that provides theoretical guarantees for causal generalization in the time-series setting.

preprint2022arXiv

Causal Inference Through the Structural Causal Marginal Problem

We introduce an approach to counterfactual inference based on merging information from multiple datasets. We consider a causal reformulation of the statistical marginal problem: given a collection of marginal structural causal models (SCMs) over distinct but overlapping sets of variables, determine the set of joint SCMs that are counterfactually consistent with the marginal ones. We formalise this approach for categorical SCMs using the response function formulation and show that it reduces the space of allowed marginal and joint SCMs. Our work thus highlights a new mode of falsifiability through additional variables, in contrast to the statistical one via additional data.

preprint2022arXiv

Correcting Confounding via Random Selection of Background Variables

We propose a method to distinguish causal influence from hidden confounding in the following scenario: given a target variable Y, potential causal drivers X, and a large number of background features, we propose a novel criterion for identifying causal relationship based on the stability of regression coefficients of X on Y with respect to selecting different background features. To this end, we propose a statistic V measuring the coefficient's variability. We prove, subject to a symmetry assumption for the background influence, that V converges to zero if and only if X contains no causal drivers. In experiments with simulated data, the method outperforms state of the art algorithms. Further, we report encouraging results for real-world data. Our approach aligns with the general belief that causal insights admit better generalization of statistical associations across environments, and justifies similar existing heuristic approaches from the literature.

preprint2022arXiv

Explaining the root causes of unit-level changes

Existing methods of explainable AI and interpretable ML cannot explain change in the values of an output variable for a statistical unit in terms of the change in the input values and the change in the "mechanism" (the function transforming input to output). We propose two methods based on counterfactuals for explaining unit-level changes at various input granularities using the concept of Shapley values from game theory. These methods satisfy two key axioms desirable for any unit-level change attribution method. Through simulations, we study the reliability and the scalability of the proposed methods. We get sensible results from a case study on identifying the drivers of the change in the earnings for individuals in the US.

preprint2022arXiv

Obtaining Causal Information by Merging Datasets with MAXENT

The investigation of the question "which treatment has a causal effect on a target variable?" is of particular relevance in a large number of scientific disciplines. This challenging task becomes even more difficult if not all treatment variables were or even cannot be observed jointly with the target variable. Another similarly important and challenging task is to quantify the causal influence of a treatment on a target in the presence of confounders. In this paper, we discuss how causal knowledge can be obtained without having observed all variables jointly, but by merging the statistical information from different datasets. We show how the maximum entropy principle can be used to identify edges among random variables when assuming causal sufficiency and an extended version of faithfulness, and when only subsets of the variables have been observed jointly.

preprint2022arXiv

Score matching enables causal discovery of nonlinear additive noise models

This paper demonstrates how to recover causal graphs from the score of the data distribution in non-linear additive (Gaussian) noise models. Using score matching algorithms as a building block, we show how to design a new generation of scalable causal discovery methods. To showcase our approach, we also propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph. Empirically, we find that the new algorithm, called SCORE, is competitive with state-of-the-art causal discovery methods while being significantly faster.

preprint2022arXiv

Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

This paper proposes a new approach for testing Granger non-causality on panel data. Instead of aggregating panel member statistics, we aggregate their corresponding p-values and show that the resulting p-value approximately bounds the type I error by the chosen significance level even if the panel members are dependent. We compare our approach against the most widely used Granger causality algorithm on panel data and show that our approach yields lower FDR at the same power for large sample sizes and panels with cross-sectional dependencies. Finally, we examine COVID-19 data about confirmed cases and deaths measured in countries/regions worldwide and show that our approach is able to discover the true causal relation between confirmed cases and deaths while state-of-the-art approaches fail.

preprint2021arXiv

A theory of independent mechanisms for extrapolation in generative models

Generative models can be trained to emulate complex empirical data, but are they useful to make predictions in the context of previously unobserved environments? An intuitive idea to promote such extrapolation capabilities is to have the architecture of such model reflect a causal graph of the true data generating process, such that one can intervene on each node independently of the others. However, the nodes of this graph are usually unobserved, leading to overparameterization and lack of identifiability of the causal structure. We develop a theoretical framework to address this challenging situation by defining a weaker form of identifiability, based on the principle of independence of mechanisms. We demonstrate on toy examples that classical stochastic gradient descent can hinder the model's extrapolation capabilities, suggesting independence of mechanisms should be enforced explicitly during training. Experiments on deep generative models trained on real world data support these insights and illustrate how the extrapolation capabilities of such models can be leveraged.

preprint2017arXiv

Causal Consistency of Structural Equation Models

Complex systems can be modelled at various levels of detail. Ideally, causal models of the same system should be consistent with one another in the sense that they agree in their predictions of the effects of interventions. We formalise this notion of consistency in the case of Structural Equation Models (SEMs) by introducing exact transformations between SEMs. This provides a general language to consider, for instance, the different levels of description in the following three scenarios: (a) models with large numbers of variables versus models in which the `irrelevant' or unobservable variables have been marginalised out; (b) micro-level models versus macro-level models in which the macro-variables are aggregate features of the micro-variables; (c) dynamical time series models versus models of their stationary behaviour. Our analysis stresses the importance of well specified interventions in the causal modelling process and sheds light on the interpretation of cyclic SEMs.

preprint2015arXiv

Algorithmic independence of initial condition and dynamical law in thermodynamics and causal inference

We postulate a principle stating that the initial condition of a physical system is typically algorithmically independent of the dynamical law. We argue that this links thermodynamics and causal inference. On the one hand, it entails behaviour that is similar to the usual arrow of time. On the other hand, it motivates a statistical asymmetry between cause and effect that has recently postulated in the field of causal inference, namely, that the probability distribution P(cause) contains no information about the conditional distribution P(effect|cause) and vice versa, while P(effect) may contain information about P(cause|effect).

preprint2015arXiv

Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

A widely applied approach to causal inference from a non-experimental time series $X$, often referred to as "(linear) Granger causal analysis", is to regress present on past and interpret the regression matrix $\hat{B}$ causally. However, if there is an unmeasured time series $Z$ that influences $X$, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as $Z$. In this paper we take a different approach: We assume that $X$ together with some hidden $Z$ forms a first order vector autoregressive (VAR) process with transition matrix $A$, and argue why it is more valid to interpret $A$ causally instead of $\hat{B}$. Then we examine under which conditions the most important parts of $A$ are identifiable or almost identifiable from only $X$. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from $X$ to $Z$. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using $X$.

preprint2015arXiv

Distinguishing cause from effect using observational data: methods and benchmarks

The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y. An example is to decide whether altitude causes temperature, or vice versa, given only joint measurements of both variables. Even under the simplifying assumptions of no confounding, no feedback loops, and no selection bias, such bivariate causal discovery problems are challenging. Nevertheless, several approaches for addressing those problems have been proposed in recent years. We review two families of such methods: Additive Noise Methods (ANM) and Information Geometric Causal Inference (IGCI). We present the benchmark CauseEffectPairs that consists of data for 100 different cause-effect pairs selected from 37 datasets from various domains (e.g., meteorology, biology, medicine, engineering, economy, etc.) and motivate our decisions regarding the "ground truth" causal directions of all pairs. We evaluate the performance of several bivariate causal discovery methods on these real-world benchmark data and in addition on artificially simulated data. Our empirical results on real-world data indicate that certain methods are indeed able to distinguish cause from effect using only purely observational data, although more benchmark data would be needed to obtain statistically significant conclusions. One of the best performing methods overall is the additive-noise method originally proposed by Hoyer et al. (2009), which obtains an accuracy of 63+-10 % and an AUC of 0.74+-0.05 on the real-world benchmark. As the main theoretical contribution of this work we prove the consistency of that method.

preprint2015arXiv

Removing systematic errors for exoplanet search via latent causes

We describe a method for removing the effect of confounders in order to reconstruct a latent quantity of interest. The method, referred to as half-sibling regression, is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification and illustrate the potential of the method in a challenging astronomy application.

preprint2015arXiv

Telling cause from effect in deterministic linear dynamical systems

Inferring a cause from its effect using observed time series data is a major challenge in natural and social sciences. Assuming the effect is generated by the cause trough a linear system, we propose a new approach based on the hypothesis that nature chooses the "cause" and the "mechanism that generates the effect from the cause" independent of each other. We therefore postulate that the power spectrum of the time series being the cause is uncorrelated with the square of the transfer function of the linear filter generating the effect. While most causal discovery methods for time series mainly rely on the noise, our method relies on asymmetries of the power spectral density properties that can be exploited even in the context of deterministic systems. We describe mathematical assumptions in a deterministic model under which the causal direction is identifiable with this approach. We also discuss the method's performance under the additive noise model and its relationship to Granger causality. Experiments show encouraging results on synthetic as well as real-world data. Overall, this suggests that the postulate of Independence of Cause and Mechanism is a promising principle for causal inference on empirical time series.

preprint2014arXiv

Causal Discovery with Continuous Additive Noise Models

We consider the problem of learning causal directed acyclic graphs from an observational joint distribution. One can use these graphs to predict the outcome of interventional experiments, from which data are often not available. We show that if the observational distribution follows a structural equation model with an additive noise structure, the directed acyclic graph becomes identifiable from the distribution under mild conditions. This constitutes an interesting alternative to traditional methods that assume faithfulness and identify only the Markov equivalence class of the graph, thus leaving some edges undirected. We provide practical algorithms for finitely many samples, RESIT (Regression with Subsequent Independence Test) and two methods based on an independence score. We prove that RESIT is correct in the population setting and provide an empirical evaluation.

preprint2014arXiv

Consistency of Causal Inference under the Additive Noise Model

We analyze a family of methods for statistical causal inference from sample under the so-called Additive Noise Model. While most work on the subject has concentrated on establishing the soundness of the Additive Noise Model, the statistical consistency of the resulting inference methods has received little attention. We derive general conditions under which the given family of inference methods consistently infers the causal direction in a nonparametric setting.

preprint2014arXiv

From Ordinary Differential Equations to Structural Causal Models: the deterministic case

We show how, and under which conditions, the equilibrium states of a first-order Ordinary Differential Equation (ODE) system can be described with a deterministic Structural Causal Model (SCM). Our exposition sheds more light on the concept of causality as expressed within the framework of Structural Causal Models, especially for cyclic models.

preprint2014arXiv

Inferring causal structure: a quantum advantage

The problem of using observed correlations to infer causal relations is relevant to a wide variety of scientific disciplines. Yet given correlations between just two classical variables, it is impossible to determine whether they arose from a causal influence of one on the other or a common cause influencing both, unless one can implement a randomized intervention. We here consider the problem of causal inference for quantum variables. We introduce causal tomography, which unifies and generalizes conventional quantum tomography schemes to provide a complete solution to the causal inference problem using a quantum analogue of a randomized trial. We furthermore show that, in contrast to the classical case, observed quantum correlations alone can sometimes provide a solution. We implement a quantum-optical experiment that allows us to control the causal relation between two optical modes, and two measurement schemes -- one with and one without randomization -- that extract this relation from the observed correlations. Our results show that entanglement and coherence, known to be central to quantum information processing, also provide a quantum advantage for causal inference.

preprint2014arXiv

Justifying Information-Geometric Causal Inference

Information Geometric Causal Inference (IGCI) is a new approach to distinguish between cause and effect for two variables. It is based on an independence assumption between input distribution and causal mechanism that can be phrased in terms of orthogonality in information space. We describe two intuitive reinterpretations of this approach that makes IGCI more accessible to a broader audience. Moreover, we show that the described independence is related to the hypothesis that unsupervised learning and semi-supervised learning only works for predicting the cause from the effect and not vice versa.

preprint2014arXiv

Quantifying causal influences

Many methods for causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between $n$ variables. Given the joint distribution on all these variables, the DAG contains all information about how intervening on one variable changes the distribution of the other $n-1$ variables. However, quantifying the causal influence of one variable on another one remains a nontrivial question. Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy. We then introduce a communication scenario, where edges in a DAG play the role of channels that can be locally corrupted by interventions. Causal strength is then the relative entropy distance between the old and the new distribution. Many other measures of causal strength have been proposed, including average causal effect, transfer entropy, directed information, and information flow. We explain how they fail to satisfy the postulates on simple DAGs of $\leq3$ nodes. Finally, we investigate the behavior of our measure on time-series, supporting our claims with experiments on simulated data.

preprint2013arXiv

From Ordinary Differential Equations to Structural Causal Models: the deterministic case

preprint2013arXiv

Identifying Finite Mixtures of Nonparametric Product Distributions and Causal Inference of Confounders

We propose a kernel method to identify finite mixtures of nonparametric product distributions. It is based on a Hilbert space embedding of the joint distribution. The rank of the constructed tensor is equal to the number of mixture components. We present an algorithm to recover the components by partitioning the data points into clusters such that the variables are jointly conditionally independent given the cluster. This method can be used to identify finite confounders.

preprint2012arXiv

Causal Inference on Time Series using Structural Equation Models

Causal inference uses observations to infer the causal structure of the data generating system. We study a class of functional models that we call Time Series Models with Independent Noise (TiMINo). These models require independent residual time series, whereas traditional methods like Granger causality exploit the variance of residuals. There are two main contributions: (1) Theoretical: By restricting the model class (e.g. to additive noise) we can provide a more general identifiability result than existing ones. This result incorporates lagged and instantaneous effects that can be nonlinear and do not need to be faithful, and non-instantaneous feedbacks between the time series. (2) Practical: If there are no feedback loops between time series, we propose an algorithm based on non-linear independence tests of time series. When the data are causally insufficient, or the data generating process does not satisfy the model assumptions, this algorithm may still give partial results, but mostly avoids incorrect answers. An extension to (non-instantaneous) feedbacks is possible, but not discussed. It outperforms existing methods on artificial and real data. Code can be provided upon request.

preprint2012arXiv

Detecting low-complexity unobserved causes

We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a "direct" causal link or only due to a connecting causal path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y|x) in the simplex of all distributions of Y. We report encouraging results on semi-empirical data.

preprint2012arXiv

Identifiability of Causal Graphs using Functional Models

This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independence-based causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by defining Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical findings.

preprint2012arXiv

Identifying confounders using additive noise models

We propose a method for inferring the existence of a latent common cause ('confounder') of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions the model is identifiable (up to an arbitrary reparameterization of the confounder) from the joint distribution of the effects. We state and prove a theoretical result that provides evidence for the conjecture that the model is generically identifiable under suitable technical conditions. In addition, we propose a practical method to estimate the confounder from a finite i.i.d. sample of the effects and illustrate that the method works well on both simulated and real-world data.

preprint2012arXiv

Inferring deterministic causal relations

We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.

preprint2012arXiv

Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery

In nonlinear latent variable models or dynamic models, if we consider the latent variables as confounders (common causes), the noise dependencies imply further relations between the observed variables. Such models are then closely related to causal discovery in the presence of nonlinear confounders, which is a challenging problem. However, generally in such models the observation noise is assumed to be independent across data dimensions, and consequently the noise dependencies are ignored. In this paper we focus on the Gaussian process latent variable model (GPLVM), from which we develop an extended model called invariant GPLVM (IGPLVM), which can adapt to arbitrary noise covariances. With the Gaussian process prior put on a particular transformation of the latent nonlinear functions, instead of the original ones, the algorithm for IGPLVM involves almost the same computational loads as that for the original GPLVM. Besides its potential application in causal discovery, IGPLVM has the advantage that its estimated latent nonlinear manifold is invariant to any nonsingular linear transformation of the data. Experimental results on both synthetic and realworld data show its encouraging performance in nonlinear manifold learning and causal discovery.

preprint2012arXiv

Kernel-based Conditional Independence Test and Application in Causal Discovery

Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.

preprint2012arXiv

On Causal and Anticausal Learning

We consider the problem of function estimation in the case where an underlying causal model can be inferred. This has implications for popular scenarios such as covariate shift, concept drift, transfer learning and semi-supervised learning. We argue that causal knowledge may facilitate some approaches for a given problem, and rule out others. In particular, we formulate a hypothesis for when semi-supervised learning can help, and corroborate it with empirical results.

preprint2012arXiv

Testing whether linear equations are causal: A free probability theory approach

We propose a method that infers whether linear relations between two high-dimensional variables X and Y are due to a causal influence from X to Y or from Y to X. The earlier proposed so-called Trace Method is extended to the regime where the dimension of the observed variables exceeds the sample size. Based on previous work, we postulate conditions that characterize a causal relation between X and Y. Moreover, we describe a statistical test and argue that both causal directions are typically rejected if there is a common cause. A full theoretical analysis is presented for the deterministic case but our approach seems to be valid for the noisy case, too, for which we additionally present an approach based on a sparsity constraint. The discussed method yields promising results for both simulated and real world data.

preprint2011arXiv

Robust Learning via Cause-Effect Models

We consider the problem of function estimation in the case where the data distribution may shift between training and test time, and additional information about it may be available at test time. This relates to popular scenarios such as covariate shift, concept drift, transfer learning and semi-supervised learning. This working paper discusses how these tasks could be tackled depending on the kind of changes of the distributions. It argues that knowledge of an underlying causal direction can facilitate several of these tasks.

preprint2011arXiv

Thermodynamic limits of dynamic cooling

We study dynamic cooling, where an externally driven two-level system is cooled via reservoir, a quantum system with initial canonical equilibrium state. We obtain explicitly the minimal possible temperature $T_{\rm min}>0$ reachable for the two-level system. The minimization goes over all unitary dynamic processes operating on the system and reservoir, and over the reservoir energy spectrum. The minimal work needed to reach $T_{\rm min}$ grows as $1/T_{\rm min}$. This work cost can be significantly reduced, though, if one is satisfied by temperatures slightly above $T_{\rm min}$. Our results on $T_{\rm min}>0$ prove unattainability of the absolute zero temperature without ambiguities that surround its derivation from the entropic version of the third law. The unattainability can be recovered, albeit via a different mechanism, for cooling by a reservoir with an initially microcanonic state. We also study cooling via a reservoir consisting of $N\gg 1$ identical spins. Here we show that $T_{\rm min}\propto\frac{1}{N}$ and find the maximal cooling compatible with the minimal work determined by the free energy.

preprint2010arXiv

Causal Markov condition for submodular information measures

The causal Markov condition (CMC) is a postulate that links observations to causality. It describes the conditional independences among the observations that are entailed by a causal hypothesis in terms of a directed acyclic graph. In the conventional setting, the observations are random variables and the independence is a statistical one, i.e., the information content of observations is measured in terms of Shannon entropy. We formulate a generalized CMC for any kind of observations on which independence is defined via an arbitrary submodular information measure. Recently, this has been discussed for observations in terms of binary strings where information is understood in the sense of Kolmogorov complexity. Our approach enables us to find computable alternatives to Kolmogorov complexity, e.g., the length of a text after applying existing data compression schemes. We show that our CMC is justified if one restricts the attention to a class of causal mechanisms that is adapted to the respective information measure. Our justification is similar to deriving the statistical CMC from functional models of causality, where every variable is a deterministic function of its observed causes and an unobserved noise term. Our experiments on real data demonstrate the performance of compression based causal inference.

preprint2010arXiv

Is there a physically universal cellular automaton or Hamiltonian?

It is known that both quantum and classical cellular automata (CA) exist that are computationally universal in the sense that they can simulate, after appropriate initialization, any quantum or classical computation, respectively. Here we introduce a different notion of universality: a CA is called physically universal if every transformation on any finite region can be (approximately) implemented by the autonomous time evolution of the system after the complement of the region has been initialized in an appropriate way. We pose the question of whether physically universal CAs exist. Such CAs would provide a model of the world where the boundary between a physical system and its controller can be consistently shifted, in analogy to the Heisenberg cut for the quantum measurement problem. We propose to study the thermodynamic cost of computation and control within such a model because implementing a cyclic process on a microsystem may require a non-cyclic process for its controller, whereas implementing a cyclic process on system and controller may require the implementation of a non-cyclic process on a "meta"-controller, and so on. Physically universal CAs avoid this infinite hierarchy of controllers and the cost of implementing cycles on a subsystem can be described by mixing properties of the CA dynamics. We define a physical prior on the CA configurations by applying the dynamics to an initial state where half of the CA is in the maximum entropy state and half of it is in the all-zero state (thus reflecting the fact that life requires non-equilibrium states like the boundary between a hold and a cold reservoir). As opposed to Solomonoff's prior, our prior does not only account for the Kolmogorov complexity but also for the cost of isolating the system during the state preparation if the preparation process is not robust.

preprint2009arXiv

Causal Inference on Discrete Data using Additive Noise Models

Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution $\prob^{(X,Y)}$ admits such a model in one direction, e.g. $Y=f(X)+N, N \independent X$, it does not admit the reversed model $X=g(Y)+\tilde N, \tilde N \independent Y$ as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets.

preprint2009arXiv

On the entropy production of time series with unidirectional linearity

There are non-Gaussian time series that admit a causal linear autoregressive moving average (ARMA) model when regressing the future on the past, but not when regressing the past on the future. The reason is that, in the latter case, the regression residuals are only uncorrelated but not statistically independent of the future. In previous work, we have experimentally verified that many empirical time series indeed show such a time inversion asymmetry. For various physical systems, it is known that time-inversion asymmetries are linked to the thermodynamic entropy production in non-equilibrium states. Here we show that such a link also exists for the above unidirectional linearity. We study the dynamical evolution of a physical toy system with linear coupling to an infinite environment and show that the linearity of the dynamics is inherited to the forward-time conditional probabilities, but not to the backward-time conditionals. The reason for this asymmetry between past and future is that the environment permanently provides particles that are in a product state before they interact with the system, but show statistical dependencies afterwards. From a coarse-grained perspective, the interaction thus generates entropy. We quantitatively relate the strength of the non-linearity of the backward conditionals to the minimal amount of entropy generation.

preprint2009arXiv

Thermodynamic efficiency of information and heat flow

A basic task of information processing is information transfer (flow). Here we study a pair of Brownian particles each coupled to a thermal bath at temperature $T_1$ and $T_2$, respectively. The information flow in such a system is defined via the time-shifted mutual information. The information flow nullifies at equilibrium, and its efficiency is defined as the ratio of flow over the total entropy production in the system. For a stationary state the information flows from higher to lower temperatures, and its the efficiency is bound from above by $\frac{{\rm max}[T_1,T_2]}{|T_1-T_2|}$. This upper bound is imposed by the second law and it quantifies the thermodynamic cost for information flow in the present class of systems. It can be reached in the adiabatic situation, where the particles have widely different characteristic times. The efficiency of heat flow|defined as the heat flow over the total amount of dissipated heat|is limited from above by the same factor. There is a complementarity between heat- and information-flow: the setup which is most efficient for the former is the least efficient for the latter and {\it vice versa}. The above bound for the efficiency can be [transiently] overcome in certain non-stationary situations, but the efficiency is still limited from above. We study yet another measure of information-processing [transfer entropy] proposed in literature. Though this measure does not require any thermodynamic cost, the information flow and transfer entropy are shown to be intimately related for stationary states.

preprint2006arXiv

A Quantum Broadcasting Problem in Classical Low Power Signal Processing

We pose a problem called ``broadcasting Holevo-information'': given an unknown state taken from an ensemble, the task is to generate a bipartite state transfering as much Holevo-information to each copy as possible. We argue that upper bounds on the average information over both copies imply lower bounds on the quantum capacity required to send the ensemble without information loss. This is because a channel with zero quantum capacity has a unitary extension transfering at least as much information to its environment as it transfers to the output. For an ensemble being the time orbit of a pure state under a Hamiltonian evolution, we derive such a bound on the required quantum capacity in terms of properties of the input and output energy distribution. Moreover, we discuss relations between the broadcasting problem and entropy power inequalities. The broadcasting problem arises when a signal should be transmitted by a time-invariant device such that the outgoing signal has the same timing information as the incoming signal had. Based on previous results we argue that this establishes a link between quantum information theory and the theory of low power computing because the loss of timing information implies loss of free energy.

preprint2006arXiv

Minimally-disturbing Heisenberg-Weyl symmetric measurements using hard-core collisions of Schrödinger particles

In a previous paper we have presented a general scheme for the implementation of symmetric generalized measurements (POVMs) on a quantum computer. This scheme is based on representation theory of groups and methods to decompose matrices that intertwine two representations. We extend this scheme in such a way that the measurement is minimally disturbing, i.e., it changes the state vector \ketΨ of the system to \sqrtΠ \ketΨ where Πis the positive operator corresponding to the measured result. Using this method, we construct quantum circuits for measurements with Heisenberg-Weyl symmetry. A continuous generalization leads to a scheme for optimal simultaneous measurements of position and momentum of a Schr"odinger particle moving in one dimension such that the outcomes satisfy Δx Δp \geq \hbar. The particle to be measured collides with two probe particles, one for the position and the other for the momentum measurement. The position and momentum resolution can be tuned by the entangled joint state of the probe particles which is also generated by a collision with hard-core potential. The parameters of the POVM can then be controlled by the initial widths of the wave functions of the probe particles. We point out some formal similarities and differences to simultaneous measurements of quadrature amplitudes in quantum optics.

preprint2006arXiv

Spin-1/2 particles moving on a 2D lattice with nearest-neighbor interactions can realize an autonomous quantum computer

What is the simplest Hamiltonian which can implement quantum computation without requiring any control operations during the computation process? In a previous paper we have constructed a 10-local finite-range interaction among qubits on a 2D lattice having this property. Here we show that pair-interactions among qutrits on a 2D lattice are sufficient, too, and can also implement an ergodic computer where the result can be read out from the time average state after some post-selection with high success probability. Two of the 3 qutrit states are given by the two levels of a spin-1/2 particle located at a specific lattice site, the third state is its absence. Usual hopping terms together with an attractive force among adjacent particles induce a coupled quantum walk where the particle spins are subjected to spatially inhomogeneous interactions implementing holonomic quantum computing. The holonomic method ensures that the implemented circuit does not depend on the time needed for the walk. Even though the implementation of the required type of spin-spin interactions is currently unclear, the model shows that quite simple Hamiltonians are powerful enough to allow for universal quantum computing in a closed physical system.

preprint2005arXiv

On Quantum A/D and D/A Conversion

An algorithm is proposed which transfers the quantum information of a wave function (analogue signal) into a register of qubits (digital signal) such that $n$ qubits describe the amplitudes and phases of $2^n$ points of a sufficiently smooth wave function. We assume that the continuous degree of freedom couples to one or more qubits of a quantum register via a Jaynes Cummings Hamiltonian and that we have universal quantum computation capabilities on the register as well as the possibility to perform bang-bang control on the qubits. The transfer of information is mainly based on the application of the quantum phase-estimation algorithm in both directions. Here, the running time increases exponentially with the number of qubits. We pose it as an open question which interactions would allow polynomial running time. One example would be interactions which enable exact squeezing operations.

preprint2003arXiv

Identity check is QMA-complete

We define the problem identity check: Given a classical description of a quantum circuit, determine whether it is almost equivalent to the identity. Explicitly, the task is to decide whether the corresponding unitary is close to a complex multiple of the identity matrix with respect to the operator norm. We show that this problem is QMA-complete. A generalization of this problem is equivalence check: Given two descriptions of quantum circuits and a description of a common invariant subspace, decide whether the restrictions of the circuits to this subspace almost coincide. We show that equivalence check is also in QMA and hence QMA-complete.

Dominik Janzing

What is connected

Connect this record

See the researcher in context

Building this map preview

43 published item(s)

Causal Forecasting:Generalization Bounds for Autoregressive Models

Causal Inference Through the Structural Causal Marginal Problem

Correcting Confounding via Random Selection of Background Variables

Explaining the root causes of unit-level changes

Obtaining Causal Information by Merging Datasets with MAXENT

Score matching enables causal discovery of nonlinear additive noise models

Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

A theory of independent mechanisms for extrapolation in generative models

Causal Consistency of Structural Equation Models

Algorithmic independence of initial condition and dynamical law in thermodynamics and causal inference

Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

Distinguishing cause from effect using observational data: methods and benchmarks

Removing systematic errors for exoplanet search via latent causes

Telling cause from effect in deterministic linear dynamical systems

Causal Discovery with Continuous Additive Noise Models

Consistency of Causal Inference under the Additive Noise Model

From Ordinary Differential Equations to Structural Causal Models: the deterministic case

Inferring causal structure: a quantum advantage

Justifying Information-Geometric Causal Inference

Quantifying causal influences

From Ordinary Differential Equations to Structural Causal Models: the deterministic case

Identifying Finite Mixtures of Nonparametric Product Distributions and Causal Inference of Confounders

Causal Inference on Time Series using Structural Equation Models

Detecting low-complexity unobserved causes

Identifiability of Causal Graphs using Functional Models

Identifying confounders using additive noise models

Inferring deterministic causal relations

Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery

Kernel-based Conditional Independence Test and Application in Causal Discovery

On Causal and Anticausal Learning

Testing whether linear equations are causal: A free probability theory approach

Robust Learning via Cause-Effect Models

Thermodynamic limits of dynamic cooling

Causal Markov condition for submodular information measures

Is there a physically universal cellular automaton or Hamiltonian?

Causal Inference on Discrete Data using Additive Noise Models

On the entropy production of time series with unidirectional linearity

Thermodynamic efficiency of information and heat flow

A Quantum Broadcasting Problem in Classical Low Power Signal Processing

Minimally-disturbing Heisenberg-Weyl symmetric measurements using hard-core collisions of Schrödinger particles

Spin-1/2 particles moving on a 2D lattice with nearest-neighbor interactions can realize an autonomous quantum computer

On Quantum A/D and D/A Conversion

Identity check is QMA-complete