Source author record

Victor Veitch

Victor Veitch appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning quant-ph math.CO math.ST Social and Information Networks Statistics Theory Computation and Language Artificial Intelligence econ.EM math.PR Methodology

Catalog footprint

What is connected

14works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the challenges and opportunities in the application of causal inference to the textual domain, with its unique properties. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects with text, encompassing settings where text is used as an outcome, treatment, or to address confounding. In addition, we explore potential uses of causal inference to improve the robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the NLP community.

preprint2022arXiv

Invariant and Transportable Representations for Anti-Causal Domain Shifts

Real-world classification problems must contend with domain shift, the (potential) mismatch between the domain where a model is deployed and the domain(s) where the training data was gathered. Methods to handle such problems must specify what structure is common between the domains and what varies. A natural assumption is that causal (structural) relationships are invariant in all domains. Then, it is tempting to learn a predictor for label $Y$ that depends only on its causal parents. However, many real-world problems are "anti-causal" in the sense that $Y$ is a cause of the covariates $X$ -- in this case, $Y$ has no causal parents and the naive causal invariance is useless. In this paper, we study representation learning under a particular notion of domain shift that both respects causal invariance and that naturally handles the "anti-causal" structure. We show how to leverage the shared causal structure of the domains to learn a representation that both admits an invariant predictor and that also allows fast adaptation in new domains. The key is to translate causal assumptions into learning principles that disentangle "invariant" and "non-stable" features. Experiments on both synthetic and real-world data demonstrate the effectiveness of the proposed learning algorithm. Code is available at https://github.com/ybjiaang/ACTIR.

preprint2022arXiv

Using Embeddings for Causal Estimation of Peer Influence in Social Networks

We address the problem of using observational data to estimate peer contagion effects, the influence of treatments applied to individuals in a network on the outcomes of their neighbors. A main challenge to such estimation is that homophily - the tendency of connected units to share similar latent traits - acts as an unobserved confounder for contagion effects. Informally, it's hard to tell whether your friends have similar outcomes because they were influenced by your treatment, or whether it's due to some common trait that caused you to be friends in the first place. Because these common causes are not usually directly observed, they cannot be simply adjusted for. We describe an approach to perform the required adjustment using node embeddings learned from the network itself. The main aim is to perform this adjustment nonparametrically, without functional form assumptions on either the process that generated the network or the treatment assignment and outcome processes. The key contributions are to nonparametrically formalize the causal effect in a way that accounts for homophily, and to show how embedding methods can be used to identify and estimate this effect. Code is available at https://github.com/IrinaCristali/Peer-Contagion-on-Networks.

preprint2021arXiv

Bootstrap estimators for the tail-index and for the count statistics of graphex processes

Graphex processes resolve some pathologies in traditional random graph models, notably, providing models that are both projective and allow sparsity. Most of the literature on graphex processes study them from a probabilistic point of view. Techniques for inferring the parameter of these processes -- the so-called \textit{graphon} -- are still marginal; exceptions are a few papers considering parametric families of graphons. Nonparametric estimation remains unconsidered. In this paper, we propose estimators for a selected choice of functionals of the graphon. Our estimators originate from the subsampling theory for graphex processes, hence can be seen as a form of bootstrap procedure.

preprint2020arXiv

Adapting Text Embeddings for Causal Inference

Does adding a theorem to a paper affect its chance of acceptance? Does labeling a post with the author's gender affect the post popularity? This paper develops a method to estimate such causal effects from observational text data, adjusting for confounding features of the text such as the subject or writing quality. We assume that the text suffices for causal adjustment but that, in practice, it is prohibitively high-dimensional. To address this challenge, we develop causally sufficient embeddings, low-dimensional document representations that preserve sufficient information for causal identification and allow for efficient estimation of causal effects. Causally sufficient embeddings combine two ideas. The first is supervised dimensionality reduction: causal adjustment requires only the aspects of text that are predictive of both the treatment and outcome. The second is efficient language modeling: representations of text are designed to dispose of linguistically irrelevant information, and this information is also causally irrelevant. Our method adapts language models (specifically, word embeddings and topic models) to learn document embeddings that are able to predict both treatment and outcome. We study causally sufficient embeddings with semi-synthetic datasets and find that they improve causal estimation over related embedding methods. We illustrate the methods by answering the two motivating questions---the effect of a theorem on paper acceptance and the effect of a gender label on post popularity. Code and data available at https://github.com/vveitch/causal-text-embeddings-tf2}{github.com/vveitch/causal-text-embeddings-tf2

preprint2020arXiv

Sampling perspectives on sparse exchangeable graphs

Recent work has introduced sparse exchangeable graphs and the associated graphex framework, as a generalization of dense exchangeable graphs and the associated graphon framework. The development of this subject involves the interplay between the statistical modeling of network data, the theory of large graph limits, exchangeability, and network sampling. The purpose of the present paper is to clarify the relationships between these subjects by explaining each in terms of a certain natural sampling scheme associated with the graphex model. The first main technical contribution is the introduction of sampling convergence, a new notion of graph limit that generalizes left convergence so that it becomes meaningful for the sparse graph regime. The second main technical contribution is the demonstration that the (somewhat cryptic) notion of exchangeability underpinning the graphex framework is equivalent to a more natural probabilistic invariance expressed in terms of the sampling scheme.

preprint2020arXiv

Valid Causal Inference with (Some) Invalid Instruments

Instrumental variable methods provide a powerful approach to estimating causal effects in the presence of unobserved confounding. But a key challenge when applying them is the reliance on untestable "exclusion" assumptions that rule out any relationship between the instrument variable and the response that is not mediated by the treatment. In this paper, we show how to perform consistent IV estimation despite violations of the exclusion assumption. In particular, we show that when one has multiple candidate instruments, only a majority of these candidates---or, more generally, the modal candidate-response relationship---needs to be valid to estimate the causal effect. Our approach uses an estimate of the modal prediction from an ensemble of instrumental variable estimators. The technique is simple to apply and is "black-box" in the sense that it may be used with any instrumental variable estimator as long as the treatment effect is identified for each valid instrument independently. As such, it is compatible with recent machine-learning based estimators that allow for the estimation of conditional average treatment effects (CATE) on complex, high dimensional data. Experimentally, we achieve accurate estimates of conditional average treatment effects using an ensemble of deep network-based estimators, including on a challenging simulated Mendelian Randomization problem.

preprint2016arXiv

Sampling and Estimation for (Sparse) Exchangeable Graphs

Sparse exchangeable graphs on $\mathbb{R}_+$, and the associated graphex framework for sparse graphs, generalize exchangeable graphs on $\mathbb{N}$, and the associated graphon framework for dense graphs. We develop the graphex framework as a tool for statistical network analysis by identifying the sampling scheme that is naturally associated with the models of the framework, and by introducing a general consistent estimator for the parameter (the graphex) underlying these models. The sampling scheme is a modification of independent vertex sampling that throws away vertices that are isolated in the sampled subgraph. The estimator is a dilation of the empirical graphon estimator, which is known to be a consistent estimator for dense exchangeable graphs; both can be understood as graph analogues to the empirical distribution in the i.i.d. sequence setting. Our results may be viewed as a generalization of consistent estimation via the empirical graphon from the dense graph regime to also include sparse graphs.

preprint2015arXiv

The Class of Random Graphs Arising from Exchangeable Random Measures

We introduce a class of random graphs that we argue meets many of the desiderata one would demand of a model to serve as the foundation for a statistical analysis of real-world networks. The class of random graphs is defined by a probabilistic symmetry: invariance of the distribution of each graph to an arbitrary relabelings of its vertices. In particular, following Caron and Fox, we interpret a symmetric simple point process on $\mathbb{R}_+^2$ as the edge set of a random graph, and formalize the probabilistic symmetry as joint exchangeability of the point process. We give a representation theorem for the class of random graphs satisfying this symmetry via a straightforward specialization of Kallenberg's representation theorem for jointly exchangeable random measures on $\mathbb{R}_+^2$. The distribution of every such random graph is characterized by three (potentially random) components: a nonnegative real $I \in \mathbb{R}_+$, an integrable function $S: \mathbb{R}_+ \to \mathbb{R}_+$, and a symmetric measurable function $W: \mathbb{R}_+^2 \to [0,1]$ that satisfies several weak integrability conditions. We call the triple $(I,S,W)$ a graphex, in analogy to graphons, which characterize the (dense) exchangeable graphs on $\mathbb{N}$. Indeed, the model we introduce here contains the exchangeable graphs as a special case, as well as the "sparse exchangeable" model of Caron and Fox. We study the structure of these random graphs, and show that they can give rise to interesting structure, including sparse graph sequences. We give explicit equations for expectations of certain graph statistics, as well as the limiting degree distribution. We also show that certain families of graphexes give rise to random graphs that, asymptotically, contain an arbitrarily large fraction of the vertices in a single connected component.

preprint2014arXiv

Contextuality supplies the magic for quantum computation

Quantum computers promise dramatic advantages over their classical counterparts, but the answer to the most basic question "What is the source of the power in quantum computing?" has remained elusive. Here we prove a remarkable equivalence between the onset of contextuality and the possibility of universal quantum computation via magic state distillation. This is a conceptually satisfying link because contextuality provides one of the fundamental characterizations of uniquely quantum phenomena and, moreover, magic state distillation is the leading model for experimentally realizing fault-tolerant quantum computation. Furthermore, this connection suggests a unifying paradigm for the resources of quantum information: the nonlocality of quantum theory is a particular kind of contextuality and nonlocality is already known to be a critical resource for achieving advantages with quantum communication. In addition to clarifying these fundamental issues, this work advances the resource framework for quantum computation, which has a number of practical applications, such as characterizing the efficiency and trade-offs between distinct theoretical and experimental schemes for achieving robust quantum computation and bounding the overhead cost for the classical simulation of quantum algorithms.

preprint2013arXiv

The Resource Theory of Stabilizer Computation

Recent results on the non-universality of fault-tolerant gate sets underline the critical role of resource states, such as magic states, to power scalable, universal quantum computation. Here we develop a resource theory, analogous to the theory of entanglement, for resources for stabilizer codes. We introduce two quantitative measures - monotones - for the amount of non-stabilizer resource. As an application we give absolute bounds on the efficiency of magic state distillation. One of these monotones is the sum of the negative entries of the discrete Wigner representation of a quantum state, thereby resolving a long-standing open question of whether the degree of negativity in a quasi-probability representation is an operationally meaningful indicator of quantum behaviour.

preprint2013arXiv

The whole is greater than the sum of the parts: on the possibility of purely statistical interpretations of quantum theory

The Pusey-Barrett-Rudolph theorem (PBR) claims to rule out the possibility of a purely statistical interpretation of the quantum state under an assumption of how to represent independent operations in any hidden variable model. We show that PBR's assumption of independence encodes an assumption of local causality, which is already known to conflict with the predictions of quantum theory via Bell-type inequalities. We devise a weaker formulation of independence within a general hidden variable model that is empirically indistinguishable from the PBR assumption in situations where certain hidden variables are inaccessible. Under this weaker principle we are able to construct an explicit hidden variable model that is purely statistical and also reproduces the quantum predictions. Our results suggest that the assumption of a purely statistical interpretation is actually an innocent bystander in the PBR argument, rather than the driving force behind their contradiction.

preprint2012arXiv

Efficient simulation scheme for a class of quantum optics experiments with non-negative Wigner representation

We provide a scheme for efficient simulation of a broad class of quantum optics experiments. Our efficient simulation extends the continuous variable Gottesman-Knill theorem to a large class of non-Gaussian mixed states, thereby identifying that these non-Gaussian states are not an enabling resource for exponential quantum speed-up. Our results also provide an operationally motivated interpretation of negativity as non-classicality. We apply our scheme to the case of noisy single-photon-added-thermal-states to show that this class admits states with positive Wigner function but negative P -function that are not useful resource states for quantum computation.

preprint2012arXiv

Negative Quasi-Probability as a Resource for Quantum Computation

A central problem in quantum information is to determine the minimal physical resources that are required for quantum computational speedup and, in particular, for fault-tolerant quantum computation. We establish a remarkable connection between the potential for quantum speed-up and the onset of negative values in a distinguished quasi-probability representation, a discrete analog of the Wigner function for quantum systems of odd dimension. This connection allows us to resolve an open question on the existence of bound states for magic-state distillation: we prove that there exist mixed states outside the convex hull of stabilizer states that cannot be distilled to non-stabilizer target states using stabilizer operations. We also provide an efficient simulation protocol for Clifford circuits that extends to a large class of mixed states, including bound universal states.

Victor Veitch

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Invariant and Transportable Representations for Anti-Causal Domain Shifts

Using Embeddings for Causal Estimation of Peer Influence in Social Networks

Bootstrap estimators for the tail-index and for the count statistics of graphex processes

Adapting Text Embeddings for Causal Inference

Sampling perspectives on sparse exchangeable graphs

Valid Causal Inference with (Some) Invalid Instruments

Sampling and Estimation for (Sparse) Exchangeable Graphs

The Class of Random Graphs Arising from Exchangeable Random Measures

Contextuality supplies the magic for quantum computation

The Resource Theory of Stabilizer Computation

The whole is greater than the sum of the parts: on the possibility of purely statistical interpretations of quantum theory

Efficient simulation scheme for a class of quantum optics experiments with non-negative Wigner representation

Negative Quasi-Probability as a Resource for Quantum Computation