Researcher profile

Victor Veitch

Victor Veitch contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the challenges and opportunities in the application of causal inference to the textual domain, with its unique properties. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects with text, encompassing settings where text is used as an outcome, treatment, or to address confounding. In addition, we explore potential uses of causal inference to improve the robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the NLP community.

preprint2022arXiv

Invariant and Transportable Representations for Anti-Causal Domain Shifts

Real-world classification problems must contend with domain shift, the (potential) mismatch between the domain where a model is deployed and the domain(s) where the training data was gathered. Methods to handle such problems must specify what structure is common between the domains and what varies. A natural assumption is that causal (structural) relationships are invariant in all domains. Then, it is tempting to learn a predictor for label $Y$ that depends only on its causal parents. However, many real-world problems are "anti-causal" in the sense that $Y$ is a cause of the covariates $X$ -- in this case, $Y$ has no causal parents and the naive causal invariance is useless. In this paper, we study representation learning under a particular notion of domain shift that both respects causal invariance and that naturally handles the "anti-causal" structure. We show how to leverage the shared causal structure of the domains to learn a representation that both admits an invariant predictor and that also allows fast adaptation in new domains. The key is to translate causal assumptions into learning principles that disentangle "invariant" and "non-stable" features. Experiments on both synthetic and real-world data demonstrate the effectiveness of the proposed learning algorithm. Code is available at https://github.com/ybjiaang/ACTIR.

preprint2022arXiv

Using Embeddings for Causal Estimation of Peer Influence in Social Networks

We address the problem of using observational data to estimate peer contagion effects, the influence of treatments applied to individuals in a network on the outcomes of their neighbors. A main challenge to such estimation is that homophily - the tendency of connected units to share similar latent traits - acts as an unobserved confounder for contagion effects. Informally, it's hard to tell whether your friends have similar outcomes because they were influenced by your treatment, or whether it's due to some common trait that caused you to be friends in the first place. Because these common causes are not usually directly observed, they cannot be simply adjusted for. We describe an approach to perform the required adjustment using node embeddings learned from the network itself. The main aim is to perform this adjustment nonparametrically, without functional form assumptions on either the process that generated the network or the treatment assignment and outcome processes. The key contributions are to nonparametrically formalize the causal effect in a way that accounts for homophily, and to show how embedding methods can be used to identify and estimate this effect. Code is available at https://github.com/IrinaCristali/Peer-Contagion-on-Networks.

preprint2021arXiv

Bootstrap estimators for the tail-index and for the count statistics of graphex processes

Graphex processes resolve some pathologies in traditional random graph models, notably, providing models that are both projective and allow sparsity. Most of the literature on graphex processes study them from a probabilistic point of view. Techniques for inferring the parameter of these processes -- the so-called \textit{graphon} -- are still marginal; exceptions are a few papers considering parametric families of graphons. Nonparametric estimation remains unconsidered. In this paper, we propose estimators for a selected choice of functionals of the graphon. Our estimators originate from the subsampling theory for graphex processes, hence can be seen as a form of bootstrap procedure.

preprint2020arXiv

Adapting Text Embeddings for Causal Inference

Does adding a theorem to a paper affect its chance of acceptance? Does labeling a post with the author's gender affect the post popularity? This paper develops a method to estimate such causal effects from observational text data, adjusting for confounding features of the text such as the subject or writing quality. We assume that the text suffices for causal adjustment but that, in practice, it is prohibitively high-dimensional. To address this challenge, we develop causally sufficient embeddings, low-dimensional document representations that preserve sufficient information for causal identification and allow for efficient estimation of causal effects. Causally sufficient embeddings combine two ideas. The first is supervised dimensionality reduction: causal adjustment requires only the aspects of text that are predictive of both the treatment and outcome. The second is efficient language modeling: representations of text are designed to dispose of linguistically irrelevant information, and this information is also causally irrelevant. Our method adapts language models (specifically, word embeddings and topic models) to learn document embeddings that are able to predict both treatment and outcome. We study causally sufficient embeddings with semi-synthetic datasets and find that they improve causal estimation over related embedding methods. We illustrate the methods by answering the two motivating questions---the effect of a theorem on paper acceptance and the effect of a gender label on post popularity. Code and data available at https://github.com/vveitch/causal-text-embeddings-tf2}{github.com/vveitch/causal-text-embeddings-tf2

preprint2020arXiv

Sampling perspectives on sparse exchangeable graphs

Recent work has introduced sparse exchangeable graphs and the associated graphex framework, as a generalization of dense exchangeable graphs and the associated graphon framework. The development of this subject involves the interplay between the statistical modeling of network data, the theory of large graph limits, exchangeability, and network sampling. The purpose of the present paper is to clarify the relationships between these subjects by explaining each in terms of a certain natural sampling scheme associated with the graphex model. The first main technical contribution is the introduction of sampling convergence, a new notion of graph limit that generalizes left convergence so that it becomes meaningful for the sparse graph regime. The second main technical contribution is the demonstration that the (somewhat cryptic) notion of exchangeability underpinning the graphex framework is equivalent to a more natural probabilistic invariance expressed in terms of the sampling scheme.

preprint2020arXiv

Valid Causal Inference with (Some) Invalid Instruments

Instrumental variable methods provide a powerful approach to estimating causal effects in the presence of unobserved confounding. But a key challenge when applying them is the reliance on untestable "exclusion" assumptions that rule out any relationship between the instrument variable and the response that is not mediated by the treatment. In this paper, we show how to perform consistent IV estimation despite violations of the exclusion assumption. In particular, we show that when one has multiple candidate instruments, only a majority of these candidates---or, more generally, the modal candidate-response relationship---needs to be valid to estimate the causal effect. Our approach uses an estimate of the modal prediction from an ensemble of instrumental variable estimators. The technique is simple to apply and is "black-box" in the sense that it may be used with any instrumental variable estimator as long as the treatment effect is identified for each valid instrument independently. As such, it is compatible with recent machine-learning based estimators that allow for the estimation of conditional average treatment effects (CATE) on complex, high dimensional data. Experimentally, we achieve accurate estimates of conditional average treatment effects using an ensemble of deep network-based estimators, including on a challenging simulated Mendelian Randomization problem.