Researcher profile

Federico Castelletti

Federico Castelletti contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
2topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Bayesian graphical modeling for heterogeneous causal effects

Our motivation stems from current medical research aiming at personalized treatment using a molecular-based approach. The broad goal is to develop a more precise and targeted decision making process, relative to traditional treatments based primarily on clinical diagnoses. Specifically, we consider patients affected by Acute Myeloid Leukemia (AML), an hematological cancer characterized by uncontrolled proliferation of hematopoietic stem cells in the bone marrow. Because AML responds poorly to chemoterapeutic treatments, the development of targeted therapies is essential to improve patients' prospects. In particular, the dataset we analyze contains the levels of proteins involved in cell cycle regulation and linked to the progression of the disease. We analyse treatment effects within a causal framework represented by a Directed Acyclic Graph (DAG) model, whose vertices are the protein levels in the network. A major obstacle in implementing the above program is however represented by individual heterogeneity. We address this issue through a Dirichlet Process (DP) mixture of Gaussian DAG-models where both the graphical structure as well as the allied model parameters are regarded as uncertain. Our procedure determines a clustering structure of the units reflecting the underlying heterogeneity, and produces subject-specific estimates of causal effects based on Bayesian model averaging. With reference to the AML dataset, we identify different effects of protein regulation among individuals; moreover, our method clusters patients into groups that exhibit only mild similarities with traditional categories based on morphological features.

preprint2022arXiv

Bayesian sample size determination for causal discovery

Graphical models based on Directed Acyclic Graphs (DAGs) are widely used to answer causal questions across a variety of scientific and social disciplines. However, observational data alone cannot distinguish in general between DAGs representing the same conditional independence assertions (Markov equivalent DAGs); as a consequence the orientation of some edges in the graph remains indeterminate. Interventional data, produced by exogenous manipulations of variables in the network, enhance the process of structure learning because they allow to distinguish among equivalent DAGs, thus sharpening causal inference. Starting from an equivalence class of DAGs, a few procedures have been devised to produce a collection of variables to be manipulated in order to identify a causal DAG. Yet, these algorithmic approaches do not determine the sample size of the interventional data required to obtain a desired level of statistical accuracy. We tackle this problem from a Bayesian experimental design perspective, taking as input a sequence of target variables to be manipulated to identify edge orientation. We then propose a method to determine, at each intervention, the optimal sample size capable of producing a successful experiment based on a pre-experimental evaluation of the overall probability of substantial correct evidence.

preprint2022arXiv

BCDAG: An R package for Bayesian structure and Causal learning of Gaussian DAGs

Directed Acyclic Graphs (DAGs) provide a powerful framework to model causal relationships among variables in multivariate settings; in addition, through the do-calculus theory, they allow for the identification and estimation of causal effects between variables also from pure observational data. In this setting, the process of inferring the DAG structure from the data is referred to as causal structure learning or causal discovery. We introduce BCDAG, an R package for Bayesian causal discovery and causal effect estimation from Gaussian observational data, implementing the Markov chain Monte Carlo (MCMC) scheme proposed by Castelletti & Mascaro (2021). Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, with the number of variables in the dataset. The package also provides functions for convergence diagnostics and for visualizing and summarizing posterior inference. In this paper, we present the key features of the underlying methodology along with its implementation in BCDAG. We then illustrate the main functions and algorithms on both real and simulated datasets.

preprint2021arXiv

Equivalence class selection of categorical graphical models

Learning the structure of dependence relations between variables is a pervasive issue in the statistical literature. A directed acyclic graph (DAG) can represent a set of conditional independences, but different DAGs may encode the same set of relations and are indistinguishable using observational data. Equivalent DAGs can be collected into classes, each represented by a partially directed graph known as essential graph (EG). Structure learning directly conducted on the EG space, rather than on the allied space of DAGs, leads to theoretical and computational benefits. Still, the majority of efforts in the literature has been dedicated to Gaussian data, with less attention to methods designed for multivariate categorical data. We then propose a Bayesian methodology for structure learning of categorical EGs. Combining a constructive parameter prior elicitation with a graph-driven likelihood decomposition, we derive a closed-form expression for the marginal likelihood of a categorical EG model. Asymptotic properties are studied, and an MCMC sampler scheme developed for approximate posterior inference. We evaluate our methodology on both simulated scenarios and real data, with appreciable performance in comparison with state-of-the-art methods.

preprint2020arXiv

Bayesian causal inference in probit graphical models

We consider a binary response which is potentially affected by a set of continuous variables. Of special interest is the causal effect on the response due to an intervention on a specific variable. The latter can be meaningfully determined on the basis of observational data through suitable assumptions on the data generating mechanism. In particular we assume that the joint distribution obeys the conditional independencies (Markov properties) inherent in a Directed Acyclic Graph (DAG), and the DAG is given a causal interpretation through the notion of interventional distribution. We propose a DAG-probit model where the response is generated by discretization through a random threshold of a continuous latent variable and the latter, jointly with the remaining continuous variables, has a distribution belonging to a zero-mean Gaussian model whose covariance matrix is constrained to satisfy the Markov properties of the DAG. Our model leads to a natural definition of causal effect conditionally on a given DAG. Since the DAG which generates the observations is unknown, we present an efficient MCMC algorithm whose target is the posterior distribution on the space of DAGs, the Cholesky parameters of the concentration matrix, and the threshold linking the response to the latent. Our end result is a Bayesian Model Averaging estimate of the causal effect which incorporates parameter, as well as model, uncertainty. The methodology is assessed using simulation experiments and applied to a gene expression data set originating from breast cancer stem cells.