Researcher profile

Gianluca Baio

Gianluca Baio contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2025arXiv

A Bayesian hierarchical mixture cure modelling framework to utilize multiple survival datasets for long-term survivorship estimates: A case study from previously untreated metastatic melanoma

Time to an event of interest over a lifetime is a central measure of the clinical benefit of an intervention used in a health technology assessment (HTA). Within the same trial, multiple end-points may also be considered. For example, overall and progression-free survival time for different drugs in oncology studies. A common challenge is when an intervention is only effective for some proportion of the population who are not clinically identifiable. Therefore, latent group membership as well as separate survival models for identified groups need to be estimated. However, follow-up in trials may be relatively short leading to substantial censoring. We present a general Bayesian hierarchical framework that can handle this complexity by exploiting the similarity of cure fractions between end-points; accounting for the correlation between them and improving the extrapolation beyond the observed data. Assuming exchangeability between cure fractions facilitates the borrowing of information between end-points. We undertake a comprehensive simulation study to evaluate the model performance under different scenarios. We also show the benefits of using our approach with a motivating example, the CheckMate 067 phase 3 trial consisting of patients with metastatic melanoma treated with first line therapy.

preprint2022arXiv

A Bayesian hierarchical model for improving exercise rehabilitation in mechanically ventilated ICU patients

Patients who are mechanically ventilated in the intensive care unit (ICU) participate in exercise as a component of their rehabilitation to ameliorate the long-term impact of critical illness on their physical function. The effective implementation of these programmes is hindered, however, by the lack of a scientific method for quantifying an individual patient's exercise intensity level in real time, which results in a broad one-size-fits-all approach to rehabilitation and sub-optimal patient outcomes. In this work we have developed a Bayesian hierarchical model with temporally correlated latent Gaussian processes to predict $\dot VO_2$, a physiological measure of exercise intensity, using readily available physiological data. Inference was performed using Integrated Nested Laplace Approximation. For practical use by clinicians $\dot VO_2$ was classified into exercise intensity categories. Internal validation using leave-one-patient-out cross-validation was conducted based on these classifications, and the role of probabilistic statements describing the classification uncertainty was investigated.

preprint2022arXiv

BCEA: An R Package for Cost-Effectiveness Analysis

We describe in detail how to perform health economic cost-effectiveness analyses (CEA) using the R package $\textbf{BCEA}$ (Bayesian Cost-Effectiveness Analysis). CEA consist of analytic approaches for combining costs and health consequences of intervention(s). These help to understand how much an intervention may cost (per unit of health gained) compared to an alternative intervention, such as a control or status quo. For resource allocation, a decision maker may wish to know if an intervention is cost saving, and if not then how much more would it cost to implement it compared to a less effective intervention. Current guidance for cost-effectiveness analyses advocates the quantification of uncertainties which can be represented by random samples obtained from a probability sensitivity analysis or, more efficiently, a Bayesian model. $\textbf{BCEA}$ can be used to post-process the sampled costs and health impacts to perform advanced analyses producing standardised and highly customisable outputs. We present the features of the package, including its many functions and their practical application. $\textbf{BCEA}$ is valuable for statisticians and practitioners working in the field of health economic modelling wanting to simplify and standardise their workflow, for example in the preparation of dossiers in support of marketing authorisation, or academic and scientific publications.

preprint2022arXiv

Blended Survival Curves: A New Approach to Extrapolation for Time-to-Event Outcomes from Clinical Trial in Health Technology Assessment

Background Survival extrapolation is essential in the cost-effectiveness analysis to quantify the lifetime survival benefit associated with a new intervention, due to the restricted duration of randomized controlled trials (RCTs). Current approaches of extrapolation often assume that the treatment effect observed in the trial can continue indefinitely, which is unrealistic and may have a huge impact on decisions for resource allocation. Objective We introduce a novel methodology as a possible solution to alleviate the problem of performing survival extrapolation with heavily censored data from clinical trials. Method The main idea is to mix a flexible model (e.g., Cox semi-parametric) to fit as well as possible the observed data and a parametric model encoding assumptions on the expected behaviour of underlying long-term survival. The two are "blended" into a single survival curve that is identical with the Cox model over the range of observed times and gradually approaching the parametric model over the extrapolation period based on a weight function. The weight function regulates the way two survival curves are blended, determining how the internal and external sources contribute to the estimated survival over time. Results A 4-year follow-up RCT of rituximab in combination with fludarabine and cyclophosphamide v. fludarabine and cyclophosphamide alone for the first-line treatment of chronic lymphocytic leukemia is used to illustrate the method. Conclusion Long-term extrapolation from immature trial data may lead to significantly different estimates with various modelling assumptions. The blending approach provides sufficient flexibility, allowing a wide range of plausible scenarios to be considered as well as the inclusion of genuine external information, based e.g. on hard data or expert opinion. Both internal and external validity can be carefully examined.

preprint2022arXiv

Interpretable Deep Causal Learning for Moderation Effects

In this extended abstract paper, we address the problem of interpretability and targeted regularization in causal machine learning models. In particular, we focus on the problem of estimating individual causal/treatment effects under observed confounders, which can be controlled for and moderate the effect of the treatment on the outcome of interest. Black-box ML models adjusted for the causal setting perform generally well in this task, but they lack interpretable output identifying the main drivers of treatment heterogeneity and their functional relationship. We propose a novel deep counterfactual learning architecture for estimating individual treatment effects that can simultaneously: i) convey targeted regularization on, and produce quantify uncertainty around the quantity of interest (i.e., the Conditional Average Treatment Effect); ii) disentangle baseline prognostic and moderating effects of the covariates and output interpretable score functions describing their relationship with the outcome. Finally, we demonstrate the use of the method via a simple simulated experiment.

preprint2022arXiv

Marginalization of Regression-Adjusted Treatment Effects in Indirect Comparisons with Limited Patient-Level Data

Population adjustment methods such as matching-adjusted indirect comparison (MAIC) are increasingly used to compare marginal treatment effects when there are cross-trial differences in effect modifiers and limited patient-level data. MAIC is sensitive to poor covariate overlap and cannot extrapolate beyond the observed covariate space. Current outcome regression-based alternatives can extrapolate but target a conditional treatment effect that is incompatible in the indirect comparison. When adjusting for covariates, one must integrate or average the conditional estimate over the population of interest to recover a compatible marginal treatment effect. We propose a marginalization method based on parametric G-computation that can be easily applied where the outcome regression is a generalized linear model or a Cox model. In addition, we introduce a novel general-purpose method based on multiple imputation, which we term multiple imputation marginalization (MIM) and is applicable to a wide range of models. Both methods can accommodate a Bayesian statistical framework, which naturally integrates the analysis into a probabilistic framework. A simulation study provides proof-of-principle for the methods and benchmarks their performance against MAIC and the conventional outcome regression. The marginalized outcome regression approaches achieve more precise and more accurate estimates than MAIC, particularly when covariate overlap is poor, and yield unbiased marginal treatment effect estimates under no failures of assumptions. Furthermore, the marginalized regression-adjusted estimates provide greater precision and accuracy than the conditional estimates produced by the conventional outcome regression, which are systematically biased because the measure of effect is non-collapsible.

preprint2021arXiv

Effect modification in anchored indirect treatment comparisons: Comments on "Matching-adjusted indirect comparisons: Application to time-to-event data"

This commentary regards a recent simulation study conducted by Aouni, Gaudel-Dedieu and Sebastien, evaluating the performance of different versions of matching-adjusted indirect comparison (MAIC) in an anchored scenario with a common comparator. The simulation study uses survival outcomes and the Cox proportional hazards regression as the outcome model. It concludes that using the LASSO for variable selection is preferable to balancing a maximal set of covariates. However, there are no treatment effect modifiers in imbalance in the study. The LASSO is more efficient because it selects a subset of the maximal set of covariates but there are no cross-study imbalances in effect modifiers inducing bias. We highlight the following points: (1) in the anchored setting, MAIC is necessary where there are cross-trial imbalances in effect modifiers; (2) the standard indirect comparison provides greater precision and accuracy than MAIC if there are no effect modifiers in imbalance; (3) while the target estimand of the simulation study is a conditional treatment effect, MAIC targets a marginal or population-average treatment effect; (4) in MAIC, variable selection is a problem of low dimensionality and sparsity-inducing methods like the LASSO may be problematic. Finally, data-driven approaches do not obviate the necessity for subject matter knowledge when selecting effect modifiers. R code is provided in the Appendix to replicate the analyses and illustrate our points.

preprint2020arXiv

Dirichlet Process Mixture Models for Regression Discontinuity Designs

The Regression Discontinuity Design (RDD) is a quasi-experimental design that estimates the causal effect of a treatment when its assignment is defined by a threshold value for a continuous assignment variable. The RDD assumes that subjects with measurements within a bandwidth around the threshold belong to a common population, so that the threshold can be seen as a randomising device assigning treatment to those falling just above the threshold and withholding it from those who fall just below. Bandwidth selection represents a compelling decision for the RDD analysis as the results may be highly sensitive to its choice. A number of methods to select the optimal bandwidth, mainly originating from the econometric literature, have been proposed. However, their use in practice is limited. We propose a methodology that, tackling the problem from an applied point of view, consider units' exchangeability, i.e., their similarity with respect to measured covariates, as the main criteria to select subjects for the analysis, irrespectively of their distance from the threshold. We carry out clustering on the sample using a Dirichlet process mixture model to identify balanced and homogeneous clusters. Our proposal exploits the posterior similarity matrix, which contains the pairwise probabilities that two observations are allocated to the same cluster in the MCMC sample. Thus we include in the RDD analysis only those clusters for which we have stronger evidence of exchangeability. We illustrate the validity of our methodology with both a simulated experiment and a motivating example on the effect of statins to lower cholesterol level, using UK primary care data.

preprint2020arXiv

Joint longitudinal models for dealing with missing at random data in trial-based economic evaluations

Health economic evaluations based on patient-level data collected alongside clinical trials~(e.g. health related quality of life and resource use measures) are an important component of the process which informs resource allocation decisions. Almost inevitably, the analysis is complicated by the fact that some individuals drop out from the study, which causes their data to be unobserved at some time point. Current practice performs the evaluation by handling the missing data at the level of aggregated variables (e.g. QALYs), which are obtained by combining the economic data over the duration of the study, and are often conducted under a missing at random (MAR) assumption. However, this approach may lead to incorrect inferences since it ignores the longitudinal nature of the data and may end up discarding a considerable amount of observations from the analysis. We propose the use of joint longitudinal models to extend standard cost-effectiveness analysis methods by taking into account the longitudinal structure and incorporate all available data to improve the estimation of the targeted quantities under MAR. Our approach is compared to popular missingness approaches in trial-based analyses, motivated by an exploratory simulation study, and applied to data from two real case studies.

preprint2019arXiv

Calculating the Expected Value of Sample Information in Practice: Considerations from Three Case Studies

Investing efficiently in future research to improve policy decisions is an important goal. Expected Value of Sample Information (EVSI) can be used to select the specific design and sample size of a proposed study by assessing the benefit of a range of different studies. Estimating EVSI with the standard nested Monte Carlo algorithm has a notoriously high computational burden, especially when using a complex decision model or when optimizing over study sample sizes and designs. Therefore, a number of more efficient EVSI approximation methods have been developed. However, these approximation methods have not been compared and therefore their relative advantages and disadvantages are not clear. A consortium of EVSI researchers, including the developers of several approximation methods, compared four EVSI methods using three previously published health economic models. The examples were chosen to represent a range of real-world contexts, including situations with multiple study outcomes, missing data, and data from an observational rather than a randomized study. The computational speed and accuracy of each method were compared, and the relative advantages and implementation challenges of the methods were highlighted. In each example, the approximation methods took minutes or hours to achieve reasonably accurate EVSI estimates, whereas the traditional Monte Carlo method took weeks. Specific methods are particularly suited to problems where we wish to compare multiple proposed sample sizes, when the proposed sample size is large, or when the health economic model is computationally expensive. All the evaluated methods gave estimates similar to those given by traditional Monte Carlo, suggesting that EVSI can now be efficiently computed with confidence in realistic examples.