A Bayesian Approach for Selecting Relevant External Data (BASE): Application to a study of Long-Term Outcomes in a Hemophilia Gene Therapy Trial
Gene therapies aim to address the root causes of diseases, particularly those stemming from rare genetic defects that can be life-threatening or severely debilitating. Although an increasing number of gene therapies have received regulatory approvals in recent years, understanding their long-term efficacy in trials with limited follow-up time remains challenging. To address this critical question, we propose a novel Bayesian framework designed to selectively integrate relevant external data with internal trial data to improve the inference of the durability of long-term efficacy. We proved that the proposed method has desired theoretical properties, such as identifying and favoring external subsets deemed relevant, where the relevance is defined as the similarity, induced by the marginal likelihood, between the generating mechanisms of the internal data and the selected external data. We also conducted comprehensive simulations to evaluate its performance under various scenarios. Furthermore, we apply this method to predict and infer the endogenous factor IX (FIX) levels of patients who receive Etranacogene dezaparvovec over the long-term. Our estimated long-term FIX levels, validated by recent trial data, indicate that Etranacogene dezaparvovec induces sustained FIX production. Together, the theoretical findings, simulation results, and successful application of this framework underscore its potential to address similar long-term effectiveness estimation and inference questions in real world applications.