Source author record

Chung-Chou H. Chang

Chung-Chou H. Chang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Methodology Applications

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Tree-based Model Averaging Approach for Personalized Treatment Effect Estimation from Heterogeneous Data Sources

Accurately estimating personalized treatment effects within a study site (e.g., a hospital) has been challenging due to limited sample size. Furthermore, privacy considerations and lack of resources prevent a site from leveraging subject-level data from other sites. We propose a tree-based model averaging approach to improve the estimation accuracy of conditional average treatment effects (CATE) at a target site by leveraging models derived from other potentially heterogeneous sites, without them sharing subject-level data. To our best knowledge, there is no established model averaging approach for distributed data with a focus on improving the estimation of treatment effects. Specifically, under distributed data networks, our framework provides an interpretable tree-based ensemble of CATE estimators that joins models across study sites, while actively modeling the heterogeneity in data sources through site partitioning. The performance of this approach is demonstrated by a real-world study of the causal effects of oxygen therapy on hospital survival rate and backed up by comprehensive simulation results.

preprint2022arXiv

Bayesian response adaptive randomization design with a composite endpoint of mortality and morbidity

Allocating patients to treatment arms during a trial based on the observed responses accumulated prior to the decision point, and sequential adaptation of this allocation,, could minimize the expected number of failures or maximize total benefit to patients. In this study, we developed a Bayesian response adaptive randomization (RAR) design targeting the endpoint of organ support-free days (OSFD) for patients admitted to the intensive care units (ICU). The OSFD is a mixture of mortality and morbidity assessed by the number of days of free of organ support within a predetermined time-window post-randomization. In the past, researchers treated OSFD as an ordinal outcome variable where the lowest category is death. We propose a novel RAR design for a composite endpoint of mortality and morbidity, e.g., OSFD, by using a Bayesian mixture model with a Markov chain Monte Carlo sampling to estimate the posterior probability of OSFD and determine treatment allocation ratios at each interim. Simulations were conducted to compare the performance of our proposed design under various randomization rules and different alpha spending functions. The results show that our RAR design using Bayesian inference allocated more patients to the better performing arm(s) compared to other existing adaptive rules while assuring adequate power and type I error rate control for the across a range of plausible clinical scenarios.

preprint2019arXiv

Hybrid Density- and Partition-based Clustering Algorithm for Data with Mixed-type Variables

Clustering is an essential technique for discovering patterns in data. The steady increase in amount and complexity of data over the years led to improvements and development of new clustering algorithms. However, algorithms that can cluster data with mixed variable types (continuous and categorical) remain limited, despite the abundance of data with mixed types particularly in the medical field. Among existing methods for mixed data, some posit unverifiable distributional assumptions or that the contributions of different variable types are not well balanced. We propose a two-step hybrid density- and partition-based algorithm (HyDaP) that can detect clusters after variables selection. The first step involves both density-based and partition-based algorithms to identify the data structure formed by continuous variables and recognize the important variables for clustering; the second step involves partition-based algorithm together with a novel dissimilarity measure we designed for mixed data to obtain clustering results. Simulations across various scenarios and data structures were conducted to examine the performance of the HyDaP algorithm compared to commonly used methods. We also applied the HyDaP algorithm on electronic health records to identify sepsis phenotypes.