Researcher profile

Kwun Chuen Gary Chan

Kwun Chuen Gary Chan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Dimension-reduced outcome-weighted learning for estimating individualized treatment regimes in observational studies

Individualized treatment regimes (ITRs) aim to improve clinical outcomes by assigning treatment based on patient-specific characteristics. However, existing methods often struggle with high-dimensional covariates, limiting accuracy, interpretability, and real-world applicability. We propose a novel sufficient dimension reduction approach that directly targets the contrast between potential outcomes and identifies a low-dimensional subspace of the covariates capturing treatment effect heterogeneity. This reduced representation enables more accurate estimation of optimal ITRs through outcome-weighted learning. To accommodate observational data, our method incorporates kernel-based covariate balancing, allowing treatment assignment to depend on the full covariate set and avoiding the restrictive assumption that the subspace sufficient for modeling heterogeneous treatment effects is also sufficient for confounding adjustment. We show that the proposed method achieves universal consistency, i.e., its risk converges to the Bayes risk, under mild regularity conditions. We demonstrate its finite sample performance through simulations and an analysis of intensive care unit sepsis patient data to determine who should receive transthoracic echocardiography.

preprint2021arXiv

Defining and Estimating Subgroup Mediation Effects with Semi-Competing Risks Data

In many medical studies, an ultimate failure event such as death is likely to be affected by the occurrence and timing of other intermediate clinical events. Both event times are subject to censoring by loss-to-follow-up but the nonterminal event may further be censored by the occurrence of the primary outcome, but not vice versa. To study the effect of an intervention on both events, the intermediate event may be viewed as a mediator, but conventional definition of direct and indirect effects is not applicable due to semi-competing risks data structure. We define three principal strata based on whether the potential intermediate event occurs before the potential failure event, which allow proper definition of direct and indirect effects in one stratum whereas total effects are defined for all strata. We discuss the identification conditions for stratum-specific effects, and proposed a semiparametric estimator based on a multivariate logistic stratum membership model and within-stratum proportional hazards models for the event times. By treating the unobserved stratum membership as a latent variable, we propose an EM algorithm for computation. We study the asymptotic properties of the estimators by the modern empirical process theory and examine the performance of the estimators in numerical studies.

preprint2021arXiv

Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

We study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from a reproducing kernel Hilbert space after kernel smoothing over the subset of interested variables. In addition, we present an augmented version of our estimator which can incorporate estimations of outcome mean functions. Based on the representer theorem, gradient-based algorithms can be applied for solving the corresponding infinite-dimensional optimization problem. Asymptotic properties are studied without any smoothness assumptions for propensity score function or the need of data splitting, relaxing certain existing stringent assumptions. The numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of a mother's smoking on a baby's birth weight conditioned on the mother's age.

preprint2020arXiv

Controlling the False Discovery Rate for Binary Feature Selection via Knockoff

Variable selection has been widely used in data analysis for the past decades, and it becomes increasingly important in the Big Data era as there are usually hundreds of variables available in a dataset. To enhance interpretability of a model, identifying potentially relevant features is often a step before fitting all the features into a regression model. A good variable selection method should effectively control the fraction of false discoveries and ensure large enough power of its selection set. In a lot of contemporary data applications, a great portion of features are coded as binary variables. Binary features are widespread in many fields, from online controlled experiments to genome science to physical statistics. Although there has recently been a handful of literature for provable false discovery rate (FDR) control in variable selection, most of the theoretical analyses were based on some strong dependency assumption or Gaussian assumption among features. In this paper we propose a variable selection method in regression framework for selecting binary features. Under mild conditions, we show that FDR is controlled exactly under a target level in a finite sample if the underlying distribution of the binary features is known. We show in simulations that FDR control is still attained when feature distribution is estimated from data. We also provide theoretical results on the power of our variables selection method in a linear regression model or a logistic regression model. In the restricted settings where competitors exist, we show in simulations and real data application on a HIV antiretroviral therapy dataset that our method has higher power than the competitor.