Source author record

Liwen Ouyang

Liwen Ouyang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

2works
4topics
3close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Maximum Mean Discrepancy for Generalization in the Presence of Distribution and Missingness Shift

Covariate shifts are a common problem in predictive modeling on real-world problems. This paper proposes addressing the covariate shift problem by minimizing Maximum Mean Discrepancy (MMD) statistics between the training and test sets in either feature input space, feature representation space, or both. We designed three techniques that we call MMD Representation, MMD Mask, and MMD Hybrid to deal with the scenarios where only a distribution shift exists, only a missingness shift exists, or both types of shift exist, respectively. We find that integrating an MMD loss component helps models use the best features for generalization and avoid dangerous extrapolation as much as possible for each test sample. Models treated with this MMD approach show better performance, calibration, and extrapolation on the test set.

preprint2015arXiv

Designed Sampling from Large Databases for Controlled Trials

The increasing prevalence of rich sources of data and the availability of electronic medical record databases and electronic registries opens tremendous opportunities for enhancing medical research. For example, controlled trials are ubiquitously used to investigate the effect of a medical treatment, perhaps dependent on a set of patient covariates, and traditional approaches have relied primarily on randomized patient sampling and allocation to treatment and control group. However, when covariate data for a large cohort group of patients have already been collected and are available in a database, one can potentially design a treatment/control sample and allocation that provides far better estimates of the covariate-dependent effects of the treatment. In this paper, we develop a new approach that uses optimal design of experiments (DOE) concepts to accomplish this objective. The approach selects the patients for the treatment and control samples upfront, based on their covariate values, in a manner that optimizes the information content in the data. For the optimal sample selection, we develop simple guidelines and an optimization algorithm that provides solutions that are substantially better than random sampling. Moreover, our approach causes no sampling bias in the estimated effects, for the same reason that DOE principles do not bias estimated effects. We test our method with a simulation study based on a testbed data set containing information on the effect of statins on low-density lipoprotein (LDL) cholesterol.