Researcher profile

Weixin Yao

Weixin Yao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Nonparametric contaminated Gaussian mixture of regressions

Semi- and non-parametric mixture of regressions are a very useful flexible class of mixture of regressions in which some or all of the parameters are non-parametric functions of the covariates. These models are, however, based on the Gaussian assumption of the component error distributions. Thus, their estimation is sensitive to outliers and heavy-tailed error distributions. In this paper, we propose semi- and non-parametric contaminated Gaussian mixture of regressions to robustly estimate the parametric and/or non-parametric terms of the models in the presence of mild outliers. The virtue of using a contaminated Gaussian error distribution is that we can simultaneously perform model-based clustering of observations and model-based outlier detection. We propose two algorithms, an expectation-maximization (EM)-type algorithm and an expectation-conditional-maximization (ECM)-type algorithm, to perform maximum likelihood and local-likelihood kernel estimation of the parametric and non-parametric of the proposed models, respectively. The robustness of the proposed models is examined using an extensive simulation study. The practical utility of the proposed models is demonstrated using real data.

preprint2022arXiv

Supervised Dictionary Learning with Auxiliary Covariates

Supervised dictionary learning (SDL) is a classical machine learning method that simultaneously seeks feature extraction and classification tasks, which are not necessarily a priori aligned objectives. The goal of SDL is to learn a class-discriminative dictionary, which is a set of latent feature vectors that can well-explain both the features as well as labels of observed data. In this paper, we provide a systematic study of SDL, including the theory, algorithm, and applications of SDL. First, we provide a novel framework that `lifts' SDL as a convex problem in a combined factor space and propose a low-rank projected gradient descent algorithm that converges exponentially to the global minimizer of the objective. We also formulate generative models of SDL and provide global estimation guarantees of the true parameters depending on the hyperparameter regime. Second, viewed as a nonconvex constrained optimization problem, we provided an efficient block coordinate descent algorithm for SDL that is guaranteed to find an $\varepsilon$-stationary point of the objective in $O(\varepsilon^{-1}(\log \varepsilon^{-1})^{2})$ iterations. For the corresponding generative model, we establish a novel non-asymptotic local consistency result for constrained and regularized maximum likelihood estimation problems, which may be of independent interest. Third, we apply SDL for imbalanced document classification by supervised topic modeling and also for pneumonia detection from chest X-ray images. We also provide simulation studies to demonstrate that SDL becomes more effective when there is a discrepancy between the best reconstructive and the best discriminative dictionaries.

preprint2021arXiv

Pursuing Sources of Heterogeneity in Modeling Clustered Population

Researchers often have to deal with heterogeneous population with mixed regression relationships, increasingly so in the era of data explosion. In such problems, when there are many candidate predictors, it is not only of interest to identify the predictors that are associated with the outcome, but also to distinguish the true sources of heterogeneity, i.e., to identify the predictors that have different effects among the clusters and thus are the true contributors to the formation of the clusters. We clarify the concepts of the source of heterogeneity that account for potential scale differences of the clusters and propose a regularized finite mixture effects regression to achieve heterogeneity pursuit and feature selection simultaneously. As the name suggests, the problem is formulated under an effects-model parameterization, in which the cluster labels are missing and the effect of each predictor on the outcome is decomposed to a common effect term and a set of cluster-specific terms. A constrained sparse estimation of these effects leads to the identification of both the variables with common effects and those with heterogeneous effects. We propose an efficient algorithm and show that our approach can achieve both estimation and selection consistency. Simulation studies further demonstrate the effectiveness of our method under various practical scenarios. Three applications are presented, namely, an imaging genetics study for linking genetic factors and brain neuroimaging traits in Alzheimer's disease, a public health study for exploring the association between suicide risk among adolescents and their school district characteristics, and a sport analytics study for understanding how the salary levels of baseball players are associated with their performance and contractual status.

preprint2017arXiv

Fully Bayesian Classification with Heavy-tailed Priors for Selection in High-dimensional Features with Grouping Structure

Feature selection is demanded in many modern scientific research problems that use high-dimensional data. A typical example is to find the most useful genes that are related to a certain disease (eg, cancer) from high-dimensional gene expressions. The expressions of genes have grouping structures, for example, a group of co-regulated genes that have similar biological functions tend to have similar expressions. Many statistical methods have been proposed to take the grouping structure into consideration in feature selection, including group LASSO, supervised group LASSO, and regression on group representatives. In this paper, we propose a fully Bayesian Robit regression method with heavy-tailed (sparsity) priors (shortened by FBRHT) for selecting features with grouping structure. The main features of FBRHT include that it discards more aggressively unrelated features than LASSO, and it can make feature selection within groups automatically without a pre-specified grouping structure. In this paper, we use simulated and real datasets to demonstrate that the predictive power of the sparse feature subsets selected by FBRHT are comparable with other much larger feature subsets selected by LASSO, group LASSO, supervised group LASSO, penalized logistic regression and random forest, and that the succinct feature subsets selected by FBRHT have significantly better predictive power than the feature subsets of the same size taken from the top features selected by the aforementioned methods.