Source author record

Bowei Yan

Bowei Yan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Probabilistic Best Subset Selection via Gradient-Based Optimization

In high-dimensional statistics, variable selection recovers the latent sparse patterns from all possible covariate combinations. This paper proposes a novel optimization method to solve the exact L0-regularized regression problem, which is also known as the best subset selection. We reformulate the optimization problem from a discrete space to a continuous one via probabilistic reparameterization. The new objective function is differentiable but its gradient often cannot be computed in a closed form. Then we propose a family of unbiased gradient estimators to optimize the best subset selection objectives by the stochastic gradient descent. Within this family, we identify the estimator with uniformly minimum variance. Theoretically, we study the general conditions under which the method is guaranteed to converge to the ground truth in expectation. The proposed method can find the true regression model from thousands of covariates in seconds. In a wide variety of synthetic and semi-synthetic data, the proposed method outperforms existing variable selection tools based on the relaxed penalties, coordinate descent, and mixed integer optimization in both sparse pattern recovery and out-of-sample prediction.

preprint2020arXiv

Graph-Fused Multivariate Regression via Total Variation Regularization

In this paper, we propose the Graph-Fused Multivariate Regression (GFMR) via Total Variation regularization, a novel method for estimating the association between a one-dimensional or multidimensional array outcome and scalar predictors. While we were motivated by data from neuroimaging and physical activity tracking, the methodology is designed and presented in a generalizable format and is applicable to many other areas of scientific research. The estimator is the solution of a penalized regression problem where the objective is the sum of square error plus a total variation (TV) regularization on the predicted mean across all subjects. We propose an algorithm for parameter estimation, which is efficient and scalable in a distributed computing platform. Proof of the algorithm convergence is provided, and the statistical consistency of the estimator is presented via an oracle inequality. We present 1D and 2D simulation results and demonstrate that GFMR outperforms existing methods in most cases. We also demonstrate the general applicability of the method by two real data examples, including the analysis of the 1D accelerometry subsample of a large community-based study for mood disorders and the analysis of the 3D MRI data from the attention-deficient/hyperactive deficient (ADHD) 200 consortium.

preprint2016arXiv

On Robustness of Kernel Clustering

Clustering is one of the most important unsupervised problems in machine learning and statistics. Among many existing algorithms, kernel k-means has drawn much research attention due to its ability to find non-linear cluster boundaries and its inherent simplicity. There are two main approaches for kernel k-means: SVD of the kernel matrix and convex relaxations. Despite the attention kernel clustering has received both from theoretical and applied quarters, not much is known about robustness of the methods. In this paper we first introduce a semidefinite programming relaxation for the kernel clustering problem, then prove that under a suitable model specification, both the K-SVD and SDP approaches are consistent in the limit, albeit SDP is strongly consistent, i.e. achieves exact recovery, whereas K-SVD is weakly consistent, i.e. the fraction of misclassified nodes vanish.