Researcher profile

Wenyang Zhang

Wenyang Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Minimax Optimal Robust Sparse Regression with Heavy-Tailed Designs: A Gradient-Based Approach

We investigate high-dimensional sparse regression when both the noise and the design matrix exhibit heavy-tailed behavior. Standard algorithms typically fail in this regime, as heavy-tailed covariates distort the empirical risk geometry. We propose a unified framework, Robust Iterative Gradient descent with Hard Thresholding (RIGHT), which employs a robust gradient estimator to bypass the need for higher-order moment conditions. Our analysis reveals a fundamental decoupling phenomenon: in linear regression, the estimation error rate is governed by the noise tail index, while the sample complexity required for stability is governed by the design tail index. This implies that while heavy-tailed noise limits precision, heavy-tailed designs primarily raise the sample size barrier for convergence. In contrast, for logistic regression, we show that the bounded gradient naturally robustifies the estimator against heavy-tailed designs, restoring standard parametric rates. We derive matching minimax lower bounds to prove that RIGHT achieves optimal estimation accuracy and sample complexity across these regimes, without requiring sample splitting or the existence of the population risk.

preprint2022arXiv

Model Averaging based Semiparametric Modelling for Conditional Quantile Prediction

In real data analysis, the underlying model is usually unknown, modelling strategy plays a key role in the success of data analysis. Stimulated by the idea of model averaging, we propose a novel semiparametric modelling strategy for conditional quantile prediction, without assuming the underlying model is any specific parametric or semiparametric model. Thanks the optimality of the selected weights by cross-validation, the proposed modelling strategy results in a more accurate prediction than that based on some commonly used semiparametric models, such as the varying coefficient models and additive models. Asymptotic properties are established of the proposed modelling strategy together with its estimation procedure. Intensive simulation studies are conducted to demonstrate how well the proposed method works, compared with its alternatives under various circumstances. The results show the proposed method indeed leads to more accurate predictions than its alternatives. Finally, the proposed modelling strategy together with its prediction procedure are applied to the Boston housing data, which result in more accurate predictions of the quantiles of the house prices than that based on some commonly used alternative methods, therefore, present us a more accurate picture of the housing market in Boston.

preprint2020arXiv

Estimation and Inference for Multi-Kink Quantile Regression

The Multi-Kink Quantile Regression (MKQR) model is an important tool for analyzing data with heterogeneous conditional distributions, especially when quantiles of response variable are of interest, due to its robustness to outliers and heavy-tailed errors in the response. It assumes different linear quantile regression forms in different regions of the domain of the threshold covariate but are still continuous at kink points. In this paper, we investigate parameter estimation, kink point detection and statistical inference in MKQR models. We propose an iterative segmented quantile regression algorithm for estimating both the regression coefficients and the locations of kink points. The proposed algorithm is much more computationally efficient than the grid search algorithm and not sensitive to the selection of initial values. Asymptotic properties, such as selection consistency of the number of kink points, asymptotic normality of the estimators of both regression coefficients and kink effects, are established to justify the proposed method theoretically. A score test, based on partial subgradients, is developed to verify whether the kink effects exist or not. Test-inversion confidence intervals for kink location parameters are also constructed. Intensive simulation studies conducted show the proposed methods work very well when sample size is finite. Finally, we apply the MKQR models together with the proposed methods to the dataset about secondary industrial structure of China and the dataset about triceps skinfold thickness of Gambian females, which leads to some very interesting findings. A new R package MultiKink is developed to implement the proposed methods.