Researcher profile

Huiming Zhang

Huiming Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Sharper Sub-Weibull Concentrations

Constant-specified and exponential concentration inequalities play an essential role in the finite-sample theory of machine learning and high-dimensional statistics area. We obtain sharper and constants-specified concentration inequalities for the sum of independent sub-Weibull random variables, which leads to a mixture of two tails: sub-Gaussian for small deviations and sub-Weibull for large deviations from the mean. These bounds are new and improve existing bounds with sharper constants. In addition, a new sub-Weibull parameter if the italic should be retained. Please check the whole text. is also proposed, which enables recovering the tight concentration inequality for a random variable (vector). For statistical applications, we give an $\ell_2$-error of estimated coefficients in negative binomial regressions when the heavy-tailed covariates are sub-Weibull distributed with sparse structures, which is a new result for negative binomial regressions. In applying random matrices, we derive non-asymptotic versions of Bai-Yin's theorem for sub-Weibull entries with exponential tail bounds. Finally, by demonstrating a sub-Weibull confidence region for a log-truncated Z-estimator without the second-moment condition, we discuss and define the sub-Weibull type robust estimator for independent observations $\{X_i\}_{i=1}^{n}$ without exponential-moment conditions.

preprint2021arXiv

A Unified Light Framework for Real-time Fault Detection of Freight Train Images

Real-time fault detection for freight trains plays a vital role in guaranteeing the security and optimal operation of railway transportation under stringent resource requirements. Despite the promising results for deep learning based approaches, the performance of these fault detectors on freight train images, are far from satisfactory in both accuracy and efficiency. This paper proposes a unified light framework to improve detection accuracy while supporting a real-time operation with a low resource requirement. We firstly design a novel lightweight backbone (RFDNet) to improve the accuracy and reduce computational cost. Then, we propose a multi region proposal network using multi-scale feature maps generated from RFDNet to improve the detection performance. Finally, we present multi level position-sensitive score maps and region of interest pooling to further improve accuracy with few redundant computations. Extensive experimental results on public benchmark datasets suggest that our RFDNet can significantly improve the performance of baseline network with higher accuracy and efficiency. Experiments on six fault datasets show that our method is capable of real-time detection at over 38 frames per second and achieves competitive accuracy and lower computation than the state-of-the-art detectors.

preprint2021arXiv

Elastic-net Regularized High-dimensional Negative Binomial Regression: Consistency and Weak Signals Detection

We study a sparse negative binomial regression (NBR) for count data by showing the non-asymptotic advantages of using the elastic-net estimator. Two types of oracle inequalities are derived for the NBR's elastic-net estimates by using the Compatibility Factor Condition and the Stabil Condition. The second type of oracle inequality is for the random design and can be extended to many $\ell_1 + \ell_2$ regularized M-estimations, with the corresponding empirical process having stochastic Lipschitz properties. We derive the concentration inequality for the suprema empirical processes for the weighted sum of negative binomial variables to show some high--probability events. We apply the method by showing the sign consistency, provided that the nonzero components in the true sparse vector are larger than a proper choice of the weakest signal detection threshold. In the second application, we show the grouping effect inequality with high probability. Third, under some assumptions for a design matrix, we can recover the true variable set with a high probability if the weakest signal detection threshold is large than the turning parameter up to a known constant. Lastly, we briefly discuss the de-biased elastic-net estimator, and numerical studies are given to support the proposal.

preprint2020arXiv

Asymptotic Theory for Differentially Private Generalized $β$-models with Parameters Increasing

Modelling edge weights play a crucial role in the analysis of network data, which reveals the extent of relationships among individuals. Due to the diversity of weight information, sharing these data has become a complicated challenge in a privacy-preserving way. In this paper, we consider the case of the non-denoising process to achieve the trade-off between privacy and weight information in the generalized $β$-model. Under the edge differential privacy with a discrete Laplace mechanism, the Z-estimators from estimating equations for the model parameters are shown to be consistent and asymptotically normally distributed. The simulations and a real data example are given to further support the theoretical results.

preprint2020arXiv

Law of the Iterated Logarithm and Model Selection Consistency for GLMs with Independent and Dependent Responses

We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent ($ρ$-mixing, $m$-dependent) responses under mild conditions. The LIL is useful to derive the asymptotic bounds for the discrepancy between the empirical process of the log-likelihood function and the true log-likelihood. As the application of the LIL, the strong consistency of some penalized likelihood based model selection criteria can be shown. Under some regularity conditions, the model selection criterion will be helpful to select the simplest correct model almost surely when the penalty term increases with model dimension and the penalty term has an order higher than $O({\rm{loglog}}n)$ but lower than $O(n)$. Simulation studies are implemented to verify the selection consistency of BIC.

preprint2020arXiv

Sparse Density Estimation with Measurement Errors

This paper aims to build an estimate of an unknown density of the data with measurement error as a linear combination of functions from a dictionary. Inspired by the penalization approach, we propose the weighted Elastic-net penalized minimal $\ell_2$-distance method for sparse coefficients estimation, where the adaptive weights come from sharp concentration inequalities. The optimal weighted tuning parameters are obtained by the first-order conditions holding with a high probability. Under local coherence or minimal eigenvalue assumptions, non-asymptotical oracle inequalities are derived. These theoretical results are transposed to obtain the support recovery with a high probability. Then, some numerical experiments for discrete and continuous distributions confirm the significant improvement obtained by our procedure when compared with other conventional approaches. Finally, the application is performed in a meteorology data set. It shows that our method has potency and superiority of detecting the shape of multi-mode density compared with other conventional approaches.

preprint2020arXiv

Weighted Lasso Estimates for Sparse Logistic Regression: Non-asymptotic Properties with Measurement Error

When we are interested in high-dimensional system and focus on classification performance, the $\ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We proposed two types of weighted Lasso estimates depending on covariates by the McDiarmid inequality. Given sample size $n$ and dimension of covariates $p$, the finite sample behavior of our proposed methods with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as $\ell_{1}$-estimation error and squared prediction error of the unknown parameters. We compare the performance of our methods with former weighted estimates on simulated data, then apply these methods to do real data analysis.