Researcher profile

Xinsheng Zhang

Xinsheng Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2023arXiv

Onset mechanism of an inverted U-shaped solar filament eruption revealed by NVST, SDO, and STEREO-A observations

Utilizing observations from the New Vacuum Solar Telescope (NVST), Solar Dynamics Observatory (SDO), and Solar Terrestrial Relations Observatory-Ahead (STEREO-A), we investigate the event from two distinct observational perspectives: on the solar disk using NVST and SDO, and on the solar limb using STEREO-A. We employ both a non-linear force-free field model and a potential field model to reconstruct the coronal magnetic field, aiming to understand its magnetic properties. Two precursor jet-like activities were observed before the eruption, displaying an untwisted rotation. The second activity released an estimated twist of over two turns. During these two jet-like activities, Y-shaped brightenings, newly emerging magnetic flux accompanied by magnetic cancellation, and the formation of newly moving fibrils were identified. Combining these observational features, it can be inferred that these two precursor jet-like activities released the magnetic field constraining the filament and were triggered by newly emerging magnetic flux. Before the filament eruption, it was observed that some moving flows had been ejected from the site as the onset of two jet-like activities, indicating the same physical process as two jet-like activities. Extrapolations revealed that the filament laid under the height of the decay index of 1.0 and had strong magnetic field (540 Gauss) and a high twisted number (2.4 turns) before the eruption. An apparent rotational motion was observed during the filament eruption. We deduce that the solar filament, exhibiting an inverted U-shape, is a significantly twisted flux rope. The eruption of the filament was initiated by the release of constraining magnetic fields through continuous magnetic reconnection. This reconnection process was triggered by the emergence of newly magnetic flux.

preprint2022arXiv

Manifold Principle Component Analysis for Large-Dimensional Matrix Elliptical Factor Model

Matrix factor model has been growing popular in scientific fields such as econometrics, which serves as a two-way dimension reduction tool for matrix sequences. In this article, we for the first time propose the matrix elliptical factor model, which can better depict the possible heavy-tailed property of matrix-valued data especially in finance. Manifold Principle Component Analysis (MPCA) is for the first time introduced to estimate the row/column loading spaces. MPCA first performs Singular Value Decomposition (SVD)for each "local" matrix observation and then averages the local estimated spaces across all observations, while the existing ones such as 2-dimensional PCA first integrates data across observations and then does eigenvalue decomposition of the sample covariance matrices. We propose two versions of MPCA algorithms to estimate the factor loading matrices robustly, without any moment constraints on the factors and the idiosyncratic errors. Theoretical convergence rates of the corresponding estimators of the factor loading matrices, factor score matrices and common components matrices are derived under mild conditions. We also propose robust estimators of the row/column factor numbers based on the eigenvalue-ratio idea, which are proven to be consistent. Numerical studies and real example on financial returns data check the flexibility of our model and the validity of our MPCA methods.

preprint2020arXiv

Large-dimensional Factor Analysis without Moment Constraints

Large-dimensional factor model has drawn much attention in the big-data era, in order to reduce the dimensionality and extract underlying features using a few latent common factors. Conventional methods for estimating the factor model typically requires finite fourth moment of the data, which ignores the effect of heavy-tailedness and thus may result in unrobust or even inconsistent estimation of the factor space and common components. In this paper, we propose to recover the factor space by performing principal component analysis to the spatial Kendall's tau matrix instead of the sample covariance matrix. In a second step, we estimate the factor scores by the ordinary least square (OLS) regression. Theoretically, we show that under the elliptical distribution framework the factor loadings and scores as well as the common components can be estimated consistently without any moment constraint. The convergence rates of the estimated factor loadings, scores and common components are provided. The finite sample performance of the proposed procedure is assessed through thorough simulations. An analysis of a financial data set of asset returns shows the superiority of the proposed method over the classical PCA method.

preprint2020arXiv

Network-Assisted Estimation for Large-dimensional Factor Model with Guaranteed Convergence Rate Improvement

Network structure is growing popular for capturing the intrinsic relationship between large-scale variables. In the paper we propose to improve the estimation accuracy for large-dimensional factor model when a network structure between individuals is observed. To fully excavate the prior network information, we construct two different penalties to regularize the factor loadings and shrink the idiosyncratic errors. Closed-form solutions are provided for the penalized optimization problems. Theoretical results demonstrate that the modified estimators achieve faster convergence rates and lower asymptotic mean squared errors when the underlying network structure among individuals is correct. An interesting finding is that even if the priori network is totally misleading, the proposed estimators perform no worse than conventional state-of-art methods. Furthermore, to facilitate the practical application, we propose a data-driven approach to select the tuning parameters, which is computationally efficient. We also provide an empirical criterion to determine the number of common factors. Simulation studies and application to the S&P100 weekly return dataset convincingly illustrate the superiority and adaptivity of the new approach.

preprint2020arXiv

Robust Covariance Estimation for High-dimensional Compositional Data with Application to Microbial Communities Analysis

Microbial communities analysis is drawing growing attention due to the rapid development of high-throughput sequencing techniques nowadays. The observed data has the following typical characteristics: it is high-dimensional, compositional (lying in a simplex) and even would be leptokurtic and highly skewed due to the existence of overly abundant taxa, which makes the conventional correlation analysis infeasible to study the co-occurrence and co-exclusion relationship between microbial taxa. In this article, we address the challenges of covariance estimation for this kind of data. Assuming the basis covariance matrix lying in a well-recognized class of sparse covariance matrices, we adopt a proxy matrix known as centered log-ratio covariance matrix in the literature, which is approximately indistinguishable from the real basis covariance matrix as the dimensionality tends to infinity. We construct a Median-of-Means (MOM) estimator for the centered log-ratio covariance matrix and propose a thresholding procedure that is adaptive to the variability of individual entries. By imposing a much weaker finite fourth moment condition compared with the sub-Gaussianity condition in the literature, we derive the optimal rate of convergence under the spectral norm. In addition, we also provide theoretical guarantee on support recovery. The adaptive thresholding procedure of the MOM estimator is easy to implement and gains robustness when outliers or heavy-tailedness exist. Thorough simulation studies are conducted to show the advantages of the proposed procedure over some state-of-the-arts methods. At last, we apply the proposed method to analyze a microbiome dataset in human gut. The R script for implementing the method is available at https://github.com/heyongstat/RCEC.

preprint2018arXiv

Doubly Robust Sure Screening for Elliptical Copula Regression Model

Regression analysis has always been a hot research topic in statistics. We propose a very flexible semi-parametric regression model called Elliptical Copula Regression (ECR) model, which covers a large class of linear and nonlinear regression models such as additive regression model,single index model. Besides, ECR model can capture the heavy-tail characteristic and tail dependence between variables, thus it could be widely applied in many areas such as econometrics and finance. In this paper we mainly focus on the feature screening problem for ECR model in ultra-high dimensional setting. We propose a doubly robust sure screening procedure for ECR model, in which two types of correlation coefficient are involved: Kendall tau correlation and Canonical correlation. Theoretical analysis shows that the procedure enjoys sure screening property, i.e., with probability tending to 1, the screening procedure selects out all important variables and substantially reduces the dimensionality to a moderate size against the sample size. Thorough numerical studies are conducted to illustrate its advantage over existing sure independence screening methods and thus it can be used as a safe replacement of the existing procedures in practice. At last, the proposed procedure is applied on a gene-expression real data set to show its empirical usefulness.