Researcher profile

Yong He

Yong He contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

Alpha-robust investment-reinsurance strategy for a mean-variance insurer with delay

In this paper, a robust optimal reinsurance-investment problem with delay is studied under the $α$-maxmin mean-variance criterion. The surplus process of an insurance company approximates Brownian motion with drift. The financial market consists of a risk-free asset and a risky asset that obeys geometric Brownian motion. Using the principle of dynamic programming and Hamilton-Jacobin-Bellman (HJB) equation, the specific expression of optimal strategy and the explicit solution of the corresponding HJB equation are obtained. In addition, a verification theorem is provided to ensure that the value function is indeed the solution of the HJB equation. Finally, some numerical examples and graphs are given to illustrate the results, and the influence of some important parameters in the model on the optimal strategy is discussed.

preprint2022arXiv

Distributed Learning for Principle Eigenspaces without Moment Constraints

Distributed Principal Component Analysis (PCA) has been studied to deal with the case when data are stored across multiple machines and communication cost or privacy concerns prohibit the computation of PCA in a central location. However, the sub-Gaussian assumption in the related literature is restrictive in real application where outliers or heavy-tailed data are common in areas such as finance and macroeconomic. In this article, we propose a distributed algorithm for estimating the principle eigenspaces without any moment constraint on the underlying distribution. We study the problem under the elliptical family framework and adopt the sample multivariate Kendall'tau matrix to extract eigenspace estimators from all sub-machines, which can be viewed as points in the Grassman manifold. We then find the "center" of these points as the final distributed estimator of the principal eigenspace. We investigate the bias and variance for the distributed estimator and derive its convergence rate which depends on the effective rank and eigengap of the scatter matrix, and the number of submachines. We show that the distributed estimator performs as if we have full access of whole data. Simulation studies show that the distributed algorithm performs comparably with the existing one for light-tailed data, while showing great advantage for heavy-tailed data. We also extend our algorithm to the distributed learning of elliptical factor models and verify its empirical usefulness through real application to a macroeconomic dataset.

preprint2022arXiv

Manifold Principle Component Analysis for Large-Dimensional Matrix Elliptical Factor Model

Matrix factor model has been growing popular in scientific fields such as econometrics, which serves as a two-way dimension reduction tool for matrix sequences. In this article, we for the first time propose the matrix elliptical factor model, which can better depict the possible heavy-tailed property of matrix-valued data especially in finance. Manifold Principle Component Analysis (MPCA) is for the first time introduced to estimate the row/column loading spaces. MPCA first performs Singular Value Decomposition (SVD)for each "local" matrix observation and then averages the local estimated spaces across all observations, while the existing ones such as 2-dimensional PCA first integrates data across observations and then does eigenvalue decomposition of the sample covariance matrices. We propose two versions of MPCA algorithms to estimate the factor loading matrices robustly, without any moment constraints on the factors and the idiosyncratic errors. Theoretical convergence rates of the corresponding estimators of the factor loading matrices, factor score matrices and common components matrices are derived under mild conditions. We also propose robust estimators of the row/column factor numbers based on the eigenvalue-ratio idea, which are proven to be consistent. Numerical studies and real example on financial returns data check the flexibility of our model and the validity of our MPCA methods.

preprint2022arXiv

On Minkowskian Product of Finsler Manifolds

Let (M_1,F_1) and (M_2,F_2) be a pair of Finsler manifolds. The Minkowskian product Finsler manifold (M,F) of (M_1,F_1) and (M_2,F_2) with respect to a product function f is the product manifold M=M_1\times M_2 endowed with the Finsler metric F^2=f(K,H), where K=(F_1)^2,H=(F_2)^2. In this paper, the Cartan connection and Berwald connection of (M,F) are derived in terms of the corresponding objects of (M_1,F_1) and (M_2,F_2). Necessary and sufficient conditions for (M,F) to be Berwald (resp. weakly Berwald, Landsberg, weakly Landsberg) manifold are obtained. Thus an effective method for constructing special Finsler manifolds mentioned above is given.

preprint2020arXiv

Enable an Open Software Defined Mobility Ecosystem through VEC-OF

OEMs and new entrants can take the Mobility as a Service market (MaaS) as the entry point, upgrade its E/E (Electric and Electronic) architecture to be C/C (Computing and Communication) architecture, build one open software defined and data driven software platform for its production and service model, use efficient and collaborative ways of vehicles, roads, cloud and network to continuously improve core technologies such as autonomous driving, provide MaaS operators with an affordable and agile platform. In this paper we present one new framework, VEC-OF (Vehicle-Edge-Cloud Open Framework), which is a new data and AI centric vehicle software framework enabling a much safer, more efficient, connected and trusted MaaS through cooperative vehicle, infrastructure and cloud capabilities and intelligence

preprint2020arXiv

Large-dimensional Factor Analysis without Moment Constraints

Large-dimensional factor model has drawn much attention in the big-data era, in order to reduce the dimensionality and extract underlying features using a few latent common factors. Conventional methods for estimating the factor model typically requires finite fourth moment of the data, which ignores the effect of heavy-tailedness and thus may result in unrobust or even inconsistent estimation of the factor space and common components. In this paper, we propose to recover the factor space by performing principal component analysis to the spatial Kendall's tau matrix instead of the sample covariance matrix. In a second step, we estimate the factor scores by the ordinary least square (OLS) regression. Theoretically, we show that under the elliptical distribution framework the factor loadings and scores as well as the common components can be estimated consistently without any moment constraint. The convergence rates of the estimated factor loadings, scores and common components are provided. The finite sample performance of the proposed procedure is assessed through thorough simulations. An analysis of a financial data set of asset returns shows the superiority of the proposed method over the classical PCA method.

preprint2020arXiv

Multiscale Modeling and Analysis for High-fidelity Interferometric Scattering Microscopy

Interferometric scattering microscopy (iSCAT), as an ultrasensitive fluorescence-free imaging modality, has recently gain enormous attention and been rapidly developing from demonstration of principle to quantitative sensing. Here we report on a theoretical and experimental study for iSCAT with samples having structural dimensions that differ by 4-5 orders of magnitude. In particular, we demonstrate and intuitively explain the profound effects of sub-nanometer surface roughness of a glass coverslip and of a mica surface on the absolute signal and the shape of the point spread function of a gold nanoparticle. These quantities significantly affect the accuracies for determining the target size and position in all three dimensions. Moreover, we investigate a sample system mimicking a gold nanoparticle in a simplified cell environment and show position-dependent and even asymmetric point spread function of the nanoparticle. The multiscale study will facilitate the development of high fidelity iSCAT in real applications.

preprint2020arXiv

Network-Assisted Estimation for Large-dimensional Factor Model with Guaranteed Convergence Rate Improvement

Network structure is growing popular for capturing the intrinsic relationship between large-scale variables. In the paper we propose to improve the estimation accuracy for large-dimensional factor model when a network structure between individuals is observed. To fully excavate the prior network information, we construct two different penalties to regularize the factor loadings and shrink the idiosyncratic errors. Closed-form solutions are provided for the penalized optimization problems. Theoretical results demonstrate that the modified estimators achieve faster convergence rates and lower asymptotic mean squared errors when the underlying network structure among individuals is correct. An interesting finding is that even if the priori network is totally misleading, the proposed estimators perform no worse than conventional state-of-art methods. Furthermore, to facilitate the practical application, we propose a data-driven approach to select the tuning parameters, which is computationally efficient. We also provide an empirical criterion to determine the number of common factors. Simulation studies and application to the S&P100 weekly return dataset convincingly illustrate the superiority and adaptivity of the new approach.

preprint2020arXiv

Robust Covariance Estimation for High-dimensional Compositional Data with Application to Microbial Communities Analysis

Microbial communities analysis is drawing growing attention due to the rapid development of high-throughput sequencing techniques nowadays. The observed data has the following typical characteristics: it is high-dimensional, compositional (lying in a simplex) and even would be leptokurtic and highly skewed due to the existence of overly abundant taxa, which makes the conventional correlation analysis infeasible to study the co-occurrence and co-exclusion relationship between microbial taxa. In this article, we address the challenges of covariance estimation for this kind of data. Assuming the basis covariance matrix lying in a well-recognized class of sparse covariance matrices, we adopt a proxy matrix known as centered log-ratio covariance matrix in the literature, which is approximately indistinguishable from the real basis covariance matrix as the dimensionality tends to infinity. We construct a Median-of-Means (MOM) estimator for the centered log-ratio covariance matrix and propose a thresholding procedure that is adaptive to the variability of individual entries. By imposing a much weaker finite fourth moment condition compared with the sub-Gaussianity condition in the literature, we derive the optimal rate of convergence under the spectral norm. In addition, we also provide theoretical guarantee on support recovery. The adaptive thresholding procedure of the MOM estimator is easy to implement and gains robustness when outliers or heavy-tailedness exist. Thorough simulation studies are conducted to show the advantages of the proposed procedure over some state-of-the-arts methods. At last, we apply the proposed method to analyze a microbiome dataset in human gut. The R script for implementing the method is available at https://github.com/heyongstat/RCEC.

preprint2019arXiv

Single-Molecule Doped Crystalline Nanosheets for Delicate Photophysics Studies and Directional Single-Photon Emitting Devices

Single molecules in solids have been considered as an attractive class of solid-state single quantum systems because they can be chemically synthesized at low cost to have stable narrow transitions at desired wavelengths. Here we report and demonstrate single dibenzoterrylene molecules in crystalline anthracene nanosheets as a robust and versatile solid-state platform for delicate photophysics studies and building blocks of single-photon devices. The high-quality nanosheet sample enables robust studies of delicate single-molecule photophysics at room temperature, including the first real-time observation of single molecule insertion site jump, quantitative measurements of the associated changes of dipole moment orientation and magnitude, unambiguous determination of excitation-power dependent intersystem crossing rate and triplet lifetime. Moreover, we demonstrate the flexible assembly of the nanosheets into a planar antenna device to achieve bright single-molecule emission with a Gaussian emission pattern. The thin thickness, good photostability and mechanical stability make the dibenzoterrylene-in-anthracene nanosheet system an excellent candidate of static quantum nodes in integrated photonic circuit.

preprint2018arXiv

Doubly Robust Sure Screening for Elliptical Copula Regression Model

Regression analysis has always been a hot research topic in statistics. We propose a very flexible semi-parametric regression model called Elliptical Copula Regression (ECR) model, which covers a large class of linear and nonlinear regression models such as additive regression model,single index model. Besides, ECR model can capture the heavy-tail characteristic and tail dependence between variables, thus it could be widely applied in many areas such as econometrics and finance. In this paper we mainly focus on the feature screening problem for ECR model in ultra-high dimensional setting. We propose a doubly robust sure screening procedure for ECR model, in which two types of correlation coefficient are involved: Kendall tau correlation and Canonical correlation. Theoretical analysis shows that the procedure enjoys sure screening property, i.e., with probability tending to 1, the screening procedure selects out all important variables and substantially reduces the dimensionality to a moderate size against the sample size. Thorough numerical studies are conducted to illustrate its advantage over existing sure independence screening methods and thus it can be used as a safe replacement of the existing procedures in practice. At last, the proposed procedure is applied on a gene-expression real data set to show its empirical usefulness.

preprint2018arXiv

Two-dimensional InSe/WS$_2$ heterostructure with enhanced optoelectronic performance in the visible region

Two-dimensional (2D) InSe and WS$_2$ exhibit promising characteristics for optoelectronic and photoelectrochemical applications, e.g. photodetection and photocatalytic water splitting. However, both of them have poor absorption of visible light due to wide band gaps. 2D InSe has high electron mobility but low hole mobility, while 2D WS$_2$ is on the opposite. Here, we design a 2D heterostructure composed of their monolayers and study its optoelectronic properties by first-principles calculations. Our results show that the heterostructure has a direct band gap of 2.19 eV, which is much smaller than those of the monolayers mainly due to a type-II band alignment: the valence band maximum and the conduction band minimum of monolayer InSe are lower than those of monolayer WS$_2$, respectively. The visible-light absorption is enhanced considerably, e.g. about fivefold (threefold) increase at the wavelength of 490 nm in comparison to monolayer InSe (WS$_2$). The type-II band alignment also facilitates the spatial separation of photogenerated electron-hole pairs, i.e., electrons (holes) reside preferably in the InSe (WS$_2$) layer. The two layers complement each other in carrier mobilities of the heterostructure: the photogenerated electrons and holes inherit the large mobilities from the InSe and WS$_2$ monolayers, respectively.