Source author record

Zhaoxing Gao

Zhaoxing Gao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology econ.EM Machine Learning

Catalog footprint

What is connected

3works

3topics

3close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Split-and-Conquer: Distributed Factor Modeling for High-Dimensional Matrix-Variate Time Series

In this paper, we propose a distributed framework for reducing the dimensionality of high-dimensional, large-scale, heterogeneous matrix-variate time series data using a factor model. The data are first partitioned column-wise (or row-wise) and allocated to node servers, where each node estimates the row (or column) loading matrix via two-dimensional tensor PCA. These local estimates are then transmitted to a central server and aggregated, followed by a final PCA step to obtain the global row (or column) loading matrix estimator. Given the estimated loading matrices, the corresponding factor matrices are subsequently computed. Unlike existing distributed approaches, our framework preserves the latent matrix structure, thereby improving computational efficiency and enhancing information utilization. We also discuss row- and column-wise clustering procedures for settings in which the group memberships are unknown. Furthermore, we extend the analysis to unit-root nonstationary matrix-variate time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. Simulation results assess the computational efficiency and estimation accuracy of the proposed framework, and real data applications further validate its predictive performance.

preprint2021arXiv

Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data

This paper proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each 2nd-level computer collects the common factors from its subordinates and performs another PCA to select the 2nd-level common factors. This process is repeated until the central server is reached, which collects common factors from its direct subordinates and performs a final PCA to select the global common factors. The noise terms of the 2nd-level approximate factor model are the unique common factors of the 1st-level clusters. We focus on the case of 2 levels in our theoretical derivations, but the idea can easily be generalized to any finite number of hierarchies. We discuss some clustering methods when the group memberships are unknown and introduce a new diffusion index approach to forecasting. We further extend the analysis to unit-root nonstationary time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. We use both simulated data and real examples to assess the performance of the proposed method in finite samples, and compare our method with the commonly used ones in the literature concerning the forecastability of extracted factors.

preprint2020arXiv

Segmenting High-dimensional Matrix-valued Time Series via Sequential Transformations

Modeling matrix-valued time series is an interesting and important research topic. In this paper, we extend the method of Chang et al. (2017) to matrix-valued time series. For any given $p\times q$ matrix-valued time series, we look for linear transformations to segment the matrix into many small sub-matrices for which each of them are uncorrelated with the others both contemporaneously and serially, thus they can be analyzed separately, which will greatly reduce the number of parameters to be estimated in terms of modeling. To overcome the identification issue, we propose a two-step and more structured procedure to segment the rows and columns separately. When $\max(p,q)$ is large in relation to the sample size $n$, we assume the transformation matrices are sparse and use threshold estimators for the (auto)covariance matrices. We also propose a block-wisely thresholding method to separate the columns (or rows) of the transformed matrix-valued data. The asymptotic properties are established for both fixed and diverging $\max(p,q)$. Unlike principal component analysis (PCA) for independent data, we cannot guarantee that the required linear transformation exists. When it does not, the proposed method provides an approximate segmentation, which may be useful for forecasting. The proposed method is illustrated with both simulated and real data examples. We also propose a sequential transformation algorithm to segment higher-order tensor-valued time series.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint