Researcher profile

Ruey S. Tsay

Ruey S. Tsay contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

High-dimensional Linear Regression for Dependent Data with Applications to Nowcasting

Recent research has focused on $\ell_1$ penalized least squares (Lasso) estimators for high-dimensional linear regressions in which the number of covariates $p$ is considerably larger than the sample size $n$. However, few studies have examined the properties of the estimators when the errors and/or the covariates are serially dependent. In this study, we investigate the theoretical properties of the Lasso estimator for a linear regression with a random design and weak sparsity under serially dependent and/or nonsubGaussian errors and covariates. In contrast to the traditional case, in which the errors are independent and identically distributed and have finite exponential moments, we show that $p$ can be at most a power of $n$ if the errors have only finite polynomial moments. In addition, the rate of convergence becomes slower owing to the serial dependence in the errors and the covariates. We also consider the sign consistency of the model selection using the Lasso estimator when there are serial correlations in the errors or the covariates, or both. Adopting the framework of a functional dependence measure, we describe how the rates of convergence and the selection consistency of the estimators depend on the dependence measures and moment conditions of the errors and the covariates. Simulation results show that a Lasso regression can be significantly more powerful than a mixed-frequency data sampling regression (MIDAS) and a Dantzig selector in the presence of irrelevant variables. We apply the results obtained for the Lasso method to nowcasting with mixed-frequency data, in which serially correlated errors and a large number of covariates are common. The empirical results show that the Lasso procedure outperforms the MIDAS regression and the autoregressive model with exogenous variables in terms of both forecasting and nowcasting.

preprint2022arXiv

Rate-Optimal Robust Estimation of High-Dimensional Vector Autoregressive Models

High-dimensional time series data appear in many scientific areas in the current data-rich environment. Analysis of such data poses new challenges to data analysts because of not only the complicated dynamic dependence between the series, but also the existence of aberrant observations, such as missing values, contaminated observations, and heavy-tailed distributions. For high-dimensional vector autoregressive (VAR) models, we introduce a unified estimation procedure that is robust to model misspecification, heavy-tailed noise contamination, and conditional heteroscedasticity. The proposed methodology enjoys both statistical optimality and computational efficiency, and can handle many popular high-dimensional models, such as sparse, reduced-rank, banded, and network-structured VAR models. With proper regularization and data truncation, the estimation convergence rates are shown to be almost optimal in the minimax sense under a bounded $(2+2ε)$-th moment condition. When $ε\geq1$, the rates of convergence match those obtained under the sub-Gaussian assumption. Consistency of the proposed estimators is also established for some $ε\in(0,1)$, with minimax optimal convergence rates associated with $ε$. The efficacy of the proposed estimation methods is demonstrated by simulation and a U.S. macroeconomic example.

preprint2021arXiv

Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data

This paper proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each 2nd-level computer collects the common factors from its subordinates and performs another PCA to select the 2nd-level common factors. This process is repeated until the central server is reached, which collects common factors from its direct subordinates and performs a final PCA to select the global common factors. The noise terms of the 2nd-level approximate factor model are the unique common factors of the 1st-level clusters. We focus on the case of 2 levels in our theoretical derivations, but the idea can easily be generalized to any finite number of hierarchies. We discuss some clustering methods when the group memberships are unknown and introduce a new diffusion index approach to forecasting. We further extend the analysis to unit-root nonstationary time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size $T$. We use both simulated data and real examples to assess the performance of the proposed method in finite samples, and compare our method with the commonly used ones in the literature concerning the forecastability of extracted factors.