Source author record

Matteo Barigozzi

Matteo Barigozzi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

econ.EM Methodology physics.soc-ph cond-mat.stat-mech q-fin.GN q-fin.ST Social and Information Networks

Catalog footprint

What is connected

10works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Estimation of large approximate dynamic matrix factor models based on the EM algorithm and Kalman filtering

This paper considers an approximate dynamic matrix factor model that accounts for the time series nature of the data by explicitly modelling the time evolution of the factors. We study estimation of the model parameters based on the Expectation Maximization (EM) algorithm, implemented jointly with the Kalman smoother which gives estimates of the factors. We establish the consistency of the estimated loadings and factor matrices as the sample size $T$ and the matrix dimensions $p_1$ and $p_2$ diverge to infinity. We then extend this approach to: (a) the case of arbitrary patterns of missing data and (b) the presence of common stochastic trends. The finite sample properties of the estimators are assessed through a large simulation study and two applications on: (i) a financial dataset of volatility proxies and (ii) a macroeconomic dataset covering the main euro area countries.

preprint2026arXiv

Mean Square Errors of factors extracted using principal components, linear projections, and Kalman filter

Factor extraction from systems of variables with a large cross-sectional dimension, $N$, is often based on either Principal Components (PC)-based procedures, or Kalman filter (KF)-based procedures. Measuring the uncertainty of the extracted factors is important when, for example, they have a direct interpretation and/or they are used to summarized the information in a large number of potential predictors. In this paper, we compare the finite $N$ mean square errors (MSEs) of PC and KF factors extracted under different structures of the idiosyncratic cross-correlations. We show that the MSEs of PC-based factors, implicitly based on treating the true underlying factors as deterministic, are larger than the corresponding MSEs of KF factors, obtained by treating the true factors as either serially independent or autocorrelated random variables. We also study and compare the MSEs of PC and KF factors estimated when the idiosyncratic components are wrongly considered as if they were cross-sectionally homoscedastic and/or uncorrelated. The relevance of the results for the construction of confidence intervals for the factors are illustrated with simulated data.

preprint2024arXiv

Dynamic Factor Models: a Genealogy

Dynamic factor models have been developed out of the need of analyzing and forecasting time series in increasingly high dimensions. While mathematical statisticians faced with inference problems in high-dimensional observation spaces were focusing on the so-called spiked-model-asymptotics, econometricians adopted an entirely and considerably more effective asymptotic approach, rooted in the factor models originally considered in psychometrics. The so-called dynamic factor model methods, in two decades, has grown into a wide and successful body of techniques that are widely used in central banks, financial institutions, economic and statistical institutes. The objective of this chapter is not an extensive survey of the topic but a sketch of its historical growth, with emphasis on the various assumptions and interpretations, and a family tree of its main variants.

preprint2020arXiv

Consistent estimation of high-dimensional factor models when the factor number is over-estimated

A high-dimensional $r$-factor model for an $n$-dimensional vector time series is characterised by the presence of a large eigengap (increasing with $n$) between the $r$-th and the $(r+1)$-th largest eigenvalues of the covariance matrix. Consequently, Principal Component (PC) analysis is the most popular estimation method for factor models and its consistency, when $r$ is correctly estimated, is well-established in the literature. However, popular factor number estimators often suffer from the lack of an obvious eigengap in empirical eigenvalues and tend to over-estimate $r$ due, for example, to the existence of non-pervasive factors affecting only a subset of the series. We show that the errors in the PC estimators resulting from the over-estimation of $r$ are non-negligible, which in turn lead to the violation of the conditions required for factor-based large covariance estimation. To remedy this, we propose new estimators of the factor model based on scaling the entries of the sample eigenvectors. We show both theoretically and numerically that the proposed estimators successfully control for the over-estimation error, and investigate their performance when applied to risk minimisation of a portfolio of financial time series.

preprint2020arXiv

Large-Dimensional Dynamic Factor Models: Estimation of Impulse-Response Functions with $I(1)$ Cointegrated Factors

We study a large-dimensional Dynamic Factor Model where: (i)~the vector of factors $\mathbf F_t$ is $I(1)$ and driven by a number of shocks that is smaller than the dimension of $\mathbf F_t$; and, (ii)~the idiosyncratic components are either $I(1)$ or $I(0)$. Under~(i), the factors $\mathbf F_t$ are cointegrated and can be modeled as a Vector Error Correction Model (VECM). Under (i) and (ii), we provide consistent estimators, as both the cross-sectional size $n$ and the time dimension $T$ go to infinity, for the factors, the loadings, the shocks, the coefficients of the VECM and therefore the Impulse-Response Functions (IRF) of the observed variables to the shocks.~Furthermore: possible deterministic linear trends are fully accounted for, and the case of an unrestricted VAR in the levels $\mathbf F_t$, instead of a VECM, is also studied. The finite-sample properties the proposed estimators are explored by means of a MonteCarlo exercise. Finally, we revisit two distinct and widely studied empirical applications. By correctly modeling the long-run dynamics of the factors, our results partly overturn those obtained by recent literature. Specifically, we find that: (i) oil price shocks have just a temporary effect on US real activity; and, (ii) in response to a positive news shock, the economy first experiences a significant boom, and then a milder recession.

preprint2020arXiv

Sequential testing for structural stability in approximate factor models

We develop a monitoring procedure to detect changes in a large approximate factor model. Letting $r$ be the number of common factors, we base our statistics on the fact that the $\left( r+1\right) $-th eigenvalue of the sample covariance matrix is bounded under the null of no change, whereas it becomes spiked under changes. Given that sample eigenvalues cannot be estimated consistently under the null, we randomise the test statistic, obtaining a sequence of \textit{i.i.d} statistics, which are used for the monitoring scheme. Numerical evidence shows a very small probability of false detections, and tight detection times of change-points.

preprint2019arXiv

Generalized Dynamic Factor Models and Volatilities: Consistency, rates, and prediction intervals

Volatilities, in high-dimensional panels of economic time series with a dynamic factor structure on the levels or returns, typically also admit a dynamic factor decomposition. We consider a two-stage dynamic factor model method recovering the common and idiosyncratic components of both levels and log-volatilities. Specifically, in a first estimation step, we extract the common and idiosyncratic shocks for the levels, from which a log-volatility proxy is computed. In a second step, we estimate a dynamic factor model, which is equivalent to a multiplicative factor structure for volatilities, for the log-volatility panel. By exploiting this two-stage factor approach, we build one-step-ahead conditional prediction intervals for large $n \times T$ panels of returns. Those intervals are based on empirical quantiles, not on conditional variances; they can be either equal- or unequal- tailed. We provide uniform consistency and consistency rates results for the proposed estimators as both $n$ and $T$ tend to infinity. We study the finite-sample properties of our estimators by means of Monte Carlo simulations. Finally, we apply our methodology to a panel of asset returns belonging to the S&P100 index in order to compute one-step-ahead conditional prediction intervals for the period 2006-2013. A comparison with the componentwise GARCH benchmark (which does not take advantage of cross-sectional information) demonstrates the superiority of our approach, which is genuinely multivariate (and high-dimensional), nonparametric, and model-free.

preprint2017arXiv

Spatio-Temporal Patterns of the International Merger and Acquisition Network

This paper analyses the world web of mergers and acquisitions (M&As) using a complex network approach. We use data of M&As to build a temporal sequence of binary and weighted-directed networks for the period 1995-2010 and 224 countries (nodes) connected according to their M&As flows (links). We study different geographical and temporal aspects of the international M&A network (IMAN), building sequences of filtered sub-networks whose links belong to specific intervals of distance or time. Given that M&As and trade are complementary ways of reaching foreign markets, we perform our analysis using statistics employed for the study of the international trade network (ITN), highlighting the similarities and differences between the ITN and the IMAN. In contrast to the ITN, the IMAN is a low density network characterized by a persistent giant component with many external nodes and low reciprocity. Clustering patterns are very heterogeneous and dynamic. High-income economies are the main acquirers and are characterized by high connectivity, implying that most countries are targets of a few acquirers. Like in the ITN, geographical distance strongly impacts the structure of the IMAN: link-weights and node degrees have a non-linear relation with distance, and an assortative pattern is present at short distances.

preprint2010arXiv

Identifying the Community Structure of the International-Trade Multi Network

We study the community structure of the multi-network of commodity-specific trade relations among world countries over the 1992-2003 period. We compare structures across commodities and time by means of the normalized mutual information index (NMI). We also compare them with exogenous community structures induced by geographical distances and regional trade agreements. We find that commodity-specific community structures are very heterogeneous and much more fragmented than that characterizing the aggregate ITN. This shows that the aggregate properties of the ITN may result (and be very different) from the aggregation of very diverse commodity-specific layers of the multi network. We also show that commodity-specific community structures, especially those related to the chemical sector, are becoming more and more similar to the aggregate one. Finally, our findings suggest that geographical distance is much more correlated with the observed community structure than RTAs. This result strengthens previous findings from the empirical literature on trade.

preprint2010arXiv

Multinetwork of international trade: A commodity-specific analysis

We study the topological properties of the multinetwork of commodity-specific trade relations among world countries over the 1992-2003 period, comparing them with those of the aggregate-trade network, known in the literature as the international-trade network (ITN). We show that link-weight distributions of commodity-specific networks are extremely heterogeneous and (quasi) log normality of aggregate link-weight distribution is generated as a sheer outcome of aggregation. Commodity-specific networks also display average connectivity, clustering, and centrality levels very different from their aggregate counterpart. We also find that ITN complete connectivity is mainly achieved through the presence of many weak links that keep commodity-specific networks together and that the correlation structure existing between topological statistics within each single network is fairly robust and mimics that of the aggregate network. Finally, we employ cross-commodity correlations between link weights to build hierarchies of commodities. Our results suggest that on the top of a relatively time-invariant ``intrinsic" taxonomy (based on inherent between-commodity similarities), the roles played by different commodities in the ITN have become more and more dissimilar, possibly as the result of an increased trade specialization. Our approach is general and can be used to characterize any multinetwork emerging as a nontrivial aggregation of several interdependent layers.

Matteo Barigozzi

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Estimation of large approximate dynamic matrix factor models based on the EM algorithm and Kalman filtering

Mean Square Errors of factors extracted using principal components, linear projections, and Kalman filter

Dynamic Factor Models: a Genealogy

Consistent estimation of high-dimensional factor models when the factor number is over-estimated

Large-Dimensional Dynamic Factor Models: Estimation of Impulse-Response Functions with $I(1)$ Cointegrated Factors

Sequential testing for structural stability in approximate factor models

Generalized Dynamic Factor Models and Volatilities: Consistency, rates, and prediction intervals

Spatio-Temporal Patterns of the International Merger and Acquisition Network

Identifying the Community Structure of the International-Trade Multi Network

Multinetwork of international trade: A commodity-specific analysis