Source author record

Qiwei Yao

Qiwei Yao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Machine Learning stat.OT

Catalog footprint

What is connected

18works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation

We study online prediction under distribution shift, where inputs arrive chronologically and outcomes are revealed only after prediction. In this setting, predictors must remain stable in quiet regimes yet adapt when regimes shift, and the right adaptation memory is unknown in advance. We propose MELO (Memory-hedged Exponentially Weighted Least-Squares Online aggregation), a model-agnostic method that hedges across adaptation scales: it wraps any non-anticipating base-predictor pool with exponentially weighted least-squares (EWLS) adaptation experts at multiple forgetting factors, and aggregates raw and EWLS-adapted forecasts with MLpol, a parameter-free online aggregation rule. Under boundedness conditions, we establish deterministic oracle inequalities showing that it competes with both the best raw predictor and the best bounded, time-varying affine combinations of the base predictions, up to a path-length-dependent tracking cost and a sublinear aggregation overhead. We evaluate MELO on French national electricity-load forecasting through the COVID-19 lockdown using no regime indicators, lockdown dates, or policy covariates. MELO reduces overall RMSE by 34.7\% relative to base-only MLpol and achieves lower overall RMSE than a TabICL reference supplied with an external COVID policy-response covariate. Moreover, MELO requires only lightweight per-step recursive updates without model retraining.

preprint2022arXiv

Autoregressive Networks

We propose a first-order autoregressive (i.e. AR(1)) model for dynamic network processes in which edges change over time while nodes remain unchanged. The model depicts the dynamic changes explicitly. It also facilitates simple and efficient statistical inference methods including a permutation test for diagnostic checking for the fitted network models. The proposed model can be applied to the network processes with various underlying structures but with independent edges. As an illustration, an AR(1) stochastic block model has been investigated in depth, which characterizes the latent communities by the transition probabilities over time. This leads to a new and more effective spectral clustering algorithm for identifying the latent communities. We have derived a finite sample condition under which the perfect recovery of the community structure can be achieved by the newly defined spectral clustering algorithm. Furthermore the inference for a change point is incorporated into the AR(1) stochastic block model to cater for possible structure changes. We have derived the explicit error rates for the maximum likelihood estimator of the change-point. Application with three real data sets illustrates both relevance and usefulness of the proposed AR(1) models and the associate inference methods.

preprint2022arXiv

Blind Source Separation over Space

We propose a new estimation method for the blind source separation model of Bachoc et al. (2020). The new estimation is based on an eigenanalysis of a positive definite matrix defined in terms of multiple normalized spatial local covariance matrices, and, therefore, can handle moderately high-dimensional random fields. The consistency of the estimated mixing matrix is established with explicit error rates even when the eigen-gap decays to zero slowly. The proposed method is illustrated via both simulation and a real data example.

preprint2022arXiv

Factor Modelling for Clustering High-dimensional Time Series

We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit convergence rates is established for the estimation of the common factors, the cluster-specific factors, the latent clusters. Numerical illustration with both simulated data as well as a real data example is also reported. As a spin-off, the proposed new approach also advances significantly the statistical inference for the factor model of Lam and Yao (2012).

preprint2020arXiv

Estimation of subgraph density in noisy networks

While it is common practice in applied network analysis to report various standard network summary statistics, these numbers are rarely accompanied by uncertainty quantification. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Here we study the problem of estimating the density of an arbitrary subgraph, given a noisy version of some underlying network as data. Under a simple model of network error, we show that consistent estimation of such densities is impossible when the rates of error are unknown and only a single network is observed. Accordingly, we develop method-of-moment estimators of network subgraph densities and error rates for the case where a minimal number of network replicates are available. These estimators are shown to be asymptotically normal as the number of vertices increases to infinity. We also provide confidence intervals for quantifying the uncertainty in these estimates based on the asymptotic normality. To construct the confidence intervals, a new and non-standard bootstrap method is proposed to compute asymptotic variances, which is infeasible otherwise. We illustrate the proposed methods in the context of gene coexpression networks.

preprint2020arXiv

Modeling Multivariate Spatial-Temporal Data with Latent Low-Dimensional Dynamics

High-dimensional multivariate spatial-temporal data arise frequently in a wide range of applications; however, there are relatively few statistical methods that can simultaneously deal with spatial, temporal and variable-wise dependencies in large data sets. In this paper, we propose a new approach to utilize the correlations in variable, space and time to achieve dimension reduction and to facilitate spatial/temporal predictions in the high-dimensional settings. The multivariate spatial-temporal process is represented as a linear transformation of a lower-dimensional latent factor process. The spatial dependence structure of the factor process is further represented non-parametrically in terms of latent empirical orthogonal functions. The low-dimensional structure is completely unknown in our setting and is learned entirely from data collected irregularly over space but regularly over time. We propose innovative estimation and prediction methods based on the latent low-rank structures. Asymptotic properties of the estimators and predictors are established. Extensive experiments on synthetic and real data sets show that, while the dimensions are reduced significantly, the spatial, temporal and variable-wise covariance structures are largely preserved. The efficacy of our method is further confirmed by the prediction performances on both synthetic and real data sets.

preprint2016arXiv

Generalized Yule-Walker Estimation for Spatio-Temporal Models with Unknown Diagonal Coefficients

We consider a class of spatio-temporal models which extend popular econometric spatial autoregressive panel data models by allowing the scalar coefficients for each location (or panel) different from each other. To overcome the innate endogeneity, we propose a generalized Yule-Walker estimation method which applies the least squares estimation to a Yule-Walker equation. The asymptotic theory is developed under the setting that both the sample size and the number of locations (or panels) tend to infinity under a general setting for stationary and alpha-mixing processes, which includes spatial autoregressive panel data models driven by i.i.d. innovations as special cases. The proposed methods are illustrated using both simulated and real data.

preprint2016arXiv

High Dimensional and Banded Vector Autoregressions

We consider a class of vector autoregressive models with banded coefficient matrices. The setting represents a type of sparse structure for high-dimensional time series, though the implied autocovariance matrices are not banded. The structure is also practically meaningful when the order of component time series is arranged appropriately. The convergence rates for the estimated banded autoregressive coefficient matrices are established. We also propose a Bayesian information criterion for determining the width of the bands in the coefficient matrices, which is proved to be consistent. By exploring some approximate banded structure for the auto-covariance functions of banded vector autoregressive processes, consistent estimators for the auto-covariance matrices are constructed.

preprint2016arXiv

Modelling and forecasting daily electricity load curves: a hybrid approach

We propose a hybrid approach for the modelling and the short-term forecasting of electricity loads. Two building blocks of our approach are (i) modelling the overall trend and seasonality by fitting a generalised additive model to the weekly averages of the load, and (ii) modelling the dependence structure across consecutive daily loads via curve linear regression. For the latter, a new methodology is proposed for linear regression with both curve response and curve regressors. The key idea behind the proposed methodology is the dimension reduction based on a singular value decomposition in a Hilbert space, which reduces the curve regression problem to several ordinary (i.e. scalar) linear regression problems. We illustrate the hybrid method using the French electricity loads between 1996 and 2009, on which we also compare our method with other available models including the EDF operational model.

preprint2015arXiv

High dimensional stochastic regression with latent factors, endogeneity and nonlinearity

We consider a multivariate time series model which represents a high dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise. We investigate the inference without imposing stationary conditions on the target multivariate time series, the regressors and the underlying factors. Furthermore we deal with the endogeneity that there exist correlations between the observed regressors and the unobserved factors. We also consider the model with nonlinear regression term which can be approximated by a linear regression function with a large number of regressors. The convergence rates for the estimators of regression coefficients, the number of factors, factor loading space and factors are established under the settings when the dimension of time series and the number of regressors may both tend to infinity together with the sample size. The proposed method is illustrated with both simulated and real data examples.

preprint2014arXiv

A Conversation with Howell Tong

The following conversation is partly based on an interview that took place in the Hong Kong University of Science and Technology in July 2013.

preprint2014arXiv

Estimation for Dynamic and Static Panel Probit Models with Large Individual Effects

For discrete panel data, the dynamic relationship between successive observations is often of interest. We consider a dynamic probit model for short panel data. A problem with estimating the dynamic parameter of interest is that the model contains a large number of nuisance parameters, one for each individual. Heckman proposed to use maximum likelihood estimation of the dynamic parameter, which, however, does not perform well if the individual effects are large. We suggest new estimators for the dynamic parameter, based on the assumption that the individual parameters are random and possibly large. Theoretical properties of our estimators are derived and a simulation study shows they have some advantages compared to Heckman's estimator.

preprint2013arXiv

Estimation of Extreme Quantiles for Functions of Dependent Random Variables

We propose a new method for estimating the extreme quantiles for a function of several dependent random variables. In contrast to the conventional approach based on extreme value theory, we do not impose the condition that the tail of the underlying distribution admits an approximate parametric form, and, furthermore, our estimation makes use of the full observed data. The proposed method is semiparametric as no parametric forms are assumed on all the marginal distributions. But we select appropriate bivariate copulas to model the joint dependence structure by taking the advantage of the recent development in constructing large dimensional vine copulas. Consequently a sample quantile resulted from a large bootstrap sample drawn from the fitted joint distribution is taken as the estimator for the extreme quantile. This estimator is proved to be consistent. The reliable and robust performance of the proposed method is further illustrated by simulation.

preprint2012arXiv

Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong

Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong [arXiv:1104.3073]

preprint2012arXiv

Factor modeling for high-dimensional time series: Inference for the number of factors

This paper deals with the factor modeling for high-dimensional time series based on a dimension-reduction viewpoint. Under stationary settings, the inference is simple in the sense that both the number of factors and the factor loadings are estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of time series is on the order of a few thousands. Asymptotic properties of the proposed method are investigated under two settings: (i) the sample size goes to infinity while the dimension of time series is fixed; and (ii) both the sample size and the dimension of time series go to infinity together. In particular, our estimators for zero-eigenvalues enjoy faster convergence (or slower divergence) rates, hence making the estimation for the number of factors easier. In particular, when the sample size and the dimension of time series go to infinity together, the estimators for the eigenvalues are no longer consistent. However, our estimator for the number of the factors, which is based on the ratios of the estimated eigenvalues, still works fine. Furthermore, this estimation shows the so-called "blessing of dimensionality" property in the sense that the performance of the estimation may improve when the dimension of time series increases. A two-step procedure is investigated when the factors are of different degrees of strength. Numerical illustration with both simulated and real data is also reported.

preprint2012arXiv

Identifying the finite dimensionality of curve time series

The curve time series framework provides a convenient vehicle to accommodate some nonstationary features into a stationary setup. We propose a new method to identify the dimensionality of curve time series based on the dynamical dependence across different curves. The practical implementation of our method boils down to an eigenanalysis of a finite-dimensional matrix. Furthermore, the determination of the dimensionality is equivalent to the identification of the nonzero eigenvalues of the matrix, which we carry out in terms of some bootstrap tests. Asymptotic properties of the proposed method are investigated. In particular, our estimators for zero-eigenvalues enjoy the fast convergence rate n while the estimators for nonzero eigenvalues converge at the standard $\sqrt{n}$-rate. The proposed methodology is illustrated with both simulated and real data sets.

preprint2010arXiv

Estimation for Latent Factor Models for High-Dimensional Time Series

This paper deals with the dimension reduction for high-dimensional time series based on common factors. In particular we allow the dimension of time series $p$ to be as large as, or even larger than, the sample size $n$. The estimation for the factor loading matrix and the factor process itself is carried out via an eigenanalysis for a $p\times p$ non-negative definite matrix. We show that when all the factors are strong in the sense that the norm of each column in the factor loading matrix is of the order $p^{1/2}$, the estimator for the factor loading matrix, as well as the resulting estimator for the precision matrix of the original $p$-variant time series, are weakly consistent in $L_2$-norm with the convergence rates independent of $p$. This result exhibits clearly that the `curse' is canceled out by the `blessings' in dimensionality. We also establish the asymptotic properties of the estimation when not all factors are strong. For the latter case, a two-step estimation procedure is preferred accordingly to the asymptotic theory. The proposed methods together with their asymptotic properties are further illustrated in a simulation study. An application to a real data set is also reported.

preprint2007arXiv

Exploring spatial nonlinearity using additive approximation

We propose to approximate the conditional expectation of a spatial random variable given its nearest-neighbour observations by an additive function. The setting is meaningful in practice and requires no unilateral ordering. It is capable of catching nonlinear features in spatial data and exploring local dependence structures. Our approach is different from both Markov field methods and disjunctive kriging. The asymptotic properties of the additive estimators have been established for $α$-mixing spatial processes by extending the theory of the backfitting procedure to the spatial case. This facilitates the confidence intervals for the component functions, although the asymptotic biases have to be estimated via (wild) bootstrap. Simulation results are reported. Applications to real data illustrate that the improvement in describing the data over the auto-normal scheme is significant when nonlinearity or non-Gaussianity is pronounced.

Qiwei Yao

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation

Autoregressive Networks

Blind Source Separation over Space

Factor Modelling for Clustering High-dimensional Time Series

Estimation of subgraph density in noisy networks

Modeling Multivariate Spatial-Temporal Data with Latent Low-Dimensional Dynamics

Generalized Yule-Walker Estimation for Spatio-Temporal Models with Unknown Diagonal Coefficients

High Dimensional and Banded Vector Autoregressions

Modelling and forecasting daily electricity load curves: a hybrid approach

High dimensional stochastic regression with latent factors, endogeneity and nonlinearity

A Conversation with Howell Tong

Estimation for Dynamic and Static Panel Probit Models with Large Individual Effects

Estimation of Extreme Quantiles for Functions of Dependent Random Variables

Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong

Factor modeling for high-dimensional time series: Inference for the number of factors

Identifying the finite dimensionality of curve time series

Estimation for Latent Factor Models for High-Dimensional Time Series

Exploring spatial nonlinearity using additive approximation