Source author record

Romain Couillet

Romain Couillet appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.PR math.ST Statistics Theory Applications Computer Science and Game Theory Neural and Evolutionary Computing physics.soc-ph q-fin.PM

Catalog footprint

What is connected

45works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Concentration of Measure and Random Matrix Approach to Large Dimensional Robust Statistics

This article studies the \emph{robust covariance matrix estimation} of a data collection $X = (x_1,\ldots,x_n)$ with $x_i = \sqrt τ_i z_i + m$, where $z_i \in \mathbb R^p$ is a \textit{concentrated vector} (e.g., an elliptical random vector), $m\in \mathbb R^p$ a deterministic signal and $τ_i\in \mathbb R$ a scalar perturbation of possibly large amplitude, under the assumption where both $n$ and $p$ are large. This estimator is defined as the fixed point of a function which we show is contracting for a so-called \textit{stable semi-metric}. We exploit this semi-metric along with concentration of measure arguments to prove the existence and uniqueness of the robust estimator as well as evaluate its limiting spectral distribution.

preprint2022arXiv

A Random Matrix Perspective on Random Tensors

Tensor models play an increasingly prominent role in many fields, notably in machine learning. In several applications, such as community detection, topic modeling and Gaussian mixture learning, one must estimate a low-rank signal from a noisy tensor. Hence, understanding the fundamental limits of estimators of that signal inevitably calls for the study of random tensors. Substantial progress has been recently achieved on this subject in the large-dimensional limit. Yet, some of the most significant among these results--in particular, a precise characterization of the abrupt phase transition (with respect to signal-to-noise ratio) that governs the performance of the maximum likelihood (ML) estimator of a symmetric rank-one model with Gaussian noise--were derived based of mean-field spin glass theory, which is not easily accessible to non-experts. In this work, we develop a sharply distinct and more elementary approach, relying on standard but powerful tools brought by years of advances in random matrix theory. The key idea is to study the spectra of random matrices arising from contractions of a given random tensor. We show how this gives access to spectral properties of the random tensor itself. For the aforementioned rank-one model, our technique yields a hitherto unknown fixed-point equation whose solution precisely matches the asymptotic performance of the ML estimator above the phase transition threshold in the third-order case. A numerical verification provides evidence that the same holds for orders 4 and 5, leading us to conjecture that, for any order, our fixed-point equation is equivalent to the known characterization of the ML estimation performance that had been obtained by relying on spin glasses. Moreover, our approach sheds light on certain properties of the ML problem landscape in large dimensions and can be extended to other models, such as asymmetric and non-Gaussian.

preprint2022arXiv

Random matrices in service of ML footprint: ternary random features with no performance loss

In this article, we investigate the spectral behavior of random features kernel matrices of the type ${\bf K} = \mathbb{E}_{\bf w} \left[σ\left({\bf w}^{\sf T}{\bf x}_i\right)σ\left({\bf w}^{\sf T}{\bf x}_j\right)\right]_{i,j=1}^n$, with nonlinear function $σ(\cdot)$, data ${\bf x}_1, \ldots, {\bf x}_n \in \mathbb{R}^p$, and random projection vector ${\bf w} \in \mathbb{R}^p$ having i.i.d. entries. In a high-dimensional setting where the number of data $n$ and their dimension $p$ are both large and comparable, we show, under a Gaussian mixture model for the data, that the eigenspectrum of ${\bf K}$ is independent of the distribution of the i.i.d.(zero-mean and unit-variance) entries of ${\bf w}$, and only depends on $σ(\cdot)$ via its (generalized) Gaussian moments $\mathbb{E}_{z\sim \mathcal N(0,1)}[σ'(z)]$ and $\mathbb{E}_{z\sim \mathcal N(0,1)}[σ''(z)]$. As a result, for any kernel matrix ${\bf K}$ of the form above, we propose a novel random features technique, called Ternary Random Feature (TRF), that (i) asymptotically yields the same limiting kernel as the original ${\bf K}$ in a spectral sense and (ii) can be computed and stored much more efficiently, by wisely tuning (in a data-dependent manner) the function $σ$ and the random vector ${\bf w}$, both taking values in $\{-1,0,1\}$. The computation of the proposed random features requires no multiplication, and a factor of $b$ times less bits for storage compared to classical random features such as random Fourier features, with $b$ the number of bits to store full precision values. Besides, it appears in our experiments on real data that the substantial gains in computation and storage are accompanied with somewhat improved performances compared to state-of-the-art random features compression/quantization methods.

preprint2021arXiv

Concentration of Measure and Large Random Matrices with an application to Sample Covariance Matrices

The present work provides an original framework for random matrix analysis based on revisiting the concentration of measure theory from a probabilistic point of view. By providing various notions of vector concentration ($q$-exponential, linear, Lipschitz, convex), a set of elementary tools is laid out that allows for the immediate extension of classical results from random matrix theory involving random concentrated vectors in place of vectors with independent entries. These findings are exemplified here in the context of sample covariance matrices but find a large range of applications in statistical learning and beyond, thanks to the broad adaptability of our hypotheses.

preprint2020arXiv

A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent

This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on real-world data sets.

preprint2020arXiv

Consistent Semi-Supervised Graph Regularization for High Dimensional Data

Semi-supervised Laplacian regularization, a standard graph-based approach for learning from both labelled and unlabelled data, was recently demonstrated to have an insignificant high dimensional learning efficiency with respect to unlabelled data (Mai and Couillet 2018), causing it to be outperformed by its unsupervised counterpart, spectral clustering, given sufficient unlabelled data. Following a detailed discussion on the origin of this inconsistency problem, a novel regularization approach involving centering operation is proposed as solution, supported by both theoretical analysis and empirical results.

preprint2020arXiv

Large Dimensional Analysis and Improvement of Multi Task Learning

Multi Task Learning (MTL) efficiently leverages useful information contained in multiple related tasks to help improve the generalization performance of all tasks. This article conducts a large dimensional analysis of a simple but, as we shall see, extremely powerful when carefully tuned, Least Square Support Vector Machine (LSSVM) version of MTL, in the regime where the dimension $p$ of the data and their number $n$ grow large at the same rate. Under mild assumptions on the input data, the theoretical analysis of the MTL-LSSVM algorithm first reveals the "sufficient statistics" exploited by the algorithm and their interaction at work. These results demonstrate, as a striking consequence, that the standard approach to MTL-LSSVM is largely suboptimal, can lead to severe effects of negative transfer but that these impairments are easily corrected. These corrections are turned into an improved MTL-LSSVM algorithm which can only benefit from additional data, and the theoretical performance of which is also analyzed. As evidenced and theoretically sustained in numerous recent works, these large dimensional results are robust to broad ranges of data distributions, which our present experiments corroborate. Specifically, the article reports a systematically close behavior between theoretical and empirical performances on popular datasets, which is strongly suggestive of the applicability of the proposed carefully tuned MTL-LSSVM method to real data. This fine-tuning is fully based on the theoretical analysis and does not in particular require any cross validation procedure. Besides, the reported performances on real datasets almost systematically outperform much more elaborate and less intuitive state-of-the-art multi-task and transfer learning methods.

preprint2020arXiv

Optimal Laplacian regularization for sparse spectral community detection

Regularization of the classical Laplacian matrices was empirically shown to improve spectral clustering in sparse networks. It was observed that small regularizations are preferable, but this point was left as a heuristic argument. In this paper we formally determine a proper regularization which is intimately related to alternative state-of-the-art spectral techniques for sparse graphs.

preprint2020arXiv

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks.

preprint2019arXiv

Random Matrix Improved Covariance Estimation for a Large Class of Metrics

Relying on recent advances in statistical estimation of covariance distances based on random matrix theory, this article proposes an improved covariance and precision matrix estimation for a wide family of metrics. The method is shown to largely outperform the sample covariance matrix estimate and to compete with state-of-the-art methods, while at the same time being computationally simpler. Applications to linear and quadratic discriminant analyses also demonstrate significant gains, therefore suggesting practical interest to statistical machine learning.

preprint2016arXiv

Kernel spectral clustering of large dimensional data

This article proposes a first analysis of kernel spectral clustering methods in the regime where the dimension $p$ of the data vectors to be clustered and their number $n$ grow large at the same rate. We demonstrate, under a $k$-class Gaussian mixture model, that the normalized Laplacian matrix associated with the kernel matrix asymptotically behaves similar to a so-called spiked random matrix. Some of the isolated eigenvalue-eigenvector pairs in this model are shown to carry the clustering information upon a separability condition classical in spiked matrix models. We evaluate precisely the position of these eigenvalues and the content of the eigenvectors, which unveil important (sometimes quite disruptive) aspects of kernel spectral clustering both from a theoretical and practical standpoints. Our results are then compared to the actual clustering performance of images from the MNIST database, thereby revealing an important match between theory and practice.

preprint2016arXiv

Large System Analysis of Base Station Cooperation for Power Minimization

This work focuses on a large-scale multi-cell multi-user MIMO system in which $L$ base stations (BSs) of $N$ antennas each communicate with $K$ single-antenna user equipments. We consider the design of the linear precoder that minimizes the total power consumption while ensuring target user rates. Three configurations with different degrees of cooperation among BSs are considered: the coordinated beamforming scheme (only channel state information is shared among BSs), the coordinated multipoint MIMO processing technology or network MIMO (channel state and data cooperation), and a single cell beamforming scheme (only local channel state information is used for beamforming while channel state cooperation is needed for power allocation). The analysis is conducted assuming that $N$ and $K$ grow large with a non trivial ratio $K/N$ and imperfect channel state information (modeled by the generic Gauss-Markov formulation form) is available at the BSs. Tools of random matrix theory are used to compute, in explicit form, deterministic approximations for: (i) the parameters of the optimal precoder; (ii) the powers needed to ensure target rates; and (iii) the total transmit power. These results are instrumental to get further insight into the structure of the optimal precoders and also to reduce the implementation complexity in large-scale networks. Numerical results are used to validate the asymptotic analysis in the finite system regime and to make comparisons among the different configurations.

preprint2016arXiv

Random matrices meet machine learning: a large dimensional analysis of LS-SVM

This article proposes a performance analysis of kernel least squares support vector machines (LS-SVMs) based on a random matrix approach, in the regime where both the dimension of data $p$ and their number $n$ grow large at the same rate. Under a two-class Gaussian mixture model for the input data, we prove that the LS-SVM decision function is asymptotically normal with means and covariances shown to depend explicitly on the derivatives of the kernel function. This provides improved understanding along with new insights into the internal workings of SVM-type methods for large datasets.

preprint2016arXiv

Spectral analysis of the Gram matrix of mixture models

This text is devoted to the asymptotic study of some spectral properties of the Gram matrix $W^{\sf T} W$ built upon a collection $w_1, \ldots, w_n\in \mathbb{R}^p$ of random vectors (the columns of $W$), as both the number $n$ of observations and the dimension $p$ of the observations tend to infinity and are of similar order of magnitude. The random vectors $w_1, \ldots, w_n$ are independent observations, each of them belonging to one of $k$ classes $\mathcal{C}_1,\ldots, \mathcal{C}_k$. The observations of each class $\mathcal{C}_a$ ($1\le a\le k$) are characterized by their distribution $\mathcal{N}(0, p^{-1}C_a)$, where $C_1, \ldots, C_k$ are some non negative definite $p\times p$ matrices. The cardinality $n_a$ of class $\mathcal{C}_a$ and the dimension $p$ of the observations are such that $\frac{n_a}{n}$ ($1\le a\le k$) and $\frac{p}{n}$ stay bounded away from $0$ and $+\infty$. We provide deterministic equivalents to the empirical spectral distribution of $W^{\sf T}W$ and to the matrix entries of its resolvent (as well as of the resolvent of $WW^{\sf T}$). These deterministic equivalents are defined thanks to the solutions of a fixed-point system. Besides, we prove that $W^{\sf T} W$ has asymptotically no eigenvalues outside the bulk of its spectrum, defined thanks to these deterministic equivalents. These results are directly used in our companion paper "Kernel spectral clustering of large dimensional data", which is devoted to the analysis of the spectral clustering algorithm in large dimensions. They also find applications in various other fields such as wireless communications where functionals of the aforementioned resolvents allow one to assess the communication performance across multi-user multi-antenna channels.

preprint2016arXiv

Spectral community detection in heterogeneous large networks

In this article, we study spectral methods for community detection based on $ α$-parametrized normalized modularity matrix hereafter called $ {\bf L}_α$ in heterogeneous graph models. We show, in a regime where community detection is not asymptotically trivial, that $ {\bf L}_α$ can be well approximated by a more tractable random matrix which falls in the family of spiked random matrices. The analysis of this equivalent spiked random matrix allows us to improve spectral methods for community detection and assess their performances in the regime under study. In particular, we prove the existence of an optimal value $ α_{\rm opt} $ of the parameter $ α$ for which the detection of communities is best ensured and we provide an on-line estimation of $ α_{\rm opt} $ only based on the knowledge of the graph adjacency matrix. Unlike classical spectral methods for community detection where clustering is performed on the eigenvectors associated with extreme eigenvalues, we show through our theoretical analysis that a regularization should instead be performed on those eigenvectors prior to clustering in heterogeneous graphs. Finally, through a deeper study of the regularized eigenvectors used for clustering, we assess the performances of our new algorithm for community detection. Numerical simulations in the course of the article show that our methods outperform state-of-the-art spectral methods on dense heterogeneous graphs.

preprint2016arXiv

The Asymptotic Performance of Linear Echo State Neural Networks

In this article, a study of the mean-square error (MSE) performance of linear echo-state neural networks is performed, both for training and testing tasks. Considering the realistic setting of noise present at the network nodes, we derive deterministic equivalents for the aforementioned MSE in the limit where the number of input data $T$ and network size $n$ both grow large. Specializing then the network connectivity matrix to specific random settings, we further obtain simple formulas that provide new insights on the performance of such networks.

preprint2015arXiv

A Robust Statistics Approach to Minimum Variance Portfolio Optimization

We study the design of portfolios under a minimum risk criterion. The performance of the optimized portfolio relies on the accuracy of the estimated covariance matrix of the portfolio asset returns. For large portfolios, the number of available market returns is often of similar order to the number of assets, so that the sample covariance matrix performs poorly as a covariance estimator. Additionally, financial market data often contain outliers which, if not correctly handled, may further corrupt the covariance estimation. We address these shortcomings by studying the performance of a hybrid covariance matrix estimator based on Tyler's robust M-estimator and on Ledoit-Wolf's shrinkage estimator while assuming samples with heavy-tailed distribution. Employing recent results from random matrix theory, we develop a consistent estimator of (a scaled version of) the realized portfolio risk, which is minimized by optimizing online the shrinkage intensity. Our portfolio optimization method is shown via simulations to outperform existing methods both for synthetic and real market data.

preprint2015arXiv

Analysis of the limiting spectral measure of large random matrices of the separable covariance type

Consider the random matrix $Σ= D^{1/2} X \widetilde D^{1/2}$ where $D$ and $\widetilde D$ are deterministic Hermitian nonnegative matrices with respective dimensions $N \times N$ and $n \times n$, and where $X$ is a random matrix with independent and identically distributed centered elements with variance $1/n$. Assume that the dimensions $N$ and $n$ grow to infinity at the same pace, and that the spectral measures of $D$ and $\widetilde D$ converge as $N,n \to\infty$ towards two probability measures. Then it is known that the spectral measure of $ΣΣ^*$ converges towards a probability measure $μ$ characterized by its Stieltjes Transform. In this paper, it is shown that $μ$ has a density away from zero, this density is analytical wherever it is positive, and it behaves in most cases as $\sqrt{|x - a|}$ near an edge $a$ of its support. A complete characterization of the support of $μ$ is also provided. \\ Beside its mathematical interest, this analysis finds applications in a certain class of statistical estimation problems.

preprint2015arXiv

Convergence and Fluctuations of Regularized Tyler Estimators

This article studies the behavior of regularized Tyler estimators (RTEs) of scatter matrices. The key advantages of these estimators are twofold. First, they guarantee by construction a good conditioning of the estimate and second, being a derivative of robust Tyler estimators, they inherit their robustness properties, notably their resilience to the presence of outliers. Nevertheless, one major problem that poses the use of RTEs in practice is represented by the question of setting the regularization parameter $ρ$. While a high value of $ρ$ is likely to push all the eigenvalues away from zero, it comes at the cost of a larger bias with respect to the population covariance matrix. A deep understanding of the statistics of RTEs is essential to come up with appropriate choices for the regularization parameter. This is not an easy task and might be out of reach, unless one considers asymptotic regimes wherein the number of observations $n$ and/or their size $N$ increase together. First asymptotic results have recently been obtained under the assumption that $N$ and $n$ are large and commensurable. Interestingly, no results concerning the regime of $n$ going to infinity with $N$ fixed exist, even though the investigation of this assumption has usually predated the analysis of the most difficult $N$ and $n$ large case. This motivates our work. In particular, we prove in the present paper that the RTEs converge to a deterministic matrix when $n\to\infty$ with $N$ fixed, which is expressed as a function of the theoretical covariance matrix. We also derive the fluctuations of the RTEs around this deterministic matrix and establish that these fluctuations converge in distribution to a multivariate Gaussian distribution with zero mean and a covariance depending on the population covariance and the parameter $ρ$.

preprint2015arXiv

Large Dimensional Analysis and Optimization of Robust Shrinkage Covariance Matrix Estimators

This article studies two regularized robust estimators of scatter matrices proposed (and proved to be well defined) in parallel in (Chen et al., 2011) and (Pascal et al., 2013), based on Tyler's robust M-estimator (Tyler, 1987) and on Ledoit and Wolf's shrinkage covariance matrix estimator (Ledoit and Wolf, 2004). These hybrid estimators have the advantage of conveying (i) robustness to outliers or impulsive samples and (ii) small sample size adequacy to the classical sample covariance matrix estimator. We consider here the case of i.i.d. elliptical zero mean samples in the regime where both sample and population sizes are large. We demonstrate that, under this setting, the estimators under study asymptotically behave similar to well-understood random matrix models. This characterization allows us to derive optimal shrinkage strategies to estimate the population scatter matrix, improving significantly upon the empirical shrinkage method proposed in (Chen et al., 2011).

preprint2015arXiv

Large Dimensional Analysis of Robust M-Estimators of Covariance with Outliers

A large dimensional characterization of robust M-estimators of covariance (or scatter) is provided under the assumption that the dataset comprises independent (essentially Gaussian) legitimate samples as well as arbitrary deterministic samples, referred to as outliers. Building upon recent random matrix advances in the area of robust statistics, we specifically show that the so-called Maronna M-estimator of scatter asymptotically behaves similar to well-known random matrices when the population and sample sizes grow together to infinity. The introduction of outliers leads the robust estimator to behave asymptotically as the weighted sum of the sample outer products, with a constant weight for all legitimate samples and different weights for the outliers. A fine analysis of this structure reveals importantly that the propensity of the M-estimator to attenuate (or enhance) the impact of outliers is mostly dictated by the alignment of the outliers with the inverse population covariance matrix of the legitimate samples. Thus, robust M-estimators can bring substantial benefits over more simplistic estimators such as the per-sample normalized version of the sample covariance matrix, which is not capable of differentiating the outlying samples. The analysis shows that, within the class of Maronna's estimators of scatter, the Huber estimator is most favorable for rejecting outliers. On the contrary, estimators more similar to Tyler's scale invariant estimator (often preferred in the literature) run the risk of inadvertently enhancing some outliers.

preprint2015arXiv

Optimal Design of the Adaptive Normalized Matched Filter Detector

This article addresses improvements on the design of the adaptive normalized matched filter (ANMF) for radar detection. It is well-acknowledged that the estimation of the noise-clutter covariance matrix is a fundamental step in adaptive radar detection. In this paper, we consider regularized estimation methods which force by construction the eigenvalues of the scatter estimates to be greater than a positive regularization parameter rho. This makes them more suitable for high dimensional problems with a limited number of secondary data samples than traditional sample covariance estimates. While an increase of rho seems to improve the conditioning of the estimate, it might however cause it to significantly deviate from the true covariance matrix. The setting of the optimal regularization parameter is a difficult question for which no convincing answers have thus far been provided. This constitutes the major motivation behind our work. More specifically, we consider the design of the ANMF detector for two kinds of regularized estimators, namely the regularized sample covariance matrix (RSCM), appropriate when the clutter follows a Gaussian distribution and the regularized Tyler estimator (RTE) for non-Gaussian spherically invariant distributed clutters. Based on recent random matrix theory results studying the asymptotic fluctuations of the statistics of the ANMF detector when the number of samples and their dimension grow together to infinity, we propose a design for the regularization parameter that maximizes the detection probability under constant false alarm rates. Simulation results which support the efficiency of the proposed method are provided in order to illustrate the gain of the proposed optimal design over conventional settings of the regularization parameter.

preprint2015arXiv

The Second-Order Coding Rate of the MIMO Rayleigh Block-Fading Channel

The second-order coding rate of the multiple-input multiple-output (MIMO) quasi-static Rayleigh fading channel is studied. We tackle this problem via an information-spectrum approach and statistical bounds based on recent random matrix theory techniques. We derive a central limit theorem (CLT) to analyze the information density in the regime where the block-length n and the number of transmit and receive antennas K and N, respectively, grow simultaneously large. This result leads to the characterization of closed-form upper and lower bounds on the optimal average error probability when the coding rate is within O((nK)^-1/2) of the asymptotic capacity.

preprint2014arXiv

Estimation of Toeplitz Covariance Matrices in Large Dimensional Regime with Application to Source Detection

In this article, we derive concentration inequalities for the spectral norm of two classical sample estimators of large dimensional Toeplitz covariance matrices, demonstrating in particular their asymptotic almost sure consistence. The consistency is then extended to the case where the aggregated matrix of time samples is corrupted by a rank one (or more generally, low rank) matrix. As an application of the latter, the problem of source detection in the context of large dimensional sensor networks within a temporally correlated noise environment is studied. As opposed to standard procedures, this application is performed online, i.e. without the need to possess a learning set of pure noise samples.

preprint2014arXiv

On the convergence of Maronna's $M$-estimators of scatter

In this paper, {we propose an alternative proof for the uniqueness} of Maronna's $M$-estimator of scatter (Maronna, 1976) for $N$ vector observations $\mathbf y_1,...,\mathbf y_N\in\mathbb R^m$ under a mild constraint of linear independence of any subset of $m$ of these vectors. This entails in particular almost sure uniqueness for random vectors $\mathbf y_i$ with a density as long as $N>m$. {This approach allows to establish further relations that demonstrate that a properly normalized Tyler's $M$-estimator of scatter (Tyler, 1987) can be considered as a limit of Maronna's $M$-estimator. More precisely, the contribution is to show that each $M$-estimator converges towards a particular Tyler's $M$-estimator.} These results find important implications in recent works on the large dimensional (random matrix) regime of robust $M$-estimation.

preprint2014arXiv

Robust Estimates of Covariance Matrices in the Large Dimensional Regime

This article studies the limiting behavior of a class of robust population covariance matrix estimators, originally due to Maronna in 1976, in the regime where both the number of available samples and the population size grow large. Using tools from random matrix theory, we prove that, for sample vectors made of independent entries having some moment conditions, the difference between the sample covariance matrix and (a scaled version of) such robust estimator tends to zero in spectral norm, almost surely. This result can be applied to various statistical methods arising from random matrix theory that can be made robust without altering their first order behavior.

preprint2014arXiv

Robust spiked random matrices and a robust G-MUSIC estimator

A class of robust estimators of scatter applied to information-plus-impulsive noise samples is studied, where the sample information matrix is assumed of low rank; this generalizes the study of (Couillet et al., 2013b) to spiked random matrix models. It is precisely shown that, as opposed to sample covariance matrices which may have asymptotically unbounded (eigen-)spectrum due to the sample impulsiveness, the robust estimator of scatter has bounded spectrum and may contain isolated eigenvalues which we fully characterize. We show that, if found beyond a certain detectability threshold, these eigenvalues allow one to perform statistical inference on the eigenvalues and eigenvectors of the information matrix. We use this result to derive new eigenvalue and eigenvector estimation procedures, which we apply in practice to the popular array processing problem of angle of arrival estimation. This gives birth to an improved algorithm based on the MUSIC method, which we refer to as robust G-MUSIC.

preprint2014arXiv

Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals

A central limit theorem for bilinear forms of the type $a^*\hat{C}_N(ρ)^{-1}b$, where $a,b\in{\mathbb C}^N$ are unit norm deterministic vectors and $\hat{C}_N(ρ)$ a robust-shrinkage estimator of scatter parametrized by $ρ$ and built upon $n$ independent elliptical vector observations, is presented. The fluctuations of $a^*\hat{C}_N(ρ)^{-1}b$ are found to be of order $N^{-\frac12}$ and to be the same as those of $a^*\hat{S}_N(ρ)^{-1}b$ for $\hat{S}_N(ρ)$ a matrix of a theoretical tractable form. This result is exploited in a classical signal detection problem to provide an improved detector which is both robust to elliptical data observations (e.g., impulsive noise) and optimized across the shrinkage parameter $ρ$.

preprint2014arXiv

Statistical Inference in Large Antenna Arrays under Unknown Noise Pattern

In this article, a general information-plus-noise transmission model is assumed, the receiver end of which is composed of a large number of sensors and is unaware of the noise pattern. For this model, and under reasonable assumptions, a set of results is provided for the receiver to perform statistical eigen-inference on the information part. In particular, we introduce new methods for the detection, counting, and the power and subspace estimation of multiple sources composing the information part of the transmission. The theoretical performance of some of these techniques is also discussed. An exemplary application of these methods to array processing is then studied in greater detail, leading in particular to a novel MUSIC-like algorithm assuming unknown noise covariance.

preprint2013arXiv

Large System Analysis of Linear Precoding in MISO Broadcast Channels with Confidential Messages

In this paper, we study the performance of regularized channel inversion (RCI) precoding in large MISO broadcast channels with confidential messages (BCC). We obtain a deterministic approximation for the achievable secrecy sum-rate which is almost surely exact as the number of transmit antennas $M$ and the number of users $K$ grow to infinity in a fixed ratio $β=K/M$. We derive the optimal regularization parameter $ξ$ and the optimal network load $β$ that maximize the per-antenna secrecy sum-rate. We then propose a linear precoder based on RCI and power reduction (RCI-PR) that significantly increases the high-SNR secrecy sum-rate for $1<β<2$. Our proposed precoder achieves a per-user secrecy rate which has the same high-SNR scaling factor as both the following upper bounds: (i) the rate of the optimum RCI precoder without secrecy requirements, and (ii) the secrecy capacity of a single-user system without interference. Furthermore, we obtain a deterministic approximation for the secrecy sum-rate achievable by RCI precoding in the presence of channel state information (CSI) error. We also analyze the performance of our proposed RCI-PR precoder with CSI error, and we determine how the error must scale with the SNR in order to maintain a given rate gap to the case with perfect CSI.

preprint2013arXiv

Secrecy Sum-Rates with Regularized Channel Inversion Precoding under Imperfect CSI at the Transmitter

In this paper, we study the performance of regularized channel inversion precoding in MISO broadcast channels with confidential messages under imperfect channel state information at the transmitter (CSIT). We obtain an approximation for the achievable secrecy sum-rate which is almost surely exact as the number of transmit antennas and the number of users grow to infinity in a fixed ratio. Simulations prove this anaylsis accurate even for finite-size systems. For FDD systems, we determine how the CSIT error must scale with the SNR, and we derive the number of feedback bits required to ensure a constant high-SNR rate gap to the case with perfect CSIT. For TDD systems, we study the optimum amount of channel training that maximizes the high-SNR secrecy sum-rate.

preprint2013arXiv

The outliers among the singular values of large rectangular random matrices with additive fixed rank deformation

Consider the matrix $Σ_n = n^{-1/2} X_n D_n^{1/2} + P_n$ where the matrix $X_n \in \C^{N\times n}$ has Gaussian standard independent elements, $D_n$ is a deterministic diagonal nonnegative matrix, and $P_n$ is a deterministic matrix with fixed rank. Under some known conditions, the spectral measures of $Σ_n Σ_n^*$ and $n^{-1} X_n D_n X_n^*$ both converge towards a compactly supported probability measure $μ$ as $N,n\to\infty$ with $N/n\to c>0$. In this paper, it is proved that finitely many eigenvalues of $Σ_nΣ_n^*$ may stay away from the support of $μ$ in the large dimensional regime. The existence and locations of these outliers in any connected component of $\R - \support(μ)$ are studied. The fluctuations of the largest outliers of $Σ_nΣ_n^*$ are also analyzed. The results find applications in the fields of signal processing and radio communications.

preprint2013arXiv

The Random Matrix Regime of Maronna's M-estimator with elliptically distributed samples

This article demonstrates that the robust scatter matrix estimator $\hat{C}_N\in {\mathbb C}^{N\times N}$ of a multivariate elliptical population $x_1,\ldots,x_n\in {\mathbb C}^N$ originally proposed by Maronna in 1976, and defined as the solution (when existent) of an implicit equation, behaves similar to a well-known random matrix model in the limiting regime where the population $N$ and sample $n$ sizes grow at the same speed. We show precisely that $\hat{C}_N\in{\mathbb C}^{N\times N}$ is defined for all $n$ large with probability one and that, under some light hypotheses, $\Vert \hat{C}_N-\hat{S}_N\Vert\to 0$ almost surely in spectral norm, where $\hat{S}_N$ follows a classical random matrix model. As a corollary, the limiting eigenvalue distribution of $\hat{C}_N$ is derived. This analysis finds applications in the fields of statistical inference and signal processing.

preprint2012arXiv

Fluctuations of spiked random matrix models and failure diagnosis in sensor networks

In this article, the joint fluctuations of the extreme eigenvalues and eigenvectors of a large dimensional sample covariance matrix are analyzed when the associated population covariance matrix is a finite-rank perturbation of the identity matrix, corresponding to the so-called spiked model in random matrix theory. The asymptotic fluctuations, as the matrix size grows large, are shown to be intimately linked with matrices from the Gaussian unitary ensemble (GUE). When the spiked population eigenvalues have unit multiplicity, the fluctuations follow a central limit theorem. This result is used to develop an original framework for the detection and diagnosis of local failures in large sensor networks, for known or unknown failure magnitude.

preprint2012arXiv

Large System Analysis of Linear Precoding in Correlated MISO Broadcast Channels under Limited Feedback

In this paper, we study the sum rate performance of zero-forcing (ZF) and regularized ZF (RZF) precoding in large MISO broadcast systems under the assumptions of imperfect channel state information at the transmitter and per-user channel transmit correlation. Our analysis assumes that the number of transmit antennas $M$ and the number of single-antenna users $K$ are large while their ratio remains bounded. We derive deterministic approximations of the empirical signal-to-interference plus noise ratio (SINR) at the receivers, which are tight as $M,K\to\infty$. In the course of this derivation, the per-user channel correlation model requires the development of a novel deterministic equivalent of the empirical Stieltjes transform of large dimensional random matrices with generalized variance profile. The deterministic SINR approximations enable us to solve various practical optimization problems. Under sum rate maximization, we derive (i) for RZF the optimal regularization parameter, (ii) for ZF the optimal number of users, (iii) for ZF and RZF the optimal power allocation scheme and (iv) the optimal amount of feedback in large FDD/TDD multi-user systems. Numerical simulations suggest that the deterministic approximations are accurate even for small $M,K$.

preprint2012arXiv

Performance of mutual information inference methods under unknown interference

The problem of fast point-to-point MIMO channel mutual information estimation is addressed, in the situation where the receiver undergoes unknown colored interference, whereas the channel with the transmitter is perfectly known. The considered scenario assumes that the estimation is based on a few channel use observations during a short sensing period. Using large dimensional random matrix theory, an estimator referred to as {\em G-estimator} is derived. This estimator is proved to be consistent as the number of antennas and observations grow large and its asymptotic performance is analyzed. In particular, the G-estimator satisfies a central limit theorem with asymptotic Gaussian fluctuations. Simulations are provided which strongly support the theoretical results, even for small system dimensions.

preprint2012arXiv

Random Beamforming over Quasi-Static and Fading Channels: A Deterministic Equivalent Approach

In this work, we study the performance of random isometric precoders over quasi-static and correlated fading channels. We derive deterministic approximations of the mutual information and the signal-to-interference-plus-noise ratio (SINR) at the output of the minimum-mean-square-error (MMSE) receiver and provide simple provably converging fixed-point algorithms for their computation. Although these approximations are only proven exact in the asymptotic regime with infinitely many antennas at the transmitters and receivers, simulations suggest that they closely match the performance of small-dimensional systems. We exemplarily apply our results to the performance analysis of multi-cellular communication systems, multiple-input multiple-output multiple-access channels (MIMO-MAC), and MIMO interference channels. The mathematical analysis is based on the Stieltjes transform method. This enables the derivation of deterministic equivalents of functionals of large-dimensional random matrices. In contrast to previous works, our analysis does not rely on arguments from free probability theory which enables the consideration of random matrix models for which asymptotic freeness does not hold. Thus, the results of this work are also a novel contribution to the field of random matrix theory and applicable to a wide spectrum of practical systems.

preprint2012arXiv

Signal Processing in Large Systems: a New Paradigm

For a long time, detection and parameter estimation methods for signal processing have relied on asymptotic statistics as the number $n$ of observations of a population grows large comparatively to the population size $N$, i.e. $n/N\to \infty$. Modern technological and societal advances now demand the study of sometimes extremely large populations and simultaneously require fast signal processing due to accelerated system dynamics. This results in not-so-large practical ratios $n/N$, sometimes even smaller than one. A disruptive change in classical signal processing methods has therefore been initiated in the past ten years, mostly spurred by the field of large dimensional random matrix theory. The early works in random matrix theory for signal processing applications are however scarce and highly technical. This tutorial provides an accessible methodological introduction to the modern tools of random matrix theory and to the signal processing methods derived from them, with an emphasis on simple illustrative examples.

preprint2011arXiv

Asymptotic Analysis of Double-Scattering Channels

We consider a multiple-input multiple-output (MIMO) multiple access channel (MAC), where the channel between each transmitter and the receiver is modeled by the doubly-scattering channel model. Based on novel techniques from random matrix theory, we derive deterministic approximations of the mutual information, the signal-to-noise-plus-interference-ratio (SINR) at the output of the minimum-mean-square-error (MMSE) detector and the sum-rate with MMSE detection which are almost surely tight in the large system limit. Moreover, we derive the asymptotically optimal transmit covariance matrices. Our simulation results show that the asymptotic analysis provides very close approximations for realistic system dimensions.

preprint2011arXiv

Electrical Vehicles in the Smart Grid: A Mean Field Game Analysis

In this article, we investigate the competitive interaction between electrical vehicles or hybrid oil-electricity vehicles in a Cournot market consisting of electricity transactions to or from an underlying electricity distribution network. We provide a mean field game formulation for this competition, and introduce the set of fundamental differential equations ruling the behavior of the vehicles at the feedback Nash equilibrium, referred here to as the mean field equilibrium. This framework allows for a consistent analysis of the evolution of the price of electricity as well as of the instantaneous electricity demand in the power grid. Simulations precisely quantify those parameters and suggest that significant reduction of the daily electricity peak demand can be achieved by appropriate electricity pricing.

preprint2011arXiv

Fluctuations of an improved population eigenvalue estimator in sample covariance matrix models

This article provides a central limit theorem for a consistent estimator of population eigenvalues with large multiplicities based on sample covariance matrices. The focus is on limited sample size situations, whereby the number of available observations is known and comparable in magnitude to the observation dimension. An exact expression as well as an empirical, asymptotically accurate, approximation of the limiting variance is derived. Simulations are performed that corroborate the theoretical claims. A specific application to wireless sensor networks is developed.

preprint2011arXiv

Iterative Deterministic Equivalents for the Performance Analysis of Communication Systems

In this article, we introduce iterative deterministic equivalents as a novel technique for the performance analysis of communication systems whose channels are modeled by complex combinations of independent random matrices. This technique extends the deterministic equivalent approach for the study of functionals of large random matrices to a broader class of random matrix models which naturally arise as channel models in wireless communications. We present two specific applications: First, we consider a multi-hop amplify-and-forward (AF) MIMO relay channel with noise at each stage and derive deterministic approximations of the mutual information after the Kth hop. Second, we study a MIMO multiple access channel (MAC) where the channel between each transmitter and the receiver is represented by the double-scattering channel model. We provide deterministic approximations of the mutual information, the signal-to-interference-plus-noise ratio (SINR) and sum-rate with minimum-mean-square-error (MMSE) detection and derive the asymptotically optimal precoding matrices. In both scenarios, the approximations can be computed by simple and provably converging fixed-point algorithms and are shown to be almost surely tight in the limit when the number of antennas at each node grows infinitely large. Simulations suggest that the approximations are accurate for realistic system dimensions. The technique of iterative deterministic equivalents can be easily extended to other channel models of interest and is, therefore, also a new contribution to the field of random matrix theory.

preprint2011arXiv

Random Beamforming over Correlated Fading Channels

We study a multiple-input multiple-output (MIMO) multiple access channel (MAC) from several multi-antenna transmitters to a multi-antenna receiver. The fading channels between the transmitters and the receiver are modeled by random matrices, composed of independent column vectors with zero mean and different covariance matrices. Each transmitter is assumed to send multiple data streams with a random precoding matrix extracted from a Haar-distributed matrix. For this general channel model, we derive deterministic approximations of the normalized mutual information, the normalized sum-rate with minimum-mean-square-error (MMSE) detection and the signal-to-interference-plus-noise-ratio (SINR) of the MMSE decoder, which become arbitrarily tight as all system parameters grow infinitely large at the same speed. In addition, we derive the asymptotically optimal power allocation under individual or sum-power constraints. Our results allow us to tackle the problem of optimal stream control in interference channels which would be intractable in any finite setting. Numerical results corroborate our analysis and verify its accuracy for realistic system dimensions. Moreover, the techniques applied in this paper constitute a novel contribution to the field of large random matrix theory and could be used to study even more involved channel models.

preprint2010arXiv

A Deterministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels

In this article, novel deterministic equivalents for the Stieltjes transform and the Shannon transform of a class of large dimensional random matrices are provided. These results are used to characterise the ergodic rate region of multiple antenna multiple access channels, when each point-to-point propagation channel is modelled according to the Kronecker model. Specifically, an approximation of all rates achieved within the ergodic rate region is derived and an approximation of the linear precoders that achieve the boundary of the rate region as well as an iterative water-filling algorithm to obtain these precoders are provided. An original feature of this work is that the proposed deterministic equivalents are proved valid even for strong correlation patterns at both communication sides. The above results are validated by Monte Carlo simulations.

preprint2010arXiv

Eigen-Inference for Energy Estimation of Multiple Sources

In this paper, a new method is introduced to blindly estimate the transmit power of multiple signal sources in multi-antenna fading channels, when the number of sensing devices and the number of available samples are sufficiently large compared to the number of sources. Recent advances in the field of large dimensional random matrix theory are used that result in a simple and computationally efficient consistent estimator of the power of each source. A criterion to determine the minimum number of sensors and the minimum number of samples required to achieve source separation is then introduced. Simulations are performed that corroborate the theoretical claims and show that the proposed power estimator largely outperforms alternative power inference techniques.

Romain Couillet

What is connected

Connect this record

See the researcher in context

Building this map preview

45 published item(s)

A Concentration of Measure and Random Matrix Approach to Large Dimensional Robust Statistics

A Random Matrix Perspective on Random Tensors

Random matrices in service of ML footprint: ternary random features with no performance loss

Concentration of Measure and Large Random Matrices with an application to Sample Covariance Matrices

A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent

Consistent Semi-Supervised Graph Regularization for High Dimensional Data

Large Dimensional Analysis and Improvement of Multi Task Learning

Optimal Laplacian regularization for sparse spectral community detection

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Random Matrix Improved Covariance Estimation for a Large Class of Metrics

Kernel spectral clustering of large dimensional data

Large System Analysis of Base Station Cooperation for Power Minimization

Random matrices meet machine learning: a large dimensional analysis of LS-SVM

Spectral analysis of the Gram matrix of mixture models

Spectral community detection in heterogeneous large networks

The Asymptotic Performance of Linear Echo State Neural Networks

A Robust Statistics Approach to Minimum Variance Portfolio Optimization

Analysis of the limiting spectral measure of large random matrices of the separable covariance type

Convergence and Fluctuations of Regularized Tyler Estimators

Large Dimensional Analysis and Optimization of Robust Shrinkage Covariance Matrix Estimators

Large Dimensional Analysis of Robust M-Estimators of Covariance with Outliers

Optimal Design of the Adaptive Normalized Matched Filter Detector

The Second-Order Coding Rate of the MIMO Rayleigh Block-Fading Channel

Estimation of Toeplitz Covariance Matrices in Large Dimensional Regime with Application to Source Detection

On the convergence of Maronna's $M$-estimators of scatter

Robust Estimates of Covariance Matrices in the Large Dimensional Regime

Robust spiked random matrices and a robust G-MUSIC estimator

Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals

Statistical Inference in Large Antenna Arrays under Unknown Noise Pattern

Large System Analysis of Linear Precoding in MISO Broadcast Channels with Confidential Messages

Secrecy Sum-Rates with Regularized Channel Inversion Precoding under Imperfect CSI at the Transmitter

The outliers among the singular values of large rectangular random matrices with additive fixed rank deformation

The Random Matrix Regime of Maronna's M-estimator with elliptically distributed samples

Fluctuations of spiked random matrix models and failure diagnosis in sensor networks

Large System Analysis of Linear Precoding in Correlated MISO Broadcast Channels under Limited Feedback

Performance of mutual information inference methods under unknown interference

Random Beamforming over Quasi-Static and Fading Channels: A Deterministic Equivalent Approach

Signal Processing in Large Systems: a New Paradigm

Asymptotic Analysis of Double-Scattering Channels

Electrical Vehicles in the Smart Grid: A Mean Field Game Analysis

Fluctuations of an improved population eigenvalue estimator in sample covariance matrix models

Iterative Deterministic Equivalents for the Performance Analysis of Communication Systems

Random Beamforming over Correlated Fading Channels

A Deterministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels

Eigen-Inference for Energy Estimation of Multiple Sources