Geometrical and statistical properties of M-estimates of scatter on Grassmann manifolds
We consider data from the Grassmann manifold $G(m,r)$ of all vector subspaces of dimension $r$ of $\mathbb{R}^m$, and focus on the Grassmannian statistical model which is of common use in signal processing and statistics. Canonical Grassmannian distributions $\mathbb{G}_Σ$ on $G(m,r)$ are indexed by parameters $Σ$ from the manifold $\mathcal{M}= Pos_{sym}^{1}(m)$ of positive definite symmetric matrices of determinant $1$. Robust M-estimates of scatter (GE) for general probability measures $\mathcal{P}$ on $G(m,r)$ are studied. Such estimators are defined to be the maximizers of the Grassmannian log-likelihood $-\ell_{\mathcal{P}}(Σ)$ as function of $Σ$. One of the novel features of this work is a strong use of the fact that $\mathcal{M}$ is a CAT(0) space with known visual boundary at infinity $\partial \mathcal{M}$. We also recall that the sample space $G(m,r)$ is a part of $\partial \mathcal{M}$, show the distributions $\mathbb{G}_Σ$ are $SL(m,\mathbb{R})$--quasi-invariant, and that $\ell_{\mathcal{P}}(Σ)$ is a weighted Busemann function. Let $\mathcal{P}_n =(δ_{U_1}+\cdots+δ_{U_n})/n$ be the empirical probability measure for $n$-samples of random i.i.d. subspaces $U_i\in G(m,r)$ of common distribution $\mathcal{P}$, whose support spans $\mathbb{R}^m$. For $Σ_n$ and $Σ_{\mathcal{P}}$ the GEs of $\mathcal{P}_n$ and $\mathcal{P}$, we show the almost sure convergence of $Σ_n$ towards $Σ$ as $n\to\infty$ using methods from geometry, and provide a central limit theorem for the rescaled process $C_n = \frac{m}{tr(Σ_{\mathcal{P}}^{-1} Σ_n)}g^{-1} Σ_n g^{-1}$, where $Σ=gg$ with $g\in SL(m,\mathbb{R})$ the unique symmetric positive-definite square root of $Σ$.