Minimax optimal submatrix detection: Sharp non-asymptotic rates
Given an observation $\mathbf Y \in \mathbb{R}^{d_1\times d_2}$ from the model $\mathbf Y = \mathbf X + \mathbf E$ where $\mathbf X$ is constant and $\mathbf E$ has i.i.d. $N(0,1)$ entries, we consider the problem of detecting a planted submatrix in the mean matrix $\mathbf X$. Specifically, we aim to distinguish the null hypothesis $\mathbf X = 0$ from the alternative hypothesis in which $\mathbf X$ is non-zero only on a submatrix of size $s_1 \times s_2$ with elevated entries bounded below by $μ>0$. We establish a minimax lower bound characterizing how large $μ$ must be to ensure that the two hypotheses are distinguishable with high probability. Furthermore, we derive novel minimax-optimal tests achieving the lower bound, and describe extensions of these tests that are adaptive to unknown sparsity levels $s_1$ and $s_2$. In contrast with previous work, which required restrictive assumptions on $s_1,s_2, d_1$ and $d_2$, our non-asymptotic upper and lower bounds match for any configuration of these parameters.