Simultaneous Estimation of Graphical Models by Neighborhood Selection
In many applications concerning statistical graphical models the data originate from several subpopulations that share similarities but have also significant differences. This raises the question of how to estimate several graphical models simultaneously. Compiling all the data together to estimate a single graph would ignore the differences among subpopulations. On the other hand, estimating a graph from each subpopulation separately does not make efficient use of the common structure in the data. We develop a new method for simultaneous estimation of multiple graphical models by estimating the topological neighborhoods of the involved variables under a sparse inducing penalty that takes into account the common structure in the subpopulations. Unlike the existing methods for joint graphical models, our method does not rely on spectral decomposition of large matrices, and is therefore more computationally attractive for estimating large networks. In addition, we develop the asymptotic properties of our method, demonstrate its the numerical complexity, and compare it with several existing methods by simulation. Finally, we apply our method to the estimation of genomic networks for a lung cancer dataset which consists of several subpopulations.