Approximation of stationary processes by Hidden Markov Models
We aim at the construction of a Hidden Markov Model (HMM) of assigned complexity (number of states of the underlying Markov chain) which best approximates, in Kullback-Leibler divergence rate, a given stationary process. We establish, under mild conditions, the existence of the divergence rate between a stationary process and an HMM. Since in general there is no analytic expression available for this divergence rate, we approximate it with a properly defined, and easily computable, divergence between Hankel matrices, which we use as our approximation criterion. We propose a three-step algorithm, based on the Nonnegative Matrix Factorization technique, which realizes an HMM optimal with respect to the defined approximation criterion. A full theoretical analysis of the algorithm is given in the special case of Markov approximation.