Spectral learning of general weighted automata via constrained matrix completion
Document typeConference lecture
Rights accessOpen Access
Many tasks in text and speech processing and computational biology require estimating functions mapping strings to real numbers. A broad class of such functions can be defined by weighted automata. Spectral methods based on the singular value decomposition of a Hankel matrix have been recently proposed for learning a probability distribution represented by a weighted automaton from a training sample drawn according to this same target distribution. In this paper, we show how spectral methods can be extended to the problem of learning a general weighted automaton from a sample generated by an arbitrary distribution. The main obstruction to this approach is that, in general, some entries of the Hankel matrix may be missing. We present a solution to this problem based on solving a constrained matrix completion problem. Combining these two ingredients, matrix completion and spectral method, a whole new family of algorithms for learning general weighted automata is obtained. We present generalization bounds for a particular algorithm in this family. The proofs rely on a joint stability analysis of matrix completion and spectral learning.
Student Paper Awards NIPS 2012
CitationB. Balle; Mohri, M. Spectral learning of general weighted automata via constrained matrix completion. A: Annual Conference on Neural Information Processing Systems. "Advances in Neural Information Processing Systems 26: proceedings of the 2012 conference". Lake Tahoe, Nevada: 2012, p. 2168-2176.