Timezone: »

Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Borja Balle · Mehryar Mohri

Tue Dec 04 11:00 AM -- 11:20 AM (PST) @ Harveys Convention Center Floor, CC

Many tasks in text and speech processing and computational biology involve functions from variable-length strings to real numbers. A wide class of such functions can be computed by weighted automata. Spectral methods based on singular value decompositions of Hankel matrices have been recently proposed for learning probability distributions over strings that can be computed by weighted automata. In this paper we show how this method can be applied to the problem of learning a general weighted automata from a sample of string-label pairs generated by an arbitrary distribution. The main obstruction to this approach is that in general some entries of the Hankel matrix that needs to be decomposed may be missing. We propose a solution based on solving a constrained matrix completion problem. Combining these two ingredients, a whole new family of algorithms for learning general weighted automata is obtained. Generalization bounds for a particular algorithm in this class are given. The proofs rely on a stability analysis of matrix completion and spectral learning.

Author Information

Borja Balle (McGill University)
Mehryar Mohri (Google Research & Courant Institute of Mathematical Sciences)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors