Timezone: »
Finite-State Transducers (FST) are a standard tool for modeling paired input-output sequences and are used in numerous applications, ranging from computational biology to natural language processing. Recently Balle et al. presented a spectral algorithm for learning FST from samples of aligned input-output sequences. In this paper we address the more realistic, yet challenging setting where the alignments are unknown to the learning algorithm. We frame FST learning as finding a low rank Hankel matrix satisfying constraints derived from observable statistics. Under this formulation, we provide identifiability results for FST distributions. Then, following previous work on rank minimization, we propose a regularized convex relaxation of this objective which is based on minimizing a nuclear norm penalty subject to linear constraints and can be solved efficiently.
Author Information
Raphael Bailly (Universitat Politècnica de Catalunya)
Xavier Carreras (Universitat Politècnica de Catalunya)
Ariadna Quattoni (Xerox Research Centre Europe)
Related Events (a corresponding poster, oral, or spotlight)
-
2013 Spotlight: Unsupervised Spectral Learning of Finite State Transducers »
Fri. Dec 6th 06:26 -- 06:30 PM Room Harvey's Convention Center Floor, CC
More from the Same Authors
-
2012 Workshop: xLiTe: Cross-Lingual Technologies »
Achim Rettinger · Marko Grobelnik · Blaz Fortuna · Xavier Carreras · Juanzi Li