Timezone: »

Comparing Unsupervised Word Translation Methods Step by Step
Mareike Hartmann · Yova Kementchedjhieva · Anders Søgaard

Wed Dec 11 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #103

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning. In the unsupervised regime, an initial seed dictionary is learned in the absence of any known correspondences between words, through {\bf distribution matching}, and the seed dictionary is then used to supervise the induction of the final alignment in what is typically referred to as a (possibly iterative) {\bf refinement} step. We focus on the first step and compare distribution matching techniques in the context of language pairs for which mixed training stability and evaluation scores have been reported. We show that, surprisingly, when looking at this initial step in isolation, vanilla GANs are superior to more recent methods, both in terms of precision and robustness. The improvements reported by more recent methods thus stem from the refinement techniques, and we show that we can obtain state-of-the-art performance combining vanilla GANs with such refinement techniques.

Author Information

Mareike Hartmann (University of Copenhagen)
Yova Kementchedjhieva (University of Copenhagen)
Anders Søgaard (University of Copenhagen)

More from the Same Authors

  • 2019 : Poster lighting round »
    Yinhe Zheng · Anders Søgaard · Abdelrhman Saleh · Youngsoo Jang · Hongyu Gong · Omar U. Florez · Margaret Li · Andrea Madotto · The Tung Nguyen · Ilia Kulikov · Arash einolghozati · Yiru Wang · Mihail Eric · Victor Petrén Bach Hansen · Nurul Lubis · Yen-Chen Wu