Timezone: »

 
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling
Zhaojiang Lin · Andrea Madotto · Genta Winata · Peng Xu · Feijun Jiang · Yuxiang Hu · Chen Shi · Pascale N Fung

Task-oriented dialogue (ToD) benchmarks provide an important avenue to measure progress and develop better conversational agents. However, existing datasets for end-to-end ToD modeling are limited to a single language, hindering the development of robust end-to-end ToD systems for multilingual countries and regions. Here we introduce BiToD, the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling. BiToD contains over 7k multi-domain dialogues (144k utterances) with a large and realistic bilingual knowledge base. It serves as an effective benchmark for evaluating bilingual ToD systems and cross-lingual transfer learning approaches. We provide state-of-the-art baselines under three evaluation settings (monolingual, bilingual, and cross-lingual). The analysis of our baselines in different settings highlights 1) the effectiveness of training a bilingual ToD system comparing to two independent monolingual ToD systems, and 2) the potential of leveraging a bilingual knowledge base and cross-lingual transfer learning to improve the system performance in the low resource condition.

Author Information

Zhaojiang Lin (The Hong Kong University of Science and Technology)

Zhaojiang Lin is a Ph.D. candidate in Electronic and Computer Engineering at The Hong Kong University of Science and Technology and Centre for Artificial Intelligence Research (CAiRE). He completed his Bachelor in Electronic Engineering at University of Electronic Science and Technology of China. His research interests lie in the area of the Dialogue System, Meta-learning, Affective computing, Natural Language Understanding, and Multilinguality. He received Best Paper Awards from RepL4NLP@ACL 2019 and ConvAI@NeurIPS 2019. He serves as the Program Committee for several major machine learning & natural language processing conferences: NeurIPS, ICLR, AAAI, and NAACL.

Andrea Madotto (The Hong Kong University of Science and Technology)

Andrea Madotto is a PhD candidate in Electronic & Computer Engineering at The Hong Kong University of Science and Technology and part of the Centre for Artificial Intelligence Research (CAiRE). His research focuses on conversational modelling, controllable language generation, and meta/continual learning. He received the Outstanding Paper Award from ACL2019 and the best paper award from the ConvAI workshop at NeurIPS2019, and his work has been featured in MIT technology review and VentureBeat. He serves as program committee and reviewer for various machine learning and natural language processing conferences such as ACL, EMNLP, NeurIPS, ICLR, and AAAI, and journals such as Journal of Natural Language Engineering and Computer Speech and Languages.

Genta Winata (The Hong Kong University of Science and Technology)
Peng Xu (The Hong Kong University of Science and Technology)
Feijun Jiang
Yuxiang Hu
Chen Shi (Alibaba Group)
Pascale N Fung (Hong Kong University of Science and Technology)
Pascale N Fung

Pascale Fung (馮雁) (born 1966 in Shanghai, China) is a professor in the Department of Electronic & Computer Engineering and the Department of Computer Science & Engineering at the Hong Kong University of Science & Technology(HKUST). She is the director of the newly established, multidisciplinary Centre for AI Research (CAiRE) at HKUST. She is an elected Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for her “contributions to human-machine interactions”[1] and an elected Fellow of the International Speech Communication Association for “fundamental contributions to the interdisciplinary area of spoken language human-machine interactions”.

More from the Same Authors