Adversarial Cross-Lingual Transfer Learning for Slot Tagging of Low-Resource Languages

摘要

Slot tagging is a key component in a task-oriented dialogue system. Conversational agents need to understand human input by training on large amounts of annotated data. However, most human languages are low-resource and lack annotated training data for slot tagging task. Therefore, we aim to leverage cross-lingual transfer learning from high-resource languages to low-resource ones. In this paper, we propose an adversarial cross-lingual transfer model with multi-level language shared and specific knowledge to improve the slot tagging task of low-resource languages. Our method explicitly separates the model into the language-shared part and language-specific part to transfer language-independent knowledge. To refine shared knowledge in the latent space, we add a language discriminator and employ adversarial training to reinforce feature separation. Besides, we adopt a novel multi-level feature transfer in an incremental and progressive way to acquire multi-granularity shared knowledge. To mitigate the discrepancies between the feature distributions of language specific and shared knowledge, we propose the neural adapters to fuse features from different sources. Experiments show that our proposed model consistently outperforms monolingual baseline with a statistically significant margin up to 2.09%, even higher improvement of 12.21% in the zero-shot setting. Further analysis demonstrates that our method could effectively alleviate data scarcity of low-resource languages.

会议
IJCNN 2020
何可清
硕士研究生

对话系统,摘要,预训练

严渊蒙
严渊蒙
硕士研究生

自然语言理解,预训练

徐蔚然
徐蔚然
副教授,硕士生导师,博士生导师

信息检索,模式识别,机器学习