Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems

Abstract

Recent advances in neural approaches greatly improve task-oriented dialogue (TOD) systems which assist users to accomplish their goals. However, such systems rely on costly manually labeled dialogs which are not available in practical scenarios. In this paper, we present our models for Track 2 of the SereTOD 2022 challenge, which is the first challenge of building semi-supervised and reinforced TOD systems on a large-scale real-world Chinese TOD dataset MobileCS. We build a knowledge-grounded dialog model to formulate dialog history and local KB as input and predict the system response. And we perform semi-supervised pre-training both on the labeled and unlabeled data. Our system achieves the first place both in the automatic evaluation and human interaction, especially with higher BLEU (+7.64) and Success (+13.6%) than the second place.

Publication
EMNLP2022 workshop (SereTOD)
Weihao Zeng
Weihao Zeng
Postgraduate Student
Keqing He
Postgraduate Student

Dialogue System, Summarization, Pre-training Language Model

Zechen Wang
Zechen Wang
Postgraduate Student
Dayuan Fu
Dayuan Fu
Postgraduate Student
Guanting Dong
Guanting Dong
Postgraduate Student

Spoken Language Understading and related applications

Ruotong Geng
Ruotong Geng
Postgraduate Student

Abstractive Summarization

Pei Wang
Pei Wang
Postgraduate Student
Weiran Xu
Weiran Xu
Associate Professor, Master Supervisor, Ph.D Supervisor

Information Retrieval, Pattern Recognition, Machine Learning