Watch the Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery

摘要

Discovering out-of-domain (OOD) intent is important for developing new skills in task-oriented dialogue systems. The key challenges lie in how to transfer prior in-domain (IND) knowledge to OOD clustering, as well as jointly learn OOD representations and cluster assignments. Previous methods suffer from in-domain overfitting problem, and there is a natural gap between representation learning and clustering objectives. In this paper, we propose a unified K-nearest neighbor contrastive learning framework to discover OOD intents. Specifically, for IND pre-training stage, we propose a KCL objective to learn inter-class discriminative features, while maintaining intra-class diversity, which alleviates the in-domain overfitting problem. For OOD clustering stage, we propose a KCC method to form compact clusters by mining true hard negative samples, which bridges the gap between clustering and representation learning. Extensive experiments on three benchmark datasets show that our method achieves substantial improvements over the state-of-the-art methods.

会议
EMNLP 2022
牟宇滔
牟宇滔
硕士研究生

任务型对话系统,自然语言理解

何可清
硕士研究生

对话系统,摘要,预训练

王霈
王霈
硕士研究生
吴亚楠
吴亚楠
硕士研究生

自然语言理解

徐蔚然
徐蔚然
副教授,硕士生导师,博士生导师

信息检索,模式识别,机器学习