DolphCoder:Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

摘要

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this paper, we introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation. It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability. Our model achieves superior performance on the HumanEval and MBPP benchmarks, demonstrating new insights for future code instruction tuning work. Our key findings are:(1) Augmenting more diverse responses with distinct reasoning paths increases the code capability of LLMs. (2) Improving one’s ability to evaluate the correctness of code solutions also enhances their ability to create it.

会议
ACL 2024
王业捷
王业捷
硕士研究生
何可清
硕士研究生

对话系统,摘要,预训练

董冠霆
董冠霆
硕士研究生

自然语言理解

王霈
王霈
硕士研究生
曾伟豪
曾伟豪
硕士研究生
刁沐熙
刁沐熙
硕士研究生
牟宇滔
牟宇滔
硕士研究生

任务型对话系统,自然语言理解

徐蔚然
徐蔚然
副教授,硕士生导师,博士生导师

信息检索,模式识别,机器学习