DolphCoder:Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

Abstract

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this paper, we introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation. It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability. Our model achieves superior performance on the HumanEval and MBPP benchmarks, demonstrating new insights for future code instruction tuning work. Our key findings are:(1) Augmenting more diverse responses with distinct reasoning paths increases the code capability of LLMs. (2) Improving one’s ability to evaluate the correctness of code solutions also enhances their ability to create it.

Publication
ACL 2024
Yejie Wang
Yejie Wang
Postgraduate Student
Keqing He
Postgraduate Student

Dialogue System, Summarization, Pre-training Language Model

Guanting Dong
Guanting Dong
Postgraduate Student

Spoken Language Understading and related applications

Pei Wang
Pei Wang
Postgraduate Student
Weihao Zeng
Weihao Zeng
Postgraduate Student
Muxi Diao
Muxi Diao
Postgraduate Student
Yutao Mu
Yutao Mu
Postgraduate Student

Task-oriented Dialogue System, Spoken Language Understading

Weiran Xu
Weiran Xu
Associate Professor, Master Supervisor, Ph.D Supervisor

Information Retrieval, Pattern Recognition, Machine Learning