NeurIPS Grounding Code Generation with Input-Output Specifications

Poster
in
Workshop: Instruction Tuning and Instruction Following

Grounding Code Generation with Input-Output Specifications

Yeming Wen · Pengcheng Yin · Kensen Shi · Henryk Michalewski · Swarat Chaudhuri · Oleksandr Polozov

Keywords: [ code generation ] [ I/O specifications ] [ Instruction Tuning ]

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

Large language models (LLMs) have demonstrated significant potential in code generation. However, the code generated by these models occasionally deviates from the user's intended outcome, resulting in executable but incorrect code. To mitigate this issue, we propose Gift4Code, a novel approach for the instruction fine-tuning of LLMs specifically tailored for code generation. Our method leverages synthetic data produced by the LLM itself and utilizes execution-derived feedback as a key learning signal. This feedback, in the form of program input-output specifications, is provided to the LLM to facilitate fine-tuning. We evaluated our approach on two challenging data science benchmarks, Arcade and DS-1000. Our results suggest that the method enhances the LLM's alignment with user intentions, reducing the incidence of executable but incorrect outputs.

Chat is not available.

Poster in Workshop: Instruction Tuning and Instruction Following

Grounding Code Generation with Input-Output Specifications

Yeming Wen · Pengcheng Yin · Kensen Shi · Henryk Michalewski · Swarat Chaudhuri · Oleksandr Polozov

Poster
in
Workshop: Instruction Tuning and Instruction Following