Alert button
Picture for Lizi Ottens

Lizi Ottens

Alert button

Automatic Code Generation using Pre-Trained Language Models

Feb 21, 2021
Luis Perez, Lizi Ottens, Sudharshan Viswanathan

Figure 1 for Automatic Code Generation using Pre-Trained Language Models
Figure 2 for Automatic Code Generation using Pre-Trained Language Models
Figure 3 for Automatic Code Generation using Pre-Trained Language Models
Figure 4 for Automatic Code Generation using Pre-Trained Language Models

Recent advancements in natural language processing \cite{gpt2} \cite{BERT} have led to near-human performance in multiple natural language tasks. In this paper, we seek to understand whether similar techniques can be applied to a highly structured environment with strict syntax rules. Specifically, we propose an end-to-end machine learning model for code generation in the Python language built on-top of pre-trained language models. We demonstrate that a fine-tuned model can perform well in code generation tasks, achieving a BLEU score of 0.22, an improvement of 46\% over a reasonable sequence-to-sequence baseline. All results and related code used for training and data processing are available on GitHub.

* 9 pages, 11 figures 
Viaarxiv icon