Bimodal Modelling of Source Code and Natural Language

Miltos Allamanis; Daniel Tarlow; Andy Gordon; Yi Wei

Bimodal Modelling of Source Code and Natural Language

Miltos Allamanis ,
Daniel Tarlow ,
Andy Gordon ,
Yi Wei

International Conference on Machine Learning | August 2015

Download BibTex

We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural language. The resulting models are useful for a variety of tasks that involve natural language and source code. We demonstrate their performance on two retrieval tasks: retrieving source code snippets given a natural language query, and retrieving natural language descriptions given a source code query (i.e., source code captioning). The experiments show there to be promise in this direction, and that modelling the structure of source code is helpful towards the retrieval tasks.