Textbooks Are All You Need

Suriya Gunasekar; Yi Zhang; Jyoti Aneja; Caio Cesar; Teodoro Mendes; Allie Del Giorno; Sivakanth Gopi; Mojan Javaheripi; Piero Kauffmann; Gustavo de Rosa; Olli Saarikivi; Adil Salim; Shital Shah; Harkirat Singh Behl; Xin Wang; Sébastien Bubeck; Ronen Eldan; Adam Tauman Kalai; Yin Tat Lee; Yuanzhi Li

Textbooks Are All You Need

Suriya Gunasekar ,
Yi Zhang ,
Jyoti Aneja ,
Caio Cesar ,
Teodoro Mendes ,
Allie Del Giorno ,
Sivakanth Gopi ,
Mojan Javaheripi ,
Piero Kauffmann ,
Gustavo de Rosa ,
Olli Saarikivi ,
Adil Salim ,
Shital Shah ,
Harkirat Singh Behl ,
Xin Wang ,
Sébastien Bubeck ,
Ronen Eldan ,
Adam Tauman Kalai ,
Yin Tat Lee ,
Yuanzhi Li

June 2023

Download BibTex

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of “textbook quality” data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval.

Publication Downloads

Phi-1

December 11, 2023

The language model phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from The Stack v1.2, Q&A content from StackOverflow, competition code from code_contests, and synthetic Python textbooks and exercises generated by gpt-3.5-turbo-0301. Even though the model and the datasets are relatively small compared to contemporary Large Language Models (LLMs), phi-1 has demonstrated an impressive accuracy rate exceeding 50% on the simple Python coding benchmark, HumanEval.

Download Data