Check your facts and try again: Improving large language models with external knowledge and automated feedback

Baoling Peng; Michel Galley; Pengcheng He; Hao Cheng; Yujia Xie; Yu Hu; Qiuyuan Huang; Lars Liden; Zhou Yu; Weizhu Chen; Jianfeng Gao

Check your facts and try again: Improving large language models with external knowledge and automated feedback

Baoling Peng ,
Michel Galley ,
Pengcheng He ,
Hao Cheng ,
Yujia Xie ,
Yu Hu ,
Qiuyuan Huang ,
Lars Liden ,
Zhou Yu ,
Weizhu Chen ,
Jianfeng Gao

MSR-TR-2023-46 | March 2023

Published by Microsoft

Related File | Related File

Download BibTex

Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e.g., task-oriented dialog and question answering. However, applying LLMs to real-world, mission-critical applications remains challenging mainly due to their tendency to generate hallucinations and inability to use external knowledge.This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules. Our system makes the LLM generate responses grounded in consolidated external knowledge, e.g., stored in task-specific databases. It also iteratively revises LLM prompts to improve model responses using feedback generated by utility functions, e.g., the factuality score of a LLM-generated response. The effectiveness of LLM-Augmenter is empirically validated on two types of mission-critical scenarios, task-oriented dialog and open-domain question answering. LLM-Augmenter significantly reduces ChatGPT’s hallucinations without sacrificing the fluency and informativeness of its responses. We make the source code and models publicly available.