Augmenting Language Models with Long-Term Memory

NeurIPS 2023 |

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Decoupled-Memory-Augmented LLMs (DeMA), which enables LLMs to memorize long history. We design a novel decoupled network architecture with the original backbone LLM frozen as a memory encoder and an adaptive residual side-network as a memory retriever and reader. Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness. Enhanced with memory-augmented adaptation training, \our{} can thus memorize long past context and use long-term memory for language modeling. The proposed memory retrieval module can handle flexible context in its memory bank to benefit various downstream tasks, including memorizing long inputs for language modeling and caching many-shot demonstration examples for enhancing in-context learning. Experiments show that our method outperforms strong long-context models on ChapterBreak, a challenging long-context modeling benchmark, and achieves remarkable improvements on memory-augmented in-context learning over LLMs.The results demonstrate that the proposed method is effective in helping language models to memorize and utilize long-form contents.

GitHubGitHub