Loading...
KBLaM blog | A flowchart illustrating the process of handling a prompt using a language model. The process begins with documents being used to construct and summarize a knowledge base (KB) offline. The summarized KB is then encoded and fed into the main process. A prompt goes through a tokenizer, followed by rectangular attention, and then into the large language model (LLM). The LLM retrieves information from the encoded KB to generate an answer.
Microsoft Research Blog

Introducing KBLaM: Bringing plug-and-play external knowledge to LLMs 

March 18, 2025 | Taketomo Isazawa, Xi Wang, Liana Mikaelyan, Mathew Salvaris, and James Hensman

Introducing KBLaM, an approach that encodes and stores structured knowledge within an LLM itself. By integrating knowledge without retraining, it offers a scalable alternative to traditional methods.