Let’s Fix this Together: Conversational Debugging with GitHub Copilot

Despite advancements in IDE tooling, code understanding, generation, and automated repair, debugging continues to present significant challenges. Existing debugging strategies available to developers in literature are often too mechanical and rigid for day-to-day issues. Recent advances in Large Language Models (LLMs) promise practical solutions that allow for more free-form debugging strategies. While LLMs offer satisfactory assistance in some cases, they often leap to action without sufficient context, making implicit assumptions and providing inaccurate responses. Moreover, the dialogue between developers and LLMs predominantly takes the form of question-answer pairs, placing the burden of formulating the correct questions and sustaining multi-turn conversations on the developer.

We introduce ROBIN, a novel multi-agent conversational AI-assistant within GitHub Copilot Chat, specifically designed for debugging. ROBIN moves beyond the question-answer pairs by introducing the investigate & respond pattern, that focuses on using information gathered automatically from the IDE or gathered interactively from the developer before responding. ROBIN incorporates a general debugging strategy to systematically analyze bugs to sustain collaborative interactions while ensuring that the conversation does not deviate from the debugging task at hand. Through a within-subjects user study with 16 industry professionals, we find that equipping ROBIN to—(1) leverage the insert expansion interaction pattern, (2) facilitate turn-taking, and (3) utilize debugging strategies—leads to lowered conversation barriers, a 2.5x improvement in bug localization and a substantial 3.5x improvement in bug resolution compared to AI-assisted debugging in Visual Studio prior to ROBIN.