Project Malmo: Reinforcement learning in a complex world

Published July 27, 2018

By Noboru Sean Kuno , Senior Research Program Manager

Share this page

France’s victory over Croatia in the 2018 FIFA World Cup was as thrilling as sports competition gets. If you’re as much a fan of the game as I am, you enjoyed watching 32 national teams vie for the title over a beautiful month across 11 cities in Russia.

The riveting action taking place on the pitch reminded us of another kind of competition. But this one, instead of football teams, involves software agents. Two years ago, a collaborative cross-Microsoft Research team that includes participants from Microsoft Research in Redmond, Washington, New York City and Cambridge, United Kingdom launched Project Malmo – an open-ended platform to advance the state of the art in AI research, especially reinforcement learning in a complex world. The platform is designed to take what’s possible today and push our research toward more ambitious and more difficult tasks.

Last year we held our first competition, the Malmo Collaborative AI Challenge (opens in new tab). It focused on human and software agents working together to tackle certain tasks. The competition attracted many students worldwide and the winners were invited to AI Summer School 2017 hosted by Microsoft Research Cambridge. Winning teams received Azure for Research. An interesting discovery in Cambridge was the sheer diversity of approaches from participants. We were delighted to see students showing various creative approaches and well-designed implementations of their agents. Indeed, in the wake of the competition one of the winning teams from Nanyang Technological University published an AAAI paper on their approach titled, “HogRider: Champion Agent of Microsoft Malmo Collaborative AI Challenge (opens in new tab)” that is absolutely worth a read.

Today we’re happy to share an additional milestone involving Project Malmo. Microsoft is partnering with Queen Mary University of London and CrowdAI to co-host a second competition, Learning to Play: The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition. This competition is a brand-new challenge that proposes research on multi-agent reinforcement learning using multiple games. Participants create learning agents able to play multiple 3D games as deﬁned in the Project Malmo platform. The aim of the competition is to encourage AI research on more general approaches through multi-player games. The challenge will consist of not one but several games, each involving tasks of varying diﬃculty and settings. This represents a very unique approach.

Diego Perez-Liebana, Lecturer in Computer Games and Artificial Intelligence at Queen Mary University of London, United Kingdom talked about the potential impact of the Malmo competitions on AI research, “Our research group has been running AI game-based competitions for many years and we are well aware of the multiple benefits these bring,” said Perez-Liebana. “They provide a common benchmark for multiple researchers across the globe to train their AI agents in a way that is comparable, allowing us to effectively contrast different techniques in a common domain,” he continued. Game AI competitions are a great resource for education, as they can be proposed as assignment or project from undergraduate to PhD level. “Games are fun, and so is AI,” said Perez-Liebana. Indeed, combining the two helps popularize challenges and solutions faster and more broadly than any other methods. Perez-Liebana pointed to the evolution of the Monte Carlo Tree Search methods during successive Go competitions that led to the use of this method in multiple other games and domains as a clear example of this.

Sharada Prasanna Mohanty, a PhD student at EPFL, Switzerland, co-founder of CrowdAI expressed his expectations regarding the competition. “With this challenge, our principal goal is to make available a series of problems for the community of multi-agent reinforcement learning researchers to collaboratively work on,” said Sharada Mohanty. “With Minecraft as the main platform enabling this research, we also hope to inspire many other researchers and engineers from various domains to get involved in reinforcement learning research. The success of this challenge can help establish these tasks as standard benchmark tasks for all multi-agent reinforcement learning researchers to compare their approaches in the future and at the same time can potentially help us better measure our own progress in multi-agent reinforcement learning research as a community over time.”

The competition is open to anyone worldwide. Visit the competition page (opens in new tab) for more detail about registration and rules. Qualifying rounds last until November 12th. The top 32 teams in the qualifying rounds can move forward to the knockout rounds of the final tournament, where team agents compete each other on an exciting set of games and tasks. The tournament will be a live competition in MARLO workshop (opens in new tab) at the 14th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (opens in new tab) (AIIDE’18) to be held at the University of Alberta in Edmonton, AB, Canada on November 14, 2018. We’re also calling for papers (opens in new tab) at the workshop.

We hope to see as many students, researchers and engineers as possible share their innovative approaches and creative ideas for multi-agent reinforcement learning at the workshop in Edmonton. Let’s kick-off! You are now in possession of the ball!