Project Malmo: Reinforcement learning in a complex world

Published

By , Senior Research Program Manager

France’s victory over Croatia in the 2018 FIFA World Cup was as thrilling as sports competition gets. If you’re as much a fan of the game as I am, you enjoyed watching 32 national teams vie for the title over a beautiful month across 11 cities in Russia.

The riveting action taking place on the pitch reminded us of another kind of competition. But this one, instead of football teams, involves software agents. Two years ago, a collaborative cross-Microsoft Research team that includes participants from Microsoft Research in Redmond, Washington, New York City and Cambridge, United Kingdom launched Project Malmo – an open-ended platform to advance the state of the art in AI research, especially reinforcement learning in a complex world. The platform is designed to take what’s possible today and push our research toward more ambitious and more difficult tasks.

Last year we held our first competition, the Malmo Collaborative AI Challenge (opens in new tab). It focused on human and software agents working together to tackle certain tasks. The competition attracted many students worldwide and the winners were invited to AI Summer School 2017 hosted by Microsoft Research Cambridge. Winning teams received Azure for Research. An interesting discovery in Cambridge was the sheer diversity of approaches from participants. We were delighted to see students showing various creative approaches and well-designed implementations of their agents. Indeed, in the wake of the competition one of the winning teams from Nanyang Technological University published an AAAI paper on their approach titled, “HogRider: Champion Agent of Microsoft Malmo Collaborative AI Challenge (opens in new tab)” that is absolutely worth a read.

Microsoft Research Blog

Introducing Aurora: The first large-scale foundation model of the atmosphere

Aurora, a new AI foundation model from Microsoft Research, can transform our ability to predict and mitigate extreme weather events and the effects of climate change by enabling faster and more accurate weather forecasts than ever before.

Today we’re happy to share an additional milestone involving Project Malmo. Microsoft is partnering with Queen Mary University of London and CrowdAI to co-host a second competition, Learning to Play: The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition. This competition is a brand-new challenge that proposes research on multi-agent reinforcement learning using multiple games. Participants create learning agents able to play multiple 3D games as defined in the Project Malmo platform. The aim of the competition is to encourage AI research on more general approaches through multi-player games. The challenge will consist of not one but several games, each involving tasks of varying difficulty and settings. This represents a very unique approach.

Diego Perez-Liebana, Lecturer in Computer Games and Artificial Intelligence at Queen Mary University of London, United Kingdom talked about the potential impact of the Malmo competitions on AI research, “Our research group has been running AI game-based competitions for many years and we are well aware of the multiple benefits these bring,” said Perez-Liebana. “They provide a common benchmark for multiple researchers across the globe to train their AI agents in a way that is comparable, allowing us to effectively contrast different techniques in a common domain,” he continued. Game AI competitions are a great resource for education, as they can be proposed as assignment or project from undergraduate to PhD level. “Games are fun, and so is AI,” said Perez-Liebana. Indeed, combining the two helps popularize challenges and solutions faster and more broadly than any other methods. Perez-Liebana pointed to the evolution of the Monte Carlo Tree Search methods during successive Go competitions that led to the use of this method in multiple other games and domains as a clear example of this.

Sharada Prasanna Mohanty, a PhD student at EPFL, Switzerland, co-founder of CrowdAI expressed his expectations regarding the competition. “With this challenge, our principal goal is to make available a series of problems for the community of multi-agent reinforcement learning researchers to collaboratively work on,” said Sharada Mohanty. “With Minecraft as the main platform enabling this research, we also hope to inspire many other researchers and engineers from various domains to get involved in reinforcement learning research. The success of this challenge can help establish these tasks as standard benchmark tasks for all multi-agent reinforcement learning researchers to compare their approaches in the future and at the same time can potentially help us better measure our own progress in multi-agent reinforcement learning research as a community over time.”

The competition is open to anyone worldwide. Visit the competition page (opens in new tab) for more detail about registration and rules. Qualifying rounds last until November 12th. The top 32 teams in the qualifying rounds can move forward to the knockout rounds of the final tournament, where team agents compete each other on an exciting set of games and tasks. The tournament will be a live competition in MARLO workshop (opens in new tab) at the 14th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (opens in new tab) (AIIDE’18) to be held at the University of Alberta in Edmonton, AB, Canada on November 14, 2018. We’re also calling for papers (opens in new tab) at the workshop.

We hope to see as many students, researchers and engineers as possible share their innovative approaches and creative ideas for multi-agent reinforcement learning at the workshop in Edmonton. Let’s kick-off! You are now in possession of the ball!

Continue reading

See all blog posts

Research Areas

Related academic programs

Related projects

Related events