Artificial intelligence researchers at Elon Musk’s OpenAI project recently made a big advance by winning a video game. Unlike recent AI victories over top human players in the games of Go and poker, this AI breakthrough involved a game that many people haven’t heard of, Dota 2. But to the hundreds of millions of fans of this type of online multiplayer battle game, a computer that can beat a professional player is a big deal.


It’s also significant to AI researchers, especially those in companies such as GoogleFacebookMicrosoft and IBM, which are investing millions of dollars in creating superhuman AI players for digital games. As AI becomes ever more important in our society, it could have wider implications for all of us because of what it demonstrates about computers’ ability to “think” strategically.


What was particularly remarkable about the Dota 2 victory, achieved by a bot created by the billion-dollar non-profit research company OpenAI, was that its developers didn’t program it with deep understanding of game strategies. Instead, they used an approach known as deep reinforcement learning, where the computer starts with only rudimentary knowledge of game strategy.


By playing against itself millions of times, the AI learns to differentiate good move decisions (that lead to victory) from bad ones. The knowledge is stored in a huge data matrix containing millions of numbers, updated after every self-play game. These numbers encode what’s known as a “function”, the instructions that specify the AI’s learned strategy for every possible game situation. So after the AI researchers programmed the method for learning, the machine effectively taught itself how to make good move decisions.


Dota 2 is part of the massively growing eSports movement, where hundreds of millions of players watch their (human) heroes playing video games, online or in large stadium events. The top human players at Dota 2 are really, really good. They are millionaires who practice for ten hours per day, six or seven days per week. They have lucrative sponsorship deals, professional trainers, sports psychologists, strict health and fitness regimes and many of the other things you would associate with professional players in football or tennis.


So as an AI achievement, beating top human professionals DendiSumail and Arteezy ranks up there with beating human world champions in chess, Go and other games. This is especially true since Dota 2 involves a rich selection of tactics that play out on the screen in real time, meaning players have much less time to think than in turn-based board games.


There are some caveats. The OpenAI player won a two-player version of what is usually a ten-player team game. And each player could only play as one particular character in the game out of over typical 100 possibilities. So this is like beating an individual pro basketball player in a one-on-one game, a significant step that still falls short of the goal of beating a team of human professional players.


Shortly after the show match with Dendi, members of the large crowd were challenged to find ways to beat the AI player, with the first 50 being awarded prizes. All 50 prizes were claimed by humans adopting wacky strategies that the AI player had not previously seen, although the AI can now learn and adapt by itself so would avoid making the same mistakes again.


Why invest in game AI research?


The reason all this is of interest to blue-chip companies is that eSports games provide an easy performance measure that generates substantial public interest. Big firms have been investing vast sums in winning games for more than 20 years, since the triumph of IBM’s Deep Blue against the world chess champion, Garry Kasparov.


The real world is not that simple, and nor is reaching the goal of “artificial general intelligence” comparable to that of humans. But AI’s victory in Dota 2, just like in other games before it, could point to other exciting developments.


For one thing, games designers and players don’t want AI that can simply win a game but also make it more fun. Games provide a unique way to understand how people behave and in particular how human psychology interacts with AI behaviour. By capturing the data for millions of players, as we’re doing at the UK’s Digital Creativity Labs, we can effectively run a huge online psychology experiment that informs us as to what people want from AI, as we research new AI techniques.


Developing AI that can learn to make the best decisions in games could also feed into AI for making other strategic choices in the real world. The Dota 2 AI learns the “function” that gives it the strategy to follow any game situation. Similarly, we could imagine AI programs that learn functions for certain economic, environmental and health situations – for example a recession or an outbreak of disease. These functions would generate effective strategies for dealing with these situations, capable of suggesting good decisions in government or business.


One of the limitations of this kind of decision-making AI is that it can’t tell us why it makes a particular move. While AI may be able to help us make better decisions for some of the toughest strategic problems we face, we will still need humans in the decision loop to consider wider ethical and social considerations. Which will make getting humans and AI to work together more important than ever.


Peter CowlingDirector of IGGI and DC Labs, Professor of Computer Science, University of York