Google’s DeepMind today shared the results of research and experiments in which multiple AI systems were trained to play Capture the Flag on Quake III Arena, a multiplayer first-person shooter game. An AI trained in the process is now better than most human players in the game, regardless of whether it’s playing with a human or machine teammate.
The AI, named For the Win (FTW), played nearly 450,000 games of Quake III Arena to gain its dominance over human players and establish its understanding of how to effectively work with other machines and humans. DeepMind refers to the practice of training multiple independently operating agents to take collective action as multi-agent learning.
[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":2370285,"post_type":"story","post_chan":"none","tags":"category-games,category-science-computer-science","ai":true,"category":"none","all_categories":"ai,business,games,","session":"C"}']“We train agents that learn and act as individuals, but which must be able to play on teams with and against any other agents, artificial or human,” the company said today in a blog post. “From a multi-agent perspective, [Capture the Flag] requires players to both successfully cooperate with their teammates as well as compete with the opposing team, while remaining robust to any playing style they might encounter.”
DeepMind is perhaps best known as the creator of AlphaGo, an AI system that beat the top Go player in the world in May 2017. AlphaGo Zero, a descendant of AlphaGo, was later made to grow better by playing games against itself.
AI Weekly
The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.
Included with VentureBeat Insider and VentureBeat VIP memberships.
Whereas some previous studies of video game play with reinforcement learning focus on environments with a handful of players, DeepMind’s experiment involved 30 agents playing concurrently four at a time against humans or machines.
In a tournament with and against 40 human Capture the Flag players, machine-only teams went undefeated in games against human-only teams and stood a 95 percent chance of winning against teams in which humans played with a machine partner.
On average, human-machine teams captured 16 flags per game less than a team of two FTW agents.
Agents were found to be efficient than humans in tagging, achieving the tactic 80 percent of the time compared to humans 48 percent. FTW continued to keep its edge over human players even when its tagging abilities were suppressed to levels comparable to humans.
Interestingly enough, a survey of human participants found FTW to be more collaborative than human teammates.
[aditude-amp id="medium1" targeting='{"env":"staging","page_type":"article","post_id":2370285,"post_type":"story","post_chan":"none","tags":"category-games,category-science-computer-science","ai":true,"category":"none","all_categories":"ai,business,games,","session":"C"}']
Authors of the study include DeepMind founder and CEO Demis Hassabis.
The research was carried out with some unique challenges.
Capture the Flag was played in settings with random map layouts rather than a static, consistent environment in order to train systems’ general understanding for better outcomes. Indoor environments with flat terrain and outdoor environments with varying elevations were also introduced. Agents also operated in slow or fast modes and developed their own internal rewards systems.
The only signal used to teach the agents was whether their team won the game by capturing the most flags within five minutes.
[aditude-amp id="medium2" targeting='{"env":"staging","page_type":"article","post_id":2370285,"post_type":"story","post_chan":"none","tags":"category-games,category-science-computer-science","ai":true,"category":"none","all_categories":"ai,business,games,","session":"C"}']
No rules of the game were given to machines beforehand, yet over time, FTW learned basic strategies like home base defense, following a teammate, or camping out in an opponent’s base to tag them after a flag has been captured.
Tagging, the act of touching an opponent to send them back to their spawning point, was incorporated into tactics used to win matches as well.
DeepMind’s study is the latest from AI researchers to apply reinforcement learning to video games as a way to train a machine strategy, memory, or other traits common among humans but which do not naturally occur with computers.
[aditude-amp id="medium3" targeting='{"env":"staging","page_type":"article","post_id":2370285,"post_type":"story","post_chan":"none","tags":"category-games,category-science-computer-science","ai":true,"category":"none","all_categories":"ai,business,games,","session":"C"}']
Last month, OpenAI revealed it used reinforcement learning to train AI to beat talented teams of humans playing Dota 2.
Insights that can be drawn from multi-agent environments can be used to inform human-machine interaction and train AI systems to complement one another or work together.
SRI International, for example, as part of DARPA’s Lifelong-Learning Machines research program, is training AI systems to play the role-playing game StarCraft: Remastered in order to train them to take collective action and act in swarms or travel in group formations as characters do in the game.
DeepMind has also found much value in StarCraft. In August, DeepMind announced the release of the StarCraft II API for reinforcement learning as part of a partnership with Blizzard.
[aditude-amp id="medium4" targeting='{"env":"staging","page_type":"article","post_id":2370285,"post_type":"story","post_chan":"none","tags":"category-games,category-science-computer-science","ai":true,"category":"none","all_categories":"ai,business,games,","session":"C"}']
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More