DeepMind’s MuZero conquers and learns the rules as it does

games

Credit: Unsplash / CC0 Public Domain

Albert Einstein once said, “You need to learn the rules of the game and then play better than anyone.” That could well be DeepMind’s motto, as a new report reveals that it has developed a program that can master complex games, even without knowing the rules.

DeepMind, a subsidiary of Alphabet, has already made breakthrough advances using reinforcement learning to teach programs to master the Chinese board game Go and the Japanese strategy game Shogi, as well as challenging Atari chess and video games. In all of these cases, computers were given the rules of the game.

But Nature reported today that DeepMind’s MuZero accomplished the same things – and in some cases, surpassed previous programs – without first learning the rules.

DeepMind’s programmers were based on a principle called “advance search”. With this approach, MuZero assesses a series of potential moves based on how an opponent would respond. While there is likely to be an impressive number of potential moves in complex games like chess, MuZero prioritizes the most relevant and likely maneuvers, learning from successful maneuvers and avoiding failed ones.

In acting against Atari’s Pac-Man, MuZero restricted himself to considering only six or seven potential future moves, but he still performed admirably, according to the researchers.

“For the first time, we really have a system that is able to build its own understanding of how the world works and use that understanding to do the kind of sophisticated planning that you’ve seen previously for games like chess,” said the lead research scientist at DeepMind, David Silver. MuZero can “start from scratch, and just by trial and error, discover the rules of the world and use those rules to achieve a kind of superhuman performance.”

Silver envisions greater applications for MuZero than mere games. Progress has already been made in video compression, a challenging task considering the large number of varied video formats and various compression modes. So far, they have achieved a 5% improvement in compression, a major achievement for the Google company, which also handles the giant video cache on the world’s second most popular website, YouTube, where a billion hours of content are seen daily. (Site # 1? Google.)

Silver says the lab is also looking for robotic programming and protein architecture design, which holds promise for personalized drug production.

It is a “significant step forward,” according to Wendy Hall, a professor of computer science at the University of Southampton and a member of England’s artificial intelligence council. “The results of DeepMind’s work are amazing and I am amazed at what they will be able to achieve in the future, given the resources they have,” she said.

But she also raised concerns about the potential for abuse. “My concern is that, while they constantly strive to improve the performance of their algorithms and apply the results for the benefit of society, DeepMind teams are not putting as much effort into thinking about the possible unintended consequences of their work,” she said. .

In fact, the United States Air Force took advantage of the first research papers covering MuZero that were released last year and used the information to design an AI system that could launch missiles from a U-2 spy plane against specific targets.

When asked by Wired what he thought of such military applications, Silver left no doubt about his concerns.

“I oppose the use of AI in any deadly weapon and wish we had made more progress towards banning lethal autonomous weapons,” he said. He added that DeepMind and its co-founders signed the Lethal Autonomous Weapons Commitment, which affirms the belief that deadly technology should always remain under human control, not AI-based algorithms.

Silver says the challenges ahead are to understand and implement algorithms as effective and powerful as the human brain. “We must aim to achieve this. The first step in making this journey is to try to understand what it means to achieve intelligence,” he said. “We think it really matters to enrich what AI can really do because the world is a messy place. It’s unknown – no one has given us this incredible rule book that says, ‘Oh, this is exactly how the world works'”, said Silver. “If we want our AI to go out into the world and be able to plan and look forward to problems where no one gives us the rule book, we really need that.”


Alphabet’s DeepMind dominates Atari games


More information:
Julian Schrittwieser et al. Master Atari, Go, chess and shogi planning with a learned model, Nature (2020). DOI: 10.1038 / s41586-020-03051-4

© 2020 Science X Network

Quote: DeepMind’s MuZero conquers and learns the rules as it does (2020, December 24) retrieved on December 25, 2020 at https://techxplore.com/news/2020-12-deepmind-muzero-conquers.html

This document is subject to copyright. In addition to any fair dealing for the purpose of study or private research, no part may be reproduced without written permission. The content is provided for informational purposes only.

Source