10 Facts About MuZero

1.

MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules.

FactSnippet No. 845,780
2.

MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases.

FactSnippet No. 845,781
3.

MuZero really is discovering for itself how to build a model and understand it just from first principles.

FactSnippet No. 845,782
4.

MuZero is a combination of the high-performance planning of the AlphaZero algorithm with approaches to model-free reinforcement learning.

FactSnippet No. 845,783
5.

MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters.

FactSnippet No. 845,784

Related searches

DeepMind AlphaZero Atari
6.

MuZero surpassed both R2D2's mean and median performance across the suite of games, though it did not do better in every game.

FactSnippet No. 845,785
7.

MuZero used 16 third-generation tensor processing units for training, and 1000 TPUs for selfplay for board games, with 800 simulations per step and 8 TPUs for training and 32 TPUs for selfplay for Atari games, with 50 simulations per step.

FactSnippet No. 845,786
8.

MuZero matched AlphaZero's performance in chess and Shogi after roughly 1 million training steps.

FactSnippet No. 845,787
9.

MuZero was viewed as a significant advancement over AlphaZero, and a generalizable step forward in unsupervised learning techniques.

FactSnippet No. 845,788
10.

MuZero has been used as a reference implementation in other work, for instance as a way to generate model-based behavior.

FactSnippet No. 845,789