10 Facts About MuZero

MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules.

FactSnippet No. 845,780

MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases.

FactSnippet No. 845,781

MuZero really is discovering for itself how to build a model and understand it just from first principles.

FactSnippet No. 845,782

MuZero is a combination of the high-performance planning of the AlphaZero algorithm with approaches to model-free reinforcement learning.

FactSnippet No. 845,783

MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters.

FactSnippet No. 845,784

Related searches

DeepMind AlphaZero Atari

MuZero surpassed both R2D2's mean and median performance across the suite of games, though it did not do better in every game.

FactSnippet No. 845,785

MuZero used 16 third-generation tensor processing units for training, and 1000 TPUs for selfplay for board games, with 800 simulations per step and 8 TPUs for training and 32 TPUs for selfplay for Atari games, with 50 simulations per step.

FactSnippet No. 845,786

MuZero matched AlphaZero's performance in chess and Shogi after roughly 1 million training steps.

FactSnippet No. 845,787

MuZero was viewed as a significant advancement over AlphaZero, and a generalizable step forward in unsupervised learning techniques.

FactSnippet No. 845,788

10.

MuZero has been used as a reference implementation in other work, for instance as a way to generate model-based behavior.

FactSnippet No. 845,789