
Many advances since then have demonstrated that these approaches can be scaled to progressively challenging domains. When combined, the notions of learning-based systems and self-play provide a powerful paradigm of open-ended learning. Its developers used the notion of self-play to make the system more robust: by playing against versions of itself, the system grew increasingly proficient at the game. Instead of playing according to hard-coded rules or heuristics, TD-Gammon was designed to use reinforcement learning to figure out, through trial-and-error, how to play the game in a way that maximises its probability of winning. In 1992, researchers at IBM developed TD-Gammon, combining a learning-based system with a neural network to play the game of backgammon. Learning-based systems and self-play are elegant research concepts which have facilitated remarkable advances in artificial intelligence.

We expect these methods could be applied to many other domains. Using the advances described in our Nature paper, AlphaStar was ranked above 99.8% of active players on, and achieved a Grandmaster level for all three StarCraft II races: Protoss, Terran, and Zerg. We chose to use general-purpose machine learning techniques – including neural networks, self-play via reinforcement learning, multi-agent learning, and imitation learning – to learn directly from game data with general purpose techniques.

Since then, we have taken on a much greater challenge: playing the full game at a Grandmaster level under professionally approved conditions.

This January, a preliminary version of AlphaStar challenged two of the world's top players in StarCraft II, one of the most enduring and popular real-time strategy video games of all time. TL DR: AlphaStar is the first AI to reach the top league of a widely popular esport without any game restrictions.
