Foreword
For us, the members of the AlphaGo team, the AlphaGo story was the adventure of a lifetime. It began, as many great adventures do, with a small steptraining a simple convolutional neural network on records of Go games played by strong human players. This led to pivotal breakthroughs in the recent development of machine learning, as well as a series of unforgettable events, including matches against the formidable Go professionals Fan Hui, Lee Sedol, and Ke Jie. Were proud to see the lasting impact of these matches on the way Go is played around the world, as well as their role in making more people aware of, and interested in, the field of artificial intelligence.
But why, you might ask, should we care about games? Just as children use games to learn about aspects of the real world, so researchers in machine learning use them to train artificial software agents. In this vein, the AlphaGo project is part of DeepMinds strategy to use games as simulated microcosms of the real world. This helps us study artificial intelligence and train learning agents with the goal of one day building general purpose learning systems capable of solving the worlds most complex problems.
AlphaGo works in a way that is similar to the two modes of thinking that Nobel laureate Daniel Kahnemann describes in his book on human cognition, Thinking Fast and Slow. In the case of AlphaGo, the slow mode of thinking is carried out by a planning algorithm called Monte Carlo Tree Search, which plans from a given position by expanding the game tree that represents possible future moves and counter moves. But with roughly 10^170 (1 followed by 170 0s) many possible Go positions, searching through every sequence of a game proves impossible. To get around this and to reduce the size of the search space, we paired the Monte Carlo Tree Search with a deep learning componenttwo neural networks trained to estimate how likely each side is to win, and what the most promising moves are.
A later version, AlphaZero, uses principles of reinforcement learning to play entirely against itself, eliminating the need for any human training data. It learned from scratch the game of Go (as well as chess and shogi), often discovering (and later discarding) many strategies developed by human players over hundreds of years and creating many of its own unique strategies along the way.
Over the course of this book, Max Pumperla and Kevin Ferguson take you on this fascinating journey from AlphaGo through to its later extensions. By the end, you will not only understand how to implement an AlphaGo-style Go engine, but you will also have great practical understanding of some of the most important building blocks of modern AI algorithms: Monte Carlo Tree Search, deep learning, and reinforcement learning. The authors have carefully tied these topics together, using the game of Go as an exciting and accessible running example. As an aside, you will have learned the basics of one of the most beautiful and challenging games ever invented.
Furthermore, the book empowers you from the beginning to build a working Go bot, which develops over the course of the book, from making entirely random moves to becoming a sophisticated self-learning Go AI. The authors take you by the hand, providing both excellent explanations of the underlying concepts, as well as executable Python code. They do not hesitate to dive into the necessary details of topics like data formats, deployment, and cloud computing necessary for you to actually get your Go bot to work and play.
In summary, Deep Learning and the Game of Go is a highly readable and engaging introduction to modern artificial intelligence and machine learning. It succeeds in taking what has been described as one of the most exciting milestones in artificial intelligence and transforming it into an enjoyable first course in the subject. Any reader who follows this path will be equipped to understand and build modern AI systems, with possible applications in all those situations that require a combination of fast pattern matching and slow planning. That is, the thinking fast and slow required for basic cognition.
T HORE G RAEPEL, RESEARCH SCIENTIST , D EEP M IND , ON BEHALF OF THE A LPHA G O TEAM AT D EEP M IND
Preface
When AlphaGo hit the news in early 2016, we were extremely excited about this groundbreaking advancement in computer Go. At the time, it was largely conjectured that human-level artificial intelligence for the game of Go was at least 10 years in the future. We followed the games meticulously and didnt shy away from waking up early or staying up late to watch the broadcasted games live. Indeed, we had good companymillions of people around the globe were captivated by the games against Fan Hui, Lee Sedol, and later Ke Jie and others.
Shortly after the emergence of AlphaGo, we picked up work on a little open source library we coined BetaGo (see http://github.com/maxpumperla/betago), to see if we could implement some of the core mechanisms running AlphaGo ourselves. The idea of BetaGo was to illustrate some of the techniques behind AlphaGo for interested developers. While we were realistic enough to accept that we didnt have the resources (time, computing power, or intelligence) to compete with DeepMinds incredible achievement, it has been a lot of fun to create our own Go bot.