Machines That Play (Checkers)

series has been broken into 7 parts: This is Part 6 of the series. Machines That Play This series covers the history of Artificial Intelligence and games (until Deep Blue) and focuses on machines that played chess, checkers, and backgammon. The following topics are covered: how to build chess machines, Shannon’s work on chess, Turing’s work on chess, The Turk, El Ajedrecista, MANIAC, Bernstein chess program, Samuel’s checkers, Mac Hack VI, Cray Blitz, BKG, HiTech, Chinook, Deep Thought, TD-Gammon, and Deep Blue. Part 1: — this one Machines That Play (Overview) Part 2: Machines That Play (Building Chess Machines) Part 3: Machines That Play (Chess-Before Deep Blue) Part 4: Machines That Play (Deep Blue) Part 5: Machines That Play (Post Deep Blue) If you want a summary of the first 5 parts, focusing on the human elements, go here . Part 6: — this one Machines That Play (Checkers) Part 7: Machines That Play (Backgammon) Part 6: Machines That Play (Checkers) This part will cover Samuel’s checkers (1950s) and Jonathan Schaeffer’s Chinook (1990s). Samuel’s checkers was the first program to (before AI was coined). Chinook was the first computer program to win the world champion title against humans. Schaeffer also solved Checkers in 2007 — this is coming soon. learn Game complexity Before we begin, let’s look at some ways to measure . game complexity The of a game is the number of legal game positions reachable from the initial position of the game. state-space complexity The is the total number of possible games that can be played: the number of leaf nodes in the rooted at the game’s initial position. game tree size game tree The is the number of children at each node. For example, chess, suppose that a “node” is considered to be a legal position, then the average branching factor is estimated to be about 35. This means that, on average, a player has about 35 legal moves available at each turn. By comparison, the average branching factor for the game Go is 250! branching factor status: it is not possible to perform better (some of these entries were solved by humans) Optimal : performs better than all humans Super-human Minimax can be traced to a 1913 paper by , the father of modern set theory. The paper contained several errors and did not describe minimax correctly. He did, however, propose (but did not prove) what became known as : in any finite deterministic (i.e. in which chance does not affect the decision making process) perfect information two-player game (such as chess) there are three possibilities: either the first-player can force a win, or the second-player can force a win, or both players can force a draw. When applied to chess, Zermelo’s Theorem states “either White can force a win, or Black can force a win, or both sides can force at least a draw”. . (Checkers, on the other hand, : either player can force a draw.) Ernst Zermelo Zermelo’s Theorem We don’t (yet) know which it is for chess has been solved Improving the minimax procedure Von Neumann improved Zermelo’s minimax theorem to include games involving imperfect information and games with more than two players. In 1944, John Von Neumann and published . This is the groundbreaking book that created the research field of game theory. Originally, game theory addressed zero-sum games, but since then it has been applied to a wide range of behavioral relations, and has become a framework for modeling scenarios in which conflicts of interest exist among the participants. Oskar Morgenstern Theory of Games and Economic Behavior Each Player has complete knowledge of the game state. For two players games, they take alternate turns. Examples: Chess, Checkers, Connect-Four, Go, Othello What does it mean to say a game is perfect? Some of the game state is hidden. Examples: Poker, Bridge What does it mean to say a game is imperfect? Game involves an element of chance. Example: Backgammon (involves dice rolls) What does it mean to say a game is not deterministic? Now, let’s talk about the machines. AI before AI (1952–1962) Samuel’s checkers Game: Checkers IBM 701 — the first major commercial computer In 1949, Arthur Samuel joined IBM’s Poughkeepsie lab and in 1952 he programmed IBM’s first major computer — IBM 701 — to play checkers. It was the first game-playing program to run on a computer — it was the first checkers program On February 24, 1956, Arthur Samuel demonstrated the program and played a game of checkers on television.Before the demonstration, Thomas J. Watson, Sr., the founder and president of IBM, predicted that the demonstration would make a strong impression — and . It did! it would raise the price of IBM stock by 15 points Arthur Samuel playing checkers on IBM 701 [Side note. See titled The Computer Pioneers: The Development of the IBM 701: “ ”] videos The Computer Pioneers is a video oral history project produced by Richard Jay Solomon. This segment of the series discusses the development of the IBM 701 model computer, also known as the Defense Calculator, in the early 1950s. These interviews were conducted on July 12th, 1983 and feature several members of the IBM 701’s development team including Jerrier Haddad, Clarence Frizzell, Nathan Rochester, and Richard Whalen. Samuel’s checkers: The first machine to learn By 1955, Samuel had done something groundbreaking; he had created a program that could — something that no one had done before — and this was demonstrated on television in 1956. learn Samuel had been thinking about machine learning since he joined IBM and wanted to focus on building programs that could learn to play the game of checkers. In 1959 he coined the term machine learning as the “ ”. He later , field of study that gives computers the ability to learn without being explicitly programmed said “I became so intrigued with this general problem of writing a program that would appear to exhibit intelligence that it was to occupy my thoughts during almost every free moment for the entire duration of my employment by IBM and indeed for some years beyond.” He published a seminal paper in 1959, titled , where he talked about how a machine could look ahead “ ”. The computer started out losing to Samuel and eventually beat Samuel. Some Studies in Machine Learning Using the Game of Checkers by evaluating the resulting board positions much as a human player might do He had programmed the “ ” digital computer to behave in a way which, if done by human beings or animals, would be described as involving the process of learning. In fact, the program had beaten its creator and it had done that by in a relatively short amount of time — “ ” learning a computer can be programmed so that it will learn to play a better game of checkers that can be played by the person who wrote the program. Furthermore, it can learn to do this in a remarkably short period of time (8 or 10 hours of machine-playing time). Imagine this: 8AM: Your computer program isn’t any good at checkers. You teach it how to play. 9AM: You leave it alone and let it play by itself the entire day. It plays against itself and learns from its mistakes. 6PM: You come back and play it. It beats you. And that was in the 1950s — with slow computers of that time. In some sense, this is the earliest glimpse of what lies ahead in games: no matter what the rate of improvement is for humans, once machines learn to learn, it will be hard to keep up with machines, whose progress will end up being measured exponentially. And ours won’t. No matter what the rate of improvement is for humans, once machines learn to learn, it will be hard to keep up with machines, whose progress will end up being measured exponentially. And ours won’t. Insanely Immense Search Trees (solution: alpha-beta pruning) We know (from from Shannon’s work described in that the main driver of the machine that played games was a search tree of the set of all possible board positions reachable from the current state. Machines That Play (Chess) Checkers is extremely complex — it has roughly 500 billion billion possible positions ( ). About 10²⁰ possible board positions compared to Chess (10⁴⁷) or Go (10²⁵⁰)). Even though checkers is simpler than those games, it is still complex enough that a brute force only approach is impractical. 5 x 10²⁰ In his paper, Samuel said that “ “at our present stage of knowledge, the only practical approach, even with the help of the digital computer, will be through the development of heuristics which tend to ape human behavior. Samuel’s program was based on Shannon’s minimax strategy to find the best move from a given current position. Recall that in two-player games, the minimax algorithm determines the best move — minimax is a recursive algorithm and it chooses an optimal move based on the assumption that both players play optimally. Simply put, it’s trying to model what we do when we play: we think “ ” If I make this move, then my opponent can only make these moves, then I will make this move,… From Samuel’s paper explaining minimax The problem with a full minimax search algorithm is that it explores all parts of the tree, including the parts of the tree it doesn’t need to. In other words, given a board position, human experts tend to “know” that some moves are irrelevant and some moves are good. They don’t explore all moves, they prune out the irrelevant or bad ones up front. (full search): Circles represent the moves of the maximizing player, and squares represent the moves of the opponent (minimizing player). The values inside the circles and squares represent the value α of the minimax algorithm. The red arrows represent the chosen move, the numbers on the left represent the tree depth and the blue arrow the chosen move. Minimax example The rules of thumb that human experts use to prune out bad moves are called “heuristics”. In games, heuristics are implemented to cut down on computation time. An (easy) example of a heuristic in chess is that if a move leads to the player’s king being in checkmate, then the algorithm should not look further down that path because a player will never want to make that move. There is no point in exploring that branch — it is a waste of precious computational resources, but simple minimax search will still explore that path. g will not explore that path. Alpha-beta prunin To attack the complexity problem, Samuel implemented alpha-beta pruning — instead of searching each path to the game’s end, Samuel developed a scoring function based on the position of the board at any given time. is an optimization of minimax. At a given node n, if a player has a better choice a (at a node further up), then n will not be reached. By looking at some successors of n, we can know enough about n to prune it out. The idea is that the algorithm will maintain two values, alpha and beta, where alpha represents the minimum score that the maximizing player is guaranteed and beta represents the maximum score that the minimizing player is guaranteed. Both players start with their worst possible score: alpha is negative infinity and beta is positive infinity. So when to prune? Well, each time beta (the maximum score that the minimizing player is guaranteed) becomes less than or equal to alpha (the minimum score that the maximizing player is guaranteed) (or when alpha becomes greater than or equal to beta), the maximizing player can prune those nodes. The chart below explains this: Alpha-beta pruning example: There is no need need to explore any of the paths whose edges are crossed-out, since other moves (that will perform better) have been found Alpha-Beta pruning The alpha-beta algorithm produces the same result as the minimax algorithm, but is more efficient because entire branches can be eliminated from the search process. This allows the program to evaluate the minimax search tree much deeper, while using the same resources. This combination of the minimax algorithm and alpha-beta pruning significantly reduced the total number of branches of the tree, which made it feasible to play games on a computer. Here’s the (commonly referenced) basic idea: “If you have an idea that is surely bad, don’t take the time to see how truly awful it is.” — Pat Winston Introducing Machine Learning In addition to minimax and alpha-beta pruning, Samuel used a fundamentally new and important component: . He used two main learning methods to create the first competent AI program: a) and b) learning by generalization. learning rote learning Rote learning meant the program would save all of the board positions encountered during play, along with their scores. Then when those moves need to be evaluated further down the tree, they could simply be referenced instead of evaluated, thus saving some computing time. Even though this wasn’t an advanced form of learning, the idea is for the program to use the saved time to compute further in depth. And rote learning produced slow but continuous improvements, which were most effective for openings and endgames. From Samuel’s paper explaining rote-learning Learning by generalization meant modifying the evaluation function based on previous games played by the program. This was groundbreaking because it showed that a a program could learn by playing against previous versions of itself (which would be a key aspect of AlphaGo). This learning method came closest to the technique later dubbed temporal-difference learning ( ) — that the value of a state should equal the value of likely following states. (Side note: Samuel’s method was conceptually the same as that used much later by Tesauro in TD-Gammon.) The version using generalized learning was able to develop a good middle game but remained weak in opening and endgame play. Later in the series, we will see how some of the most successful programs would attack these issues. reinforcement learning TD-Lambda Samuel did his work despite poor computation power in that period, while working essentially alone and doing his own programming. The principles of machine learning verified by the program’s “ ” Today there is no field that is untouched by machine learning. experiments are, of course, applicable to many other situations. Successive enhancements made by Samuel allowed the computer to improve and achieve good skill — it eventually beat Samuel, but it still could not beat the experts consistently. But it did defeat an expert player however, there is a controversy about whether Nealy lost the game when a draw was available (i.e. he played poorly) or whether Samuel’s program won (i.e. it played well). Robert Nealy, The next year a six match rematch was won by Nealy 5–1. Samuel’s Checkers Program and Robert Nealy Samuel’s checkers demonstrated that a computer could be programmed to learn to play a game better than its creator. It could improve its performance doing something that humans could not — by playing thousands of games against itself to practice. This was a monumental achievement for his day. But Samuel himself had no illusions about the strength of his AI program. Even though it is a milestone in AI, it was blown out of proportion by the media. Some , “ .” said This work led to the false impression that checkers was a “solved” game. As a result, researchers moved on to chess, and largely ignored checkers for over 25 years Samuel’s work, while groundbreaking on its own, may have restricted research into Checkers until 1989 when began working on . Jonathan Schaeffer Chinook In 1990, Schaeffer reached out to Samuel to tell him about his program’s massive improvements, but Samuel had just died. It was the end of one of the founders, one who gave us the very first glimpse of machine intelligence. Miles ahead of the strongest human players Chinook (1989–1996) Game: Checkers Chinook meets human — Take 1 Is he really human or is he superhuman? The year was 1991. had agreed to a friendly checkers match against . Tinsley had been the world’s best checkers player for 40 years. Regarding Tinsley it was said that he was “ ” Marion Tinsley Chinook to checkers what Leonardo da Vinci was to science, what Michelangelo was to art and what Beethoven was to music. After Samuel’s work on checkers, there was a false impression that . As a result, researchers moved on to chess and mostly ignored checkers until began working on Chinook in 1989. Schaeffer’s was to develop a program capable of defeating the checkers player. checkers was a “solved” game Jonathan Schaeffer goal best Back to 1991, Chinook was playing the best player ever, Tinsley. The first nine games they played were all draws. In the tenth game, Chinook made a move that it thought was slightly advantageous. Seeing this Tinsley remarked, “ ” In an Schaeffer said. “ ” It turned out that Tinsley began to pull ahead and Chinook resigned. Schaeffer said (about Tinsley), “ ” You’re going to regret that. interview And at the time, I was thinking, what the heck does he know, what could possibly go wrong? In his notes to the game, he later wrote that he had seen all the way to the end of the game and he knew he was going to win. After the game when Schaeffer looked back into the database, he discovered that from that move to the end of the game, if both sides played perfectly, he would lose every time. But to see that, a computer or a human would have to look 64 moves ahead. Tinsley seemed to have picked the only strategy that could have defeated Chinook from that point — Tinsley was able to see the win 64 moves ahead. “ ” Schaeffer . I was absolutely stunned, said “ ” How do you compete with somebody whose understanding of the game is so deep that he immediately knows through experience or knowledge or doing some amazing search that he was gonna win that position? Chinook meets human — Take 2 It’s only a matter of time now Chinook and Marion Tinsley met again in 1992 at the Man-Machine World Championship. It was the first time in history a human was playing a computer for a world championship title. (August 1992). From left to right: Duane Szafron, Joe Culberson, Paul Lu, Brent Knight, Jonathan Schaeffer, Rob Lake, and Steve Sutphen. Our checkers expert, Norman Treloar, is missing Chinook team Because of Tinsley’s greatness, most players would play cautiously against Tinsley, hoping for a draw. Chinook, however, played a very different game. Regarding how Chinook played, , “ ’” Schaeffer said It played brash, aggressive moves — it walked on the edge of a precipice…It would do things people looked at and said, ‘Man, is that program crazy? Even though Chinook lost against Tinsley, it came close to defeating Tinsley. It played a stunning eighth game and won; it was Tinsley’s sixth loss in 40 years. According to , Schaffer felt sad about Tinsley’s loss and later wrote in this book: The Atlantic “ ” We’re still members of the human race and Chinook defeating Tinsley in a single game means that it will only be a matter of time before computers will be supreme in checkers, and eventually in other games like chess. But it wasn’t just single game. Chinook won again in game 16. No living player had defeated Tinsley more than once. Tinsley had lost only seven games in his entire 45 year career, two of them to Chinook. one Finally, a machine was becoming perfect. more But Chinook had some kind of an error, which forced Schaffer to resign the game. Schaeffer was devastated. Chinook meets human — Take 3 Losing his divine backing In 1994, Chinook, after having beaten another player, was ready to face Tinsley again. , “ Scientific American reported The night before the match, …Tinsley dreamt that God spoke to him and said, “I like Jonathan, too,” which had led him to believe that he might have lost exclusive divine backing. ” Chinook and since 1992, Schaeffer’s team had spent thousands of hours improving Chinook. hadn’t lost a game in its last 125 games Leading up to the match, Tinsley’s stomach had been hurting. It hurt again the day of the match. After six games, all of which were draws, he needed to see a doctor so Schaeffer took him to the hospital. The next day Tinsley was told there was a lump on his pancreas — he had pancreatic cancer. He withdrew. , rated the number two player in the world at the time, replaced Tinsley and played Chinook. Chinook won and became the first computer program in history to win a human world championship. Seven months later, Tinsley died. Don Lafferty Schaeffer’s program never got to beat the checkers player ever — and this was why he started it all in 1989. The best human player never truly lost a match to Chinook. By the late 1980s checkers programs had become more advanced. Chinook played its last tournament in 1996, where Chinook finished “ .” best miles ahead of the strongest human players in the U.S. Championship But that wasn’t enough for Schaeffer. He needed to make sure his program could beat the best player ever. Humans are only almost perfect. “ But my computer program is perfect.” From 1997 to 2001, Schaeffer suspended Chinook and began working on solving the game of checkers, which meant his program would always know the right move. It would be perfect. In an interview with , Schaeffer said, “ Scientific American From the end of the Tinsley saga in ’94–’95 until 2007, I worked obsessively on building a perfect checkers program. The reason was simple: I wanted to get rid of the ghost of Marion Tinsley. People said to me, ‘You could never have beaten Tinsley because he was perfect.’ “Well, yes, we would have beaten Tinsley because he was only almost perfect. But my computer program is perfect. ” In 2007 Schaeffer and his team published a in the journal Science titled : the program could no longer be defeated by anyone, human or otherwise. paper Checkers Is Solved Schaeffer said. .” “Eighteen years later, I’m finally done Solving Checkers (2007) Game: checkers In 2007, the makers of Chinook published a paper in the journal Science announcing that Chinook had completely solved Checkers: the program could no longer be defeated by any, human or otherwise. Jonathan Schaeffer and his team had been working to solve the checkers problem since 1989. The article stated, “ ” Checkers is the largest game that has been solved to date, with a search space of 5×10²⁰. “The number of calculations involved was , which were done over a period of 18 years. The process involved from 200 desktop computers at its peak down to around 50”. …checkers is now solved: Perfect play by both sides leads to a draw. This is the most challenging popular game to be solved to date, roughly one million times as complex as Connect Four. 10¹⁴ Coming Soon