Machines That Play series has been broken into 7 parts. This is Part 2 of the series.
This series covers the history of Artificial Intelligence and games (until Deep Blue) and focuses on machines that played chess, checkers, and backgammon. The following topics are covered: how to build chess machines, Shannon’s work on chess, Turing’s work on chess, The Turk, El Ajedrecista, MANIAC, Bernstein chess program, Samuel’s checkers, Mac Hack VI, Cray Blitz, BKG, HiTech, Chinook, Deep Thought, TD-Gammon, and Deep Blue.
Part 1: Machines That Play (Overview)
Part 2: Machines That Play (Building Chess Machines) — this one
Part 3: Machines That Play (Chess — Before Deep Blue)
Part 4: Machines That Play (Deep Blue)
Part 5: Machines That Play (Post Deep Blue)
If you want a summary of the first 5 parts, focusing on the human elements, go here.
Part 6: Machines That Play (Checkers)
Part 7: Machines That Play (Backgammon)
It will cover 1) why we wanted to build chess machines, 2) high level strategies to building chess machines and 3) progress in this field from 1940s-1990s.
Why would anyone want to teach a machine how to follow some arbitrary man-made rules to move a bunch of wooden pieces on a checkerboard, with the sole aim of cornering one special wooden piece? It’s so human to attack and capture pawns, knights, bishops and the queen to finally corner the king into an inescapable position. Those are our actions, that is our goal and we balance strategy and tactics to suit it. Then why teach machines to play chess?
Chess is an old game. It is believed to have originated in Eastern India (280–550). It reached Western Europe and Russia in the 9th century and by the year 1000, it had spread throughout Europe. It became popular and writings about chess theory (of how to play chess) began to appear in the 15th century.
Throughout history, many people have said many different things about chess and its connection to our intelligence and our mind.
Chess has long been regarded as the game of intellect. And many people argued that a machine that could successfully play chess would prove that thinking can be modeled/understood or that machines can be built to think. And that is exactly why people wanted to build a machine that could follow some arbitrary rules of our game and become so good at it that it could one day beat us at it.
Norbert Wiener in 1948 book Cybernetics asked “whether it is possible to construct a chess-playing machine, and whether this sort of ability represents an essential difference between the potentialities of the machine and the mind.”
It turned out that we would build a successful chess-playing machine in 1997 and it would beat the best of us, but it wouldn’t necessarily help us understand our mind — its way of playing would range from being extremely human to being almost alien and it would surely challenge our conception of what to call intelligence.
Chess is a two-player zero sum game (i.e. if one player wins, then the other player loses) with perfect information (i.e. in which players know, at each time, all moves that have taken place at any given point).
How do we usually play this game? We do the following:
From this perspective, almost all chess computers must deal with these fundamental steps. And in doing that, a chess computer would have to address the following key problems:
Let’s look at each in detail:
Representing the “board”: A chess computer would need to represent the chessboard and the pieces that occupy squares on the board, i.e., it would need to represent a single position in data structures. There are several ways of doing this — see here.
Before we talk about the next two steps, let’s understand why building a chess computer is hard. A chess computer would receive a given chess position as input and it would need to calculate the next best move. A lot of difficulty is here because of the complexity of chess.
In other words,
The problem space for chess can represented by a game tree, where nodes represent board positions and edges represent legal moves. The complete game tree for a game is the tree starting at the initial position (start of the game), containing all possible moves from each position, and ending at terminal position (end of the game: win/lose/draw). But chess has about 10^120 moves (and about10^40 nodes)!!! So the best way to think about a game tree is as a theoretical construct — it cannot be realized in the real world, it’s just too massive. But a player (human or computer) still needs to find a good move. This means a chess computer still needs to search some tree which is not too massive, but still has enough (relevant) nodes that can be examined to allow it to determine a next “good” move.
Games, including chess, are difficult because the search trees become astronomically large.
Here are some ways to measure game complexity.
The state-space complexity of a game is the number of legal game positions reachable from the initial position of the game.
The game tree size is the total number of possible games that can be played: the number of leaf nodes in the game tree rooted at the game’s initial position.
The branching factor is the number of children at each node. For example, chess, suppose that a “node” is considered to be a legal position, then the average branching factor is estimated to be about 35. This means that, on average, a player has about 35 legal moves available at each turn. By comparison, the average branching factor for the game Go is 250!
A quick look at game complexity of some commonly known games as well as their status:
Optimal status: it is not possible to perform better (some of these entries were solved by humans)
Super-human: performs better than all humans
Complexity of some games
What, then, does a chess computer have to do in order to beat the world champion?
Good game programs, therefore, must: 1) prune irrelevant branches of the game tree, 2) use good evaluation functions, and 3) look ahead as many moves as possible (and as fast a possible).
In particular, chess computers would need to do the following well:
Generating all legal next states: A chess computer would need to know how to identify possible moves and select the most promising moves for further analysis (i.e. prune out bad moves and keep good moves). Since it is impractical (actually impossible!) to consider every possible move, a computer must make choices by its use of search techniques — these are algorithms to select the best move. This is where a lot of the earlier artificial intelligence innovations took place. For example, a chess computer can evaluate 2 or 5 or 10 or 20 moves ahead in advance. And the depth of the tree it can generate depends on the speed of the computer — the faster the computer generates the moves, the better the performance. Hence the emphasis on computational power.
Evaluating a position: Once a chess computer generates the tree, it needs to evaluate the positions. When the search space is too large, the game tree can be created to a certain depth only, i.e. a computer cannot look ahead to all possible end (win/lose/draw) positions. It must, instead, look ahead a few plies (half moves) and compare the possible positions, i.e. it must correctly evaluate the value of a board position, given no further search will be done at that position. This is done via the evaluation function, which assigns real number scores to these board positions. This evaluation function (and the algorithms) are often vastly different between different chess programs. It is extremely complex to tell how good a position is. For example, Deep Blue examined 200 million positions per second and used very sophisticated evaluation — it had over 8000 features as part of its function.
Many chess computers also added endgame databases. These are databases of pre-calculated position evaluation of hundreds (or thousands or millions) of past games from top players. These endgame databases would mainly be used for openings and endgames.
There were two main philosophical approaches to developing chess computers: emulation vs. engineering — should computers emulate human knowledge and decision-making or should computers improve search via brute force? Those focusing on the first approach would build programs that had a lot of chess knowledge and a relatively smaller focus on search. Those focusing on the engineering approach would focus on computational power, by using special-purpose hardware, and search innovations. We’ll see that the best chess computers used the second approach, but even they ended up using a lot of chess knowledge and sophisticated evaluation heuristics.
Keep this framework in mind while reading through the remaining parts.
Optional: Below is a brief summary of progress in this direction. Key events. Key people. Key discoveries. Key machines. Details in other parts.
Computing power was limited in the 1950s, so machines could only play at a very basic level. In this period researchers developed the fundamental techniques for evaluating chess positions and searching best possible moves, considering opponent’s possible counter-moves. These ideas are still in use today.
In the 1960s, AI pioneers Herbert Simon and John McCarthy referred to chess as “the Drosophila of AI”, which meant that chess, like the common fruit fly, represented a relatively simple system that could also be used to explore larger, more complex real-world phenomena.
By the end of the 1960s, computer chess programs were good enough to occasionally beat against club-level or amateur players.
In the 1950s and 1960s, early pioneers focused on chess heuristics (rules of thumb) to choose the best next moves. Programs in 1970 also used heuristics, however, there was a much stronger focus was on software improvements and faster and more specialized hardware. Main chess programs used transposition tables (or hash tables) which stored information about positions the program had already searched. So, if the same position is reached on the board, the program wouldn’t need to search again; it would use the previously generated information. This customized hardware and software allowed programs to conduct much deeper searches of game trees (involving millions of chess positions), something humans did not do (because they could not) .
The 1980s brought an era of low cost chess computers. First microprocessor-based chess programs started becoming available. Because of the availability of home computers and these programs, anyone could now play chess (and improve their game) against a machine. By the mid-1980s, the sophistication of microprocessor-based chess software had improved so much they began winning tournaments — both against supercomputer based chess programs and some top-ranked human players.
Additionally, several search innovations were introduced: iterative deepening (searched down to a certain level of the game tree), opening books (included rules or sequences of moves, which experts knew to be good, to start a game), and endgame databases (contained sequences of moves that were good for ending a game).
In 1990s, chess programs began challenging International Chess Masters and later Grandmasters. These programs relied much more on memory and brute force than on strategic insight, and they consistently started to beat the best humans. Some dramatic moments in computer chess occurred in 1989 — two widely-respected Grandmasters were defeated by CMU’s Hitech and Deep Thought.
Researchers felt machines could finally beat a World Chess Champion. This got IBM interested so they began working on this challenge in 1989 and built a specialized chess machine, named Deep Blue. In 1997, it would end up beating Garry Kasparov, the best human chess player. So, did it think? And was it intelligent?
IBM researchers were awarded the Fredkin prize, a $100,000 award for the first program to beat a reigning world chess champion, which had gone unclaimed for 17 years. This (and later) victories in computer chess were disappointing to many AI researchers because they were interested in building machines that would succeed via “general intelligence” strategies rather than brute-force. In this sense, chess had started to decouple from AI research.
Read about chess machines until Deep Blue in Part 3: Machines that Play (Chess — Before Deep Blue).
Read about Deep Blue here: Part 4: Machines That Play (Deep Blue) and Part 5: Machines That Play (Post Deep Blue)
Luke Muehlhauser Historical chess engines’ estimated ELO rankings
Chess (Jersey City)