How to Play Chess Using a GPT-2 Model

OpenAI’s transformer-based language model GPT-2 definitely lives up to the hype. Following the natural evolution of Artificial Intelligence (AI), this generative language model drew a lot of attention by engaging in interviews and appearing in the online text adventure game AI Dungeon.

GPT-2, built on transformer decoder blocks, essentially leverages its natural language prediction algorithm to train and formulate word sequences based on what it has already learned. Its ability to adapt and generate human-like text interactions with minimal prompts quickly is exceptional.

So it’s no surprise that it’s been deployed across applications like chatbots (of course), translation platforms, TL;DR generators (which shows its impressive cognitive abilities), music generators, poetry generators, and more.

Although more complex Natural Language Processing (NLP) models like GPT-3 and Google T5 were released recently, GPT-2 continues to reign supreme. One of the primary reasons for this is its ability to run efficiently on everyday hardware. For example, the sheer number of trainable parameters, 1.5 billion to be exact, only demands 6.5 GB storage.

So at Intersog, we decided to put it to the test and see what else we could train it to do. Someone suggested playing a game of chess, and we ran with it.

Machine Learning (ML) researcher Shawn Presser already trained GPT-2 to play chess using Portable Game Notation (PGN) files. The model is clear evidence of its ability to recognize known patterns within the game.

We decided to take it a step further by training the GPT-2 model on the current board state rather than PGN sequences. This approach ensures that it can play the game based on the present without going through gaming history to predict the next best move.

The code you’ll see in the steps below is inspired by Professor Blank’s Programming a Chess Player and trained on Presser’s Cryochess – GPT-2 1.5B chess engine.

Step 1. Set up Your Dependencies

To play chess with smart algorithms, you need a powerful ML-ready machine with a robust Graphics Processing Unit (GPU).

Install CUDA 10.1, PyTorch, and TensorFlow 2. It’s best to engage in this activity in a virtual environment with a JupyterLab extension installed.

Install python-chess [4] and aitextgen [5] modules:

In [ ]:

!pip install python-chess

!pip install aitextgen

!pip install tqdm

Download PGN files to the PGN folder. You also have the option of converting SCID databases (*.sg4) to PGN format. For this exercise, we engaged in training by leveraging 100,000 games in PGN archives.

Some valuable PGN resources include the following:

import os

if not os.path.exists("pgn"):

os.mkdir("pgn")

2. Generate Training Data

As reflected in the example below, our model uses the current board state along with the training file for the next move on each line.

It follows the following format:

[Result] FEN-position-and-side-only - next_move

[1-0] r1bq3k/ppp2rpp/5b2/3n4/3P4/P4p2/BP1B1PPP/R2QR1K1 w - a2d5

[1-0] 2bQ4/p4kb1/6n1/q1p1p3/1rn1P3/N3BP2/1PP5/2KR2R1 w - a3c4

[0-1] 1r3rk1/p4nbp/1qppb1p1/4p3/PP2P3/4NN1P/2QB1PP1/2R1R1K1 b - f8c8

We decided to use only chess matches that were won and skipped draws. The code to generate this function is as follows:

import os

from tqdm.auto import tqdm

import glob

import chess.pgn

MAX_IMPORT = 100000

def importPgn(filename, s, max_import):

counter = 0

total = 0

with open(filename) as f:

for line in f:

if "[Result" in line:

total += 1

if total > max_import:

total = max_import

pbar = tqdm(total=total, desc="read " + filename, unit=" games", mininterval=1)

pgn = open(filename)

while counter < max_import:

game = chess.pgn.read_game(pgn)

if not game:

break

board = game.board()

moves = game.mainline_moves()

count = sum(1 for _ in moves)

# skip unfinished games

if count <= 5:

continue

result = game.headers["Result"]

# import only resultative games

if result != "1-0" and result != "0-1":

continue

for move in moves:

if board.turn == chess.WHITE and result == "1-0":

line = (

"[1-0] "

+ " ".join(board.fen().split(" ", 2)[:2])

+ " - "

+ move.uci()

).strip()

s.add(line)

elif board.turn == chess.BLACK and result == "0-1":

line = (

"[0-1] "

+ " ".join(board.fen().split(" ", 2)[:2])

+ " - "

+ move.uci()

).strip()

s.add(line)

board.push(move)

counter += 1

pbar.update(1)

pbar.close()

return counter

def convert():

games = 0

moves = 0

max_import = MAX_IMPORT

s = set()

# load previous state

if os.path.exists("fen.txt"):

with open("fen.txt") as f:

for line in tqdm(f, desc="read fen.txt", unit=" moves", mininterval=1):

if line:

s.add(line)

max_import -= 1

if max_import <= 0:

break

for file in glob.glob("pgn/*.pgn"):

count = importPgn(file, s, max_import)

games += count

max_import -= count

if max_import <= 0:

break

with open("fen.txt", "w") as f:

for line in tqdm(s, desc="write fen.txt", unit=" moves", mininterval=1):

f.write(line + "\n")

moves += 1

print("imported " + str(games) + " games, " + str(moves) + " moves")

convert()

It took us about 15 minutes to import 100,000 games. If you want to use more games for training, you’ll also need more RAM.

Step 4: Training the GPT-2 Model

As described in the aitextgen documentation, we trained a small GPT-2 model from scratch using only the model memory. We chose a small model as we could quickly train it on basic/average hardware (rather than larger models).

Larger models come with their own sets of demands and benefits, but it’s far too complex for a simple demonstration. The training function can run many times, repeat it, and archive acceptable loss as a model checkpoint.

As model checkpoints are periodically saved, the training function can be run multiple times, repeat the training, and achieve acceptable losses. In this scenario, to save time, we stopped at a loss value close to 0.8. Even at this level, the GPT-2 model could predict moves with an acceptable level of accuracy.

To better fit with your GPU and negate out of memory scenarios, tune-up batch_size, and num_workers.

from aitextgen import aitextgen

from aitextgen.utils import build_gpt2_config

from aitextgen.TokenDataset import TokenDataset

from aitextgen.tokenizers import train_tokenizer

import os

file_name = "fen.txt"

model_dir = "trained_model"

config_file = os.path.join(model_dir, "config.json")

pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")

vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")

merges_file = os.path.join(model_dir, "aitextgen-merges.txt")

dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")

max_length = 100

vocab_size = 10000

def train():

if not os.path.exists(model_dir):

os.mkdir(model_dir)

# train tokenizer if necessary

if not os.path.exists(vocab_file):

print("training tokenizer, please wait...")

train_tokenizer(file_name, save_path=model_dir, vocab_size=vocab_size)

if os.path.exists(dataset_cache_file): # use cache

data = TokenDataset(dataset_cache_file,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,from_cache=True,)

else: # or create token cache if necessary

data = TokenDataset(file_name,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,line_by_line=True,save_cache=True,cache_destination=dataset_cache_file)

if not os.path.exists(pytorch_model_file):

config = build_gpt2_config(vocab_size=vocab_size,max_length=max_length,dropout=0.0,n_embd=512,n_head=16,n_layer=16,)

ai = aitextgen(config=config, vocab_file=vocab_file, merges_file=merges_file, to_gpu=True)

else:

ai = aitextgen(model=pytorch_model_file,

config=config_file,

vocab_file=vocab_file,

merges_file=merges_file,

to_gpu=True)

ai.train(data, num_steps=150000,

generate_every=1000,

save_every=1000,

learning_rate=1e-4,

batch_size=16,

num_workers=4,)

train()

This process takes about eight hours. However, if you want to use a well-trained model, it’s best to give it a few days.

Step 5: Assessment

5.1. Introduce a Random Player (random_player)

In this scenario, a random player is a simple (or novice) player who’s pretty poor at the game of chess. The function basically makes an arbitrary choice based on a list of valid moves.

import random

def random_player(board):

move = random.choice(list(board.legal_moves))

return move.uci(), False, False

5.2. Introduce a GPT-2 Player (gpt2_player)

This player uses AI-powered GPT-2 to predict the next move. The model prompt is constructed from the expected result from the current board state and side (and since we want to win, white = 1-0 and black = 0-1). From here, once prompted the model supplements the next generated move.

The GPT-2 player is trained on a small ML dataset and can never evolve into a chess master. The predicted moves are based on unknown board states and not ones presented during the training phase. Whenever the model generates an invalid movement, it’s fixed by using a random valid move.

import os

from aitextgen import aitextgen

from aitextgen.utils import build_gpt2_config

import chess

from tqdm.auto import tqdm

model_dir = "trained_model"

vocab_file = "aitextgen-vocab.json"

merges_file = "aitextgen-merges.txt"

max_length = 100

model_dir = "trained_model"

config_file = os.path.join(model_dir, "config.json")

pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")

vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")

merges_file = os.path.join(model_dir, "aitextgen-merges.txt")

dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")

max_length = 100

ai = aitextgen(

model=pytorch_model_file,

config=config_file,

vocab_file=vocab_file,

merges_file=merges_file,

from_cache=True,

to_gpu=True,

# to_fp16=True

# a set to find known states

db = set()

with open("fen.txt") as f:

for line in tqdm(f, desc="read fen.txt", unit=" moves"):

if line:

db.add(" ".join(line.split(" ", 3)[:3]))

def gpt2_player(board):

if board.turn == chess.WHITE:

prompt = "[1-0] " + " ".join(board.fen().split(" ", 2)[:2])

else:

prompt = "[0-1] " + " ".join(board.fen().split(" ", 2)[:2])

isKnown = prompt in db

prediction = ai.generate_one(prompt=prompt,max_length=max_length,temperature=0.9,top_k=0,)

isPredicted = False

try:

uci = prediction.split(' - ')[1].strip()

move = chess.Move.from_uci(uci)

isPredicted = True

except Exception as e:

# print(str(e))

move = None

if not move or move not in board.legal_moves:

# give up and do random move

move = random.choice(list(board.legal_moves))

isPredicted = False

return move.uci(), isPredicted, isKnown

Step 6: Play a Game of Chess

Now comes the fun part!

This function demands two players to play chess against each other:

import time

from IPython.display import display, HTML, clear_output

import chess

def who(player):

return "White" if player == chess.WHITE else "Black"

def display_board(board, use_svg):

if use_svg:

return board._repr_svg_()

else:

return "<pre>" + str(board) + "</pre>"

def play_game(player1, player2, visual="svg", pause=0.1):

"""

playerN1, player2: functions that takes board, return uci move

visual: "simple" | "svg" | None

"""

use_svg = (visual == "svg")

board = chess.Board()

known1 = 0

predicted1 = 0

total1 = 0

known2 = 0

predicted2 = 0

total2 = 0

if visual is not None:

display(display_board(board, visual == 'svg'))

try:

while not board.is_game_over(claim_draw=True):

if board.turn == chess.WHITE:

uci, isPredicted, isKnown = player1(board)

total1 += 1

if isKnown:

known1 += 1

if isPredicted:

predicted1 += 1

else:

uci, isPredicted, isKnown = player2(board)

total2 += 1

if isKnown:

known2 += 1

if isPredicted:

predicted2 += 1

name = who(board.turn)

board.push_uci(uci)

board_stop = display_board(board, use_svg)

html = "<b>Move %s %s, Play '%s':</b><br/>%s<br/>Known/Predicted/Total moves: %s/%s/%s %s%% - %s/%s/%s %s%%" % (

len(board.move_stack), name, uci, board_stop,

known1, predicted1, total1, round(predicted1 / (total1 or 1) * 100),

known2, predicted2, total2, round(predicted2 / (total2 or 1) * 100))

if visual is not None:

if visual == "svg":

clear_output(wait=True)

display(HTML(html))

if visual == "svg":

time.sleep(pause)

except KeyboardInterrupt:

msg = "Game interrupted!"

return (None, msg, board)

result = "1/2-1/2"

if board.is_checkmate():

msg = "checkmate: " + who(not board.turn) + " wins!"

result = "1-0" if who(not board.turn) == "White" else "0-1"

elif board.is_stalemate():

msg = "draw: stalemate"

elif board.is_fivefold_repetition():

msg = "draw: 5-fold repetition"

elif board.is_insufficient_material():

msg = "draw: insufficient material"

elif board.can_claim_draw():

msg = "draw: claim"

if visual is not None:

print(msg)

return (result, msg, board)

Let’s play the gpt2_player vs. random_player:

play_game(gpt2_player, random_player)

pass

Move 61 White, Play 'd2d7':

Known/Predicted/Total moves: 2/29/31 94% - 0/0/30 0%

checkmate: White wins!

When you're working with a small dataset, these chess matches often end up in a stalemate. While we aren't a hundred percent certain, it could also result from not analyzing the next move and choosing the best one.

Now let's play 100 games where the gpt2_player plays white:

from tqdm.auto import tqdm

plays = 100

white_wins = 0

black_wins = 0

pbar1 = None

pbar2 = None

for i in tqdm(range(plays), desc="Plays"):

if not pbar1:

pbar1 = tqdm(total=plays, desc="White wins")

if not pbar2:

pbar2 = tqdm(total=plays, desc="Black wins")

result, _, _ = play_game(gpt2_player, random_player, visual=None)

if result is None:

break

elif result == "1-0":

white_wins += 1

pbar1.update(1)

elif result == "0-1":

black_wins += 1

pbar2.update(1)

pbar1.close()

pbar2.close()

print("Final score: %s-%s" % (white_wins, black_wins))

Final score: 52-0

Most often, the GPT-2 controlled player wins the game or draws. In this scenario, the GPT-2 controlled white player won more than half the games. We also noticed that the current board state was almost always new to the model.

The model was also able to engage in more valid moves than fails. So we can conclude that the GPT-2 model was able to learn some basic patterns from the training data to predict the next move successfully.

Let’s play the gpt2_player vs. Human Player:

The following function effectively handles human input into the game:

def human_player(board):

uci = get_move("%s's move [q to quit]> " % who(board.turn))

legal_uci_moves = [move.uci() for move in board.legal_moves]

while uci not in legal_uci_moves:

print("Legal moves: " + (",".join(sorted(legal_uci_moves))))

uci = get_move("%s's move[q to quit]> " % who(board.turn))

return uci, True, False

def get_move(prompt):

uci = input(prompt)

if uci and uci[0] == "q":

raise KeyboardInterrupt()

try:

chess.Move.from_uci(uci)

except:

uci = None

return uci

Do you want to play against gpt2_player?

If you do, note that you must enter your move in the user-computer interface representing each piece and place. This means that typing something like "a2a4"moves the piece located at a2 to a4.

play_game(human_player, gpt2_player)

pass

Move 10 Black, Play 'b7b6':

Known/Predicted/Total moves: 0/5/5 100% - 2/5/5 100%

Conclusion

In this experiment, we deployed GPT-2 to learn and play a game of chess. Although it was nowhere near our grandmasters (at least not yet), it showed that it was capable of understanding the game's basics. With more training data and larger model size, the model can theoretically take it to the next level and possibly beat human players.

An interesting observation was that the model itself behaved quite erratically when playing the random player who made arbitrary moves. However, when we added a human player, the model made more calculated and confident moves. This suggests that a better player creates board states that are similar to the training data.

We also noticed that the GPT-2 model engaged in natural text generation and confidently generated different types of textual patterns. It was also able to leverage its training data to successfully handle unknown input, almost like it developed a new internal algorithm.

If you like to do your own experiments, this notebook is available HERE.