OpenAI’s transformer-based language model GPT-2 definitely lives up to the hype. Following the natural evolution of Artificial Intelligence (AI), this generative language model drew a lot of attention by engaging in interviews and appearing in the online text adventure game AI Dungeon.
GPT-2, built on transformer decoder blocks, essentially leverages its natural language prediction algorithm to train and formulate word sequences based on what it has already learned. Its ability to adapt and generate human-like text interactions with minimal prompts quickly is exceptional.
So it’s no surprise that it’s been deployed across applications like chatbots (of course), translation platforms, TL;DR generators (which shows its impressive cognitive abilities), music generators, poetry generators, and more.
Although more complex Natural Language Processing (NLP) models like GPT-3 and Google T5 were released recently, GPT-2 continues to reign supreme. One of the primary reasons for this is its ability to run efficiently on everyday hardware. For example, the sheer number of trainable parameters, 1.5 billion to be exact, only demands 6.5 GB storage.
So at Intersog, we decided to put it to the test and see what else we could train it to do. Someone suggested playing a game of chess, and we ran with it.
Machine Learning (ML) researcher Shawn Presser already trained GPT-2 to play chess using Portable Game Notation (PGN) files. The model is clear evidence of its ability to recognize known patterns within the game.
We decided to take it a step further by training the GPT-2 model on the current board state rather than PGN sequences. This approach ensures that it can play the game based on the present without going through gaming history to predict the next best move.
The code you’ll see in the steps below is inspired by Professor Blank’s Programming a Chess Player and trained on Presser’s Cryochess – GPT-2 1.5B chess engine.
Step 1. Set up Your Dependencies
To play chess with smart algorithms, you need a powerful ML-ready machine with a robust Graphics Processing Unit (GPU).
Install CUDA 10.1, PyTorch, and TensorFlow 2. It’s best to engage in this activity in a virtual environment with a JupyterLab extension installed.
Install python-chess [4] and aitextgen [5] modules:
In [ ]: !pip install python-chess !pip install aitextgen !pip install tqdmDownload PGN files to the PGN folder. You also have the option of converting SCID databases (*.sg4) to PGN format. For this exercise, we engaged in training by leveraging 100,000 games in PGN archives.
Some valuable PGN resources include the following:
import os if not os.path.exists("pgn"): os.mkdir("pgn")2. Generate Training Data
As reflected in the example below, our model uses the current board state along with the training file for the next move on each line.
It follows the following format:
[Result] FEN-position-and-side-only - next_move[1-0] r1bq3k/ppp2rpp/5b2/3n4/3P4/P4p2/BP1B1PPP/R2QR1K1 w - a2d5 [1-0] 2bQ4/p4kb1/6n1/q1p1p3/1rn1P3/N3BP2/1PP5/2KR2R1 w - a3c4 [0-1] 1r3rk1/p4nbp/1qppb1p1/4p3/PP2P3/4NN1P/2QB1PP1/2R1R1K1 b - f8c8We decided to use only chess matches that were won and skipped draws. The code to generate this function is as follows:
import osfrom tqdm.auto import tqdmimport globimport chess.pgnMAX_IMPORT = 100000def importPgn(filename, s, max_import):counter = 0total = 0with open(filename) as f:for line in f:if "[Result" in line:total += 1if total > max_import:total = max_importpbar = tqdm(total=total, desc="read " + filename, unit=" games", mininterval=1)pgn = open(filename)while counter < max_import:game = chess.pgn.read_game(pgn)if not game:breakboard = game.board()moves = game.mainline_moves()count = sum(1 for _ in moves)# skip unfinished gamesif count <= 5:continueresult = game.headers["Result"]# import only resultative gamesif result != "1-0" and result != "0-1":continuefor move in moves:if board.turn == chess.WHITE and result == "1-0":line = ("[1-0] "+ " ".join(board.fen().split(" ", 2)[:2])+ " - "+ move.uci()).strip()s.add(line)elif board.turn == chess.BLACK and result == "0-1":line = ("[0-1] "+ " ".join(board.fen().split(" ", 2)[:2])+ " - "+ move.uci()).strip()s.add(line)board.push(move)counter += 1pbar.update(1)pbar.close()return counterdef convert():games = 0moves = 0max_import = MAX_IMPORTs = set()# load previous stateif os.path.exists("fen.txt"):with open("fen.txt") as f:for line in tqdm(f, desc="read fen.txt", unit=" moves", mininterval=1):if line:s.add(line)max_import -= 1if max_import <= 0:breakfor file in glob.glob("pgn/*.pgn"):count = importPgn(file, s, max_import)games += countmax_import -= countif max_import <= 0:breakwith open("fen.txt", "w") as f:for line in tqdm(s, desc="write fen.txt", unit=" moves", mininterval=1):f.write(line + "\n")moves += 1print("imported " + str(games) + " games, " + str(moves) + " moves")convert()It took us about 15 minutes to import 100,000 games. If you want to use more games for training, you’ll also need more RAM.
Step 4: Training the GPT-2 Model
As described in the aitextgen documentation, we trained a small GPT-2 model from scratch using only the model memory. We chose a small model as we could quickly train it on basic/average hardware (rather than larger models).
Larger models come with their own sets of demands and benefits, but it’s far too complex for a simple demonstration. The training function can run many times, repeat it, and archive acceptable loss as a model checkpoint.
As model checkpoints are periodically saved, the training function can be run multiple times, repeat the training, and achieve acceptable losses. In this scenario, to save time, we stopped at a loss value close to 0.8. Even at this level, the GPT-2 model could predict moves with an acceptable level of accuracy.
To better fit with your GPU and negate out of memory scenarios, tune-up batch_size, and num_workers.
from aitextgen import aitextgenfrom aitextgen.utils import build_gpt2_configfrom aitextgen.TokenDataset import TokenDatasetfrom aitextgen.tokenizers import train_tokenizerimport osfile_name = "fen.txt"model_dir = "trained_model"config_file = os.path.join(model_dir, "config.json")pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")merges_file = os.path.join(model_dir, "aitextgen-merges.txt")dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")max_length = 100vocab_size = 10000def train():if not os.path.exists(model_dir):os.mkdir(model_dir)# train tokenizer if necessaryif not os.path.exists(vocab_file):print("training tokenizer, please wait...")train_tokenizer(file_name, save_path=model_dir, vocab_size=vocab_size)if os.path.exists(dataset_cache_file): # use cachedata = TokenDataset(dataset_cache_file,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,from_cache=True,)else: # or create token cache if necessarydata = TokenDataset(file_name,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,line_by_line=True,save_cache=True,cache_destination=dataset_cache_file)if not os.path.exists(pytorch_model_file):config = build_gpt2_config(vocab_size=vocab_size,max_length=max_length,dropout=0.0,n_embd=512,n_head=16,n_layer=16,)ai = aitextgen(config=config, vocab_file=vocab_file, merges_file=merges_file, to_gpu=True)else:ai = aitextgen(model=pytorch_model_file,config=config_file,vocab_file=vocab_file,merges_file=merges_file,to_gpu=True)ai.train(data, num_steps=150000,generate_every=1000,save_every=1000,learning_rate=1e-4,batch_size=16,num_workers=4,)train()This process takes about eight hours. However, if you want to use a well-trained model, it’s best to give it a few days.
Step 5: Assessment
5.1. Introduce a Random Player (random_player)
In this scenario, a random player is a simple (or novice) player who’s pretty poor at the game of chess. The function basically makes an arbitrary choice based on a list of valid moves.
import randomdef random_player(board):move = random.choice(list(board.legal_moves))return move.uci(), False, False5.2. Introduce a GPT-2 Player (gpt2_player)
This player uses AI-powered GPT-2 to predict the next move. The model prompt is constructed from the expected result from the current board state and side (and since we want to win, white = 1-0 and black = 0-1). From here, once prompted the model supplements the next generated move.
The GPT-2 player is trained on a small ML dataset and can never evolve into a chess master. The predicted moves are based on unknown board states and not ones presented during the training phase. Whenever the model generates an invalid movement, it’s fixed by using a random valid move.
import osfrom aitextgen import aitextgenfrom aitextgen.utils import build_gpt2_configimport chessfrom tqdm.auto import tqdmmodel_dir = "trained_model"vocab_file = "aitextgen-vocab.json"merges_file = "aitextgen-merges.txt"max_length = 100model_dir = "trained_model"config_file = os.path.join(model_dir, "config.json")pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")merges_file = os.path.join(model_dir, "aitextgen-merges.txt")dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")max_length = 100ai = aitextgen(model=pytorch_model_file,config=config_file,vocab_file=vocab_file,merges_file=merges_file,from_cache=True,to_gpu=True,# to_fp16=True)# a set to find known statesdb = set()with open("fen.txt") as f:for line in tqdm(f, desc="read fen.txt", unit=" moves"):if line:db.add(" ".join(line.split(" ", 3)[:3]))def gpt2_player(board):if board.turn == chess.WHITE:prompt = "[1-0] " + " ".join(board.fen().split(" ", 2)[:2])else:prompt = "[0-1] " + " ".join(board.fen().split(" ", 2)[:2])isKnown = prompt in dbprediction = ai.generate_one(prompt=prompt,max_length=max_length,temperature=0.9,top_k=0,)isPredicted = Falsetry:uci = prediction.split(' - ')[1].strip()move = chess.Move.from_uci(uci)isPredicted = Trueexcept Exception as e:# print(str(e))move = Noneif not move or move not in board.legal_moves:# give up and do random movemove = random.choice(list(board.legal_moves))isPredicted = Falsereturn move.uci(), isPredicted, isKnownStep 6: Play a Game of Chess
Now comes the fun part!
This function demands two players to play chess against each other:
import timefrom IPython.display import display, HTML, clear_outputimport chessdef who(player):return "White" if player == chess.WHITE else "Black"def display_board(board, use_svg):if use_svg:return board._repr_svg_()else:return "<pre>" + str(board) + "</pre>"def play_game(player1, player2, visual="svg", pause=0.1):"""playerN1, player2: functions that takes board, return uci movevisual: "simple" | "svg" | None"""use_svg = (visual == "svg")board = chess.Board()known1 = 0predicted1 = 0total1 = 0known2 = 0predicted2 = 0total2 = 0if visual is not None:display(display_board(board, visual == 'svg'))try:while not board.is_game_over(claim_draw=True):if board.turn == chess.WHITE:uci, isPredicted, isKnown = player1(board)total1 += 1if isKnown:known1 += 1if isPredicted:predicted1 += 1else:uci, isPredicted, isKnown = player2(board)total2 += 1if isKnown:known2 += 1if isPredicted:predicted2 += 1name = who(board.turn)board.push_uci(uci)board_stop = display_board(board, use_svg)html = "<b>Move %s %s, Play '%s':</b><br/>%s<br/>Known/Predicted/Total moves: %s/%s/%s %s%% - %s/%s/%s %s%%" % (len(board.move_stack), name, uci, board_stop,known1, predicted1, total1, round(predicted1 / (total1 or 1) * 100),known2, predicted2, total2, round(predicted2 / (total2 or 1) * 100))if visual is not None:if visual == "svg":clear_output(wait=True)display(HTML(html))if visual == "svg":time.sleep(pause)except KeyboardInterrupt:msg = "Game interrupted!"return (None, msg, board)result = "1/2-1/2"if board.is_checkmate():msg = "checkmate: " + who(not board.turn) + " wins!"result = "1-0" if who(not board.turn) == "White" else "0-1"elif board.is_stalemate():msg = "draw: stalemate"elif board.is_fivefold_repetition():msg = "draw: 5-fold repetition"elif board.is_insufficient_material():msg = "draw: insufficient material"elif board.can_claim_draw():msg = "draw: claim"if visual is not None:print(msg)return (result, msg, board)Let’s play the gpt2_player vs. random_player:play_game(gpt2_player, random_player)passMove 61 White, Play 'd2d7':Known/Predicted/Total moves: 2/29/31 94% - 0/0/30 0%checkmate: White wins!When you're working with a small dataset, these chess matches often end up in a stalemate. While we aren't a hundred percent certain, it could also result from not analyzing the next move and choosing the best one.
Now let's play 100 games where the gpt2_player plays white:
from tqdm.auto import tqdmplays = 100white_wins = 0black_wins = 0pbar1 = Nonepbar2 = Nonefor i in tqdm(range(plays), desc="Plays"):if not pbar1:pbar1 = tqdm(total=plays, desc="White wins")if not pbar2:pbar2 = tqdm(total=plays, desc="Black wins")result, _, _ = play_game(gpt2_player, random_player, visual=None)if result is None:breakelif result == "1-0":white_wins += 1pbar1.update(1)elif result == "0-1":black_wins += 1pbar2.update(1)pbar1.close()pbar2.close()print("Final score: %s-%s" % (white_wins, black_wins))Final score: 52-0Most often, the GPT-2 controlled player wins the game or draws. In this scenario, the GPT-2 controlled white player won more than half the games. We also noticed that the current board state was almost always new to the model.
The model was also able to engage in more valid moves than fails. So we can conclude that the GPT-2 model was able to learn some basic patterns from the training data to predict the next move successfully.
Let’s play the gpt2_player vs. Human Player:
The following function effectively handles human input into the game:
def human_player(board):uci = get_move("%s's move [q to quit]> " % who(board.turn))legal_uci_moves = [move.uci() for move in board.legal_moves]while uci not in legal_uci_moves:print("Legal moves: " + (",".join(sorted(legal_uci_moves))))uci = get_move("%s's move[q to quit]> " % who(board.turn))return uci, True, Falsedef get_move(prompt):uci = input(prompt)if uci and uci[0] == "q":raise KeyboardInterrupt()try:chess.Move.from_uci(uci)except:uci = Nonereturn uciDo you want to play against gpt2_player?
If you do, note that you must enter your move in the user-computer interface representing each piece and place. This means that typing something like "a2a4"moves the piece located at a2 to a4.
play_game(human_player, gpt2_player)passMove 10 Black, Play 'b7b6':Known/Predicted/Total moves: 0/5/5 100% - 2/5/5 100%Conclusion
In this experiment, we deployed GPT-2 to learn and play a game of chess. Although it was nowhere near our grandmasters (at least not yet), it showed that it was capable of understanding the game's basics. With more training data and larger model size, the model can theoretically take it to the next level and possibly beat human players.
An interesting observation was that the model itself behaved quite erratically when playing the random player who made arbitrary moves. However, when we added a human player, the model made more calculated and confident moves. This suggests that a better player creates board states that are similar to the training data.
We also noticed that the GPT-2 model engaged in natural text generation and confidently generated different types of textual patterns. It was also able to leverage its training data to successfully handle unknown input, almost like it developed a new internal algorithm.
If you like to do your own experiments, this notebook is available HERE.
