OpenAI’s transformer-based language model GPT-2 definitely lives up to the hype. Following the natural evolution of Artificial Intelligence (AI), this generative language model drew a lot of attention by engaging in interviews and appearing in the online text adventure game AI Dungeon.
GPT-2, built on transformer decoder blocks, essentially leverages its natural language prediction algorithm to train and formulate word sequences based on what it has already learned. Its ability to adapt and generate human-like text interactions with minimal prompts quickly is exceptional.
So it’s no surprise that it’s been deployed across applications like chatbots (of course), translation platforms, TL;DR generators (which shows its impressive cognitive abilities), music generators, poetry generators, and more.
Although more complex Natural Language Processing (NLP) models like GPT-3 and Google T5 were released recently, GPT-2 continues to reign supreme. One of the primary reasons for this is its ability to run efficiently on everyday hardware. For example, the sheer number of trainable parameters, 1.5 billion to be exact, only demands 6.5 GB storage.
So at Intersog, we decided to put it to the test and see what else we could train it to do. Someone suggested playing a game of chess, and we ran with it.
Machine Learning (ML) researcher Shawn Presser already trained GPT-2 to play chess using Portable Game Notation (PGN) files. The model is clear evidence of its ability to recognize known patterns within the game.
We decided to take it a step further by training the GPT-2 model on the current board state rather than PGN sequences. This approach ensures that it can play the game based on the present without going through gaming history to predict the next best move.
The code you’ll see in the steps below is inspired by Professor Blank’s Programming a Chess Player and trained on Presser’s Cryochess – GPT-2 1.5B chess engine.
To play chess with smart algorithms, you need a powerful ML-ready machine with a robust Graphics Processing Unit (GPU).
Install CUDA 10.1, PyTorch, and TensorFlow 2. It’s best to engage in this activity in a virtual environment with a JupyterLab extension installed.
Install python-chess [4] and aitextgen [5] modules:
In [ ]:
!pip install python-chess
!pip install aitextgen
!pip install tqdm
Download PGN files to the PGN folder. You also have the option of converting SCID databases (*.sg4) to PGN format. For this exercise, we engaged in training by leveraging 100,000 games in PGN archives.
Some valuable PGN resources include the following:
import os
if not os.path.exists("pgn"):
os.mkdir("pgn")
As reflected in the example below, our model uses the current board state along with the training file for the next move on each line.
It follows the following format:
[Result] FEN-position-and-side-only - next_move
[1-0] r1bq3k/ppp2rpp/5b2/3n4/3P4/P4p2/BP1B1PPP/R2QR1K1 w - a2d5
[1-0] 2bQ4/p4kb1/6n1/q1p1p3/1rn1P3/N3BP2/1PP5/2KR2R1 w - a3c4
[0-1] 1r3rk1/p4nbp/1qppb1p1/4p3/PP2P3/4NN1P/2QB1PP1/2R1R1K1 b - f8c8
We decided to use only chess matches that were won and skipped draws. The code to generate this function is as follows:
import os
from tqdm.auto import tqdm
import glob
import chess.pgn
MAX_IMPORT = 100000
def importPgn(filename, s, max_import):
counter = 0
total = 0
with open(filename) as f:
for line in f:
if "[Result" in line:
total += 1
if total > max_import:
total = max_import
pbar = tqdm(total=total, desc="read " + filename, unit=" games", mininterval=1)
pgn = open(filename)
while counter < max_import:
game = chess.pgn.read_game(pgn)
if not game:
break
board = game.board()
moves = game.mainline_moves()
count = sum(1 for _ in moves)
# skip unfinished games
if count <= 5:
continue
result = game.headers["Result"]
# import only resultative games
if result != "1-0" and result != "0-1":
continue
for move in moves:
if board.turn == chess.WHITE and result == "1-0":
line = (
"[1-0] "
+ " ".join(board.fen().split(" ", 2)[:2])
+ " - "
+ move.uci()
).strip()
s.add(line)
elif board.turn == chess.BLACK and result == "0-1":
line = (
"[0-1] "
+ " ".join(board.fen().split(" ", 2)[:2])
+ " - "
+ move.uci()
).strip()
s.add(line)
board.push(move)
counter += 1
pbar.update(1)
pbar.close()
return counter
def convert():
games = 0
moves = 0
max_import = MAX_IMPORT
s = set()
# load previous state
if os.path.exists("fen.txt"):
with open("fen.txt") as f:
for line in tqdm(f, desc="read fen.txt", unit=" moves", mininterval=1):
if line:
s.add(line)
max_import -= 1
if max_import <= 0:
break
for file in glob.glob("pgn/*.pgn"):
count = importPgn(file, s, max_import)
games += count
max_import -= count
if max_import <= 0:
break
with open("fen.txt", "w") as f:
for line in tqdm(s, desc="write fen.txt", unit=" moves", mininterval=1):
f.write(line + "\n")
moves += 1
print("imported " + str(games) + " games, " + str(moves) + " moves")
convert()
It took us about 15 minutes to import 100,000 games. If you want to use more games for training, you’ll also need more RAM.
As described in the aitextgen documentation, we trained a small GPT-2 model from scratch using only the model memory. We chose a small model as we could quickly train it on basic/average hardware (rather than larger models).
Larger models come with their own sets of demands and benefits, but it’s far too complex for a simple demonstration. The training function can run many times, repeat it, and archive acceptable loss as a model checkpoint.
As model checkpoints are periodically saved, the training function can be run multiple times, repeat the training, and achieve acceptable losses. In this scenario, to save time, we stopped at a loss value close to 0.8. Even at this level, the GPT-2 model could predict moves with an acceptable level of accuracy.
To better fit with your GPU and negate out of memory scenarios, tune-up batch_size, and num_workers.
from aitextgen import aitextgen
from aitextgen.utils import build_gpt2_config
from aitextgen.TokenDataset import TokenDataset
from aitextgen.tokenizers import train_tokenizer
import os
file_name = "fen.txt"
model_dir = "trained_model"
config_file = os.path.join(model_dir, "config.json")
pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")
vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")
merges_file = os.path.join(model_dir, "aitextgen-merges.txt")
dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")
max_length = 100
vocab_size = 10000
def train():
if not os.path.exists(model_dir):
os.mkdir(model_dir)
# train tokenizer if necessary
if not os.path.exists(vocab_file):
print("training tokenizer, please wait...")
train_tokenizer(file_name, save_path=model_dir, vocab_size=vocab_size)
if os.path.exists(dataset_cache_file): # use cache
data = TokenDataset(dataset_cache_file,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,from_cache=True,)
else: # or create token cache if necessary
data = TokenDataset(file_name,vocab_file=vocab_file,merges_file=merges_file,block_size=max_length,line_by_line=True,save_cache=True,cache_destination=dataset_cache_file)
if not os.path.exists(pytorch_model_file):
config = build_gpt2_config(vocab_size=vocab_size,max_length=max_length,dropout=0.0,n_embd=512,n_head=16,n_layer=16,)
ai = aitextgen(config=config, vocab_file=vocab_file, merges_file=merges_file, to_gpu=True)
else:
ai = aitextgen(model=pytorch_model_file,
config=config_file,
vocab_file=vocab_file,
merges_file=merges_file,
to_gpu=True)
ai.train(data, num_steps=150000,
generate_every=1000,
save_every=1000,
learning_rate=1e-4,
batch_size=16,
num_workers=4,)
train()
This process takes about eight hours. However, if you want to use a well-trained model, it’s best to give it a few days.
5.1. Introduce a Random Player (random_player)
In this scenario, a random player is a simple (or novice) player who’s pretty poor at the game of chess. The function basically makes an arbitrary choice based on a list of valid moves.
import random
def random_player(board):
move = random.choice(list(board.legal_moves))
return move.uci(), False, False
5.2. Introduce a GPT-2 Player (gpt2_player)
This player uses AI-powered GPT-2 to predict the next move. The model prompt is constructed from the expected result from the current board state and side (and since we want to win, white = 1-0 and black = 0-1). From here, once prompted the model supplements the next generated move.
The GPT-2 player is trained on a small ML dataset and can never evolve into a chess master. The predicted moves are based on unknown board states and not ones presented during the training phase. Whenever the model generates an invalid movement, it’s fixed by using a random valid move.
import os
from aitextgen import aitextgen
from aitextgen.utils import build_gpt2_config
import chess
from tqdm.auto import tqdm
model_dir = "trained_model"
vocab_file = "aitextgen-vocab.json"
merges_file = "aitextgen-merges.txt"
max_length = 100
model_dir = "trained_model"
config_file = os.path.join(model_dir, "config.json")
pytorch_model_file = os.path.join(model_dir, "pytorch_model.bin")
vocab_file = os.path.join(model_dir, "aitextgen-vocab.json")
merges_file = os.path.join(model_dir, "aitextgen-merges.txt")
dataset_cache_file = os.path.join(model_dir, "dataset_cache.tar.gz")
max_length = 100
ai = aitextgen(
model=pytorch_model_file,
config=config_file,
vocab_file=vocab_file,
merges_file=merges_file,
from_cache=True,
to_gpu=True,
# to_fp16=True
)
# a set to find known states
db = set()
with open("fen.txt") as f:
for line in tqdm(f, desc="read fen.txt", unit=" moves"):
if line:
db.add(" ".join(line.split(" ", 3)[:3]))
def gpt2_player(board):
if board.turn == chess.WHITE:
prompt = "[1-0] " + " ".join(board.fen().split(" ", 2)[:2])
else:
prompt = "[0-1] " + " ".join(board.fen().split(" ", 2)[:2])
isKnown = prompt in db
prediction = ai.generate_one(prompt=prompt,max_length=max_length,temperature=0.9,top_k=0,)
isPredicted = False
try:
uci = prediction.split(' - ')[1].strip()
move = chess.Move.from_uci(uci)
isPredicted = True
except Exception as e:
# print(str(e))
move = None
if not move or move not in board.legal_moves:
# give up and do random move
move = random.choice(list(board.legal_moves))
isPredicted = False
return move.uci(), isPredicted, isKnown
Now comes the fun part!
This function demands two players to play chess against each other:
import time
from IPython.display import display, HTML, clear_output
import chess
def who(player):
return "White" if player == chess.WHITE else "Black"
def display_board(board, use_svg):
if use_svg:
return board._repr_svg_()
else:
return "<pre>" + str(board) + "</pre>"
def play_game(player1, player2, visual="svg", pause=0.1):
"""
playerN1, player2: functions that takes board, return uci move
visual: "simple" | "svg" | None
"""
use_svg = (visual == "svg")
board = chess.Board()
known1 = 0
predicted1 = 0
total1 = 0
known2 = 0
predicted2 = 0
total2 = 0
if visual is not None:
display(display_board(board, visual == 'svg'))
try:
while not board.is_game_over(claim_draw=True):
if board.turn == chess.WHITE:
uci, isPredicted, isKnown = player1(board)
total1 += 1
if isKnown:
known1 += 1
if isPredicted:
predicted1 += 1
else:
uci, isPredicted, isKnown = player2(board)
total2 += 1
if isKnown:
known2 += 1
if isPredicted:
predicted2 += 1
name = who(board.turn)
board.push_uci(uci)
board_stop = display_board(board, use_svg)
html = "<b>Move %s %s, Play '%s':</b><br/>%s<br/>Known/Predicted/Total moves: %s/%s/%s %s%% - %s/%s/%s %s%%" % (
len(board.move_stack), name, uci, board_stop,
known1, predicted1, total1, round(predicted1 / (total1 or 1) * 100),
known2, predicted2, total2, round(predicted2 / (total2 or 1) * 100))
if visual is not None:
if visual == "svg":
clear_output(wait=True)
display(HTML(html))
if visual == "svg":
time.sleep(pause)
except KeyboardInterrupt:
msg = "Game interrupted!"
return (None, msg, board)
result = "1/2-1/2"
if board.is_checkmate():
msg = "checkmate: " + who(not board.turn) + " wins!"
result = "1-0" if who(not board.turn) == "White" else "0-1"
elif board.is_stalemate():
msg = "draw: stalemate"
elif board.is_fivefold_repetition():
msg = "draw: 5-fold repetition"
elif board.is_insufficient_material():
msg = "draw: insufficient material"
elif board.can_claim_draw():
msg = "draw: claim"
if visual is not None:
print(msg)
return (result, msg, board)
Let’s play the gpt2_player vs. random_player:
play_game(gpt2_player, random_player)
pass
Move 61 White, Play 'd2d7':
Known/Predicted/Total moves: 2/29/31 94% - 0/0/30 0%
checkmate: White wins!
When you're working with a small dataset, these chess matches often end up in a stalemate. While we aren't a hundred percent certain, it could also result from not analyzing the next move and choosing the best one.
Now let's play 100 games where the gpt2_player plays white:
from tqdm.auto import tqdm
plays = 100
white_wins = 0
black_wins = 0
pbar1 = None
pbar2 = None
for i in tqdm(range(plays), desc="Plays"):
if not pbar1:
pbar1 = tqdm(total=plays, desc="White wins")
if not pbar2:
pbar2 = tqdm(total=plays, desc="Black wins")
result, _, _ = play_game(gpt2_player, random_player, visual=None)
if result is None:
break
elif result == "1-0":
white_wins += 1
pbar1.update(1)
elif result == "0-1":
black_wins += 1
pbar2.update(1)
pbar1.close()
pbar2.close()
print("Final score: %s-%s" % (white_wins, black_wins))
Final score: 52-0
Most often, the GPT-2 controlled player wins the game or draws. In this scenario, the GPT-2 controlled white player won more than half the games. We also noticed that the current board state was almost always new to the model.
The model was also able to engage in more valid moves than fails. So we can conclude that the GPT-2 model was able to learn some basic patterns from the training data to predict the next move successfully.
Let’s play the gpt2_player vs. Human Player:
The following function effectively handles human input into the game:
def human_player(board):
uci = get_move("%s's move [q to quit]> " % who(board.turn))
legal_uci_moves = [move.uci() for move in board.legal_moves]
while uci not in legal_uci_moves:
print("Legal moves: " + (",".join(sorted(legal_uci_moves))))
uci = get_move("%s's move[q to quit]> " % who(board.turn))
return uci, True, False
def get_move(prompt):
uci = input(prompt)
if uci and uci[0] == "q":
raise KeyboardInterrupt()
try:
chess.Move.from_uci(uci)
except:
uci = None
return uci
Do you want to play against gpt2_player?
If you do, note that you must enter your move in the user-computer interface representing each piece and place. This means that typing something like "a2a4"moves the piece located at a2 to a4.
play_game(human_player, gpt2_player)
pass
Move 10 Black, Play 'b7b6':
Known/Predicted/Total moves: 0/5/5 100% - 2/5/5 100%
In this experiment, we deployed GPT-2 to learn and play a game of chess. Although it was nowhere near our grandmasters (at least not yet), it showed that it was capable of understanding the game's basics. With more training data and larger model size, the model can theoretically take it to the next level and possibly beat human players.
An interesting observation was that the model itself behaved quite erratically when playing the random player who made arbitrary moves. However, when we added a human player, the model made more calculated and confident moves. This suggests that a better player creates board states that are similar to the training data.
We also noticed that the GPT-2 model engaged in natural text generation and confidently generated different types of textual patterns. It was also able to leverage its training data to successfully handle unknown input, almost like it developed a new internal algorithm.
If you like to do your own experiments, this notebook is available HERE.