Checkers, also known as draughts, is an ancient game that has been popular for a long time. Like other classic board games, it has attracted the interest of computer scientists and AI researchers as a platform for their experiments. A prominent method they use to train computers to play checkers is a genetic algorithm, which is based on the concept of survival of the fittest that drives evolution.
In this article, we'll explore the intersection of checkers and genetic algorithms, demonstrating how evolutionary computational techniques can be employed to develop proficient checkers-playing agents.
In my last article, I think I’ve covered the topic of checkers enough, but still, let’s remember the basic rules of the game: The game of checkers is played on a 10x10 board, with each player starting with 20 pieces.
The goal is simple: capture all of your opponent's pieces or trap them so they cannot move. The primary strategies revolve around positioning, defense, offense, and foreseeing opponent moves.
GAs are optimization algorithms based on the process of natural selection. These algorithms reflect the procedure of evolution observed in nature, comprising selection, crossover (recombination), and mutation.
This article will use a simple model with fully-connected layers. A layer is a collection of neurons that process inputs and produce outputs. These outputs are then used as the final prediction in the case of the last layer (output layer) or passed as input to the next layer in the network.
Each layer contains:
First, we have to figure out how to represent a potential solution, like a checkers strategy, in a way that will work with a genetic algorithm. In our case, we'll grab all possible moves for the current player, hit up GA about each one, and then pick the move with the biggest weight.
Also, we need to calculate the game score to choose the best items in the population. In our case, it will be:
For each population, we will make a tournament (a simplified version of the Swiss-system tournament). After each tournament, we select 1/4 of the population for future selection.
If some agent is the top player (selected for the next generation) in more than 1 tournament, we choose him to be the top player, and after all the epochs are finished, we make a tournament between the top players and choose the best, which will be saved.
The crux of the algorithm lies in evolving the population of strategies.
There's a risk of the algorithm getting stuck in a local optimum, where it thinks it has found the best strategy but is only optimal in a small region of the solution space.
While GAs can be fast, simulating thousands or millions of games requires significant computational resources.
There's a risk that the evolved strategy becomes too tailored to the training scenarios and performs poorly against unseen strategies or tactics.
I’m going to use node.js.
Why not Python?
I know that Python will be more effective, but with node.js, which I am more familiar with, it can be easily used on the client-side and server-side.
Why not tensorflow.js (or any other library that is so close to you)?
Initially, I planned to use this library, but then I decided that since this project is primarily educational, I can make a minimally viable neuron network myself and also leave space for further improvements (for example, use wasm for calculations).
Almost any neural network is a set of layers where some data is received as input, multiplied by a set of indices, and some data is returned.
For clarity, imagine that you are looking for a new home of your dreams, and now you have found it, and now you ask your friends what they think about this house, and they give their answer on a 10-point scale, for example.
You trust the opinion of each of your friends to a certain extent, but you have to ask everyone, even those you don’t particularly trust, so as not to offend. And so you sit down with your friends in a group with your favorite drinks and look at different houses, and everyone says how much they liked it.
As a result, you know what your friends think about each house, and based on the final result, you decide.
The neural network consists of just such “friends”, but the number of groups of friends can be quite large, and everyone relies on the opinion of the previous group.
Schematic presentation of the neural network:
import {createEmpty, createNew} from "./genetic";
export enum Activation {
Sigmoid,
Relu,
}
class Layer {
weights: Float32Array;
biases: Float32Array;
inputCount: number;
outputCount: number;
size: number;
activation: Activation;
/**
* Create a new layer
* @param inputCount
* @param outputCount
* @param activation
*/
constructor(
inputCount: number,
outputCount: number,
activation: Activation = Activation.Relu
) {
this.inputCount = inputCount;
this.outputCount = outputCount;
this.weights = createNew(inputCount * outputCount, 2);
this.biases = createNew(outputCount, 2);
this.activation = activation;
this.size = inputCount * outputCount + outputCount;
switch (activation) {
case Activation.Sigmoid:
this.activate = this.activateSigmoid.bind(this);
break;
case Activation.Relu:
this.activate = this.activationRelu.bind(this);
break;
}
}
/**
* Activation function (identity)
* @param val
*/
activate(val: Float32Array) {
return val;
}
/**
* Activation function (sigmoid)
* @param val
*/
activateSigmoid(val: Float32Array) {
return val.map((v) => 1 / (1 + Math.exp(-v)));
}
/**
* Activation function (relu)
* @param val
*/
activationRelu(val: Float32Array) {
return val.map((v) => Math.max(0, v));
}
/**
* Predict an output
* @param inputs
*/
predict(inputs: Float32Array) {
let result = createEmpty(this.outputCount);
for (let i = 0; i < this.outputCount; i++) {
for (let j = 0; j < this.inputCount; j++) {
result[i] += inputs[j] * this.weights[i * this.inputCount + j];
}
result[i] += this.biases[i];
}
return this.activate(result);
}
/**
* Get the weights of the layer
*/
getWeights() {
return Float32Array.from([...this.weights, ...this.biases]);
}
/**
* Gst current layer topology
*/
getTopology() {
return new Float32Array([
this.inputCount,
this.activation,
this.outputCount,
]);
}
/**
* Set the weights of the layer
* @param weights
*/
setWeights(weights: Float32Array) {
this.weights = weights.slice(0, this.weights.length);
this.biases = weights.slice(this.weights.length);
}
}
export class Network {
network: Layer[] = [];
inputs: any;
outputs: any;
/**
* Create a new network
* @param inputs
* @param outputs
* @param layer
*/
constructor(inputs: number, outputs: number, layer: (number[] | Layer)[]) {
this.inputs = inputs;
this.outputs = outputs;
for (const layerSize of layer) {
const l =
layerSize instanceof Layer
? layerSize
: new Layer(layerSize[0], layerSize[2], layerSize[1]);
this.network.push(l);
}
}
/**
* Predict an output
* @param input
*/
predict(input: Float32Array) {
return this.network.reduce((acc, layer) => layer.predict(acc), input);
}
/**
* Get topology for whole network
*/
getTopology() {
return new Float32Array(
[
this.inputs,
this.outputs,
this.network.length,
...this.network.map((layer) => [...layer.getTopology()]),
].flat()
);
}
/**
* Get the weights of the network
*/
getWeights() {
return this.network.reduce((acc, layer) => {
return new Float32Array([...acc, ...layer.getWeights()]);
}, new Float32Array([]));
}
/**
* Set the weights of the network
* @param weights
*/
setWeights(weights: Float32Array) {
let offset = 0;
for (const layer of this.network) {
layer.setWeights(weights.slice(offset, offset + layer.size));
offset += layer.size;
}
}
/**
* Get the size of the network
*/
size() {
return this.network.reduce((acc, layer) => acc + layer.size, 0);
}
/**
* Serialize the network
*/
toBinary() {
const topology = this.getTopology();
const weights = new Float32Array(topology.length + this.size());
weights.set(this.getTopology());
weights.set(this.getWeights(), topology.length);
return Buffer.from(weights.buffer);
}
/**
* Create a network from a binary
* @param json
* @param weights
*/
static fromBinary(buffer: Float32Array) {
const inputs = buffer[0];
const outputs = buffer[1];
const length = buffer[2];
const layers = Array.from({ length }).map((_, i) => {
const start = 3 + i * 3;
const end = start + 3;
const topology = buffer.subarray(start, end);
return new Layer(topology[0], topology[2], topology[1]);
});
const network = new Network(inputs, outputs, layers);
network.setWeights(buffer.subarray(3 + length * 3));
return network;
}
}
An agent
is some entity that decides what to do next and receives rewards or fines for its activities. In other words, an agent is an ordinary person who makes decisions at work and receives a reward for it.
As part of our task, the agent contains a neural network and selects the best solution from the point of view of the neural network.
Also, our agent, like any good employee, remembers the results of his work and reports to his superiors (genetic algorithm) about the average value, based on which the final decision about this employee is made.
import { Keccak } from "sha3";
import { Network, Agent, createEmpty, getMoves, FEN } from "shared";
export class AgentTrainer implements Agent {
historySize: number;
history: Float32Array;
id: string;
private _games: Set<string> = new Set();
games: number = 0;
wins: number = 0;
score: number = 0;
minScore = +Infinity;
maxScore = -Infinity;
age: number = 0;
network: Network;
taken: boolean = false;
_player: "B" | "W" = "W";
/**
* Create a new agent
* @param historySize
* @param modelPath
* @param weights
*/
constructor(historySize: number, buf: Float32Array) {
this.historySize = historySize;
this.network = Network.fromBinary(buf);
this.id = new Keccak(256).update(Buffer.from(buf.buffer)).digest("hex");
this.history = createEmpty(this.historySize);
}
/**
* Create a new epoch
*/
onNewEpoch() {
this.age += 1;
this.score = 0;
this.games = 0;
this._games = new Set();
this.maxScore = -Infinity;
this.minScore = +Infinity;
this.wins = 0;
this.reset();
}
/**
* Check if the player has played against the opponent
* @param player
*/
hasPlayedBefore(player: AgentTrainer) {
if (this.id === player.id) {
return false;
}
return this._games.has(player.id);
}
/**
* Set the result of a match
* @param score
* @param opponent
*/
setResult(score: number, opponent: AgentTrainer) {
this._games.add(opponent.id);
this.games += 1;
this.score += score;
this.minScore = Math.min(this.minScore, score);
this.maxScore = Math.max(this.maxScore, score);
if (score > 0) {
this.wins += 1;
}
}
/**
* Calculate the average score
* @returns number
*/
getAverageScore() {
return this.score / this.games;
}
/**
* Get the weights of the network
*/
getWeights() {
return this.network.getWeights();
}
getTopology() {
return this.network.getTopology();
}
/**
* Serialize the weights of the network
*/
serialize() {
return this.network.toBinary();
}
/**
* Reset history
*/
reset() {
this.history = new Float32Array(this.historySize);
this.taken = false;
}
toString() {
return `${this.id} with ${String(this.score).padStart(
6,
" "
)} points min: ${String(this.minScore).padStart(6, " ")} max: ${String(
this.maxScore
).padStart(6, " ")} avg: ${String(
this.getAverageScore().toFixed(2)
).padStart(9, " ")} ${((this.wins / this.games) * 100)
.toFixed(2)
.padStart(6, " ")}%`;
}
setPlayer(player: "B" | "W") {
this._player = player;
}
/**
* Calculate moves and return the best one
* @param gameState
* @returns
*/
getMove(gameState: FEN): string {
const board = new Float32Array(50);
const wMul = this._player === "W" ? 1 : -1;
for (let i = 0; i < gameState.white.length; i++) {
let isKing = gameState.white[i].startsWith("K");
let pos = isKing
? parseInt(gameState.white[i].slice(1), 10)
: parseInt(gameState.white[i], 10);
board[pos] = wMul * (isKing ? 2 : 1);
}
for (let i = 0; i < gameState.black.length; i++) {
let isKing = gameState.black[i].startsWith("K");
let pos = isKing
? parseInt(gameState.black[i].slice(1), 10)
: parseInt(gameState.black[i], 10);
board[pos] = -1 * wMul * (isKing ? 2 : 1);
}
this.history = new Float32Array([...board, ...this.history.slice(50)]);
const value = new Float32Array(this.network.inputs);
value.set(new Float32Array(50));
value.set(this.history, 50);
let pos = 0;
let posVal = -Infinity;
const moves = getMoves(gameState);
for (let i = 0; i < moves.length; i += 1) {
/**
* Create a new value for move
*/
const move = moves[i];
const val = value.slice();
val[move.from - 1] = -1;
val[move.to - 1] = 1;
const result = this.network.predict(val);
/**
* If the result is better than the previous one, save it
*/
if (result[0] > posVal) {
pos = moves.indexOf(move);
posVal = result[0];
}
}
/**
* Return the best move in the format from-to
*/
return `${moves[pos].from}-${moves[pos].to}`;
}
}
To get the result of the work, we need to compare two agents so that they are in equal conditions; each of them will play for each of the colors, and the final result is summed up.
In order not to overload the network with unnecessary information, each agent sees the board as if he is playing white checkers.
import { Draughts } from "@jortvl/draughts";
import { Player, Position, getFen } from "shared";
import { AgentTrainer } from "./agent";
export function playMatch(white: AgentTrainer, black: AgentTrainer) {
const draughts = new Draughts();
white.setPlayer(Player.White);
black.setPlayer(Player.Black);
while (!draughts.gameOver()) {
/**
* Get current player
*/
const player = draughts.turn() === Player.White ? white : black;
/**
* Get the move from the player
*/
const move = player.getMove(getFen(draughts.fen()));
draughts.move(move);
}
/**
* Calculate the score
*/
const [winner, ...left] = draughts.position().split("");
const score =
250 +
left.reduce((acc: number, val: string) => {
switch (val) {
case Position.Black:
case Position.White:
return acc + 3;
case Position.BlackKing:
case Position.WhiteKing:
return acc + 7;
default:
return acc;
}
}, 0) -
draughts.history().length;
/**
* Set the result, if white won, the score is positive; if black won, the score is negative
*/
return winner === Player.White ? score : score * -1;
}
At each era of training (testing the work) of agents, we have a set of experimental agents that we will pit against each other. But because checking each agent with each other will be very long and ineffective, we will use a simplified chess tournament algorithm.
At each stage, we have a list of players sorted by points and distribute them as follows: the first one plays with the first opponent from the middle of the sheet. If he has already played with him, we select the next one.
import {Agent} from "./agent";
import {playMatch} from "./play-match";
export function playTournament(playerList: Agent[]) {
let d = Math.floor(playerList.length / 2);
/**
* Count rounds
*/
let rounds = Math.ceil(Math.log2(playerList.length)) + 2;
for (let i = 0; i < rounds; i += 1) {
for (let j = 0; j < d; j += 1) {
let dj = d;
/**
* Find the next opponent
*/
let found = false;
while (dj < playerList.length && !found) {
if (playerList[dj].hasPlayedBefore(playerList[j]) || playerList[dj].games > i) {
dj += 1;
} else {
found = true;
}
}
if (found) {
let score = 0;
/**
* Play the match
*/
score += playMatch(playerList[j], playerList[dj]);
/**
* Play the reverse match
*/
score += playMatch(playerList[dj], playerList[j]) * -1;
playerList[j].setResult(score, playerList[dj]);
playerList[dj].setResult(-1 * score, playerList[j]);
}
}
playerList.sort((player1, player2) => player2.score - player1.score);
console.log('round', i, playerList[0].id, playerList[0].score.toFixed(1).padStart(6, ' '));
}
return playerList;
}
This is where all the magic of the genetic algorithm happens. Existing models are loaded, and new ones are created.
From these models, we get a genome with agents that will be used in our game. This genome will change as training progresses, agents with the lowest score will be discarded, and agents with the highest score will pass on their genes to new generations and will be compared with them.
Create a new genome:
A new set of genes is created in a given numerical interval, without hereditary problems and the like, and this is where everything will begin.
export function createNew(size: number, delta = 4): Float32Array {
return new Float32Array(size).map(() => Math.random() * delta - (delta / 2));
}
Crossover:
The mechanism is quite simple; we take two sets of genes (weights) from agents and, depending on the probability, we take a gene from the first agent or the second, thus, the genes are mixed and a new network is obtained with some knowledge from the first and second agent.
This is exactly how everything works in real nature; each of us has genes from each of our parents.
export function crossover(first: Float32Array, second: Float32Array, prob = 0.25): Float32Array {
return new Float32Array(first.map((w, i) => Math.random() < prob ? second[i] : w))
}
Mutate:
The mutation function works as follows: we take a set of genes and, with some degree of probability, add a little chaos to it. Again, in real life, everything works exactly the same way, and each of us has genes that are not in our parents, otherwise, there would be no diseases and other incomprehensible processes in our body.
export function mutate(master: Float32Array, prob = 0.25, delta = 0.5): Float32Array {
return new Float32Array(master.map(w => Math.random() < prob ? w + (Math.random() * delta - (delta / 2)) : w))
}
As a result of each tournament, we have a certain number of agents who are a little luckier than others (In genetic algorithms, everything is built on our lives, and some are more lucky, some are less).
And in the end, we take the list of these very lucky ones and compare them with each other to find the best genome for our game, and so that the data does not disappear, we must remember to save this genome.
When we have a sufficient number of such genomes, we can build populations of agents based on them. Still, each of the genomes already has some understanding of what it needs to do.
import * as fs from "fs";
import { playTournament } from "./play-tournament";
import { Network, createNew, crossover, mutate } from "shared";
import { playBattle } from "./play-battle";
import { AgentTrainer } from "./agent";
export async function tournaments(
historySize: number,
layers: number[] = [],
epoch = 64,
population = 32,
best = false
) {
const modelName = `${(historySize + 1) * 50}_${layers.join("_")}`;
const modelPath = `../models/${modelName}`;
/**
* Create the model if it does not exist
*/
if (!fs.existsSync(modelPath)) {
fs.mkdirSync(modelPath, { recursive: true });
}
let topPlayerList = [];
const topPlayerIds = new Set();
const bestModels = new Set();
let playerList: AgentTrainer[] = [];
const baseLayers = [];
let inp = historySize * 50 + 50;
for (let i = 0; i < layers.length; i++) {
baseLayers.push([inp, 1, layers[i]]);
inp = layers[i];
}
const baseNet = new Network(historySize * 50 + 50, 1, baseLayers);
const topologySize = baseNet.getTopology().length;
const size = baseNet.size() + topologySize;
/**
* Load the best models
*/
if (best) {
const weights = fs
.readdirSync(modelPath)
.filter((file) => file.endsWith(".bin"));
for (const weight of weights) {
const buf = fs.readFileSync(`${modelPath}/${weight}`);
const weights = new Float32Array(buf.buffer);
const agent = new AgentTrainer(historySize * 50, weights);
agent.age = 1;
bestModels.add(agent.id);
playerList.push(agent);
topPlayerList.push(agent);
topPlayerIds.add(agent.id);
}
const d = playerList.length;
let ind = 0;
/**
* Create new players by crossover and mutation from the best models.
* For the zero population, we need to ensure the greatest genetic diversity, than next populations.
* This way we will get a larger number of potentially viable models, from which subsequent generations will be built in the future
*/
if (d > 1) {
while (playerList.length < Math.max(population, d * 2)) {
const playerA = playerList[ind];
const playerB = playerList[Math.floor(Math.random() * d)];
if (playerA && playerB && playerA.id !== playerB.id) {
const newWeights = mutate(
crossover(playerA.getWeights(), playerB.getWeights())
);
const weights = new Float32Array(size);
weights.set(baseNet.getTopology());
weights.set(newWeights, topologySize);
const agent = new AgentTrainer(historySize * 50, weights);
playerList.push(agent);
ind += 1;
ind = ind % d;
}
}
}
}
/**
* Create the initial population
*/
while (playerList.length < population) {
const w = createNew(baseNet.size(), 2);
const weights = new Float32Array(size);
weights.set(baseNet.getTopology());
weights.set(w, topologySize);
const agent = new AgentTrainer(historySize * 50, weights);
playerList.push(agent);
}
/**
* Run the initial championship
*/
playerList = await playTournament(playerList);
console.log(
`0 ${playerList[0].id} (${playerList[0].age}) with ${playerList[0].score} points`
);
let currentEpoch = 0;
while (currentEpoch <= epoch) {
/**
* Keep the best 25% of the population
*/
playerList = playerList.slice(0, Math.floor(population / 4));
for (const player of playerList) {
player.onNewEpoch();
/**
* if the player is in the top 25% and has played at least one tournament, add it to the top players
*/
if (player.age > 1 && !topPlayerIds.has(player.id)) {
topPlayerIds.add(player.id);
topPlayerList.push(player);
console.log("add top player", player.id, topPlayerList.length);
}
}
const d = playerList.length;
/**
* Create new players by crossover and mutation
*/
let ind = 0;
while (playerList.length < population) {
const playerA = playerList[ind];
const playerB = playerList[Math.floor(Math.random() * d)];
if (playerA && playerB && playerA.id !== playerB.id) {
const newWeights = mutate(
crossover(playerA.getWeights(), playerB.getWeights())
);
const weights = new Float32Array(size);
weights.set(baseNet.getTopology());
weights.set(newWeights, topologySize);
const agent = new AgentTrainer(historySize * 50, weights);
playerList.push(agent);
ind += 1;
ind = ind % d;
}
}
/**
* Run the championship
*/
playerList = await playTournament(playerList);
currentEpoch += 1;
console.log(
`${currentEpoch} ${playerList[0].id} (${playerList[0].age}) with ${playerList[0].score} points`
);
}
/**
* Add the top players to the list from championship
*/
for (const player of playerList) {
if (player.age > 1 && !topPlayerIds.has(player.id)) {
topPlayerIds.add(player.id);
topPlayerList.push(player);
console.log("add top player", player.id, topPlayerList.length);
}
}
console.log("-----");
console.log(topPlayerList.length);
console.log("-----");
/**
* Reset agents
*/
for (const player of topPlayerList) {
player.onNewEpoch();
}
/**
* Run the final championship
*/
topPlayerList = await playBattle(topPlayerList);
let index = 1;
for (const player of topPlayerList) {
const code = bestModels.has(player.id) ? "\x1b[32m" : "\x1b[36m";
const reset = "\x1b[m";
console.log(
`${code}${String(index).padStart(4, " ")} ${player.toString()}${reset}`
);
index += 1;
}
/**
* Save the best player
*/
while (topPlayerList[0] && bestModels.has(topPlayerList[0].id)) {
/**
* Remove the best player if it is already in the best models
*/
console.log("remove", topPlayerList[0].id);
topPlayerList.shift();
}
if (topPlayerList[0]) {
let player = topPlayerList[0];
console.log(`${player.score} ${player.id}`);
const weights = player.serialize();
console.log(weights.length, weights.length / 4);
fs.writeFileSync(`${modelPath}/${player.id}.bin`, weights);
}
}
For the results, I used the most current models, and this is what came out:
When comparing 2 neural network models with each other, they show quite good results, although they perform illogical actions.
When playing between a neural network and an alpha beta search with a search depth of 1 step, the neural network has a fairly good chance of winning.
In a game between neural network and alpha beta searches with a search depth of 2 steps, the neural network had no chance and lost all the games.
I haven’t looked deeper yet because it’s pointless. Maybe after more games it will be able to produce more acceptable results, or if you teach the neural network to play not with the same networks, but with an alpha-beta search agent
Applying genetic algorithms to the game of checkers exemplifies the transformative potential of biologically-inspired computation. While traditional game-playing algorithms like Minimax and its variants have proven effective, the evolutionary and adaptive nature of GAs offers a fresh perspective.
As with biological evolution, the strategies developed through this method may not always be the fittest right away. Still, given enough time and the right conditions, they can evolve into formidable opponents, demonstrating the power of nature-inspired computation.
Whether you're a checkers enthusiast, an AI researcher, or just someone fascinated by the convergence of the old and the new, there's no denying that the melding of this age-old game with cutting-edge algorithms is an exciting frontier in artificial intelligence.
As usual, all code provided on GitHub