Although it was a lot of fun to write a game, my main intention in tackling this project was less to make something anyone would actually bother to play and more to learn as many new technologies and techniques as possible (it’s a happy coincidence that it did end up pretty cool!). We delved into socket.io to enable real-time play and chat features, got our hands dirty learning the ins and outs of hexagon grids and SVG rendering for the board, spent a fair amount of time designing a server-side caching system that never quite came to fruition, and ultimately deployed our Dockerized project to an Amazon AWS EC2 to get familiar with all this micro-services stuff everyone’s talking about these days. The real joy for me, though, was in designing the game logic.
Of that game logic, there are three elements that I worked on directly and that I’m particularly proud of: one, our complex combat algorithm; two, our (admittedly super simple) procedural generation algorithm to create dynamic boards; and three, Hexbot, the rudimentary AI that you can play against in single player. This article is about him.
Picking An Approach
First, a quick disclaimer: this is the (long) account of a single naive solution to the infinitely complex challenges of game AI, and how I produced a working solution under time pressure. It may not have ultimately been entirely optimal, and I welcome any and all feedback, but this was my first stab and I hope it’s instructive for you! If you’re thinking about implementing your own AI algorithm or are just curious what kind of decisions a bot player in a game might be making, please read on.
That being said, I didn’t really know where to start when designing Hexbot. Most of my preliminary research focused around machine learning and TensorFlow, which I quickly realized was out of scope for a total newbie on a project with a one month time limit — even if I had managed to learn the platform, I didn’t have the time or resources to train up the bot itself. I spent a lot of time reading about finite state machines, path-finding, and A* search, but none of those felt quite right for my needs either, for various reasons beyond the scope of this article.
Finally, I happened upon the minimax algorithm with alpha-beta pruning, the algorithm used by Deep Blue to beat the world’s best chess players. In a nutshell, this strategy looks as many moves into the future as possible, creates a tree out of every possible board state, and then, using a heuristic which assigns positive values to moves that are advantageous for the AI player and negative values which are advantageous to its opponent, uses a bit of game theory to pick the move which has the best consequent board states, pruning any branches that would lead to sub-optimal results in order to save time as it goes.
For about a week, I believed that minimax was the right way to handle my problem, and indeed, its implementation seemed like something I could wrap my mind around. Ultimately, it’s a lot like how a human plays a game, just with access to more perfect and far-reaching information. When I actually sat down to get it working, however, it didn’t take long before I discovered two problems.
Firstly, I didn’t want Hexbot to be too smart. I wanted him to be engaging and fun to play against above all else, and in order for that to be true, players have to have a fighting chance. Losing to Deep Blue, or really losing in any game where you were never even a contender, is not a terribly fun experience. Good game AI needs to exist in that precarious middle ground where, yes, it’s smart and does clever things, but is also predictable and organically gives rise to interesting game states. Whether or not I succeeded on that front with Hexbot is a subjective question, but it was definitely something I thought about, and a reason I decided not to pursue minimax, which I thought might inadvertently result in a little too perfect a player.
In a perfect world, an adapted minimax implementation may still have been the way to approach this problem, but with the time I had available, I would need to find something more streamlined.
So, I decided to pivot. In my research, I had stumbled across this awesome GDC video in which the developers of 2014’s spectacular Shadow of Mordor discuss what they call “goal-oriented action planning,” or GOAP. To quickly and radically oversimplify, the thrust of this strategy is to reduce an AI to a set of world-state-dependent actions which are used to decide on an optimal path towards meeting a given goal.
In contrast to a simple finite state machine, which is merely a set of interconnected, discrete AI states triggered by changes in overall world state, an AI that uses GOAP formulates a goal, looks at the tools it has available, and then decides which ones, and in what order, would be best suited to accomplish that goal. It can also alter that strategy on the fly.
GOAP still seemed a bit too complex for my needs, since ultimately Hexbot could really only do a few things (purchase, move, deploy, attack), but it did help orient me towards a way of thinking that would eventually lead to my implementation of Hexbot: that is,
reduce each potential move down to its atomic elements, and all you really have is a set of composable sub-moves which can easily be assigned arbitrary point values and executed as necessary to reached a desired outcome.
If you then relate those sub-moves to the goal of eliminating the enemy’s units, with an awareness of the state of the game at any moment in time, and then simply evaluate which series of sub-moves ultimately yields the best numerical result, you have a radically tooled down but still effective and predictable AI. As a plus, you can also tweak its priorities by simply altering the heuristic it uses to generate values.
So! Enough theoretical stuff. What did all of this actually look like in practice?
The Code Part One: Boilerplate
If you’re interested in seeing how the whole game fits together, I encourage you to take a look at the repo, but let’s just take a zoomed in look at things to understand what makes our bot tick. First and foremost, Hexbot is actually just a single function that relies on a couple of utility functions. If you choose to play a game against him, the function that evaluates the end of every player turn will invoke the AI at the appropriate place. Not exactly rocket science so far!
Our Hexbot file is where all the magic happens. It’s a doozy, with almost 500 lines of code, plus another 100 of utility functions, but understanding the data flow isn’t really too bad. Indeed, the overwhelming majority of it is just collecting all of our data and constructing projections for the next two turns. Ultimately, all we’re doing is finding all current possible moves for the bot, as well as all possible subsequent moves opened up by each of those moves, and evaluating them using a heuristic which will assign a number value and determine what the bot’s action is going to be. Piece of cake.
The first thing we see in the file is a constant that represents all the hex relationships. Understanding coordinates and spatial geometry is a bit more challenging on a hex grid than in an a standard Cartesian coordinate system (I highly recommend this Red Blob article if you want more background), and although one of our project’s stretch goals was to add variable board sizes, we never quite got there. So as it stands, we simply hard-coded an object that lists out all 17 hexes and their neighbors.
Next, we have the hexbot function itself (aka the rest of the file). The first thing we do is grab/create all of the variables we’re going to need throughout this process. A few of them are really only related to housekeeping/passing around application state that the bot will need access to, but they all help us get an understanding of what it is we’re actually going to be doing here, so let’s take a look:
Socket and room are part of that housekeeping I mentioned. boardState is an array of objects, and is what actually represents what’s on the board at any given moment — we’ll be using it as our starting point. The unit totals, resources, and unit bank variables all similarly pull out individual pieces of salient application state.
Starting on line 35, though, we’ve got the most interesting stuff, and can start to understand our overall strategy: we have three objects, which we use to store 1) the possible moves the bot can make this turn (possibleMoves), 2) the possible moves the bot could make next turn (possibleNextTurnMoves), and 3) the heuristic values of each of the moves available this turn (moveValues).
It may seem strange that we don’t have a possibleNextTurnMovesValues object, but we’re not actually going to be evaluating those in isolation, since they don’t directly affect what the bot is going to do this turn. Instead, for each of the moves we can make this turn, we’re going to look at all of its consequent moves and simply use them as modifiers to that move’s values. The same is true of a human player considering her moves in any board game: you can only move this turn, but you always consider the possibility space that this turn’s move will open up, and that possibility space affects the weight of your decision this turn.
The other three variables that are truly of interest to our process are bestMove, worstSecondaryThreat, and purchase. The third of these, purchase, is the least interesting: it’s simply a storage value where we’ll keep track of what purchase (if any) will be made as part of the current best move! bestMove itself, on the other hand, will be a running record of whatever the best move is this turn, represented by its coordinates and a numeric value we’ll generate with our heuristic. We will ultimately use this array to actually execute the bot’s movement. worstSecondaryThreat is similar, except we’ll be using it to determine whether or not the bot should make any purchases to deal with encroaching enemy forces this turn that may become a problem next turn — the reason we don’t need a worstThreat variable is that immediate threats will already have been evaluated directly, as they themselves are possible moves.
All three variables will be overwritten multiple times at the end of our program as we loop through each possible move and discover alternatives with higher heuristic values. Notice that both bestMove and worstSecondaryThreat are initialized with values of negative infinity: any and all moves, with the exception of an immediate game lose-state, will return a numeric value that, even if negative, will be preferable to simply staying still.
The Code Part Two: Collecting Our Data
Next up, we’ve got some loops. So. Many. Loops.
The idea for phase one, as alluded to above, is just to walk through the board and collect information on its layout. We are only really interested in each hex where the bot currently has units plus its neighbors and secondary neighbors, so we’ll only investigate a hex if its occupied by player two, which will always be our bot in an AI game. We’ll also hold onto that hex for convenience in a storage object for our bot hexes. Also of note: we’re not actually storing the entire hex, just a reference to its index, since we can always just refer back to the board state to see what’s actually on that hex if necessary.
Next, whenever we find a bot hex, we want to walk through each neighbor of that hex to determine if it’s occupied by an enemy and if it has resources (and if so, which type). We are going to store these hexes in a list of immediate threats and a list of resource hexes — here’s the tricky part — relative to the bot-controlled hex we’re currently examining. It’s critical to preserve this relative relationship, because if we have multiple bot hexes on the board, we’ll want to know which one is threatened by a particular enemy force or is next to a resource, not just that there’s a neighbor of interest somewhere on the board. That is to say, if the AI controls hex six and hex seventeen and there are enemies on hex five, we want to know that hex five is a threat specifically to hex six, not just a threat in general. We’ll apply this pattern elsewhere, as well, so get your head around it!
Now that we are accounting for all bot hexes and all immediate neighbors, we also want to take into account secondary neighbors relative to each of those neighbor hexes. This process will be very similar to the above; the only difference is that we’ll collect our possibilities in an object for possibleNextTurnMoves and all of our storage is going to be nested one level deeper than before. Each of the up to six neighbor hexes of a given hex has six potential neighbors itself, and we want to store every one of those potential neighbors as part of the possibility space of the original neighbor in question… if your head is spinning by this point, that’s okay, the syntax is a little much, but the core concept, the idea that every secondary move is a consequence of a given move and we want to preserve that entire chain, current hex -> its immediate neighbors -> neighbors of those neighbors, is what’s important.
The Code Part Three: Begin Analysis
Cool! So, for those keeping score, we’ve collected all of our neighbor and secondary neighbor relationships, including resource and enemy placement. All that information is ripe for analysis! It’s time to — you guessed it — write even MORE loops to iterate over all of our options and begin our calculations. Here’s the general plan from 30,000 feet:
- Run mock versions of all potential combats, both this turn and next turn, using present unit totals. Return the results and store them as potential moves.
- Process those potential moves. For any combat that currently results in a tie or a loss, check if there is a purchase that the bot can make right now, or in the future, that will turn that loss into a tie or even a victory. Update the existing entry for each of these moves, storing the new outcomes and any purchases if necessary.
- For every possible move, perform our heuristic operation to get a number value and store it in our moveValues object. This heuristic will take into account the value of the move itself and its consequent moves, i.e., the moves that will be possible next turn if that move is picked.
- Once you’ve processed all possible moves, make the best one!
Alright, that’s not so bad! I’ll spare you a look at the combat simulation algorithms (points one and two), since they’re just chock full of variables and would require an explanation of the combat system as well, but let’s take a look at the function that actually walks through the possible moves once they’ve been given combat values and runs the heuristic. It’s pretty dang simple:
As a reminder, moveValues will be an object with keys that represent all currently bot-owned hexes and values that are nested objects with all possible moves from that hex and their related numerical scores. A sample object where the bot has units at hex seventeen, with options to move to hexes thirteen and sixteen at values of twenty-five and fifty respectively, would look like this.
So to reiterate, this function just walks each hex in the possibleMoves object, and for each move possible from that hex, runs our heuristic value on that move and its secondary consequents, and finally puts that final tabulated score into a final object to keep track of those scores. We’re almost there!
The Code Part Four: The Heuristic
Before we take a look at how the best move is selected (which is actually pretty simple, once we have all the values) let’s just take a break to look at the heuristic itself, since its pretty valuable to understand what’s going on under the hood. I’ve picked arbitrary values, essentially, that can be tweaked to create different game balance depending on your needs. An instant victory is worth infinity, a instant loss is worth negative infinity, winning a combat without winning the game is worth 75 points, acquiring a resource or getting a tie is worth 15 points, etc.
We will also be calling this heuristic recursively on each secondary move related to the move in question, weighting all of those values a little bit less and tabulating those values into our final score for the move itself — that’s the point of the flag variable in there, which keeps track of whether we’re currently looking at a primary or secondary move. Even though an instant victory as the result of a current move might be worth infinity, for example, the value of a secondary move that might eventually lead to a game victory might only be worth a hundred points, since a lot can change over the course of a turn. We take into account resource proximity, combat wins and losses, and game win/lose states when calculating these values.
Since we’ve already calculated the hypothetical result of a combat for each potential move and secondary move, all we do is run those outcomes through this heuristic to get our point value, and voila! Scores in hand, we are very nearly done!
Okay, home stretch. Let’s finally use that bestMove variable I mentioned at the beginning of the article. Here’s a little loop to check all our available options and find the optimal move:
As you can see, all we need to do is check each item’s heuristic value in our collected moveValues object and keep track of the highest scoring move (origin hex index, target hex index, and heuristic value) as we go. Couldn’t be simpler!
All that’s left once that’s done running is to actually enact our purchase of units, if there is one, before executing the move itself. There is just one little wrinkle: although we have been tracking next turn combats and have determined whether or not a purchase this turn could have an effect on them, we haven’t actually done anything with that information other than use it as a value modifier on our current move. We don’t want Hexbot to blindly stumble up to a battle it can’t win without buying units it could have afforded, however, so we have one last check, worstSecondaryThreatCheck, which determines that, if there is no purchase which is an inherent part of this turn’s move, the bot should check if its opponent’s reaction would necessitate purchasing troops. To rephrase: if the bot has ten units in a hex, its opponent has 15 units two spaces away, and it has enough resource to purchase 10 units, we need to make sure it does make that purchase even though all it’s going to do this turn is move closer, which does not require a purchase this turn.
So our best move has been picked, our purchase has been enacted… all we have left to do is emit a socket event to keep our server in the loop and get things moving, set up a little thinking animation, and we’ve done it!
Conclusion: We Done Made A Robot
So, three and a half thousand words and over 500 lines of code later, we have a working bot to play single player against. There is still a ton of functionality that could be added — you’ll notice that we didn’t ever give Hexbot the ability to split his armies, he doesn’t take into account his opponent’s resources or undeployed units, and we’re only ever looking two turns ahead, among other things. He also has a bad tendency to just go up and to the left if there is no single best option available (oops! ¯\_(ツ)_/¯). This MVP, however, is entirely serviceable, if a bit predictable. In fact, I’ve gotten the feedback that most find him a little bit too hard until they figure out the pattern, which seems to be a pretty good sweet spot in terms of challenge.
Thanks for coming along if you’ve made it this far, and good luck making your own bots! Here is another cute ferret as a thank you.