Facebook Messenger recently received an update that allows users to play games within the App, amongst them was a vertical scrolling game — Endless Lake, which was getting pretty competitive within my social group.
In this article I’ll be talking about the processes I used and challenges I faced whilst building a bot for the game Endless Lake.
I don’t like having free time
I recently finished a side project and was feeling rather empty inside, and so I decided to build a simple bot to play this game with the following two rules enforced:
- The inputs for the robot must either be the raw pixel values from the screen or a processed version of the raw pixel values.
- No hard-coding rules. Or in other words, it’ll attempt to learn how to play the game by watching a user play it.
Python was chosen as the language of choice as I was familiar with it and it has a ton of machine-learning libraries that integrates with it.
Mouse/Keyboard Event handler
This is really crucial in collecting data and enabling the bot to play the game. Fortunately there’s a really easy to use library which fits the bill — pyuserinput.
Capturing a region of the screen at a good rate (at least 30 fps) was already a challenge due to Python’s interpretive nature. Well-known libraries such as pyscreenshot or PIL/Pillow was able to take a screenshot of the region every 0.1 seconds or so (or around 10 fps), which was too slow and only 1/3 of the speed which I originally wanted.
I was ready to give up until I stumbled onto this korean blog talking about python-wx. Installation was a pain in the ass but was worth it. I could capture a region of my screen at around 200 fps, a 20 fold increase!
If you would like to have fast screencapture using Python, have a look at the code.
I decided to use OpenCV as it had a suite of tools that suited my needs. After grabbing the screen region, I used Otsu’s dynamic thresholding to figure out the contour of the game window, and crop it so it only captures the game window and nothing more.
I then divided the game window into a NxN grid (which can be changed), used some simple RGB thresholding to figure out where the platform and player was relative to each other.
Since I didn’t really care anything that was too far ahead or too far behind the player, I decided to just take into account 6 rows in front of the player. These rows will be used as inputs for our neural network. A visualization of the inputs can be seen below. (Red=Water, Green=Platform, Blue=Player)
I decided to use Neural Networks from scikit-learn due to its sheer simplicity and ease-of-use. Of course you can use other techniques such as random trees, SVMs, KNN, etc.
Does it even work?
Errr well, it does, with enough data.
Here’s how it performs with 1 training set:
6 training sets:
10 training sets:
More data = more performance
If you have any questions just email me at kendricktan0814 at gmail.com.