How to Make a Gaming Bot that Beats Human Using Python and OpenCV

Written by zetyquickly | Published 2021/09/26
Tech Story Tags: computer-vision | gaming | artificial-intelligence | python | gaming-bot | ai-vs-humans | algorithm-vs-human | python-tutorials | hackernoon-es

TLDRA Python bot piggybacks on computer vision library OpenCV to beat Don’t touch the red an endless runner from Addicting Games. The bot uses the OpenCV algorithm to find a matching patch on the target image. It is not a neural network approach it is much simpler and limited. The other important parts of the bot are how we can get screens to analyze and take screenshots of a particular part of the game. The bot’s very handy to set “break action” on some keyboard when your mouse is controlled by a key.via the TL;DR App

Do you remember the time when the Flappy Bird appeared on the smartphones screens? This game has started the era of casual games with very few in-game actions, and each wrong move means your game is over. The one who lasts alive longest will head the leaderboard.

Today we will see how to write a Python bot that piggybacks on computer vision library OpenCV to beat Don’t touch the red, an endless runner from Addicting Games.

Rules of the game

The game is pretty simple: green buttons fall, and the player needs to press them before leaving the screen. And of course, don’t touch the red!

There’s one crucial feature; if you play in arcade mode, the buttons fall with increasing speed. That makes the game hard for a human player. But it is not an issue for our bot!

OpenCV Template Matching

The main part of our computer vision bot is the template matching available in the OpenCV library. It is not a neural network approach. It is much simpler and limited. This algorithm is meant for searching for a patch on the target image, e.g., a “green button” on a “gaming screen.” It works the following way: the algorithm takes the template image and then, using a sliding window, tries to find a matching patch on the target image. Using this, we can get the positions and similarity measures for each pixel.

In the code application of the template matching looks like this

import cv2

template = cv2.imread('template.png')
target = cv2.imread('target.png')
result = cv2.matchTemplate(target, template, cv2.TM_CCOEFF_NORMED)
    _, max_val, _, max_loc = cv2.minMaxLoc(result)

As a result, we’ll have the max_val equal to the maximum similarity on the target image and the max_loc is the upper left corner of the found match.

This algorithm is faster when it works with smaller target images and smaller patterns. Firstly, I’ve tried to work with whole green buttons, but then I’ve switched to smaller ones that work faster, and with that, I’ve achieved higher scores

Taking Screenshots and Interaction

The other important parts of the bot are getting screens to analyze and how we’re sending mouse clicks to the game. It is necessary to mention that Addicting Games provide games that you can play using your internet browser, so nothing additional must be installed.

There’re two Python packages that help with the tasks above: mss and pyautogui, we use them to get the screenshots of a particular part of the screen and to send clicks to the browser window correspondingly. I also use keyboard library as it’s very handy to set the “break action” on some key in the case when your mouse is controlled by a bot. The keyboard library (and probably pyautogui) require sudo rights, so run your Python script as an executable with a proper shebang header.

Here I provide code snippets on how to get screenshots and send clicks:

#!/hdd/anaconda2/envs/games_ai/bin/python

# ^ change above to your python path ^

import keyboard
import mss
import pyautogui

pyautogui.PAUSE = 0.0

print("Press 's' to start")
print("Press 'q' to quit")
keyboard.wait('s')

# setup mss and get the full size of your monitor
sct = mss.mss()
mon = sct.monitors[0]

while True:
    # decide on the part of the screen
    roi = {
        "left": 0, 
        "top": int(mon["height"] * 0.2), 
        "width": int(mon["width"] / 2), 
        "height": int(mon["height"] * 0.23)
    }

    roi_crop = numpy.array(sct.grab(roi))[:,:,:3]
    
    # do something with `roi_crop`

    if keyboard.is_pressed('q'):
        break

Here’s also one thing. When you use pyautogui on Linux, you might face Xlib.error.DisplayConnectionError it is possible to overcome with xhost + command.

My Algorithm

Based on the latter two, I’ve created an algorithm that beats the previous human playing score of 170 with a score of 445.

There are two parts to a program. First tries to click the first three-button available on a screen when the game starts. The game field doesn’t move until a player hits the first button, so we can treat a field as static when we do click on the first three. For that purpose, we inspect the three lines of the screen, searching for a small pattern (see the previous figure), and then click on them

The first half of the code:

#!/hdd/anaconda2/envs/games_ai/bin/python

# if "Xlib.error.DisplayConnectionError" use "xhost +" on linux

import shutil
import os
import keyboard
import mss
import cv2
import numpy
from time import time, sleep
import pyautogui
from random import randint
import math

pyautogui.PAUSE = 0.0

print("Press 's' to start")
print("Press 'q' to quit")
keyboard.wait('s')

try:
    shutil.rmtree("./screenshots")
except FileNotFoundError:
    pass
os.mkdir("./screenshots")

# setup mss and get the full size of your monitor
sct = mss.mss()
mon = sct.monitors[0]

frame_id = 0
# decide where is the region of interest
for idx in range(3,0,-1):
    roi = {
        "left": 0, 
        "top": int(mon["height"] * (idx * 0.2)), 
        "width": int(mon["width"] / 2), 
        "height": int(mon["height"] * 0.23)
    }

    green_button = cv2.imread('green_button.png')
    offset_x = int(green_button.shape[0] / 2)
    offset_y = int(green_button.shape[1] / 2)

    roi_crop = numpy.array(sct.grab(roi))[:,:,:3]
    result = cv2.matchTemplate(roi_crop, green_button, cv2.TM_CCOEFF_NORMED)
    _, max_val, _, max_loc = cv2.minMaxLoc(result)

    print(max_val, max_loc)

    button_center = (max_loc[0] + offset_y, max_loc[1] + offset_x)
    roi_crop = cv2.circle(roi_crop.astype(float), button_center, 20, (255, 0, 0), 2)
    cv2.imwrite(f"./screenshots/{frame_id:03}.jpg", roi_crop)

    abs_x_roi = roi["left"] + button_center[0]
    abs_y_roi = roi["top"] + button_center[1]
    pyautogui.click(x=abs_x_roi, y=abs_y_roi)
    frame_id += 1

In the second part, we press the following 400 buttons; it is implemented as an infinite while loop that captures the screen and clicks on the pixel where it is expected to see a button regarding the current speed. The speed function has been selected as a logarithmic function of the number of iterations. This function provides a pixel offset needed to adjust when time has passed since the pattern had been found.

The second half:

second_roi = {
    "left": 0, 
    "top": int(mon["height"] * 0.18), 
    "width": int(mon["width"] / 2), 
    "height": int(mon["height"] * 0.06)
}

btn = cv2.imread('center.png')
offset_y = int(btn.shape[0])
offset_x = int(btn.shape[1] / 2)

thresh = 0.9
frame_list = []
btn_cnt = 1
while True:
    frame_id += 1
    second_roi_crop = numpy.array(sct.grab(second_roi))[:,:,:3]
    result = cv2.matchTemplate(second_roi_crop, btn, cv2.TM_CCOEFF_NORMED)
    _, max_val, _, max_loc = cv2.minMaxLoc(result)
    
    # define the speed of the screen
    speed = math.floor(math.log(frame_id)**2.5)
    print(frame_id, max_val, max_loc, speed)
    frame_list.append(max_loc[0])
    if max_val > thresh:
        button_center = (max_loc[0] + offset_x, max_loc[1] + offset_y)
        second_roi_crop = cv2.circle(second_roi_crop.astype(float), button_center, 20, (255, 0, 0), 2)
        cv2.imwrite(f"./screenshots/{frame_id:03}.jpg", second_roi_crop)

        abs_x_sec = second_roi["left"] + button_center[0]
        abs_y_sec = second_roi["top"] + button_center[1] + speed
        pyautogui.click(x=abs_x_sec, y=abs_y_sec)
        btn_cnt += 1

    if keyboard.is_pressed('q'):
        break

As you can see, the speed is parameterized, and depending on your PC configuration, you can find better parameters that beat my high score. I encourage you to do that! This is because the code is very dependent on the speed of image processing and it may vary from system to system.

Here’s the peek of the one run. How it looks like when the bot is actually running.

In order not to be unfounded, here is the leaderboard screenshot. I need to mention that in this particular game, score at all levels of difficulty goes to the leaderboard, so you needn’t play “Hard”. “Easy” level is just fine (btw when you reach 100 pressed button, you can’t tell that it’s easy anymore)

The code of the project is available at Github https://github.com/zetyquickly/addicting-games-ai. It would be great to create an extensive library of Hacked Addicting Games and keep all of these algorithms there. So you are invited to create the pull requests!

Acknowledgments

This video inspired this project:

https://www.youtube.com/watch?v=vXqKniVe6P8

Here the author beats the leaderboard of the Kick Ya Chop game, it has similarities with Don’t Touch the Red, but also, there’s a big difference. In Kick Ya Chop, the player decides on the speed of the game. The faster the human/bot clicks faster the tree is falling. In Don’t Touch the Red, the game decides on the speed of upcoming buttons.


Written by zetyquickly | Machine learning enthusiast. Research engineer at Skoltech.
Published by HackerNoon on 2021/09/26