702 reads

How AI can be used to replicate a game engine

by Febin John JamesSeptember 15th, 2017

Too Long; Didn't Read

This is a derived work from the research paper <a href="https://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjiv8Ks_KfWAhULKY8KHbBeAjoQFggnMAA&url=https%3A%2F%2Fwww.cc.gatech.edu%2F~riedl%2Fpubs%2Fijcai17.pdf&usg=AFQjCNF3lLPanlsi1MVVhKUt0ql-q4gftQ" target="_blank">Game Engine Learning from Video</a> . The credit goes to Matthew Guzdial, Boyang Li, Mark O. Riedl from Georgia Institute of Technology.

featured image - How AI can be used to replicate a game engine

This is a derived work from the research paper Game Engine Learning from Video . The credit goes to Matthew Guzdial, Boyang Li, Mark O. Riedl from Georgia Institute of Technology.

In the following video Mario is played by an artificial intelligent agent, which uses the process of neural evolution to play the game like a master player.

Original Mega Man game play video on the left. Cloned game engine on the right. Image from Georgia Tech

Overview

The system scans each frame to find out the list of objects present in them, later they run an algorithm across adjacent frames to see how objects change between frames. Finally an engine search algorithm is run when the change detected is more than a set threshold.

Parsing Frames

The system needs two inputs, a sprite pallet(set of characters and objects in a game ) and a gameplay video. Using these inputs and OpenCV(Computer Vision Library) we can understand the number of sprites and their spatial positions in respective frames.

Given sequence of frames of sprites and their locations. An algorithm is run to match each sprites to its closest neighbour in the next frame. If the count of sprites doesn’t match , a blank sprite is created to match the remaining. This happens in some cases , ex: when Mario jumps over an enemy and destroys it.

Finally parsing is completed when these sprite representations are converted to a list of facts. Each of these fact types require a pre-written function to derive it from a given input frame. Here are the list of fact types.

Animation

This is a collection of all the sprite images according to their original filename, width an height. Ex: If an image of “mario1.png” with height size [26,26] was found at the position 0,0 , the animation fact would be {mario1,26,26}

Spatial

This is the sprite’s filename with the x,y coordinates in the frame.

RelationshipX

This is the relationships between pair of sprites in the x-dimension. Ex: (mario1, mario1’s closest edge to pipe1, 3px, pipe1, pipe1’s closest edge to mario1). This allows the system to learn collision rules like Mario’s velocity changes to 0, when it hits a pipe)

RelationshipY

The same fact mentioned above in the y-dimension like when Mario hits a brick.

VelocityX

This records sprite’s velocity in x-dimension , it compares the previous frame and the next frame. Ex: If Mario is at [0,0] in frame 1 and [10,0] in frame 2 , then the fact would be VelocityX: {mario,10}

VelocityY

The same fact mentioned above in the y-dimension.

CameraX

This stores how far the camera goes in a level.

Engine Learning

The engine learning approach tries to make a game engine which can predict the changes observed in the parsed frames. A game engine is a set of rules with each rule having IF and THEN . Ex: IF Mario collides with a pipe THEN change the velocity-x to 0. This approach consists of frame scan algorithm and engine search algorithm. The frame scan algorithm scans through the parsed frames and begin to search for rules that justifies the difference between predicted frame and actual frame. If a game engine is found which reduces the difference to an extend, then another scan is made to make sure the new engine can accurately predict prior frames.

Frame Scan Algorithm

engine = new Engine()currentFrame = frames [0]

while i=1 to frameSize do

Check if this engine predicts within the threshold

frameDist = Distance(engine, currentFrame, i + 1)if frameDist < threshold thencurrentFrame = Predict(engine, currentFrame, i + 1)continue

Update engine and start parse over

engine = EngineSearch(engine, currentFrame, i + 1)i=1currentFrame = frames[0]

The frame scan algorithm takes a set of parsed frames and a threshold as input and outputs a game engine. The distance function gives a pixel by pixel distance between the actual frame and the predicted frame . If the distance is less than set threshold, the predict function returns the closest frame to the actual frame.

Engine Scan Algorithm

closed = []open = PriorityQueue()open.push(1,engine)

while open is not empty do

node = open.pop()

if node[0]<threshold thenreturn node [1]

engine = nodeclosed.add(engine)

for Neighbor n of engine doif n in closed thencontinuedistance = Distance(engine, currentFrame, goalFrame)open.push(distance + engine.rules.length, n)

The engine scan algorithm makes a search to find a set of rules that creates the predicted frame within some threshold to the actual frame. This is done by generating neighbours for a given engine.

Neighbours are generated in the following ways

Adding rules

This requires picking up a pair of facts of same kind, one from the current frame and the other from the goal frame. Ex: Current frame has a velocityX fact of {mario1,5} and the goal frame has a fact of {mario1,0}. These pair of fact represents the change that rule handles. Here Mario’s velocity dropped. The other facts in the current frame make the initial condition for this rule. Though these will contain conditions that are not required for the change to happen. For Ex: Mario’s velocity dropped because his spatial distance between the pipe was too close. This might also include the condition of a cloud present above him. These set of initial conditions is minimised by modifying rule’s condition facts.

Modifying rule’s condition facts

The set of condition for a given rule is minimised by taking common conditions between an existing rule’s conditions and current set of conditions. If this neighbour reduces the pixel distance of predicted and the goal frame , then this is likely to be chosen than another neighbour who adds a new rule. This result in giving preference for smaller and generic engines.

Modifying a rule to cover an additional sprite

This works like the above one expect the changes it can make can be extended. For Ex: When Mario jumps on an enemy it vanishes or technically it goes from Animation Fact with values to one without. So one single rule can handle multiple cases.

Modifying a rule into a control rule

This changes the rule from being handled normally to a control rule. These are the cases of rules that player makes a decision upon such as the inputs that control the character (left, right, jump).

Here is an example of the output of a learned game engine.

Follow Hackernoon and me (Febin John James) for more stories. I am also writing a book to raise awareness on the Blue Whale Challenge, which has claimed lives of many teenagers in several countries. It is intended to help parents understand the threat of the dark web and to take actions to ensure safety of their children. The book Fight The Blue Whale is available for pre-order on Amazon. The title will be released on 20th of this month.