This is the second part of a 5-part blog series on reinforcement learning. In part one we learned a bit about what reinforcement learning was all about and set the stage for implementing a handful of retro video game use cases. This post focuses on getting the retro gaming environment set up to train our models.
We'll be building and training agents in Python, so there are a number of key things you'll need in order to get up and running. I'll be assuming a basic knowledge of how to install Python and Python packages in a virtual environment, Anaconda or otherwise.
Python 3 is the language we'll be using, which can be best obtained through the Anacondas package. Specifically I used Python 3.7 encapsulated in an Anacondas virtual environment.
OpenAI gym is an open source interface to a wide range of reinforcement learning environments and problems. Available as a Python package it provides tools for creating and interacting with the environment in the form of API's. It forms the backbone of our RL system for all of the subsequent use cases. From their website:
Gym is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball.
Once installed, a minimum working example can be stood up in a handful of lines of code:
import gym env = gym.make("CartPole-v1") observation = env.reset() for _ in range(1000): env.render() action = env.action_space.sample() # your agent here (this takes random actions) observation, reward, done, info = env.step(action) if done: observation = env.reset() env.close()
Gym-retro is another Python library from OpenAI. It builds a layer on top of the OpenAI Gym ecosystem providing implementations and API's for retro gaming environments. It currently includes "integrations" (data, scenarios, states and reward function definitions) for over 1,000 retro games across a multitude of platforms including various Nintendo, Sega and Atari systems. The library also provides tooling for creating integrations for new games, which we'll walk through shortly. First, a brief definition of some terminology.
ROM: ROMs are files that contain a copy of the read-only-memory chip from a video game cartridge or disk. Legality of possession is questionable without owning an original copy of the game in question. These ROM files are playable on...
Emulators: software that emulates retro hardware and enables playing of retro games on modern computing platforms.
Gym-Retro ships with emulators for each of the platforms it supports. It does not ship with ROM files for commercial games, these will be up to the user to source.
Gym-Retro Integration UI
This UI allows you to interactively play the game in order to integrate it into the system. It allows you to search for elements in RAM, track variables in real-time as you play and create and edit scenarios and game states. It's an essential tool in the process, and will be discussed further shortly. Note that you must place this executable in the same directory as you Gym-Retro Python installation for it to work correctly. The links to both the Windows and Mac UI packages appear to be broken right now, sadly.
Stable Baselines 2/3
Stable baselines is a set of "reliable" implementations of common reinforcement learning algorithms. Yet another Python library, it's based on the OpenAI baselines package and extends upon some of that code, aiming to also be more streamlined and easier to use. The documentation is somewhat... lacking in my opinion, but I've learned a bunch of stuff through trial and error that I'll share over subsequent posts. Version 2 is backed by Tensorflow and version 3 is backed by Pytorch, but the developers have done a decent job of abstracting this away from the user. Both versions include Tensorboard integration for tracking model performance, and both have support for multithreading where the algorithms support it. Finally, the libraries come packaged with a bunch of pre-trained models (RL Zoo) that you can experiment with.
Once you've stood up your Python 3 environment, installed the requisite packages and downloaded the integration UI, you're ready to start integrating a game of your choice! Here's a quick tutorial, using Super Mario Kart on SNES as an example.
Vague Order of Operations
Obtain ROM for game you wish to integrate
Open game in integration tool
Save the start state for the game: this will be the point at which you want the agent to start learning how to play.
Define a done condition: when is the simulation over?
Define a reward function: what constitutes good performance in the game? Scoring points, or some sort of level progression is a good starting point.
Pull variables from RAM to inform the reward function/done condition. This is generally done by some combination of playing the game and using the UI search functionality.
Note that gym-retro supports Lua script integration for more complex reward/termination criteria. Screenshot of integration UI below.
The next post in this series will outline the first game we tried to integrate and teach an agent to play - Tetris.