Added core library files

zJean001 · Oct 29, 2018 · f10e239 · f10e239
1 parent 54b2c36
commit f10e239
Show file tree

Hide file tree

Showing 67 changed files with 25,115 additions and 122 deletions.
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ This Python library has the to potential to train your reinforcement learning al
 The toolkit has currently been applied to Street Fighter III Third Strike: Fight for the Future, but can modified for any game available on MAME. The following demonstrates how a random agent can be written for a street fighter environment.
 ```python
 import random
-from Main.SF_Environment.Environment import Environment
+from sf_environment.Environment import Environment
 
 env = Environment(difficulty=3, frame_ratio=3, frames_per_step=3)
 env.start()
@@ -25,7 +25,7 @@ The toolkit also supports hogwild training:
 ```Python
 from threading import Thread
 import random
-from Main.SF_Environment.Environment import Environment
+from src.sf_environment.Environment import Environment
 
 
 def run_env(env):
@@ -48,14 +48,15 @@ def main():
     [thread.start() for thread in threads]
 ```
 
-![](https://raw.githubusercontent.com/BombayCinema/MAMEToolkit/master/hogwild3.gif "Hogwild Random Agents")
+![](pics/hogwild3.gif "Hogwild Random Agents")
 
 ## Setting Up Your Own Game Environment
 It doesn't take much to interact with the emulator itself using the toolkit, however the challenge comes from finding the memory address values associated with the internal state you care about, and tracking said state with your environment class.
 The internal memory states of a game can be tracked using the [MAME Cheat Debugger](http://docs.mamedev.org/debugger/cheats.html), which allows you to track how the memory address values of the game change over time.
 To create an emulation of the game you must first have the ROM for the game you are emulating and know the game ID used by MAME, for example for this version of street fighter it is 'sfiii3n'. Once you have these and have determined the memory addresses you wish to track you can start the emulation:
 ```python
-from MAMEToolkit import Emulator
+from emulator.Emulator import Emulator
+from emulator.pipes.Address import Address
 
 game_id = "sfiii3n"
 memory_addresses = {
@@ -83,14 +84,14 @@ The step function returns the frame data as a NumPy matrix, along with all of th
 
 To send actions to the emulator you also need to determine which input ports and fields the game supports. For example, with street fighter to insert a coin the following code is required:
 ```python
-from MAMEToolkit import Action
+from emulator.Action import Action
 
 insert_coin = Action(':INPUTS', 'Coin 1')
 data = emulator.step([insert_coin])
 ```
 To identify which ports are availble use the list actions command:
 ```python
-from MAMEToolkit import list_actions
+from emulator.Emulator import list_actions
 
 game_id = "sfiii3n"
 print(list_actions(game_id))
@@ -133,7 +134,7 @@ There is also the problem of transitioning games between non-learnable gameplay
 
 The emulator class also has a frame_ratio argument which can be used for adjusting the frame rate seen by your algorithm. By default MAME generates frames at 60 frames per second, however, this may be too many frames for your algorithm. The toolkit by default will use a frame_ratio of 3, which means that 1 in 3 frames are sent through the toolkit, this converts the frame rate to 20 frames per second. Using a higher frame_ratio also increases the performance of the toolkit.
 ```Python
-from MAMEToolkit import Emulator
+from emulator.Emulator import Emulator
 
 emulator = Emulator("sfiii3n", memory_addresses, frame_ratio=3)
 ```
@@ -145,6 +146,6 @@ With a single random agent, the street fighter environment can be run at 600%+ t
 ## Simple ConvNet Agent
 To ensure that the toolkit is able to train algorithms, a simple 5 layer ConvNet was setup with minimal tuning. The algorithm was able to successfully learn some simple mechanics of Street Fighter, such as combos and blocking. The Street Fighter gameplay works by having the player fight different opponents across 10 stages of increasing difficulty. Initially, the algorithm would reach stage 2 on average, but eventually could reach stage 5 on average after 2200 episodes of training. The learning rate was tracked using the net damage done vs damage taken of a single playthough for each episode.
 
-![](https://raw.githubusercontent.com/BombayCinema/MAMEToolkit/master/chart.png "ConvNet Results")
+![](pics/chart.png "ConvNet Results")
 
 
diff --git a/Steps.py b/Steps.py
diff --git a/chart.png → pics/chart.png b/chart.png → pics/chart.png
diff --git a/hogwild3.gif → pics/hogwild3.gif b/hogwild3.gif → pics/hogwild3.gif
diff --git a/src/emulator/Action.py b/src/emulator/Action.py
@@ -0,0 +1,8 @@
+class Action(object):
+
+    def __init__(self, port, field):
+        self.port = port
+        self.field = field
+
+    def get_lua_string(self):
+        return 'iop.ports["' + self.port + '"].fields["' + self.field + '"]'
diff --git a/src/emulator/Console.py b/src/emulator/Console.py
@@ -0,0 +1,83 @@
+import os
+from subprocess import Popen, PIPE
+from src.emulator.StreamGobbler import StreamGobbler
+import queue
+import logging
+
+
+# A class for starting the MAME emulator, and communicating with the Lua engine console
+class Console(object):
+
+    # Starts up an instance of MAME with POpen
+    # Uses a separate thread for reading from the console outputs
+    # render is for displaying the frames to the emulator window, disabling it has little to no effect
+    # throttle enabled will run any game at the intended gameplay speed, disabling it will run the game as fast as the computer can handle
+    # debug enabled will print everything that comes out of the Lua engine console
+    def __init__(self, game_id, render=True, throttle=False, debug=False):
+        self.logger = logging.getLogger("Console")
+
+        command = "exec ./mame -rompath roms -pluginspath plugins -skip_gameinfo -sound none -console "+game_id
+        if not render:
+            command += " -video none"
+        if throttle:
+            command += " -throttle"
+        else:
+            command += " -frameskip 10"
+
+        # Start lua console
+        script_path = os.path.dirname(os.path.abspath(__file__))
+        self.process = Popen(command, cwd=f"{script_path}/mame", shell=True, stdin=PIPE, stdout=PIPE)
+
+        # Start read queues
+        self.stdout_queue = queue.Queue()
+        self.gobbler = StreamGobbler(self.process.stdout, self.stdout_queue, debug=debug)
+        self.gobbler.wait_for_cursor()
+        self.gobbler.start()
+
+    # Read the oldest line which may have been output by the console
+    # Uses the FIFO principle, once a line is read it is removed from the queue
+    # timeout determines how long the function will wait for an output if there is nothing immediately available
+    def readln(self, timeout=0.5):
+        line = self.stdout_queue.get(timeout=timeout)
+        while len(line)>0 and line[0] == 27:
+            line = line[19:]
+        return line.decode("utf-8")
+
+    # Read as many lines from the console as there are available
+    # timeout determines how long the function will wait for an output if there is nothing immediately available
+    def readAll(self, timeout=0.5):
+        lines = []
+        while True:
+            try:
+                lines.append(self.readln(timeout=timeout))
+            except queue.Empty as e:
+                break
+        return lines
+
+    def writeln(self, command, expect_output=False, timeout=0.5):
+        self.process.stdin.write(command.encode("utf-8") + b'\n')
+        self.process.stdin.flush()
+        output = self.readAll(timeout=timeout)
+
+        if expect_output and len(output) == 0:
+            error = "Expected output but received nothing from emulator after '" + command + "'"
+            self.logger.error(error)
+            raise IOError(error)
+        if not expect_output and len(output) > 0:
+            error = "No output expected from command '" + command + "', but recieved: " + "\n".join(output)
+            self.logger.error(error)
+            raise IOError(error)
+        if expect_output:
+            return output
+
+    # Mainly for testing
+    # Safely kills the emulator process
+    def close(self):
+        self.process.kill()
+        try:
+            self.process.wait(timeout=3)
+        except Exception as e:
+            error = "Failed to close emulator console"
+            self.logger.error(error, e)
+            raise EnvironmentError(error)
+        self.gobbler.stop()
diff --git a/src/emulator/Emulator.py b/src/emulator/Emulator.py
@@ -0,0 +1,131 @@
+import atexit
+import os
+from src.emulator.Console import Console
+from src.emulator.pipes.Pipe import Pipe
+from src.emulator.pipes.DataPipe import DataPipe
+
+
+# Converts a list of action Enums into the relevant Lua engine representation
+def actions_to_string(actions):
+    action_strings = [action.get_lua_string() for action in actions]
+    return '+'.join(action_strings)
+
+
+def list_actions(game_id):
+    console = Console(game_id)
+    console.writeln('iop = manager:machine():ioport()')
+    actions = []
+    ports = console.writeln("for k,v in pairs(iop.ports) do print(k) end", expect_output=True, timeout=0.5)
+    for port in ports:
+        fields = console.writeln("for k,v in pairs(iop.ports['"+port+"'].fields) do print(k) end", expect_output=True)
+        for field in fields:
+            actions.append({"port": port, "field": field})
+    console.close()
+    return actions
+
+
+# An interface for using the Lua engine console functionality
+class Emulator(object):
+
+    # env_id - the unique id of the emulator
+    # game_id - the game id being used
+    # memory_addresses - The internal memory addresses of the game which this class will return the value of at every time step
+    # frame_ratio - the ratio of frames that will be returned, 3 means 1 out of every 3 frames will be returned. Note that his also effects how often memory addresses are read and actions are sent
+    # See console for render, throttle & debug
+    def __init__(self, env_id, game_id, memory_addresses, frame_ratio=3, render=True, throttle=False, debug=False):
+        self.memoryAddresses = memory_addresses
+        self.frameRatio = frame_ratio
+
+        # setup lua engine
+        self.console = Console(game_id, render=render, throttle=throttle, debug=debug)
+        atexit.register(self.close)
+        self.wait_for_resource_registration()
+        self.create_lua_variables()
+        screen_width = self.setup_screen_width()
+        screen_height = self.setup_screen_height()
+        self.screenDims = {"width": screen_width, "height": screen_height}
+
+        # open pipes
+        pipes_path = f"{os.path.dirname(os.path.abspath(__file__))}/mame/pipes"
+        self.actionPipe = Pipe(env_id, "action", 'w', pipes_path)
+        self.actionPipe.open(self.console)
+
+        self.dataPipe = DataPipe(env_id, self.screenDims, memory_addresses, pipes_path)
+        self.dataPipe.open(self.console)
+
+        # Connect inter process communication
+        self.setup_frame_access_loop()
+
+    def create_lua_variables(self):
+        self.console.writeln('iop = manager:machine():ioport()')
+        self.console.writeln('s = manager:machine().screens[":screen"]')
+        self.console.writeln('mem = manager:machine().devices[":maincpu"].spaces["program"]')
+        self.console.writeln('releaseQueue = {}')
+
+    def wait_for_resource_registration(self):
+        screen_registered = False
+        program_registered = False
+        while not screen_registered or not program_registered:
+            if not screen_registered:
+                screen_registered = self.console.writeln('print(manager:machine().screens[":screen"])', expect_output=True, timeout=3) is not "nil"
+            if not program_registered:
+                program_registered = self.console.writeln('print(manager:machine().devices[":maincpu"].spaces["program"])', expect_output=True, timeout=3) is not "nil"
+
+    # Gets the game screen width in pixels
+    def setup_screen_width(self):
+        output = self.console.writeln('print(s:width())', expect_output=True, timeout=1)
+        if len(output) != 1:
+            raise IOError('Expected one result from "print(s:width())", but received: ', output)
+        return int(output[0])
+
+    # Gets the game screen height in pixels
+    def setup_screen_height(self):
+        output = self.console.writeln('print(s:height())', expect_output=True, timeout=1)
+        if len(output) != 1:
+            raise IOError('Expected one result from "print(s:height())"", but received: ', output)
+        return int(output[0])
+
+    # Pauses the emulator
+    def pause_game(self):
+        self.console.writeln('emu.pause()')
+
+    # Unpauses the emulator
+    def unpause_game(self):
+        self.console.writeln('emu.unpause()')
+
+    # Sets up the callback function written in Lua that the Lua engine will execute each time a frame done
+    def setup_frame_access_loop(self):
+        pipe_data_func = 'function pipeData() ' \
+                            'if (math.fmod(tonumber(s:frame_number()),' + str(self.frameRatio) +') == 0) then ' \
+                                'for i=1,#releaseQueue do ' \
+                                    'releaseQueue[i](); ' \
+                                    'releaseQueue[i]=nil; ' \
+                                'end; ' \
+                                '' + self.dataPipe.get_lua_string() + '' \
+                                'actions = ' + self.actionPipe.get_lua_string() + '' \
+                                'if (string.len(actions) > 1) then ' \
+                                    'for action in string.gmatch(actions, "[^+]+") do ' \
+                                        'actionFunc = loadstring(action..":set_value(1)"); ' \
+                                        'actionFunc(); ' \
+                                        'releaseFunc = loadstring(action..":set_value(0)"); ' \
+                                        'table.insert(releaseQueue, releaseFunc); ' \
+                                    'end; ' \
+                                'end; ' \
+                            'end; ' \
+                        'end'
+        self.console.writeln(pipe_data_func)
+        self.console.writeln('emu.register_frame_done(pipeData, "data")')
+
+    # Steps the emulator along one time step
+    def step(self, actions):
+        data = self.dataPipe.read_data(timeout=10)  # gathers the frame data and memory address values
+        action_string = actions_to_string(actions)
+        self.actionPipe.writeln(action_string)  # sends the actions for the game to perform before the next step
+        return data
+
+    # Testing
+    # Safely stops all of the processes related to running the emulator
+    def close(self):
+        self.console.close()
+        self.actionPipe.close()
+        self.dataPipe.close()
diff --git a/src/emulator/StreamGobbler.py b/src/emulator/StreamGobbler.py
@@ -0,0 +1,32 @@
+import threading
+
+
+# A thread used for reading data from a thread
+# pipes don't have very good time out functionality, so this is used in combination with a queue
+class StreamGobbler(threading.Thread):
+
+    def __init__(self, pipe, queue, debug=False):
+        threading.Thread.__init__(self)
+        self.pipe = pipe
+        self.queue = queue
+        self.debug = debug
+        self._stop_event = threading.Event()
+        self.has_cursor = False
+
+    def run(self):
+        for line in iter(self.pipe.readline, b''):
+            if self.debug:
+                print(line)
+            self.queue.put(line[:-1])
+            if self._stop_event.is_set():
+                break
+
+    def wait_for_cursor(self):
+        new_line_count = 0
+        while new_line_count != 3:
+            line = self.pipe.readline()
+            if line == b'\n':
+                new_line_count += 1
+
+    def stop(self):
+        self._stop_event.set()