Display OpenAI gym in Jupyter notebook only Display OpenAI gym in Jupyter notebook only python-3.x python-3.x

Display OpenAI gym in Jupyter notebook only


I made a working example here that you can fork: https://kyso.io/eoin/openai-gym-jupyter with two examples of rendering in Jupyter - one as an mp4, and another as a realtime gif.

The .mp4 example is quite simple.

import gymfrom gym import wrappersenv = gym.make('SpaceInvaders-v0')env = wrappers.Monitor(env, "./gym-results", force=True)env.reset()for _ in range(1000):    action = env.action_space.sample()    observation, reward, done, info = env.step(action)    if done: breakenv.close()

Then in a new cell

import ioimport base64from IPython.display import HTMLvideo = io.open('./gym-results/openaigym.video.%s.video000000.mp4' % env.file_infix, 'r+b').read()encoded = base64.b64encode(video)HTML(data='''    <video width="360" height="auto" alt="test" controls><source src="data:video/mp4;base64,{0}" type="video/mp4" /></video>'''.format(encoded.decode('ascii')))


This worked for me in Ubuntu 18.04 LTS, to render gym locally. But, I believe it will work even in remote Jupyter Notebook servers.

First, run the following installations in Terminal:

pip install gympython -m pip install pyvirtualdisplaypip3 install box2dsudo apt-get install xvfb

That's just it. Use the following snippet to configure how your matplotlib should render :

import matplotlib.pyplot as pltfrom pyvirtualdisplay import Displaydisplay = Display(visible=0, size=(1400, 900))display.start()is_ipython = 'inline' in plt.get_backend()if is_ipython:    from IPython import displayplt.ion()# Load the gym environmentimport gymimport matplotlib.pyplot as plt%matplotlib inlineenv = gym.make('LunarLander-v2')env.seed(23)# Let's watch how an untrained agent moves aroundstate = env.reset()img = plt.imshow(env.render(mode='rgb_array'))for j in range(200):#     action = agent.act(state)    action = random.choice(range(4))    img.set_data(env.render(mode='rgb_array'))     plt.axis('off')    display.display(plt.gcf())    display.clear_output(wait=True)    state, reward, done, _ = env.step(action)    if done:        break         env.close()