moolib.EnvPool

class moolib.EnvPool

A class to run sets of gym-like environments in different processes.

This class will often be used in an RL setting, see examples/impala/impala.py, with batching happening in a background thread.

The class maintains num_batches batches of environments, with batch_size environments each. This means the batches can be stepped through alternately, for increased efficiency (cf “double-buffering”), and the whole EnvPool uses num_process to run these environments.

Example

Provide and example here:

def create_env():
    return gym.make('NetHackChallenge-v0')

batch_size = 32
n_actions = create_env().action_space.n

# step through two sets of envs, each with its own process ("double-buffering")
batcher = EnvPool(create_env, num_processes=64, batch_size=batch_size, num_batches=2)

for i in range(20):
    actions = (torch.rand(batch_size) * n_actions).int()
    batcher.step(i % 2, actions)
__init__()

Init.

Parameters
  • create_env (Callable[[], gym.Env]) – a user-defined function that returns a Gym environment.

  • num_processes (int) – how many processes should be used for running environments.

  • batch_size (int) – the number of environments in one batch.

  • num_batches (int) – the number of batches to maintain (for double-buffering).

Methods

__init__

Init.

step

Step through a batch of envs.

step()

Step through a batch of envs.

Parameters
  • batch_index (int) – index of the batch of envs are we stepping.

  • action (torch.Tensor) – actions for each of the envs [BATCH_SIZE].