compiler_gym.wrappers

The compiler_gym.wrappers module provides a set of classes that can be used to transform an environment in a modular way.

For example:

>>> env = compiler_gym.make("llvm-v0")
>>> env = TimeLimit(env, n=10)
>>> env = CycleOverBenchmarks(
...     env,
...     benchmarks=[
...         "benchmark://cbench-v1/crc32",
...         "benchmark://cbench-v1/qsort",
...     ],
... )

Warning

CompilerGym environments are incompatible with the OpenAI Gym wrappers. This is because CompilerGym extends the environment API with additional arguments and methods. You must use the wrappers from this module when wrapping CompilerGym environments. We provide a set of base wrappers that are equivalent to those in OpenAI Gym that you can use to write your own wrappers.

Base wrappers

class compiler_gym.wrappers.CompilerEnvWrapper(env: CompilerEnv)[source]

Wraps a CompilerEnv environment to allow a modular transformation.

This class is the base class for all wrappers. This class must be used rather than gym.Wrapper to support the CompilerGym API extensions such as the fork() method.

__init__(env: CompilerEnv)[source]

Constructor.

Parameters

env – The environment to wrap.

Raises

TypeError – If env is not a CompilerEnv.

class compiler_gym.wrappers.ActionWrapper(env: CompilerEnv)[source]

Wraps a CompilerEnv environment to allow an action space transformation.

action(action: ActionType) ActionType[source]

Translate the action to the new space.

reverse_action(action: ActionType) ActionType[source]

Translate an action from the new space to the wrapped space.

class compiler_gym.wrappers.ObservationWrapper(env: CompilerEnv)[source]

Wraps a CompilerEnv environment to allow an observation space transformation.

observation()

A view of the available observation spaces that permits on-demand computation of observations.

class compiler_gym.wrappers.RewardWrapper(env: CompilerEnv)[source]

Wraps a CompilerEnv environment to allow an reward space transformation.

reward()

A view of the available reward spaces that permits on-demand computation of rewards.

Action space wrappers

class compiler_gym.wrappers.CommandlineWithTerminalAction(env: CompilerEnv, terminal=CommandlineFlag(name='end-of-episode', flag='# end-of-episode', description='End the episode'))[source]

Creates a new action space with a special “end of episode” terminal action at the start. If step() is called with it, the “done” flag is set.

__init__(env: CompilerEnv, terminal=CommandlineFlag(name='end-of-episode', flag='# end-of-episode', description='End the episode'))[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • terminal – The flag to use as the terminal action. Optional.

class compiler_gym.wrappers.ConstrainedCommandline(env: CompilerEnv, flags: Iterable[str], name: Optional[str] = None)[source]

Constrains a Commandline action space to a subset of the original space’s flags.

__init__(env: CompilerEnv, flags: Iterable[str], name: Optional[str] = None)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • flags – A list of entries from env.action_space.flags denoting flags that are available in this wrapped environment.

  • name – The name of the new action space.

class compiler_gym.wrappers.TimeLimit(env: CompilerEnv, max_episode_steps: Optional[int] = None)[source]

A step-limited wrapper that is compatible with CompilerGym.

Example usage:

>>> env = TimeLimit(env, max_episode_steps=3)
>>> env.reset()
>>> _, _, done, _ = env.step(0)
>>> _, _, done, _ = env.step(0)
>>> _, _, done, _ = env.step(0)
>>> done
True
__init__(env: CompilerEnv, max_episode_steps: Optional[int] = None)[source]

Constructor.

Parameters

env – The environment to wrap.

Raises

TypeError – If env is not a CompilerEnv.

class compiler_gym.wrappers.ForkOnStep(env: CompilerEnv)[source]

A wrapper that creates a fork of the environment before every step.

This wrapper creates a new fork of the environment before every call to env.reset(). Because of this, this environment supports an additional env.undo() method that can be used to backtrack.

Example usage:

>>> env = ForkOnStep(compiler_gym.make("llvm-v0"))
>>> env.step(0)
>>> env.actions
[0]
>>> env.undo()
>>> env.actions
[]
Variables

stack (List[CompilerEnv]) – A fork of the environment before every previous call to env.reset(), ordered oldest to newest.

__init__(env: CompilerEnv)[source]

Constructor.

Parameters

env – The environment to wrap.

undo() CompilerEnv[source]

Undo the previous action.

Returns

Self.

Datasets wrappers

class compiler_gym.wrappers.IterateOverBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]

Iterate over a (possibly infinite) sequence of benchmarks on each call to reset(). Will raise StopIteration on reset() once the iterator is exhausted. Use CycleOverBenchmarks or RandomOrderBenchmarks for wrappers which will loop over the benchmarks.

__init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • benchmarks – An iterable sequence of benchmarks.

  • fork_shares_iterator – If True, the benchmarks iterator will bet shared by a forked environment created by env.fork(). This means that calling env.reset() with one environment will advance the iterator in the other. If False, forked environments will use itertools.tee() to create a copy of the iterator so that each iterator may advance independently. However, this requires shared buffers between the environments which can lead to memory overheads if env.reset() is called many times more in one environment than the other.

class compiler_gym.wrappers.CycleOverBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]

Cycle through a list of benchmarks on each call to reset(). Same as IterateOverBenchmarks except the list of benchmarks repeats once exhausted.

__init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], fork_shares_iterator: bool = False)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • benchmarks – An iterable sequence of benchmarks.

  • fork_shares_iterator – If True, the benchmarks iterator will be shared by a forked environment created by env.fork(). This means that calling env.reset() with one environment will advance the iterator in the other. If False, forked environments will use itertools.tee() to create a copy of the iterator so that each iterator may advance independently. However, this requires shared buffers between the environments which can lead to memory overheads if env.reset() is called many times more in one environment than the other.

class compiler_gym.wrappers.CycleOverBenchmarksIterator(env: CompilerEnv, make_benchmark_iterator: Callable[[], Iterable[Union[str, Benchmark]]])[source]

Same as CycleOverBenchmarks except that the user generates the iterator.

__init__(env: CompilerEnv, make_benchmark_iterator: Callable[[], Iterable[Union[str, Benchmark]]])[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • make_benchmark_iterator – A callback that returns an iterator over a sequence of benchmarks. Once the iterator is exhausted, this callback is called to produce a new iterator.

class compiler_gym.wrappers.RandomOrderBenchmarks(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], rng: Optional[Generator] = None)[source]

Select randomly from a list of benchmarks on each call to reset().

Note

Uniform random selection is provided by evaluating the input benchmarks iterator into a list and sampling randomly from the list. For very large and infinite iterables of benchmarks you must use the IterateOverBenchmarks wrapper with your own random sampling iterator.

__init__(env: CompilerEnv, benchmarks: Iterable[Union[str, Benchmark]], rng: Optional[Generator] = None)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • benchmarks – An iterable sequence of benchmarks. The entirety of this input iterator is evaluated during construction.

  • rng – A random number generator to use for random benchmark selection.

LLVM Environment wrappers

class compiler_gym.wrappers.RuntimePointEstimateReward(env: ~compiler_gym.envs.llvm.llvm_env.LlvmEnv, runtime_count: int = 30, warmup_count: int = 0, estimator: ~typing.Callable[[~typing.Iterable[float]], float] = <function median>)[source]

LLVM wrapper that uses a point estimate of program runtime as reward.

This class wraps an LLVM environment and registers a new runtime reward space. Runtime is estimated from one or more runtime measurements, after optionally running one or more warmup runs. At each step, reward is the change in runtime estimate from the runtime estimate at the previous step.

__init__(env: ~compiler_gym.envs.llvm.llvm_env.LlvmEnv, runtime_count: int = 30, warmup_count: int = 0, estimator: ~typing.Callable[[~typing.Iterable[float]], float] = <function median>)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • runtime_count – The number of times to execute the binary when estimating the runtime.

  • warmup_count – The number of warmup runs of the binary to perform before measuring the runtime.

  • estimator – A function that takes a list of runtime measurements and produces a point estimate.

class compiler_gym.wrappers.SynchronousSqliteLogger(env: LlvmEnv, db_path: Path, commit_frequency_in_seconds: int = 300, max_step_buffer_length: int = 5000)[source]

A wrapper for an LLVM environment that logs all transitions to an sqlite database.

Wrap an existing LLVM environment and then use it as per normal:

>>> env = SynchronousSqliteLogger(
...     env=gym.make("llvm-autophase-ic-v0"),
...     db_path="example.db",
... )

Connect to the database file you specified:

There are two tables:

  1. States: records every unique combination of benchmark + actions. For each entry, records an identifying state ID, the episode reward, and whether the episode is terminated:

sqlite> .mode markdown
sqlite> .headers on
sqlite> select * from States limit 5;
|      benchmark_uri       | done | ir_instruction_count_oz_reward |                                 state_id | actions |
|--------------------------|------|--------------------------------|------------------------------------------|----------------|
| generator://csmith-v0/99 | 0    | 0.0                            | d625b874e58f6d357b816e21871297ac5c001cf0 |                |
| generator://csmith-v0/99 | 0    | 0.0                            | d625b874e58f6d357b816e21871297ac5c001cf0 | 31             |
| generator://csmith-v0/99 | 0    | 0.0                            | 52f7142ef606d8b1dec2ff3371c7452c8d7b81ea | 31 116         |
| generator://csmith-v0/99 | 0    | 0.268005818128586              | d8c05bd41b7a6c6157b6a8f0f5093907c7cc7ecf | 31 116 103     |
| generator://csmith-v0/99 | 0    | 0.288621664047241              | c4d7ecd3807793a0d8bc281104c7f5a8aa4670f9 | 31 116 103 109 |
  1. Observations: records pickled, compressed, and text observation values for each unique state.

Caveats of this implementation:

  1. Only LlvmEnv environments may be wrapped.

  2. The wrapped environment must have an observation space and reward space set.

  3. The observation spaces and reward spaces that are logged to database are hardcoded. To change what is recorded, you must copy and modify this implementation.

  4. Writing to the database is synchronous and adds significant overhead to the compute cost of the environment.

__init__(env: LlvmEnv, db_path: Path, commit_frequency_in_seconds: int = 300, max_step_buffer_length: int = 5000)[source]

Constructor.

Parameters
  • env – The environment to wrap.

  • db_path – The path of the database to log to. This file may already exist. If it does, new entries are appended. If the files does not exist, it is created.

  • commit_frequency_in_seconds – The maximum amount of time to elapse before writing pending logs to the database.

  • max_step_buffer_length – The maximum number of calls to step() before writing pending logs to the database.

flush() None[source]

Flush the buffered steps and observations to database.