Moolib Documentation

moolib  - a communications library for distributed ml training

moolib offers general purpose RPC with automatic transport
selection (shared memory, tcp/ip, infiniband) allowing models
to data-parallelise their training and synchronize gradients
and model weights across many nodes.

Getting Started

Install from GitHub

pip install git+https://github.com/facebookresearch/moolib

Build from source: Linux

git clone --recursive git@github.com:facebookresearch/moolib
cd moolib && pip install .

Build from source: MacOS

git clone --recursive git@github.com:facebookresearch/moolib
cd moolib && USE_CUDA=0 pip install .

How to host docs:

# after installation
pip install sphinx==4.1.2
cd docs && ./run_docs.sh

API

Classes

Accumulator

Accumulate and synchronize gradients and state from multiple peers in the cohort.

Batcher

A auxiliary class to asynchronously batch tensors into an chosen batch size on a certain device.

Broker

A class to coordinate a cohort during training.

EnvPool

A class to run sets of gym-like environments in different processes.

EnvStepper

A helper class for EnvPool.

Group

A group of Rpc objects.

Rpc

A class to execute Remote Procedure Calls.

Methods

create_uid

Generate a unique user id.

set_logging

Set logging using the python logging module.

set_log_level

Set the level to log at.

set_max_threads

Set the maximum number of threads used by the moolib.

Futures

AllReduce

A future result of an AllReduce operation.

Future

A future result.

EnvStepperFuture

A future result from an EnvStepper step.

Examples

Some examples are in the ./examples directory.