The instructions in this document are for users who want to use fairseq2 on a
system for which no pre-built fairseq2 package is available, or for users who
want to work on the C++/CUDA code of fairseq2.
Note
If you plan to edit and only modify Python portions of fairseq2, and if
fairseq2 provides a pre-built nightly package for your system, we recommend
using an editable pip installation as described in
Setting up Development Environment.
Note the --recurse-submodules option that asks Git to clone the third-party
dependencies along with fairseq2. If you have already cloned fairseq2 without
--recurse-submodules before reading these instructions, you can run the
following command in your cloned repository to achieve the same effect:
In simplest case, you can run the following command to create an empty Python
virtual environment (shown for Python 3.8):
python3.8-mvenv~/myvenv
And, activate it:
source~/myvenv/bin/activate
You can check out the
Python documentation
to learn more about other environment options.
Important
We strongly recommend creating a new environment from scratch instead of
reusing an existing one to avoid dependency conflicts.
Important
Manually building fairseq2 or any other C++ project in a Conda environment can
become tricky and fail due to environment-specific conflicts with the host
system libraries. Unless necessary, we recommend using a Python virtual
environment to build fairseq2.
If you plan to build fairseq2 in a CUDA environment, you first have to install
a version of the CUDA Toolkit that matches the CUDA version of PyTorch. The
instructions for different toolkit versions can be found on NVIDIA’s website.
Note
If you are on a compute cluster with module support (e.g. FAIR Cluster),
you can typically activate a specific CUDA Toolkit version by
moduleloadcuda/<VERSION>.
The final step before installing fairseq2 is to build fairseq2n, fairseq2’s C++
library. Run the following command at the root directory of your repository to
configure the build:
cdnative
cmake-GNinja-Bbuild
Once the configuration step is complete, build fairseq2n using:
cmake--buildbuild
fairseq2 uses reasonable defaults, so the command above is sufficient for a
standard installation; however, if you are familiar with CMake, you can check
out the advanced build options in
native/CMakeLists.txt.
If you are on a compute cluster with module support (e.g. FAIR Cluster),
you can typically activate a specific CUDA Toolkit version by
moduleloadcuda/<VERSION>.
If you would like to build fairseq2’s CUDA kernels, set the FAIRSEQ2N_USE_CUDA
option to ON. When turned on, the version of the CUDA Toolkit installed on
your machine and the version of CUDA that was used to build PyTorch must match:
cmake-GNinja-DFAIRSEQ2N_USE_CUDA=ON-Bbuild
Similar to CPU-only build, follow this command with:
By default, fairseq2 builds its CUDA kernels only for the Volta architecture.
You can override this setting using the CMAKE_CUDA_ARCHITECTURES option.
For
instance, the following configuration generates binary and PTX codes for the
Ampere architecture (e.g. for A100):
In case you want to modify and test fairseq2, installing it in editable mode
will be more convenient:
cdnative/python
pipinstall-e.
cd-
pipinstall-e.
Optionally, you can also install the development tools (e.g. linters,
formatters) if you plan to contribute to fairseq2. See
Contributing to fairseq2 for more information: