fairseq2 consists of two packages; the user-facing fairseq2 package implemented
in pure Python, and the fairseq2n package that contains the C++ and CUDA
portions of the library. If pre-built fairseq2n nightly packages are available
for your system (check [README](.#nightlies)), and if you are interested in only
modifying Python portions of fairseq2, you can use an editable pip installation
as described below. Otherwise, if you are planning to work on C++ or CUDA, or if
fairseq2n is not available as a pre-built package for your system, please follow
the installation instructions [here](INSTALL_FROM_SOURCE.md).
For an editable installation, first, install a nightly build of fairseq2n (shown
for PyTorch 2.4.0 and variant cu121):
fairseq2n relies on the C++ API of PyTorch which has no API/ABI compatibility
between releases. This means you have to install the fairseq2n variant that
exactly matches your PyTorch version. Otherwise, you might experience issues
like immediate process crashes or spurious segfaults. For the same reason, if
you upgrade your PyTorch version, you must also upgrade your fairseq2n
installation.
Then, clone the fairseq2 repository to your machine:
And, install the fairseq2 package in editable mode:
pipinstall-e.
Finally, make sure to install the development tools (e.g. linters and
formatters):
pipinstall-rrequirements-devel.txt
Note
Any time you pull the latest fairseq2 commits from GitHub, make sure to re-run
the fairseq2n installation command above to get the most up-to-date binary. If
you observe runtime or test failures after the installation, it might be
because the latest nightlies are not published yet. If the problem persists
for more than 12 hours, please create a
[GitHub issue](https://github.com/facebookresearch/fairseq2/issues/new/choose).
Any work that you plan to contribute should ideally be covered by a unit or
integration test. Once you have all your tests in place, ensure the full test
suite passes:
pytest
By default, the tests will be run on CPU; pass the --device (short form
-d) option to run them on a specific device (e.g. GPU):
pytest--devicecuda:0
If you have changes in C++ or CUDA, in addition to pytest, also run the
native tests:
Any new or revised user-facing feature included in your work should have an
accompanying documentation. Depending on the scope of the work, the
documentation can be just docstrings in Python code, or, for larger features,
one or more Sphinx RST files. For docstrings, make sure to follow our formatting
conventions. You can check out any Python file in our code base to study how we
format our docstrings.
To build and test out the library documentation, run the following commands:
For C++ and CUDA, we do not enforce our coding conventions via a tool (e.g.
clang-format), but we expect you to follow them. You can check out any C++ file
in our code base to study our conventions. Since C++ syntax can become pretty
complex at times, refrain from being too pedantic and prioritize readability
over convention.