fairseq2.datasets¶
The datasets module provides pre-built datasets and dataset utilities for common NLP and speech tasks.
Coming soon: This documentation is being developed. The datasets module includes:
Common benchmark datasets
Dataset loading utilities
Data preprocessing pipelines
Please refer to the source code and examples in the meantime.