In fairseq2, “assets” refer to the various components that make up a sequence or language modeling task, such as datasets, models, tokenizers, etc. These assets are essential for training, evaluating, and deploying models.
fairseq2.assets provides API to load the different models using the “model cards” from different “stores”.
To organize these assets, fairseq2 uses a concept called “cards,” which are essentially YAML files that describe the assets and their relationships.
For example, you can find all the “cards” in fairseq2 here.
Cards provide a flexible way to define and manage the various components of an NLP task, making it easier to reuse, share, and combine different assets.
A store is a place where all the model cards are stored. In fairseq2, a store is accessed via
fairseq2.assets.AssetStore. Multiple stores are allowed. By default, fairseq2 will look up the following stores:
System asset store: Cards that are shared by all users. By default, the system store is /etc/fairseq2/assets,
but this can be changed via the environment variable FAIRSEQ2_ASSET_DIR
User asset store: Cards that are only available to the user. By default, the user store is
~/.config/fairseq2/assets, but this can be changed via the environment variable FAIRSEQ2_USER_ASSET_DIR
To register a new store, implement a fairseq2.assets.AssetMetadataProvider and add them to
fairseq2.assets.asset_store. Here is an example to register a new directory as a model store:
A model card is a .YAML file that contains information about a model and instructs a
fairseq2.models.utils.generic_loaders.ModelLoader on how to load the model into the memory. Each model card
must have 2 mandatory attributes: name and checkpoint. name will be used to identify the model card, and it must
be unique across all
fairseq2 provides example cards for different LLMs in
fairseq2.assets.cards.
In fairseq2, a model card is accessed via fairseq2.assets.AssetCard. Alternatively, one can call
fairseq2.assets.AssetMetadataProvider.get_metadata(name: str) to get the meta data of a given model card name.