fairseq2.models.hg

The fairseq2.models.hg module provides seamless integration with HuggingFace Transformers models within the fairseq2 framework. This module allows you to load and use any HuggingFace model with fairseq2’s training and inference pipelines.

API Reference

High-Level API

fairseq2.models.hg.load_hg_model_simple(name: str, *, model_type: str = 'auto', use_processor: bool = False, device: str = 'cpu', trust_remote_code: bool = False, dtype: dtype | None = None, **kwargs: Any) Any[source]

Load a HuggingFace model with simplified configuration.

This is the main entry point for users who want to load HuggingFace models into fairseq2 with minimal configuration.

Parameters:
  • name – HuggingFace model identifier (e.g., ‘gpt2’, ‘microsoft/DialoGPT’)

  • model_type – Type of AutoModel to use (‘auto’, ‘causal_lm’, ‘seq2seq_lm’, ‘custom’)

  • use_processor – Whether to use AutoProcessor instead of AutoTokenizer

  • device – Device placement (‘cpu’, ‘cuda:0’, or ‘auto’ for HF accelerate)

  • trust_remote_code – Whether to trust remote code for custom architectures

  • dtype – PyTorch dtype to use. None means ‘auto’ (let HuggingFace decide)

  • kwargs – Additional kwargs passed to from_pretrained

Returns:

The loaded HuggingFace model

Examples:

Load a standard causal language model:

model = load_hg_model_simple("gpt2")

Load a seq2seq model:

model = load_hg_model_simple("t5-small", model_type="seq2seq_lm")

Load a multimodal model with processor:

model = load_hg_model_simple(
    "Qwen/Qwen2.5-Omni-7B",
    use_processor=True,
    trust_remote_code=True
)
fairseq2.models.hg.load_hg_tokenizer_simple(name: str, *, unk_token: str | None = None, bos_token: str | None = None, eos_token: str | None = None, pad_token: str | None = None, boh_token: str | None = None, eoh_token: str | None = None) HgTokenizer[source]

Load a HuggingFace tokenizer with custom special tokens.

Parameters:
  • name – HuggingFace tokenizer identifier (same as model name)

  • unk_token – Custom unknown token

  • bos_token – Custom beginning of sequence token

  • eos_token – Custom end of sequence token

  • pad_token – Custom padding token

  • boh_token – Custom beginning of human token

  • eoh_token – Custom end of human token

Returns:

The loaded tokenizer with custom tokens

Examples:

Load a tokenizer with default settings:

tokenizer = load_hg_tokenizer_simple("gpt2")

Load with custom tokens:

tokenizer = load_hg_tokenizer_simple(
    "gpt2",
    pad_token="<pad>",
    eos_token="<end>"
)

Convenience Functions

fairseq2.models.hg.load_causal_lm(name: str, **kwargs: Any) Any[source]

Load a causal language model (GPT-style).

Convenience function for loading causal language models like GPT-2, DialoGPT, or LLaMA.

Parameters:
  • name – HuggingFace model identifier

  • kwargs – Additional arguments passed to load_hg_model_simple

Returns:

A causal language model

Example:

Load GPT-2 for text generation:

model = load_causal_lm("gpt2")
fairseq2.models.hg.load_seq2seq_lm(name: str, **kwargs: Any) Any[source]

Load a sequence-to-sequence model (T5-style).

Convenience function for loading seq2seq models like T5, BART, or Pegasus for tasks like translation, summarization, and question answering.

Parameters:
  • name – HuggingFace model identifier

  • kwargs – Additional arguments passed to load_hg_model_simple

Returns:

A sequence-to-sequence model

Example:

Load T5 for translation:

model = load_seq2seq_lm("t5-small")
fairseq2.models.hg.load_multimodal_model(name: str, **kwargs: Any) Any[source]

Load a multimodal model with processor.

Convenience function for loading multimodal models that require processors instead of tokenizers (e.g., vision-language models).

Parameters:
  • name – HuggingFace model identifier

  • kwargs – Additional arguments passed to load_hg_model_simple

Returns:

A multimodal model

Example:

Load a multimodal model:

model = load_multimodal_model(
    "Qwen/Qwen2.5-Omni-3B",
    trust_remote_code=True
)

Configuration

class fairseq2.models.hg.HuggingFaceModelConfig(*, hf_name: str, model_type: str = 'auto', use_processor: bool = False, device: str = 'cpu', custom_model_class: str | None = None, custom_processor_class: str | None = None, trust_remote_code: bool = False, dtype: dtype | None = None, load_kwargs: dict[str, Any] | None = None, enable_gradient_checkpointing: bool = False)[source]

Bases: object

Configuration for loading HuggingFace models.

This dataclass contains all the parameters needed to configure how a HuggingFace model should be loaded, including device placement, dtype, custom classes, and special loading options.

Parameters:
  • hf_name – The HuggingFace model identifier (e.g., ‘gpt2’)

  • model_type – Type of AutoModel (‘auto’, ‘causal_lm’, ‘seq2seq_lm’, ‘custom’)

  • use_processor – Whether to use AutoProcessor for multimodal models

  • device – Device placement (‘cpu’, ‘cuda:0’, or ‘auto’)

  • custom_model_class – Custom model class name for special cases

  • custom_processor_class – Custom processor class name for special cases

  • trust_remote_code – Whether to trust remote code for custom architectures

  • dtype – PyTorch dtype to use. None means ‘auto’ (let HuggingFace decide)

  • load_kwargs – Additional kwargs to pass to from_pretrained

  • enable_gradient_checkpointing – Whether to enable gradient checkpointing to reduce memory usage during training (only for causal_lm models)

Example:

Create a configuration for GPT-2:

config = HuggingFaceModelConfig(
    hf_name="gpt2",
    model_type="causal_lm",
    device="cuda:0"
)
hf_name: str

The HuggingFace model identifier (e.g., ‘gpt2’).

model_type: str = 'auto'

Type of AutoModel (‘auto’, ‘causal_lm’, ‘seq2seq_lm’, ‘custom’).

use_processor: bool = False

Whether to use AutoProcessor for multimodal models.

device: str = 'cpu'

Device placement: ‘cpu’, ‘cuda:0’, or ‘auto’ for HF accelerate.

custom_model_class: str | None = None

Custom model class name for special cases.

custom_processor_class: str | None = None

Custom processor class name for special cases.

trust_remote_code: bool = False

Whether to trust remote code for custom architectures.

dtype: dtype | None = None

PyTorch dtype to use. None means ‘auto’ (let HuggingFace decide).

load_kwargs: dict[str, Any] | None = None

Additional kwargs to pass to from_pretrained.

enable_gradient_checkpointing: bool = False

Whether to enable gradient checkpointing to reduce memory usage (causal_lm only).

class fairseq2.models.hg.HgTokenizerConfig(*, unk_token: str | None = None, bos_token: str | None = None, eos_token: str | None = None, pad_token: str | None = None, boh_token: str | None = None, eoh_token: str | None = None)[source]

Bases: object

Configuration for HuggingFace tokenizers.

unk_token: str | None = None

The unknown token.

bos_token: str | None = None

The beginning-of-sequence token.

eos_token: str | None = None

The end-of-sequence token.

pad_token: str | None = None

The padding token.

boh_token: str | None = None

The beginning-of-head token.

eoh_token: str | None = None

The end-of-head token.

Factory Functions

fairseq2.models.hg.create_hg_model(config: HuggingFaceModelConfig) Any[source]

Create a HuggingFace model from configuration.

This factory loads models directly from HuggingFace Hub with transformers.

Parameters:

config – HuggingFace model configuration

Returns:

HuggingFace PreTrainedModel

Raises:

OperationalError: If transformers library is not available

Raises:

HuggingFaceModelError: If model loading fails

Raises:

NotSupportedError: If transformers library is not available

fairseq2.models.hg.register_hg_model_class(config_class_name: str, model_class: Type[PreTrainedModel] | str, tokenizer_class: Type[PreTrainedTokenizer] | str | None = None, processor_class: str | None = None) None[source]

Register a custom model class for models not supported by Auto classes.

This function allows registration of custom model classes that cannot be loaded automatically by HuggingFace’s Auto classes. This is useful for new or experimental model architectures.

Parameters:
  • config_class_name – The name of the config class (e.g., ‘Qwen2_5OmniConfig’)

  • model_class – The model class or its string name

  • tokenizer_class – The tokenizer class or its string name (optional)

  • processor_class – The processor class or its string name (optional)

Raises:

OperationalError: If transformers library is not available

Example:

Register a custom model:

register_hg_model_class(
    "Qwen2_5OmniConfig",
    "Qwen2_5OmniForConditionalGeneration",
    processor_class="Qwen2_5OmniProcessor",
)

Tokenizer Classes

class fairseq2.models.hg.HgTokenizer(model: HuggingFaceTokenModel)[source]

Bases: Tokenizer

HuggingFace tokenizer adapter for fairseq2.

This class wraps a HuggingFace tokenizer to make it compatible with fairseq2’s Tokenizer interface. It provides access to both fairseq2 tokenizer methods and the underlying HuggingFace tokenizer.

Example:

Create a tokenizer from a model:

model = load_hg_token_model("gpt2")
tokenizer = HgTokenizer(model)

# Use fairseq2 interface
tokens = tokenizer.encode("Hello world")
text = tokenizer.decode(tokens)

# Access underlying HuggingFace tokenizer
hf_tokenizer = tokenizer.raw
create_encoder(*, task: str | None = None, lang: str | None = None, mode: str | None = None, device: device | None = None, pin_memory: bool = False) TokenEncoder[source]
create_raw_encoder(*, device: device | None = None, pin_memory: bool = False) TokenEncoder[source]
create_decoder(*, skip_special_tokens: bool = False) TokenDecoder[source]
encode(text: str, *, device: device | None = None, pin_memory: bool = False) Tensor[source]
decode(token_indices: Tensor, *, skip_special_tokens: bool = False) str[source]
convert_tokens_to_ids(tokens: list[str] | str) int | list[int][source]
property vocab_info: VocabularyInfo
property unk_token: str | None
property bos_token_id: int | None
property bos_token: str | None
property eos_token_id: int | None
property eos_token: str | None
property pad_token_id: int | None
property pad_token: str | None
property boh_token: str | None
property eoh_token: str | None
property chat_template: str | None
property raw: PreTrainedTokenizer | PreTrainedTokenizerFast
property model: HuggingFaceTokenModel
fairseq2.models.hg.load_hg_tokenizer(path: Path, config: HgTokenizerConfig) HgTokenizer[source]

Load a HuggingFace tokenizer.

Parameters:

config – Tokenizer configuration

Returns:

HgTokenizer instance

Hub Integration

fairseq2.models.hg.get_hg_model_hub() ModelHub[ModelT, ModelConfigT]

Creates a ModelHub instance when called.

This class provides a strongly-typed way to access model hubs. Its direct use is meant for model authors rather than library users.

See src/fairseq2/models/llama/hub.py as an example.

The use of ModelHubAccessor for model authors
from fairseq2.models import ModelHubAccessor

# Defined in the Python module where the model is implemented.
get_my_model_hub = ModelHubAccessor(
    family_name="my_model_family", kls=MyModel, config_kls=MyModelConfig
)

# `get_my_model_hub()` is treated as a standalone function by the model
# users in other parts of the code like below:
model_config = MyModelConfig()

model = get_my_model_hub().create_new_model(model_config)
fairseq2.models.hg.get_hg_tokenizer_hub() TokenizerHub[TokenizerT, TokenizerConfigT]

Exceptions

exception fairseq2.models.hg.HuggingFaceModelError(model_name: str, message: str)[source]

Bases: Exception

Exception raised when HuggingFace model loading fails.

Examples

Basic Model Loading

Use DialoGPT for conversational AI:

from fairseq2.models.hg import load_causal_lm, load_hg_tokenizer_simple
import torch

# Load DialoGPT model
model = load_causal_lm("microsoft/DialoGPT-small")
tokenizer = load_hg_tokenizer_simple("microsoft/DialoGPT-small")

# Conversation
user_input = "How are you doing today?"
inputs = tokenizer.encode(user_input + tokenizer.eos_token).unsqueeze(0)

with torch.no_grad():
    outputs = model.generate(
        inputs,
        max_length=inputs.shape[1] + 20,
        num_return_sequences=1,
        pad_token_id=tokenizer.vocab_info.eos_idx,
        do_sample=True,
        temperature=0.7,
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(f”Bot: {response[len(user_input):]}”)

Sequence-to-Sequence Tasks

Use T5 for translation and summarization:

from fairseq2.models.hg import load_seq2seq_lm, load_hg_tokenizer_simple
import torch

# Load T5 model
model = load_seq2seq_lm("t5-small")
tokenizer = load_hg_tokenizer_simple("t5-small")

# Translation task
text = "translate English to French: Hello, how are you?"
inputs = tokenizer.encode(text).unsqueeze(0)

with torch.no_grad():
    outputs = model.generate(inputs, max_length=50)

translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Translation: {translation}")

Custom Model Registration

Register custom models not supported by Auto classes:

from fairseq2.models.hg import register_hg_model_class, load_hg_model_simple

# Register a custom model class
register_hg_model_class(
    config_class_name="Qwen2_5OmniConfig",
    model_class="Qwen2_5OmniForConditionalGeneration",
    processor_class="Qwen2_5OmniProcessor",
)

# Now load the custom model
model = load_hg_model_simple(
    "Qwen/Qwen2.5-Omni-3B",
    model_type="custom",
    use_processor=True,
    trust_remote_code=True,
)

Hub Loading

Load models/tokenizers using the fairseq2 hub system:

from fairseq2.models.hg import get_hg_model_hub, get_hg_tokenizer_hub

name = "hg_qwen25_omni_3b"

# Load a pre-configured model
model_hub = get_hg_model_hub()
model = model_hub.load_model(name)

# Load the corresponding tokenizer
tokenizer_hub = get_hg_tokenizer_hub()
tokenizer = tokenizer_hub.load_tokenizer(name)

Note

This module requires the transformers library. Install it with: pip install transformers

Warning

Some models require trust_remote_code=True for custom architectures. Only use this with trusted model sources.

Module Structure

fairseq2.models.hg.api

High-level API for loading HuggingFace models and tokenizers.

fairseq2.models.hg.config

Configuration classes for HuggingFace model integration.

fairseq2.models.hg.factory

Factory functions for creating HuggingFace models.

fairseq2.models.hg.hub

Hub integration for HuggingFace models and tokenizers.

fairseq2.models.hg.tokenizer

HuggingFace tokenizer integration for fairseq2.

ABCs

class fairseq2.models.hg.HuggingFaceConverter[source]

Bases: ABC

Converts the state dict and configuration of a fairseq2 model to its Hugging Face Transformers equivalent.

Model authors must register their converter implementations with fairseq2 as part of library initialization as shown below:

from fairseq2.models.hg import HuggingFaceConverter
from fairseq2.runtime.dependency import DependencyContainer, register_model_family

class MyModelConverter(HuggingFaceConverter):
    ...

def register_my_model(container: DependencyContainer) -> None:
    register_model_family(container, name="my_model_family", ...)

    container.register_type(
        HuggingFaceConverter, MyModelConverter, key="my_model_family",
    )
abstract to_hg_config(config: object) HuggingFaceConfig[source]

Converts the specified fairseq2 model configuration to its Hugging Face Transformers equivalent.

Raises:

TypeErrorconfig is not of valid type. The expected type is one registered as part of the ModelFamily.

abstract to_hg_state_dict(state_dict: dict[str, object], config: object) dict[str, object][source]

Converts the specified fairseq2 state dict to its Hugging Face Transformers equivalent.

config is the fairseq2 model configuration and can be used to adjust the converted state dict when necessary.

Raises:

TypeErrorconfig is not of valid type. The expected type is one registered as part of the ModelFamily.

Classes

class fairseq2.models.hg.HuggingFaceConfig(data: Mapping[str, object], kls_name: str, arch: str | Sequence[str])[source]

Bases: object

Represents the configuration of a Hugging Face Transformers model.

This class is part of the HuggingFaceConverter interface which converts fairseq2 models to their Hugging Face equivalents.

data: Mapping[str, object]

Configuration data.

Each key in this mapping must correspond to an attribute of the actual configuration class in Hugging Face Transformers.

kls_name: str

Name of the configuration class in Hugging Face Transformers. For instance, Qwen3Config or LlamaConfig.

arch: str | Sequence[str]

Architecture(s) of the model as defined in Hugging Face Transformers. For instance, Qwen3ForCausalLM, LlamaForCausalLM.

Functions

fairseq2.models.hg.get_hugging_face_converter(family_name: str) HuggingFaceConverter[source]

Returns the HuggingFaceConverter of the specified model family.

Raises:

NotSupportedError – The model family does not support Hugging Face conversion.

fairseq2.models.hg.save_hugging_face_model(save_dir: Path, state_dict: dict[str, object], config: HuggingFaceConfig) None[source]

Saves the state dict and configuration of a Hugging Face Transformers model to the specified directory.

Raises:
  • TypeErrorconfig.kls_name does not correspond to the expected PretrainedConfig subclass of the Hugging Face model.

  • TypeErrorstate_dict contains non-tensor values which is not supported in Safetensors format.

  • ValueError – A key in config does not have a corresponding attribute in Hugging Face model configuration class.

  • OSError – The state dict or configuration cannot be saved to the file system.