fairseq2.models.hg.config¶
Configuration classes for HuggingFace model integration.
Functions
|
Register predefined HuggingFace model configurations. |
Classes
|
Configuration for loading HuggingFace models. |
- class fairseq2.models.hg.config.HuggingFaceModelConfig(*, hf_name: str, model_type: str = 'auto', use_processor: bool = False, device: str = 'cpu', custom_model_class: str | None = None, custom_processor_class: str | None = None, trust_remote_code: bool = False, dtype: dtype | None = None, load_kwargs: dict[str, Any] | None = None, enable_gradient_checkpointing: bool = False)[source]¶
Bases:
objectConfiguration for loading HuggingFace models.
This dataclass contains all the parameters needed to configure how a HuggingFace model should be loaded, including device placement, dtype, custom classes, and special loading options.
- Parameters:
hf_name – The HuggingFace model identifier (e.g., ‘gpt2’)
model_type – Type of AutoModel (‘auto’, ‘causal_lm’, ‘seq2seq_lm’, ‘custom’)
use_processor – Whether to use AutoProcessor for multimodal models
device – Device placement (‘cpu’, ‘cuda:0’, or ‘auto’)
custom_model_class – Custom model class name for special cases
custom_processor_class – Custom processor class name for special cases
trust_remote_code – Whether to trust remote code for custom architectures
dtype – PyTorch dtype to use. None means ‘auto’ (let HuggingFace decide)
load_kwargs – Additional kwargs to pass to from_pretrained
enable_gradient_checkpointing – Whether to enable gradient checkpointing to reduce memory usage during training (only for causal_lm models)
- Example:
Create a configuration for GPT-2:
config = HuggingFaceModelConfig( hf_name="gpt2", model_type="causal_lm", device="cuda:0" )