fairseq2.recipes.lm¶

Overview¶

The fairseq2.recipes.lm module provides utilities and recipes for language model training and fine-tuning. This includes tools for both pre-training and instruction tuning of language models.

Key Features¶

Language model pre-training utilities
Instruction fine-tuning support
CLI setup for language model training
Common training recipes and configurations

Instruction Fine-Tuning¶

Classes¶

class fairseq2.recipes.lm.InstructionFinetuneConfig(*, model: 'ModelSection' = <factory>, dataset: 'InstructionFinetuneDatasetSection' = <factory>, gang: 'GangSection' = <factory>, trainer: 'TrainerSection' = <factory>, optimizer: 'OptimizerSection' = <factory>, lr_scheduler: 'LRSchedulerSection' = <factory>, regime: 'RegimeSection' = <factory>, common: 'CommonSection' = <factory>)[source]¶

Bases: object

final class fairseq2.recipes.lm.InstructionFinetuneCriterion(model)[source]¶

Bases: object

class fairseq2.recipes.lm.InstructionFinetuneDatasetSection(*, name: 'str' = 'foo', path: 'Path | None' = None, family: 'str' = 'generic_instruction', train_split: 'str' = 'default', valid_split: 'str | None' = None, source_encode_mode: 'str' = 'prompt', target_encode_mode: 'str' = 'prompt_response', min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, max_num_tokens: 'int' = 16384, batch_size: 'int | None' = None, max_num_valid_tokens: 'int | None' = None, example_shuffle_window: 'int' = 10000, batch_shuffle_window: 'int' = 1000, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]¶

Bases: DatasetSection

source_encode_mode: str = 'prompt'¶: The encode mode for the prompt, determines what special tokens to add.

target_encode_mode: str = 'prompt_response'¶: The encode mode for the target, determines what special tokens to add.

min_seq_len: int = 1¶: The minimum sequence length.

max_seq_len: int = 8192¶: The maximum sequence length.

max_num_tokens: int = 16384¶: The maximum number of tokens per batch.

batch_size: int | None = None¶: If not None, ignores max_num_tokens and each batch will have batch_size examples.

max_num_valid_tokens: int | None = None¶: The maximum number of tokens per validation batch.

example_shuffle_window: int = 10000¶: The size of the sliding window for shuffling examples.

batch_shuffle_window: int = 1000¶: The size of the sliding window for shuffling batches.

num_prefetch: int = 4¶: The number of batches to prefetch in background.

extras: dict[str, object]¶: The dataset-specific extra options.

final class fairseq2.recipes.lm.InstructionFinetuneUnit(criterion, gangs)[source]¶

Bases: AbstractTrainUnit[SequenceBatch]

property metric_bag: SequenceMetricBag¶: The training-related metrics.

final class fairseq2.recipes.lm.InstructionLossEvalUnit(criterion, gangs)[source]¶

Bases: AbstractEvalUnit[SequenceBatch]

property metric_bag: SequenceMetricBag¶: The evaluation-related metrics.

Functions¶

fairseq2.recipes.lm.load_instruction_finetuner(context, config, output_dir)[source]¶

Return type:: Trainer[SequenceBatch]

Preference Fine-Tuning¶

Classes (PO)¶

class fairseq2.recipes.lm.POFinetuneConfig(*, model: 'ModelSection' = <factory>, dataset: 'POFinetuneDatasetSection' = <factory>, gang: 'GangSection' = <factory>, trainer: 'TrainerSection' = <factory>, criterion: 'POCriterionSection' = <factory>, optimizer: 'OptimizerSection' = <factory>, lr_scheduler: 'LRSchedulerSection' = <factory>, regime: 'RegimeSection' = <factory>, common: 'CommonSection' = <factory>)[source]¶

Bases: object

class fairseq2.recipes.lm.POFinetuneMetricBag(gang)[source]¶

Bases: SequenceMetricBag

Parameters:: train – If True, indicates that this bag is used in a training task.

update_logps(batch, chosen_logps, rejected_logps)[source]¶

Update the Chosen Sequence Log Probabilities and Rejected Sequence Log Probabilities metrics.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.
chosen_logps (Tensor) – The log probabilities for each sequence in batch.chosen.
rejected_logps (Tensor) – The log probabilities for each sequence in batch.rejected.

update_sequence_lengths(batch)[source]¶

Update the Chosen Sequence Length and Rejected Sequence Length metrics.

Parameters:: batch (PreferenceBatch) – The batch processed by the model.

class fairseq2.recipes.lm.POCriterionSection(*, name: 'str', config: 'object')[source]¶

Bases: object

class fairseq2.recipes.lm.POFinetuneDatasetSection(*, name: 'str' = 'gsm8k_dpo', path: 'Path | None' = None, family: 'str' = 'generic_preference', source_encode_mode: 'str' = 'prompt', target_encode_mode: 'str' = 'prompt_response', mask_source_tokens: 'bool' = True, min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, max_num_tokens: 'int' = 16384, batch_size: 'int | None' = None, example_shuffle_window: 'int' = 10000, batch_shuffle_window: 'int' = 1000, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]¶

Bases: DatasetSection

source_encode_mode: str = 'prompt'¶: The encode mode for the prompt, determines what special tokens to add.

target_encode_mode: str = 'prompt_response'¶: The encode mode for the target, determines what special tokens to add.

mask_source_tokens: bool = True¶: If False, calculates loss on the src tokens as well as the tgt tokens.

min_seq_len: int = 1¶: The minimum sum of src + tgt_chosen and src + tgt_rejected. Shorter sequences will be dropped.

max_seq_len: int = 8192¶: The maximum sum of src + tgt_chosen and src + tgt_rejected. Longer sequences will be dropped.

max_num_tokens: int = 16384¶: The maximum number of total src, tgt_chosen, and tgt_rejected tokens per batch.

batch_size: int | None = None¶: If not None, ignores max_num_tokens and each batch will have batch_size examples.

example_shuffle_window: int = 10000¶: The size of the sliding window for shuffling examples.

batch_shuffle_window: int = 1000¶: The size of the sliding window for shuffling batches.

num_prefetch: int = 4¶: The number of batches to prefetch in background.

extras: dict[str, object]¶: The dataset-specific extra options.

class fairseq2.recipes.lm.CpoFinetuneConfig(*, beta: 'float' = 1.0, nll_scale: 'float' = 1.0)[source]¶

Bases: object

beta: float = 1.0¶: The coefficient applied to the difference between preferred and dispreferred sequences.

nll_scale: float = 1.0¶: The coefficient of NLL loss added to the CPO loss.

final class fairseq2.recipes.lm.CpoFinetuneUnit(model, gangs, beta=1.0, nll_scale=1.0)[source]¶

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model CPO-finetuning unit. Paper: https://arxiv.org/abs/2401.08417.

set_step_nr(step_nr)[source]¶

Set the current training step number.

property metric_bag: CpoFinetuneMetricBag¶: The training-related metrics.

final class fairseq2.recipes.lm.CpoFinetuneUnitHandler[source]¶: Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.CpoFinetuneMetricBag(gang)[source]¶

Bases: POFinetuneMetricBag

Holds the metrics of a CPO preference finetuning task.

Parameters:: train – If True, indicates that this bag is used in a training task.

update_cpo_loss(batch, loss)[source]¶

Update the CPO loss metric.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.
loss (Tensor) – The CPO loss of batch.

class fairseq2.recipes.lm.DpoFinetuneConfig(*, reference_model: 'ReferenceModelSection' = <factory>, reference_dtype: 'DataType' = torch.bfloat16, beta: 'float' = 0.1, nll_scale: 'float' = 0.0, length_normalization: 'bool' = False)[source]¶

Bases: object

reference_model: ReferenceModelSection¶: The reference model. If None, the recipe expects to get reference log-probabilities for chosen and rejected targets as float values in the data example (fields reference_score_rejected and reference_score_chosen).

reference_dtype: dtype = torch.bfloat16¶: The data type of the reference model.

beta: float = 0.1¶: The coefficient of regularization towards the reference model.

nll_scale: float = 0.0¶: The coefficient of NLL loss added to the DPO loss.

length_normalization: bool = False¶: Use length normalized DPO, which uses the average log probability of a sequence as the implicit reward.

final class fairseq2.recipes.lm.DpoFinetuneUnit(model, reference_model, gangs, beta=0.1, nll_scale=1.0, length_normalization=False)[source]¶

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model DPO-finetuning unit. Paper: https://arxiv.org/abs/2305.18290.

set_step_nr(step_nr)[source]¶

Set the current training step number.

property metric_bag: DpoFinetuneMetricBag¶: The training-related metrics.

final class fairseq2.recipes.lm.DpoFinetuneUnitHandler(context)[source]¶

Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.DpoFinetuneMetricBag(gang)[source]¶

Bases: POFinetuneMetricBag

Holds the metrics of a DPO preference finetuning task.

Parameters:: train – If True, indicates that this bag is used in a training task.

update_dpo_loss(batch, loss)[source]¶

Update the DPO loss metric.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.
loss (Tensor) – The DPO loss of batch.

class fairseq2.recipes.lm.OrpoFinetuneConfig(*, orpo_lambda: 'float' = 1.0, nll_scale: 'float' = 1.0)[source]¶

Bases: object

orpo_lambda: float = 1.0¶: The coefficient of the odds-ratio component of ORPO loss

nll_scale: float = 1.0¶: The coefficient of the NLL component of ORPO loss.

final class fairseq2.recipes.lm.OrpoFinetuneUnit(model, gangs, orpo_lambda=1.0, nll_scale=1.0)[source]¶

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model ORPO-finetuning unit. Paper: https://arxiv.org/abs/2403.07691.

set_step_nr(step_nr)[source]¶

Set the current training step number.

property metric_bag: OrpoFinetuneMetricBag¶: The training-related metrics.

final class fairseq2.recipes.lm.OrpoFinetuneUnitHandler[source]¶: Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.OrpoFinetuneMetricBag(gang)[source]¶

Bases: POFinetuneMetricBag

Holds the metrics of a ORPO preference finetuning task.

Parameters:: train – If True, indicates that this bag is used in a training task.

update_orpo_loss(batch, loss)[source]¶

Update the ORPO loss metric.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.
loss (Tensor) – The ORPO loss of batch.

class fairseq2.recipes.lm.SimPOFinetuneConfig(*, beta: 'float' = 1, gamma: 'float' = 0.5, nll_scale: 'float' = 0.0)[source]¶

Bases: object

beta: float = 1¶: The coefficient of KL-divergence regularization.

gamma: float = 0.5¶: The target reward margin between positive and negative completions.

nll_scale: float = 0.0¶: The coefficient of NLL loss added to the SimPO loss.

final class fairseq2.recipes.lm.SimPOFinetuneUnit(model, gangs, beta=0.1, gamma=0.5, nll_scale=1.0)[source]¶

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model SimPO-finetuning unit. Paper: https://arxiv.org/abs/2405.14734.

set_step_nr(step_nr)[source]¶

Set the current training step number.

property metric_bag: SimPOFinetuneMetricBag¶: The training-related metrics.

final class fairseq2.recipes.lm.SimPOFinetuneUnitHandler[source]¶: Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.SimPOFinetuneMetricBag(gang)[source]¶

Bases: POFinetuneMetricBag

Holds the metrics of a SimPO preference finetuning task.

Parameters:: train – If True, indicates that this bag is used in a training task.

update_simpo_loss(batch, loss)[source]¶

Update the SimPO loss metric.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.
loss (Tensor) – The SimPO loss of batch.

Functions (PO)¶

fairseq2.recipes.lm.load_po_finetuner(context, config, output_dir)[source]¶

Return type:: Trainer[PreferenceBatch]

Text Generation¶

Classes (Text Generation)¶

class fairseq2.recipes.lm.TextGenerateConfig(*, model: 'ReferenceModelSection' = <factory>, dataset: 'TextGenerateDatasetSection' = <factory>, gang: 'GangSection' = <factory>, generator: 'GeneratorSection' = <factory>, seq_generator: 'SequenceGeneratorSection' = <factory>, common: 'CommonSection' = <factory>)[source]¶

Bases: object

class fairseq2.recipes.lm.TextGenerateDatasetSection(*, name: 'str' = 'foo', path: 'Path | None' = None, family: 'str' = 'generic_instruction', split: 'str' = 'default', min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]¶

Bases: DatasetSection

min_seq_len: int = 1¶: The minimum sequence length.

max_seq_len: int = 8192¶: The maximum sequence length.

num_prefetch: int = 4¶: The number of batches to prefetch in background.

extras: dict[str, object]¶: The dataset-specific extra options.

final class fairseq2.recipes.lm.TextGenerateUnit(generator, tokenizer, gangs, text_output_stream, json_output_stream)[source]¶

Bases: AbstractGeneratorUnit[SequenceBatch]

Represents a text generation unit.

property metric_bag: SequenceGenerationMetricBag¶: The generation-related metrics.

Functions (Text Generation)¶

fairseq2.recipes.lm.load_text_generator(context, config, output_dir)[source]¶

Return type:: Generator[SequenceBatch]

Usage Examples¶

End-to-End Fine-Tuning