fairseq2.recipes.lm

Overview

The fairseq2.recipes.lm module provides utilities and recipes for language model training and fine-tuning. This includes tools for both pre-training and instruction tuning of language models.

Key Features

  • Language model pre-training utilities

  • Instruction fine-tuning support

  • CLI setup for language model training

  • Common training recipes and configurations

Instruction Fine-Tuning

Classes

class fairseq2.recipes.lm.InstructionFinetuneConfig(*, model: 'ModelSection' = <factory>, dataset: 'InstructionFinetuneDatasetSection' = <factory>, gang: 'GangSection' = <factory>, trainer: 'TrainerSection' = <factory>, optimizer: 'OptimizerSection' = <factory>, lr_scheduler: 'LRSchedulerSection' = <factory>, regime: 'RegimeSection' = <factory>, common: 'CommonSection' = <factory>)[source]

Bases: object

final class fairseq2.recipes.lm.InstructionFinetuneCriterion(model)[source]

Bases: object

class fairseq2.recipes.lm.InstructionFinetuneDatasetSection(*, name: 'str' = 'foo', path: 'Path | None' = None, family: 'str' = 'generic_instruction', train_split: 'str' = 'default', valid_split: 'str | None' = None, source_encode_mode: 'str' = 'prompt', target_encode_mode: 'str' = 'prompt_response', min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, max_num_tokens: 'int' = 16384, batch_size: 'int | None' = None, max_num_valid_tokens: 'int | None' = None, example_shuffle_window: 'int' = 10000, batch_shuffle_window: 'int' = 1000, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]

Bases: DatasetSection

source_encode_mode: str = 'prompt'

The encode mode for the prompt, determines what special tokens to add.

target_encode_mode: str = 'prompt_response'

The encode mode for the target, determines what special tokens to add.

min_seq_len: int = 1

The minimum sequence length.

max_seq_len: int = 8192

The maximum sequence length.

max_num_tokens: int = 16384

The maximum number of tokens per batch.

batch_size: int | None = None

If not None, ignores max_num_tokens and each batch will have batch_size examples.

max_num_valid_tokens: int | None = None

The maximum number of tokens per validation batch.

example_shuffle_window: int = 10000

The size of the sliding window for shuffling examples.

batch_shuffle_window: int = 1000

The size of the sliding window for shuffling batches.

num_prefetch: int = 4

The number of batches to prefetch in background.

extras: dict[str, object]

The dataset-specific extra options.

final class fairseq2.recipes.lm.InstructionFinetuneUnit(criterion, gangs)[source]

Bases: AbstractTrainUnit[SequenceBatch]

property metric_bag: SequenceMetricBag

The training-related metrics.

final class fairseq2.recipes.lm.InstructionLossEvalUnit(criterion, gangs)[source]

Bases: AbstractEvalUnit[SequenceBatch]

property metric_bag: SequenceMetricBag

The evaluation-related metrics.

Functions

fairseq2.recipes.lm.load_instruction_finetuner(context, config, output_dir)[source]
Return type:

Trainer[SequenceBatch]

Preference Fine-Tuning

Classes (PO)

class fairseq2.recipes.lm.POFinetuneConfig(*, model: 'ModelSection' = <factory>, dataset: 'POFinetuneDatasetSection' = <factory>, gang: 'GangSection' = <factory>, trainer: 'TrainerSection' = <factory>, criterion: 'POCriterionSection' = <factory>, optimizer: 'OptimizerSection' = <factory>, lr_scheduler: 'LRSchedulerSection' = <factory>, regime: 'RegimeSection' = <factory>, common: 'CommonSection' = <factory>)[source]

Bases: object

class fairseq2.recipes.lm.POFinetuneMetricBag(gang)[source]

Bases: SequenceMetricBag

Parameters:

train – If True, indicates that this bag is used in a training task.

update_logps(batch, chosen_logps, rejected_logps)[source]

Update the Chosen Sequence Log Probabilities and Rejected Sequence Log Probabilities metrics.

Parameters:
  • batch (PreferenceBatch) – The batch processed by the model.

  • chosen_logps (Tensor) – The log probabilities for each sequence in batch.chosen.

  • rejected_logps (Tensor) – The log probabilities for each sequence in batch.rejected.

update_sequence_lengths(batch)[source]

Update the Chosen Sequence Length and Rejected Sequence Length metrics.

Parameters:

batch (PreferenceBatch) – The batch processed by the model.

class fairseq2.recipes.lm.POCriterionSection(*, name: 'str', config: 'object')[source]

Bases: object

class fairseq2.recipes.lm.POFinetuneDatasetSection(*, name: 'str' = 'gsm8k_dpo', path: 'Path | None' = None, family: 'str' = 'generic_preference', source_encode_mode: 'str' = 'prompt', target_encode_mode: 'str' = 'prompt_response', mask_source_tokens: 'bool' = True, min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, max_num_tokens: 'int' = 16384, batch_size: 'int | None' = None, example_shuffle_window: 'int' = 10000, batch_shuffle_window: 'int' = 1000, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]

Bases: DatasetSection

source_encode_mode: str = 'prompt'

The encode mode for the prompt, determines what special tokens to add.

target_encode_mode: str = 'prompt_response'

The encode mode for the target, determines what special tokens to add.

mask_source_tokens: bool = True

If False, calculates loss on the src tokens as well as the tgt tokens.

min_seq_len: int = 1

The minimum sum of src + tgt_chosen and src + tgt_rejected. Shorter sequences will be dropped.

max_seq_len: int = 8192

The maximum sum of src + tgt_chosen and src + tgt_rejected. Longer sequences will be dropped.

max_num_tokens: int = 16384

The maximum number of total src, tgt_chosen, and tgt_rejected tokens per batch.

batch_size: int | None = None

If not None, ignores max_num_tokens and each batch will have batch_size examples.

example_shuffle_window: int = 10000

The size of the sliding window for shuffling examples.

batch_shuffle_window: int = 1000

The size of the sliding window for shuffling batches.

num_prefetch: int = 4

The number of batches to prefetch in background.

extras: dict[str, object]

The dataset-specific extra options.

class fairseq2.recipes.lm.CpoFinetuneConfig(*, beta: 'float' = 1.0, nll_scale: 'float' = 1.0)[source]

Bases: object

beta: float = 1.0

The coefficient applied to the difference between preferred and dispreferred sequences.

nll_scale: float = 1.0

The coefficient of NLL loss added to the CPO loss.

final class fairseq2.recipes.lm.CpoFinetuneUnit(model, gangs, beta=1.0, nll_scale=1.0)[source]

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model CPO-finetuning unit. Paper: https://arxiv.org/abs/2401.08417.

set_step_nr(step_nr)[source]

Set the current training step number.

property metric_bag: CpoFinetuneMetricBag

The training-related metrics.

final class fairseq2.recipes.lm.CpoFinetuneUnitHandler[source]

Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.CpoFinetuneMetricBag(gang)[source]

Bases: POFinetuneMetricBag

Holds the metrics of a CPO preference finetuning task.

Parameters:

train – If True, indicates that this bag is used in a training task.

update_cpo_loss(batch, loss)[source]

Update the CPO loss metric.

Parameters:
  • batch (PreferenceBatch) – The batch processed by the model.

  • loss (Tensor) – The CPO loss of batch.

class fairseq2.recipes.lm.DpoFinetuneConfig(*, reference_model: 'ReferenceModelSection' = <factory>, reference_dtype: 'DataType' = torch.bfloat16, beta: 'float' = 0.1, nll_scale: 'float' = 0.0, length_normalization: 'bool' = False)[source]

Bases: object

reference_model: ReferenceModelSection

The reference model. If None, the recipe expects to get reference log-probabilities for chosen and rejected targets as float values in the data example (fields reference_score_rejected and reference_score_chosen).

reference_dtype: dtype = torch.bfloat16

The data type of the reference model.

beta: float = 0.1

The coefficient of regularization towards the reference model.

nll_scale: float = 0.0

The coefficient of NLL loss added to the DPO loss.

length_normalization: bool = False

Use length normalized DPO, which uses the average log probability of a sequence as the implicit reward.

final class fairseq2.recipes.lm.DpoFinetuneUnit(model, reference_model, gangs, beta=0.1, nll_scale=1.0, length_normalization=False)[source]

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model DPO-finetuning unit. Paper: https://arxiv.org/abs/2305.18290.

set_step_nr(step_nr)[source]

Set the current training step number.

property metric_bag: DpoFinetuneMetricBag

The training-related metrics.

final class fairseq2.recipes.lm.DpoFinetuneUnitHandler(context)[source]

Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.DpoFinetuneMetricBag(gang)[source]

Bases: POFinetuneMetricBag

Holds the metrics of a DPO preference finetuning task.

Parameters:

train – If True, indicates that this bag is used in a training task.

update_dpo_loss(batch, loss)[source]

Update the DPO loss metric.

Parameters:
  • batch (PreferenceBatch) – The batch processed by the model.

  • loss (Tensor) – The DPO loss of batch.

class fairseq2.recipes.lm.OrpoFinetuneConfig(*, orpo_lambda: 'float' = 1.0, nll_scale: 'float' = 1.0)[source]

Bases: object

orpo_lambda: float = 1.0

The coefficient of the odds-ratio component of ORPO loss

nll_scale: float = 1.0

The coefficient of the NLL component of ORPO loss.

final class fairseq2.recipes.lm.OrpoFinetuneUnit(model, gangs, orpo_lambda=1.0, nll_scale=1.0)[source]

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model ORPO-finetuning unit. Paper: https://arxiv.org/abs/2403.07691.

set_step_nr(step_nr)[source]

Set the current training step number.

property metric_bag: OrpoFinetuneMetricBag

The training-related metrics.

final class fairseq2.recipes.lm.OrpoFinetuneUnitHandler[source]

Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.OrpoFinetuneMetricBag(gang)[source]

Bases: POFinetuneMetricBag

Holds the metrics of a ORPO preference finetuning task.

Parameters:

train – If True, indicates that this bag is used in a training task.

update_orpo_loss(batch, loss)[source]

Update the ORPO loss metric.

Parameters:
  • batch (PreferenceBatch) – The batch processed by the model.

  • loss (Tensor) – The ORPO loss of batch.

class fairseq2.recipes.lm.SimPOFinetuneConfig(*, beta: 'float' = 1, gamma: 'float' = 0.5, nll_scale: 'float' = 0.0)[source]

Bases: object

beta: float = 1

The coefficient of KL-divergence regularization.

gamma: float = 0.5

The target reward margin between positive and negative completions.

nll_scale: float = 0.0

The coefficient of NLL loss added to the SimPO loss.

final class fairseq2.recipes.lm.SimPOFinetuneUnit(model, gangs, beta=0.1, gamma=0.5, nll_scale=1.0)[source]

Bases: AbstractTrainUnit[PreferenceBatch]

Represents the language model SimPO-finetuning unit. Paper: https://arxiv.org/abs/2405.14734.

set_step_nr(step_nr)[source]

Set the current training step number.

property metric_bag: SimPOFinetuneMetricBag

The training-related metrics.

final class fairseq2.recipes.lm.SimPOFinetuneUnitHandler[source]

Bases: POFinetuneUnitHandler

class fairseq2.recipes.lm.SimPOFinetuneMetricBag(gang)[source]

Bases: POFinetuneMetricBag

Holds the metrics of a SimPO preference finetuning task.

Parameters:

train – If True, indicates that this bag is used in a training task.

update_simpo_loss(batch, loss)[source]

Update the SimPO loss metric.

Parameters:
  • batch (PreferenceBatch) – The batch processed by the model.

  • loss (Tensor) – The SimPO loss of batch.

Functions (PO)

fairseq2.recipes.lm.load_po_finetuner(context, config, output_dir)[source]
Return type:

Trainer[PreferenceBatch]

Text Generation

Classes (Text Generation)

class fairseq2.recipes.lm.TextGenerateConfig(*, model: 'ReferenceModelSection' = <factory>, dataset: 'TextGenerateDatasetSection' = <factory>, gang: 'GangSection' = <factory>, generator: 'GeneratorSection' = <factory>, seq_generator: 'SequenceGeneratorSection' = <factory>, common: 'CommonSection' = <factory>)[source]

Bases: object

class fairseq2.recipes.lm.TextGenerateDatasetSection(*, name: 'str' = 'foo', path: 'Path | None' = None, family: 'str' = 'generic_instruction', split: 'str' = 'default', min_seq_len: 'int' = 1, max_seq_len: 'int' = 8192, num_prefetch: 'int' = 4, extras: 'dict[str, object]' = <factory>)[source]

Bases: DatasetSection

min_seq_len: int = 1

The minimum sequence length.

max_seq_len: int = 8192

The maximum sequence length.

num_prefetch: int = 4

The number of batches to prefetch in background.

extras: dict[str, object]

The dataset-specific extra options.

final class fairseq2.recipes.lm.TextGenerateUnit(generator, tokenizer, gangs, text_output_stream, json_output_stream)[source]

Bases: AbstractGeneratorUnit[SequenceBatch]

Represents a text generation unit.

property metric_bag: SequenceGenerationMetricBag

The generation-related metrics.

Functions (Text Generation)

fairseq2.recipes.lm.load_text_generator(context, config, output_dir)[source]
Return type:

Generator[SequenceBatch]

Usage Examples