Working with Presets

What you will learn
  • What presets are and why they are useful

  • How to use built-in presets

  • How to create custom presets

  • How to override preset configurations

Prerequisites

Overview

Presets are pre-defined configurations that help you quickly get started with common training scenarios. They encapsulate best practices and tested hyperparameters for specific use cases. They also allows quick hyperparameter sweeps.

The key benefits of using presets are:

  • Reduce boilerplate configuration code

  • Start with proven configurations

  • Easily customize for your needs

  • Share configurations across experiments

Using Built-in Presets

fairseq2 comes with several built-in presets for common scenarios. To use a preset:

  1. List available presets:

fairseq2 lm instruction_finetune --list-presets
  1. Use a preset:

fairseq2 lm instruction_finetune $OUTPUT_DIR --preset base_10h

The preset will set default values for all configuration parameters. You can override any of these values using --config.

Creating Custom Presets

You can create custom presets by:

  1. Define a configuration class (if not using an existing one)

@dataclass(kw_only=True)
class MyTrainConfig:
    """Configuration for my training task."""

    learning_rate: float = 1e-4
    """The learning rate."""

    batch_size: int = 32
    """The batch size."""

    profile: tuple[int, int] | None = None
    """The number of steps that the PyTorch profiler should skip and then record."""
  1. Create a preset registry

my_train_presets = ConfigRegistry[MyTrainConfig]()

my_train_preset = my_train_presets.decorator
  1. Define presets using the decorator

@my_train_preset("fast")
def _fast() -> MyTrainConfig:
    return MyTrainConfig(
        learning_rate=1e-3,
        batch_size=64,
        profile=(1000, 10),  # skip 1000 steps then record 10 steps
    )

@my_train_preset("accurate")
def _accurate() -> MyTrainConfig:
    return MyTrainConfig(
        learning_rate=1e-5,
        batch_size=16,
        profile=(1000, 10),  # skip 1000 steps then record 10 steps
    )

For a complete example of preset implementation, here are a couple of examples:

  • fairseq2.recipes.wav2vec2.train

  • fairseq2.recipes.lm.instruction_finetune

Overriding Preset Values

You can override any preset values in two ways:

  1. Using command line arguments:

fairseq2 lm instruction_finetune $OUTPUT_DIR \
    --preset llama3_1_instruct \
    --config learning_rate=2e-4 batch_size=16
  1. Using a YAML configuration file:

# my_config.yaml
learning_rate: 2e-4
batch_size: 16
fairseq2 lm instruction_finetune $OUTPUT_DIR \
    --preset llama3_1_instruct \
    --config-file my_config.yaml

The override precedence is:

  1. Command line overrides (highest priority)

  2. Config file values

  3. Preset defaults (lowest priority)

Best Practices

  • Start with an existing preset close to your use case

  • Create custom presets for configurations you use frequently

  • Document preset parameters and their effects

  • Use meaningful preset names that indicate their purpose

  • Keep presets focused on specific scenarios

  • Version control your custom presets

Go Beyond

Once you are familiar with presets, you can go beyond and easily run hyperparameter sweeps.

A dummy slurm example
presets=(
    "preset_fast"
    "preset_accurate"
    "preset_default"
)

batch_sizes=(
    "16"
    "32"
    "64"
)

output_dir=<your_output_dir>

for preset in "${presets[@]}"; do
    for batch_size in "${batch_sizes[@]}"; do
        echo "Running preset::$preset | batch_size::$batch_size"
        srun fairseq2 <your_recipe> train $output_dir/$preset/batch_size_$batch_size \
            --preset $preset \
            --config batch_size=$batch_size
    done
done

It will be much easier for you to manage your experiments and benchmark training speed to multiple nodes.

Benchmark

See Also