fairseq2.nn.utils

Masking Utilities

fairseq2.nn.utils.mask.apply_mask(seqs, mask, *, fill_value=0)[source]

Applies the specified boolean mask to seqs.

Parameters:
  • seqs (Tensor) – The sequences to mask. Shape: \((N,S,*)\), where \(N\) is the batch size, \(S\) is the sequence length, and \(*\) is any number of sequence-specific dimensions including none.

  • mask (Tensor) – The boolean mask.

Returns:

The input sequences with mask applied. Shape: Same as seqs.

Return type:

Tensor

Apply boolean masks to sequences with proper broadcasting.

fairseq2.nn.utils.mask.compute_row_mask(shape, span_len, max_mask_prob, row_lens=None, min_num_spans=0, device=None)[source]

Implements the RowMaskFactory protocol.

Note that, due to mask span overlap, the effective mask probability will be lower than max_mask_prob. The implementation also guarantees that there will be always at least one unmasked element in each row.

Return type:

Tensor | None

Generate random row masks for training objectives like MLM.

Example Usage:

from fairseq2.nn.utils.mask import apply_mask, compute_row_mask
from fairseq2.nn import BatchLayout

# Create batch with layout
batch_tensor = torch.randn(4, 16, 512)
batch_layout = BatchLayout((4, 16), seq_lens=[16, 14, 15, 16])

# Apply padding mask
padding_mask = batch_layout.position_indices >= 0
masked_batch = apply_mask(batch_tensor, padding_mask, fill_value=0.0)

# Generate random mask for MLM training
random_mask = compute_row_mask(
    shape=(4, 16),
    span_len=2, # must be no greater than the row_lens [4, 4, 3, 5]
    max_mask_prob=0.65,
    device="cpu",
    row_lens=batch_layout.seq_lens_pt
)