Normalization Layers¶

class fairseq2.nn.LayerNorm(*args, **kwargs)[source]¶

Bases: Module, ABC

Applies Layer Normalization to incoming data.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

abstract forward(x)[source]¶

Parameters:: x (Tensor) – The input to normalize. Shape: \((*,H)\), where \(H\) is normalized_shape.
Returns:: The normalized output. Shape: Same as x.
Return type:: Tensor

final class fairseq2.nn.StandardLayerNorm(normalized_shape, bias, *, eps=1e-05, elementwise_affine=True, cast_fp32=False, init_fn=None, device=None, dtype=None)[source]¶

Bases: LayerNorm

Applies Layer Normalization to incoming data as described in Ba et al. [1].

Parameters:

normalized_shape (int | Sequence[int] | Size) – The shape over which to normalize incoming data. For example, if the shape is (3, 5), the incoming data is normalized over the last 2 dimensions (i.e. input.mean((-2, -1))).
bias (bool) – If True, learns an additive bias. Ignored if elementwise_affine is False.
eps (float) – The value to add to the denominator for numerical stability.
elementwise_affine (bool) – If True, learns an affine transformation.

reset_parameters()[source]¶

forward(x)[source]¶

Parameters:: x (Tensor) – The input to normalize. Shape: \((*,H)\), where \(H\) is normalized_shape.
Returns:: The normalized output. Shape: Same as x.
Return type:: Tensor

extra_repr()[source]¶

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type:: str

final class fairseq2.nn.RMSNorm(normalized_shape, bias, *, eps=1e-05, elementwise_affine=True, init_fn=None, device=None, dtype=None)[source]¶