TransformerNormOrder

class fairseq2.nn.transformer.TransformerNormOrder(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Specifies the Layer Normalization order.

classmethod __iter__()

Return members in definition order.

POST = 0

Apply Layer Normalization after each layer’s residual connection as described in Vaswani et al. [VSP+17].

PRE = 1

Apply Layer Normalization at the beginning of each layer as described in Xiong et al. [XYH+20].

PRE_WITH_NORMFORMER = 2

Apply Layer Normalization as described in Shleifer et al. [SWO21].