All

ABCs and Protocols

gang.Gang

Represents a set of processes that work collectively.

Classes

optim.lr_scheduler.CosineAnnealingLR

Represents the learning rate schedule described in Loshchilov and Hutter [LH17].

optim.lr_scheduler.MyleLR

Represents a scaled version of NoamLR that preserves the base learning rate of the associated optimizer.

optim.lr_scheduler.NoamLR

Represents the learning rate schedule described in Section 5.3 of Vaswani et al. [VSP+17].

optim.lr_scheduler.PolynomialDecayLR

Represents the polynomial decay learning rate schedule.

Enums

nn.transformer.TransformerNormOrder

Specifies the Layer Normalization order.

Functions

nn.utils.mask.to_float_mask

Convert a boolean mask to a float mask.