All
ABCs and Protocols
Represents a set of processes that work collectively. |
Classes
Represents the learning rate schedule described in Loshchilov and Hutter [LH17]. |
|
Represents a scaled version of |
|
Represents the learning rate schedule described in Section 5.3 of Vaswani et al. [VSP+17]. |
|
Represents the polynomial decay learning rate schedule. |
Enums
Specifies the Layer Normalization order. |
Functions
Convert a boolean mask to a float mask. |