PolynomialDecayLR

final class fairseq2.optim.lr_scheduler.PolynomialDecayLR(optimizer, num_steps, num_warmup_steps, *, power=1.0, start_lr=0.0, final_lr=0.0, last_epoch=-1)[source]

Bases: AbstractLRScheduler

Represents the polynomial decay learning rate schedule.

During warmup:

\[\eta_t = \eta_{base} \frac{t}{T_{warmup}}\]

After warmup:

\[\eta_t = \eta_{final} + (\eta_{base} - \eta_{final}) (\frac{T - t}{T - T_{warmup}})^{p}\]

This corresponds to increasing the learning rate linearly for the first \(T_{warmup}\) training steps to the base learning rate, and decreasing it thereafter for \(T - T_{warmup}\) steps to the final learning rate using a polynomial of degree \(p\).

Note

This scheduler is not chainable.

Parameters:
  • optimizer (Optimizer) – The associated optimizer.

  • num_steps (int) – The total number of steps, including warmup, over which to decay the learning rate.

  • num_warmup_steps (int) – The number of warmup steps.

  • power (float) – The exponent of the polynomial used for decay.

  • start_lr (Union[float, Sequence[float]]) – The initial warmup learning rate of all parameter groups, or of each parameter group respectively.

  • final_lr (Union[float, Sequence[float]]) – The final learning rate of all parameter groups, or of each parameter group respectively.

  • last_epoch (int) – The index of the last epoch.

get_last_lr()

Return last computed learning rate by current scheduler.

Return type:

List[float]

get_lr()

Compute learning rate using chainable form of the scheduler.

Return type:

List[float]

load_state_dict(state_dict)

Load the scheduler’s state.

Args:
state_dict (dict): scheduler state. Should be an object returned

from a call to state_dict().

print_lr(is_verbose, group, lr, epoch=None)

Display the current learning rate.

Deprecated since version 2.4: print_lr() is deprecated. Please use get_last_lr() to access the learning rate.

state_dict()

Return the state of the scheduler as a dict.

It contains an entry for every variable in self.__dict__ which is not the optimizer.

step(epoch=None)

Perform a step.