qhoptim: Quasi-hyperbolic optimization¶
Star Fork IssueThe qhoptim library provides PyTorch and TensorFlow implementations of the quasi-hyperbolic momentum (QHM) and quasi-hyperbolic Adam (QHAdam) optimization algorithms from Facebook AI Research.
Quickstart¶
Use this one-liner for installation:
$ pip install qhoptim
Then, you can instantiate the optimizers in PyTorch:
>>> from qhoptim.pyt import QHM, QHAdam
# something like this for QHM
>>> optimizer = QHM(model.parameters(), lr=1.0, nu=0.7, momentum=0.999)
# or something like this for QHAdam
>>> optimizer = QHAdam(
... model.parameters(), lr=1e-3, nus=(0.7, 1.0), betas=(0.995, 0.999))
Or in TensorFlow:
>>> from qhoptim.tf import QHMOptimizer, QHAdamOptimizer
# something like this for QHM
>>> optimizer = QHMOptimizer(
... learning_rate=1.0, nu=0.7, momentum=0.999)
# or something like this for QHAdam
>>> optimizer = QHAdamOptimizer(
... learning_rate=1e-3, nu1=0.7, nu2=1.0, beta1=0.995, beta2=0.999)
Please refer to the links on the menubar for detailed installation instructions and API references.
Choosing QHM parameters¶
For those who use momentum or Nesterov’s accelerated gradient with momentum constant \(\beta = 0.9\), we recommend trying out QHM with \(\nu = 0.7\) and momentum constant \(\beta = 0.999\). You’ll need to normalize the learning rate by dividing by \(1 - \beta_{old}\).
Similarly, for those who use Adam with \(\beta_1 = 0.9\), we recommend trying out QHAdam with \(\nu_1 = 0.7\), \(\beta_1 = 0.995\), \(\nu_2 = 1\), and all other parameters unchanged.
Below is a handy widget to help convert from SGD with (Nesterov) momentum to QHM:
QHM Hyperparameter Advisor
QHM learning rate (alpha):
QHM immediate discount (nu):
QHM momentum (beta):
QHM learning rate (alpha):
QHM immediate discount (nu):
QHM momentum (beta):
Reference¶
QHM and QHAdam were proposed in the ICLR 2019 paper “Quasi-hyperbolic momentum and Adam for deep learning”. We recommend reading the paper for both theoretical insights into and empirical analyses of the algorithms.
If you find the algorithms useful in your research, we ask that you cite the paper as follows:
@inproceedings{ma2019qh,
title={Quasi-hyperbolic momentum and Adam for deep learning},
author={Jerry Ma and Denis Yarats},
booktitle={International Conference on Learning Representations},
year={2019}
}