Adai optimizer is added to gdtuo.
[1] Chandra, K., Xie, A., Ragan-Kelley, J., & Meijer, E. (2022). Gradient descent: The ultimate optimizer. Advances in Neural Information Processing Systems, 35, 8214-8225.
[2] Xie, Z., Wang, X., Zhang, H., Sato, I., & Sugiyama, M. (2022, June). Adaptive inertia: Disentangling the effects of adaptive learning rate and momentum. In International Conference on Machine Learning (pp. 24430-24459). PMLR.
