|
基于Polyak步长的动量方法
|
Abstract:
近年来,动量方法广泛地应用在机器学习训练中。本文基于Polyak步长和移动平均动量(MAG)方法提出了一个新的动量方法(LAGP),并将其与随机梯度结合,提出SLAGP方法。建立了LAGP方法在半强凸条件下的线性收敛性,以及SLAGP算法在半强凸条件下的线性收敛性。数值实验表明LAGP和SLAGP与其他流行算法相比有明显优势。
Recently, momentum methods have been widely adopted in training machine learning. In this paper, based on the Polyak step-size and the Moving Average Gradient (MAG) method, a new momentum method (LAGP) is proposed. By combining it with the stochastic gradient, the SLAGP method is developed. The linear convergence of the LAGP method under the semi-strongly convex condition, and the linear convergence of the SLAGP algorithm under the semi-strongly convex condition are established. Numerical experiments show that LAGP and SLAGP have significant advantages compared with other popular algorithms.
[1] | Joshi, A.V. (2020) Machine Learning and Artificial Intelligence. Springer International Publishing. https://doi.org/10.1007/978-3-030-26622-6 |
[2] | 马宪民. 人工智能的原理与方法[M]. 西安: 西北工业大学出版社, 2002. |
[3] | Polyak, B.T. (1987) Introduction to Optimization, Optimization Software, Inc., New York. |
[4] | Wang, X., Johansson, M. and Zhang, T. (2023) Generalized Polyak Step Size for First Order Optimization with Momentum. International Conference on Machine Learning (PMLR), Honolulu, HI, 35836-35863. |