全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种赋有新的BB类步长的随机递归梯度算法
A New Stochastic Recursive Gradient Algorithm with a BB-Like Stepsize

DOI: 10.12677/PM.2023.1311328, PP. 3165-3175

Keywords: 随机递归梯度算法,BB步长,自适应计算,随机优化
Stochastic Recursive Gradient Algorithm
, BB Stepsize, Adaptive Computing, Stochastic Optimization

Full-Text   Cite this paper   Add to My Lib

Abstract:

随机递归梯度算法(SARAH)最近引起了人们的广泛关注。它允许一个简单的递归框架来更新随机梯度估计。SARAH与重要性抽样策略相结合得到了SARAH-I算法。基于此,本文提出了一种新的随机递归梯度方法。该算法将SARAH-I算法与具有二维二次终止性的BB类步长相结合,使SARAH-I算法的步长能够自适应计算,具有较好的数值性能。最后通过数值实验我们观察到,新算法对初始步长的选取不敏感,并且具有自动生成最优步长的能力。
The stochastic recursive gradient algorithm (SARAH) attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. SARAH-I algorithm is ob-tained by combining SARAH with importance sampling strategy. Based on this, a new stochastic recursive gradient method is proposed in this paper. This algorithm combines SARAH-I algorithm with a BB-like stepsize with two dimensional quadratic termination property, which makes the SARAH-I algorithm automatically compute stepsizes and has good numerical performance. Finally, through numerical experiments, we observe that the new algorithm is insensitive to the selection of the initial stepsize, and has the ability to automatically generate the optimal stepsize.

References

[1]  Robbins, H. and Monro, S. (1951) A Stochastic Approximation Method. The Annals of Mathematical Statistics, 22, 400-407.
https://doi.org/10.1214/aoms/1177729586
[2]  Bottou, L., Curtis, F.E. and Nocedal, J. (2018) Optimi-zation Methods for Large-Scale Machine Learning. SIAM Review, 60, 223-311.
https://doi.org/10.1137/16M1080173
[3]  Roux, N.L., Schmidt, M. and Bach, F.R. (2013) A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets. Neural Information Processing Systems, Lake Tahoe, 5 December 2013, 2663-2671.
[4]  Schmidt, M.W., Roux, N.L. and Bach, F. (2017) Minimizing Finite Sums with the Stochastic Average Gradient. Mathematical Programming, 162, 83-112.
https://doi.org/10.1007/s10107-016-1030-6
[5]  Defazio, A., Bach, F. and Lacoste-Julien, S. (2014) SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives. Neural Information Processing Systems, Montreal, 8 December 2014, 1646-1654.
[6]  Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. Neural Information Processing Systems, Lake Tahoe, 5 December 2013, 315-323.
[7]  Konecny, J. and Richtarik, P. (2017) Semi-Stochastic Gradient Descent Methods. Frontiers in Applied Mathematics & Statistics, 3.
https://doi.org/10.3389/fams.2017.00009
[8]  Babanezhad, R., Ahmed, M.O., Virani, A., Schmidt, M.W., Konecny, J. and Sallinen, S. (2015) Stop Wasting My Gradients: Practical SVRG. Neural Information Processing Systems, Montreal, 7 December 2015, 2251-2259.
[9]  Nguyen, L.M., Liu, J., Scheinberg, K. and Takac, M. (2017) SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient. Neural Information Processing Systems, Long Beach, 4 December 2017, 2613-2621.
[10]  Tan, C., Ma, S., Dai, Y.H. and Qian, Y. (2016) Barzilai-Borwein Step Size for Stochastic Gradient Descent. Neural Information Processing Systems, Barcelona, 5 December 2016, 685-693.
[11]  Liu, Y., Wang, X. and Guo, T. (2020) A Linearly Convergent Stochastic Recursive Gradient Method for Convex Optimization. Optimization Letters, 14, 2265-2283.
https://doi.org/10.1007/s11590-020-01550-x
[12]  Raydan, M. (1997) The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem. SIAM Journal on Optimization, 7, 26-33.
https://doi.org/10.1137/S1052623494266365
[13]  Grippo, L. and Lucidi, F.L. (1986) A Nonmonotone Line Search Technique for Newton’s Method. SIAM Journal on Numerical Analysis, 23, 707-716.
https://doi.org/10.1137/0723046
[14]  Birgin, E.G., Martínez, J.M. and Raydan, M. (2000) Nonmonotone Spectral Projected Gradient Methods on Convex Sets. SIAM Journal on Optimization, 10, 1196-1211.
https://doi.org/10.1137/S1052623497330963
[15]  Yuan, Y.X. (2006) A New Stepsize for the Steepest Descent Method. Journal of Computational Mathematics, 24, 149-156.
[16]  Yuan, Y. (2008) Step-Sizes for the Gradient Method. AMS/IP Studies in Advanced Mathematics, 785-796.
https://doi.org/10.1090/amsip/042.2/23
[17]  Dai, Y. and Yuan, Y.X. (2017) Analysis of Monotone Gradient Methods. Journal of Industrial & Management Optimization, 1, 181-192.
https://doi.org/10.3934/jimo.2005.1.181
[18]  Huang, Y., Dai, Y.H. and Liu, X.W. (2021) Equipping Barzilai-Borwein Method with Two Dimensional Quadratic Termination Property. SIAM Journal on Optimization, 31, 3068-3069.
https://doi.org/10.1137/21M1390785

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133