|
Pure Mathematics 2024
一种非精确邻近梯度算法
|
Abstract:
邻近点算法(PPA)是求解非光滑优化问题的一种有效的迭代算法,对特殊结构问题的求解非常高效,但在实际问题中求解大规模可分离问题时花费很大。为解决上述问题且同时又保持PPA算法的优点,本文给出了一种非精确邻近梯度算法。该算法结合了线搜索法与邻近梯度下降算法的思想,在子问题的求解过程中采用近似的梯度,且不需要Lipschitz常数已知。基于以上思想,首先我们给出算法的伪代码,然后建立了算法收敛性的充分条件,最后证明在该条件下,算法迭代所产生序列的每个极限点是原问题的临界点。
The Proximity Algorithm (PPA) is an effective iterative algorithm for solving non-smooth optimization problems, which is very efficient in solving special structural problems, but it is expensive to solve large-scale separable problems in practical problems. In order to solve the above problems and maintain the advantages of PPA algorithm, an inexact proximity gradient algorithm is proposed. The algorithm combines the ideas of the line search method and the proximity gradient descent algorithm, and adopts the approximate gradient in the solution of the sub-problem, and does not need the Lipschitz constant to be known. Based on the above ideas, firstly, we give the pseudocode of the algorithm, then establish the sufficient conditions for the convergence of the algorithm, and finally prove that under this condition, each limit point of the sequence generated by the algorithm iteration is the critical point of the original problem.
[1] | Robbins, H. and Monro, S. (1951) A Stochastic Approximation Method. Annals of Mathematical Statistics, 22, 400-407. https://doi.org/10.1214/aoms/1177729586 |
[2] | Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. Advances in Neural Information Processing Systems, 1, 315-323. |
[3] | Erdogdu, M.A. and Montanari, A. (2015) Convergence Rates of Sub-Sampled Newton Methods. International Conference on Neural Information Processing Systems, MIT Press, 28. |
[4] | Agarwal, N., Bullins, B. and Hazan, E. (2017) Second-Order Stochastic Optimization for Machine Learning in Linear Time. Journal of Machine Learning Research, 18, 1-14. |
[5] | Moreau, J.J. (1962) Fonctions convexes duales et points proximaux dans un espacehilbertien. Comptes rendushebdomadaires des séances de l’Académie des Sciences, 255, 2897-2899. |
[6] | Yuan, X. (2012) Alternating Direction Methods for Sparse Covariance Selection. Journal of Scientific Computing, 51, 261-273. https://doi.org/10.1007/s10915-011-9507-1 |
[7] | Beck, A. and Teboulle, M. (2009) Gradient-Based Algorithms with Applications to Signal Recovery. Convex Optimization in Signal Processing and Communications, 42-88. https://doi.org/10.1017/CBO9780511804458.003 |
[8] | Machart, P., Anthoine, S. and Baldassarre, L. (2012) Optimal Computational Tradeoff of Inexact Proximal Methods. |
[9] | Moreau, J.J. (1965) Proximité et dualité dans un espacehilbertien. Bulletin de la Société mathématique de France, 93, 273-299. https://doi.org/10.24033/bsmf.1625 |
[10] | Krasnoselkii, M.A. (1957) Two Observations about the Method of Successive Approximations, Uspehi Math. Nauk, 10, 131-140. |
[11] | Rockafellar, R.T. (1976) Monotone Operators and the Proximal Point Algorithm. SIAM Journal on Control and Optimization, 14, 877-898. https://doi.org/10.1137/0314056 |
[12] | Rockafellar, R.T. (1976) Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming. Mathematics of Operations Research, 1, 97-116. https://doi.org/10.1287/moor.1.2.97 |
[13] | Bello Cruz, J.Y. and Nghia, T.T.A. (2016) On the Convergence of the Forward-Backward Splitting Method with Linesearches. Optimization Methods and Software, 31, 1209-1238. https://doi.org/10.1080/10556788.2016.1214959 |
[14] | Polyak, B.T. and Juditsky, A. B. (1992) Acceleration of Stochastic Approximation by Averaging. SIAM Journal on Control and Optimization, 30, 838-855. https://doi.org/10.1137/0330046 |
[15] | Bauschke, H.H. and Combettes, P.L. (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York. https://doi.org/10.1007/978-1-4419-9467-7 |
[16] | Roosta-Khorasani, F. and Mahoney, M.W. (2019) Sub-Sampled Newton Methods. Mathematical Programming, 174, 293-326. https://doi.org/10.1007/s10107-018-1346-5 |
[17] | Iusem, A.N., Svaiter, B.F. and Teboulle, M. (1994) Entropy-Like Proximal Methods in Convex Programming. Mathematics of Operations Research, 19, 790-814. https://doi.org/10.1287/moor.19.4.790 |