全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

求解DC问题的一类随机优化算法
A Class of Stochastic Optimization Algorithms for Solving DC Problems

DOI: 10.12677/orf.2024.144403, PP. 342-357

Keywords: DC问题,随机梯度,l1-2正则化最小二乘问题
DC Problems
, Stochastic Gradient, l1-2 Regularized Least Squares Problems

Full-Text   Cite this paper   Add to My Lib

Abstract:

本文研究的是一类具有有限和形式的DC问题,其目标函数为具有有限和形式的光滑凸函数与连续凸函数之和再减去适当的闭凸函数的形式。传统的邻近DC算法(pDCA)在处理此类问题时,由于每一迭代步都需要对目标函数光滑部分的全梯度进行计算,从而导致计算成本较为昂贵,因此本文将随机梯度SARAH引入到pDCA中,提出了一种基于随机梯度SARAH的随机邻近DC算法(pDCA-SARAH),并给出了该算法的具体迭代格式,以降低计算成本。在非凸情形下,本文针对pDCA-SARAH算法给出了收敛性及收敛率分析。具体的,本文给出了目标函数在期望意义下的下降量分析以及次线性收敛率的结果。最后,通过将pDCA-SARAH算法用于求解l1-2正则化最小二乘问题,并与pDCA进行数值比较,展示了本文所提算法的高效性。
In this paper, we study a class of DC problems with finite sum form, whose objective function is the sum of smooth convex function and continuous convex function with finite sum form, minus the appropriate closed convex function. When dealing with this kind of problems, the traditional proximal difference-of-convex algorithm (pDCA) needs to calculate the full gradient of the smooth part of the objective function at each iterative step, so the computational cost is expensive. In this paper, stochastic gradient SARAH is introduced into pDCA, a stochastic proximal DC algorithm (pDCA-SARAH) based on stochastic gradient SARAH is proposed, and the specific iterative scheme of the algorithm is given to reduce the computational cost. In the non-convex case, the convergence and convergence rate analysis of pDCA-SARAH algorithm are given in this paper. Specifically, this paper gives the analysis of the decline of the objective function in the sense of expectation and the results of sublinear convergence rate. Finally, the pDCA-SARAH algorithm is applied to solve the l1-2 regularized least square problem, and compared with pDCA, the efficiency of the proposed algorithm is demonstrated.

References

[1]  Le Thi, H.A., Le, H.M. and Pham Dinh, T. (2014) Feature Selection in Machine Learning: An Exact Penalty Approach Using a Difference of Convex Function Algorithm. Machine Learning, 101, 163-186.
https://doi.org/10.1007/s10994-014-5455-y
[2]  Yin, P., Lou, Y., He, Q. and Xin, J. (2015) Minimization of ℓ1−2 for Compressed Sensing. SIAM Journal on Scientific Computing, 37, A536-A563.
https://doi.org/10.1137/140952363
[3]  Le Thi, H.A., Le, H.M., Phan, D.N. and Tran, B. (2020) Stochastic DCA for Minimizing a Large Sum of DC Functions with Application to Multi-Class Logistic Regression. Neural Networks, 132, 220-231.
https://doi.org/10.1016/j.neunet.2020.08.024
[4]  Pham, N.H., Nguyen, L.M., et al. (2020) ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Non-Convex Optimization. Journal of Machine Learning Research, 21, 4455-4502.
[5]  Dinh, T.P. (1986) Methods of Subgradients. North-Holland Mathematics Studies.
[6]  Luu, H.P.H., Le, H.M. and Le Thi, H.A. (2024) Markov Chain Stochastic DCA and Applications in Deep Learning with PDEs Regularization. Neural Networks, 170, 149-166.
https://doi.org/10.1016/j.neunet.2023.11.032
[7]  Hu, S. and Yan, Z. (2024) Quadratic Growth and Linear Convergence of a DCA Method for Quartic Minimization over the Sphere. Journal of Optimization Theory and Applications, 201, 378-395.
https://doi.org/10.1007/s10957-024-02401-w
[8]  Gotoh, J., Takeda, A. and Tono, K. (2017) DC Formulations and Algorithms for Sparse Optimization Problems. Mathematical Programming, 169, 141-176.
https://doi.org/10.1007/s10107-017-1181-0
[9]  Nesterov, Y. (2004) Introductory Lectures on Convex Optimization: a Basic Course. Kluwer Academic Publishers.
[10]  Polyak, B.T. (1964) Some Methods of Speeding up the Convergence of Iteration Methods. USSR Computational Mathematics and Mathematical Physics, 4, 1-17.
https://doi.org/10.1016/0041-5553(64)90137-5
[11]  Wen, B., Chen, X. and Pong, T.K. (2017) A Proximal Difference-Of-Convex Algorithm with Extrapolation. Computational Optimization and Applications, 69, 297-324.
https://doi.org/10.1007/s10589-017-9954-1
[12]  Gao, L. and Wen, B. (2022) Convergence Rate Analysis of an Extrapolated Proximal Difference-of-Convex Algorithm. Journal of Applied Mathematics and Computing, 69, 1403-1429.
https://doi.org/10.1007/s12190-022-01797-w
[13]  Robbins, H. and Monro, S. (1951) A Stochastic Approximation Method. The Annals of Mathematical Statistics, 22, 400-407.
https://doi.org/10.1214/aoms/1177729586
[14]  Li, M., Zhang, T., Chen, Y. and Smola, A.J. (2014) Efficient Mini-Batch Training for Stochastic Optimization. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 24-27 August 2014, 661-670.
https://doi.org/10.1145/2623330.2623612
[15]  Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. News in Physiological Sciences, 1, 315-323.
[16]  Defazio, A., Bach, F. and Lacoste-Julien, S. (2014) SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives. In: Ghahramani, Z., Ed., Advances in Neural Information Processing Systems, MIT Press, 1646-1654.
[17]  Nguyen, L.M., Liu, J., Scheinberg, K., et al. (2017) SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient. Proceedings of the 34th International Conference on Machine Learning, Sydney, 6-11 August 2017, 2613-2621.
[18]  董萍. 求解大规模优化问题的Gauss-Seidd型惯性邻近交替线性极小化算法[D]: [硕士学位论文]. 南京: 南京信息工程大学, 2023.
[19]  刘浩洋. 最优化计算方法[M]. 北京: 高等教育出版社, 2020.
[20]  刘浩洋. 最优化: 建模、算法与理论[M]. 北京: 高等教育出版社, 2020.
[21]  Nesterov, Y., et al. (2018) Lectures on Convex Optimization: Volume 137. Springer.
[22]  Liu, T. and Takeda, A. (2022) An Inexact Successive Quadratic Approximation Method for a Class of Difference-of-Convex Optimization Problems. Computational Optimization and Applications, 82, 141-173.
https://doi.org/10.1007/s10589-022-00357-z

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133