全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一类带偶次惩罚范数的非凸函数及周期ADMM算法的收敛性分析
Convergence Analysis of a Class of Nonconvex Functions with Even-Powered Penalty Norms and the Periodic ADMM Algorithm

DOI: 10.12677/aam.2024.136252, PP. 2641-2652

Keywords: 机器学习,非凸函数,Lp范数,交替方向乘子
Machine Learning
, Nonconvex Function, Lp Norm, Alternating Direction Multiplier

Full-Text   Cite this paper   Add to My Lib

Abstract:

在机器学习以及其它相关领域中,针对非凸函数的优化问题,目前存在的算法理论上对非凸函数的收敛和全局稳定性无法得到有效保证。本文提出将Lp范数(p为偶数)引入到非凸函数中,并在此基础上设计一种周期交替方向乘子(Periodic Alternating Direction Method of Multipliers, PADMM)的优化算法,用于此类非凸函数收敛性分析。我们证明在惩罚参数足够大的情况下,带偶次惩罚范数的非凸函数必收敛,并且收敛到全局最小值。此外,PADMM算法不对变量更新的先后顺序作特殊要求,这一特性大大增强了PADMM算法在处理各类非凸函数优化问题时的普适性。
In machine learning and other related fields, for the optimization problem of non-convex functions, the existing algorithms cannot effectively guarantee the convergence and global stability of non-convex functions in theory. In this paper, the Lp norm (p is even) is introduced into the non-convex function, and on this basis, an optimization algorithm of Periodic Alternating Direction Method of Multipliers (PADMM) is designed for the convergence analysis of such non-convex functions. We prove that when the penalty parameter is large enough, the nonconvex function with even penalty norm will converge and converge to the global minimum. In addition, the PADMM algorithm does not impose special requirements on the order of variable updating, which greatly enhances the universality of the PADMM algorithm in dealing with various non-convex function optimization problems.

References

[1]  Jain, P. and Kar, P. (2017) Non-Convex Optimization for Machine Learning. Foundations and Trends? in Machine Learning, 10, 142-336.
https://doi.org/10.1561/2200000058
[2]  Du, S., Lee, J., Li, H., et al. (2019) Gradient Descent Finds Global Minima of Deep Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, 28 May 2019, 1675-1685.
[3]  Mignacco, F. and Urbani, P. (2022) The Effective Noise of Stochastic Gradient Descent. Journal of Statistical Mechanics: Theory and Experiment, 2022, Article 083405.
https://doi.org/10.1088/1742-5468/ac841d
[4]  Huang, F., Gao, S., Pei, J., et al. (2022) Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization. Journal of Machine Learning Research, 23, 1616-1685.
[5]  Shani, L., Efroni, Y. and Mannor, S. (2020) Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 5668-5675.
https://doi.org/10.1609/aaai.v34i04.6021
[6]  Krutikov, V., Tovbis, E., Stanimirovi?, P. and Kazakovtsev, L. (2023) On the Convergence Rate of Quasi-Newton Methods on Strongly Convex Functions with Lipschitz Gradient. Mathematics, 11, Article 4715.
https://doi.org/10.3390/math11234715
[7]  Glowinski, R. and Marroco, A. (1975) Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-Dualité d’une classe de problèmes de Dirichlet non linéaires. Revue fran?aise d’automatique, informatique, recherche opérationnelle. Analyse Numérique, 9, 41-76.
https://doi.org/10.1051/m2an/197509r200411
[8]  Gabay, D. and Mercier, B. (1976) A Dual Algorithm for the Solution of Nonlinear Variational Problems via Finite Element Approximation. Computers & Mathematics with Applications, 2, 17-40.
https://doi.org/10.1016/0898-1221(76)90003-1
[9]  Bertsekas, D.P. (2014) Constrained Optimization and Lagrange Multiplier Methods. Academic Press.
[10]  Jakovetic, D., Bajovic, D., Xavier, J. and Moura, J.M.F. (2020) Primal-Dual Methods for Large-Scale and Distributed Convex Optimization and Data Analytics. Proceedings of the IEEE, 108, 1923-1938.
https://doi.org/10.1109/jproc.2020.3007395
[11]  Ma, S. (2015) Alternating Proximal Gradient Method for Convex Minimization. Journal of Scientific Computing, 68, 546-572.
https://doi.org/10.1007/s10915-015-0150-0
[12]  Chen, C., He, B., Ye, Y. and Yuan, X. (2014) The Direct Extension of ADMM for Multi-Block Convex Minimization Problems Is Not Necessarily Convergent. Mathematical Programming, 155, 57-79.
https://doi.org/10.1007/s10107-014-0826-5
[13]  Lin, T., Ma, S. and Zhang, S. (2017) Global Convergence of Unmodified 3-Block ADMM for a Class of Convex Minimization Problems. Journal of Scientific Computing, 76, 69-88.
https://doi.org/10.1007/s10915-017-0612-7
[14]  Wang, Y., Yin, W. and Zeng, J. (2018) Global Convergence of ADMM in Nonconvex Nonsmooth Optimization. Journal of Scientific Computing, 78, 29-63.
https://doi.org/10.1007/s10915-018-0757-z
[15]  Chao, M.T., Zhang, Y. and Jian, J.B. (2020) An Inertial Proximal Alternating Direction Method of Multipliers for Nonconvex Optimization. International Journal of Computer Mathematics, 98, 1199-1217.
https://doi.org/10.1080/00207160.2020.1812585
[16]  Liavas, A.P. and Sidiropoulos, N.D. (2015) Parallel Algorithms for Constrained Tensor Factorization via Alternating Direction Method of Multipliers. IEEE Transactions on Signal Processing, 63, 5450-5463.
https://doi.org/10.1109/tsp.2015.2454476
[17]  Shen, Y., Wen, Z. and Zhang, Y. (2012) Augmented Lagrangian Alternating Direction Method for Matrix Separation Based on Low-Rank Factorization. Optimization Methods and Software, 29, 239-263.
https://doi.org/10.1080/10556788.2012.700713
[18]  Mai, V. and Johansson, M. (2020) Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization. Proceedings of the 37th International Conference on Machine Learning, Online, 13-18 July 2020, 6630-6639.
[19]  Emmert-Streib, F. and Dehmer, M. (2019) High-Dimensional Lasso-Based Computational Regression Models: Regularization, Shrinkage, and Selection. Machine Learning and Knowledge Extraction, 1, 359-383.
https://doi.org/10.3390/make1010021
[20]  Avron, H., Clarkson, K.L. and Woodruff, D.P. (2017) Faster Kernel Ridge Regression Using Sketching and Preconditioning. SIAM Journal on Matrix Analysis and Applications, 38, 1116-1138.
https://doi.org/10.1137/16m1105396
[21]  Zhong, W. and Kwok, J. (2014) Gradient Descent with Proximal Average for Nonconvex and Composite Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 28, 2206-2212.
https://doi.org/10.1609/aaai.v28i1.8994
[22]  Li, X., Ding, S. and Li, Y. (2017) Outlier Suppression via Non-Convex Robust PCA for Efficient Localization in Wireless Sensor Networks. IEEE Sensors Journal, 17, 7053-7063.
https://doi.org/10.1109/jsen.2017.2754502

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133