全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

两两关系马尔科夫网的自适应组稀疏化学习

DOI: 10.16383/j.aas.2015.c140682, PP. 1419-1437

Keywords: 无向图模型,两两马尔科夫网,稀疏化学习,自适应组稀疏化

Full-Text   Cite this paper   Add to My Lib

Abstract:

?稀疏化学习能显著降低无向图模型的参数学习与结构学习的复杂性,有效地处理无向图模型的学习问题.两两关系马尔科夫网在多值变量情况下,每条边具有多个参数,本文对此给出边参数向量的组稀疏化学习,提出自适应组稀疏化,根据参数向量的模大小自适应调整惩罚程度.本文不仅对比了不同边势情况下的稀疏化学习性能,为了加速模型在复杂网络中的训练过程,还对目标函数进行伪似然近似、平均场自由能近似和Bethe自由能近似.本文还给出自适应组稀疏化目标函数分别使用谱投影梯度算法和投影拟牛顿算法时的最优解,并对比了两种优化算法进行稀疏化学习的性能.实验表明自适应组稀疏化具有良好的性能.

References

[1]  Liu Jian-Wei, Li Hai-En, Luo Xiong-Lin. Learning technique of probabilistic graphical models: a review. Acta Automatica Sinica, 2014, 40(6): 1025-1044(刘建伟, 黎海恩, 罗雄麟. 概率图模型学习技术研究进展. 自动化学报, 2014, 40(6): 1025-1044)
[2]  Liu Jian-Wei, Li Hai-En, Luo Xiong-Lin. Representation theory of probabilistic graphical models. Computer Science, 2014, 41(9): 1-17(刘建伟, 黎海恩, 罗雄麟. 概率图模型表示理论. 计算机科学, 2014, 41(9): 1-17)
[3]  Liu Jian-Wei, Cui Li-Peng, Luo Xiong-Lin. Survey on the sparse learning of probabilistic graphical models. Chinese Journal of Computers, 2014, 37: Online Publishing No.114 (刘建伟, 崔立鹏, 罗雄麟. 概率图模型的稀疏化学习综述. 计算机学报, 2014, 37: 在线出版号No.114)
[4]  Schmidt M. Graphical Model Structure Learning with L1 Regularization [Ph.D. dissertation], The University of British Columbia, Vancouver, BC, 2010
[5]  Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2006, 68(1): 49-67
[6]  Huang J Z, Zhang T. The benefit of group sparsity. The Annals of Statistics, 2010, 38(4): 1978-2004
[7]  Schlüter F. A survey on independence-based Markov networks learning. Artificial Intelligence Review, 2014, 42(4): 1069-1093
[8]  Lee K, Anguelov D, Sumengen B, Gokturk S B. Markov random field models for hair and face segmentation. In: Proceedings of 8th IEEE International Conference on Automatic Face and Gesture Recognition. Amsterdam, The Netherlands: IEEE, 2008. 1-6
[9]  Amizadeh S, Hauskrecht M. Latent variable model for learning in pairwise Markov networks. In: Proceedings of the 2010 AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA: AAAI, 2010. 382-387
[10]  Zhang X H, Saha A, Vishwanathan S V N. Accelerated training of max-margin Markov networks with kernels. Theoretical Computer Science, 2014, 519: 88-102
[11]  Kovács E, Szántai T. Discovering the Markov network structure [Online], available: http://www.arxiv.org, July 2, 2013
[12]  Lee S I, Ganapathi V, Koller D. Efficient structure learning of Markov networks using L1-regularization. In: Proceedings of the 20th Annual Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada, 2007. 817-824
[13]  Wang H S, Leng C L. A note on adaptive group lasso. Computational Statistics and Data Analysis, 2008, 52(12): 5277-5286
[14]  Wei F R, Huang J. Consistent group selection in high-dimensional linear regression. Bernoulli: Official Journal of the Bernoulli Society for Mathematical Statistics and Probability, 2010, 16(4): 1369-1384
[15]  Besag J. Efficiency of pseudolikelihood estimation for simple Gaussian fields. Biometrika, 1977, 64(3): 616-618
[16]  Kok S, Domingos P. Learning the structure of Markov logic networks. In: Proceedings of the 22nd International Conference on Machine Learning. Bonn, Germany: ACM, 2005. 441-448
[17]  Campbell N D F, Subr K, Kautz J. Fully-connected CRFs with non-parametric pairwise potential. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, OR, USA: IEEE, 2013. 1658-1665
[18]  Kr?henbühl P, Koltun V. Efficient inference in fully connected CRFs with Gaussian edge potentials [Online], available: http://www.arxiv.org/abs/1210.5644, October 20, 2012
[19]  Yedidia J S, Freeman W T, Weiss Y. Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 2005, 51(7): 2282-2312
[20]  Wainwright M J, Jordan M I. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 2008, 1(1-2): 1-305
[21]  Liu J, Ye J P. Efficient l1/l_q norm regularization [Online], available: http://www.arxiv.org/abs/1009.4766v1, September 24, 2010
[22]  Zhu J, Lao N, Xing E P. Grafting-light: fast, incremental feature selection and structure learning of Markov random fields. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington D.C., USA: ACM, 2010. 303-312
[23]  Lee S, Zhu J, Xing E P. Adaptive multi-task lasso: with application to eQTL detection. In: Proceedings of 24th Annual Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada, 2010. 1306-1314
[24]  Cheng Qiang, Chen Feng, Dong Jian-Wu, Xu Wen-Li. Variational approximate inference methods for graphical models. Acta Automatica Sinica, 2012, 38(11): 1721-1734 (程强, 陈峰, 董建武, 徐文立. 概率图模型中的变分近似推理方法. 自动化学报, 2012, 38(11): 1721-1734)
[25]  Li Hai-En, Liu Jian-Wei, Luo Xiong-Lin Variational approximate inference for probabilistic graphical models. In: Proceeding of the 2013 Chinese Intelligent Automation Conference (4). Yangzhou, Jiangsu, China, 2013. (黎海恩, 刘建伟, 罗雄麟. 概率图模型的变分近似推理. 见: 2013年中国智能自动化学术会议论文集(第四分册). 扬州, 江苏, 中国, 2013.)
[26]  Yedidia J S, Freeman W T, Weiss Y. Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 2005, 51(7): 2282-2312
[27]  Sun S L. A review of deterministic approximate inference techniques for Bayesian machine learning. Neural Computing and Applications, 2013, 23(7-8): 2039-2050
[28]  Gy?rfi L, Gy?rfi Z, Vajda I. Bayesian decision with rejection. Problems of Control and Information Theory, 1979, 8(5-6): 445-452
[29]  Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2006, 68(1): 49-67
[30]  Birgin E, Martínez J, Raydan M. Nonmonotone spectral projected gradient methods on convex sets. SIAM Journal on Optimization, 2000, 10(4): 1196-1211
[31]  Yu Z S. Solving bound constrained optimization via a new nonmonotone spectral projected gradient method. Applied Numerical Mathematics, 2008, 58(9): 1340-1348
[32]  Fu W J. Penalized regressions: the bridge versus the lasso. Journal of Computational and Graphical Statistics, 1998, 7(3): 397-416
[33]  Schmidt M, van den Berg E, Friedlander M P, Murphy K. Optimizing costly functions with simple constraints: a limited-memory projected quasi-Newton algorithm. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. Clearwater Beach, Florida, USA, 2009. 456-463
[34]  Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 2011, 3(1): 1-122

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133