全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

概率图模型中的变分近似推理方法

DOI: 10.3724/SP.J.1004.2012.01721, PP. 1721-1734

Keywords: 概率图模型,贝叶斯网,马尔科夫随机场,近似推理,变分法,对偶分解

Full-Text   Cite this paper   Add to My Lib

Abstract:

?概率图模型将图论和概率论相结合,为多个变量之间复杂依赖关系的表示提供了统一的框架,在计算机视觉、自然语言处理和计算生物学等领域有着广泛的应用.概率推理(包括计算边缘概率和计算最大概率状态等问题)是概率图模型研究及应用的核心问题.本文主要介绍概率图模型近似推理方法中变分推理的最新研究成果.在变分近似推理的框架下,系统地归纳了概率图模型推理问题的基本研究思路,综述了目前主要的近似推理方法,并分析了近似算法的单调性、收敛性和全局性等性质.最后,对概率图模型近似推理方法的研究方向和应用前景作了展望.

References

[1]  Koller D, Friedman N. Probabilistic Graphical Models Principles and Techniques. Cambridge, MA: MIT Press, 2009
[2]  Chu Yi-Ping, Zhang Yin, Ye Xiu-Zi, Zhang San-Yuan. Adaptive video segmentation algorithm using hidden conditional random fields. Acta Automatica Sinica, 2007, 33(12): 1252-1258 (褚一平, 张引, 叶修梓, 张三元. 基于隐条件随机场的自适应视频 分割算法. 自动化学报, 2007, 33(12): 1252-1258)
[3]  Freeman W T, Pasztor E C, Carmichael O T. Learning low-level vision. International Journal of Computer Vision, 2000, 40(1): 25-47
[4]  Komodakis N, Tziritas G. Image completion using global optimization. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 442-452
[5]  Rush A M, Sontag D, Collins M, Jaakkola T. On dual decomposition and linear programming relaxations for natural language processing. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Massachusetts, USA: Association for Computational Linguistics, 2010. 1-11
[6]  Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35-45
[7]  McAuliffe J D, Pachter L, Jordan M I. Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics, 2004, 20(12): 1850-1860
[8]  Yanover C, Meltzer T, Weiss Y. Linear programming relaxations and belief propagation—— an empirical study. Journal of Machine Learning Research, 2006, 7: 1887-1907
[9]  Jordan M I, Ghahramani Z, Jaakkola T, Saul L K. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183-233
[10]  Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo: Morgan Kaufmann, 1988
[11]  Aji S M, McEliece R J. The generalized distributive law. IEEE Transactions on Information Theory, 2000, 46(2): 325-343
[12]  Park J D, Darwiche A. Complexity results and approximation strategies for MAP explanations. Journal of Artificial Intelligence Research, 2004, 21(1): 101-133
[13]  Huang J B, Chavira M, Darwiche A. Solving MAP exactly by searching on complied arithmetic circuits. In: Proceedings of the 21st National Conference on Artificial Intelligence. Menlo Park, USA: AAAI, 2006. 1143-1148
[14]  Liu Q, Ihler A. Variational algorithms for marginal MAP. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. Barcelona, Spain: AUAI, 2011. 453-462
[15]  Cheng Q, Chen F, Dong J W, Xu W L. Approximating the sum operation for marginal-MAP inference. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. Toronto, Canada, 2012, 1882-1887
[16]  Schraudolph N N, Kamenetsky D. Efficient exact inference in planar Ising models. In: Proceedings of the Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2009. 1417-1424
[17]  Schraudolph N N. Polynomial-time exact inference in NP-hard binary MRFs via reweighted perfect matching. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: JMLR W & CP, 2010. 717-724
[18]  Wainwright M J, Jaakkola T S, Willsky A S. MAP estimation via agreement on trees: message-passing and linear programming. IEEE Transactions on Information Theory, 2005, 51(11): 3697-3717
[19]  Cover T M, Thomas J A. Elements of Information Theory. New York: Wiley-Interscience, 2006
[20]  Cheng Q, Chen F, Xu W L, Wang S. Recursive sum-product algorithm for generalized outer-planar graphs. Information Processing Letters, 2012, 112(11): 449-456
[21]  Kolmogorov V, Zabih R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 147-159
[22]  Globerson A, Jaakkola T. Approximate inference using planar graph decomposition. In: Proceedings of the 2007 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2007. 473-480
[23]  Van Engelen R A. Approximating Bayesian belief networks by arc removal. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(8): 916-920
[24]  D'Ambrosio B. Incremental probabilistic inference. In: Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence. Washington, USA: AUAI, 1993. 301-308
[25]  Poole D. Average-case analysis of a search algorithm for estimating prior and posterior probabilities in Bayesian networks with extreme probabilities. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence. Chambery, France: AAAI, 1993. 606-612
[26]  Wainwright M J, Jaakkola T S, Willsky A S. A new class of upper bounds on the log partition function. IEEE Transactions on Information Theory, 2005, 51(7): 2313-2335
[27]  Heskes T. Convexity arguments for efficient minimization of the Bethe and Kikuchi free energies. Journal of Artificial Intelligence Research, 2006, 26(1): 153-190
[28]  Sontag D, Globerson A, Jaakkola T. Introduction to dual decomposition for inference. Optimization for Machine Learning. Cambridge, USA: MIT Press, 2011. 1-37
[29]  Xing E P, Jordan M I, Russell S. A generalized mean field algorithm for variational inference in exponential families. In: Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence. Acapulco, Mexico: AUAI, 2003. 538-591
[30]  Wiegerinck W. Variational approximations between mean field theory and the junction tree algorithm. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence. Stanford, USA: AUAI, 2000. 626-633
[31]  Saul L K, Jordan M I. Exploiting tractable substructures in intractable networks. In: Proceedings of the 1997 Conference on Neural Information Processing Systems. Denver, USA: MIT Press, 1997. 486-492
[32]  Leisink M A R, Kappen H J. Learning in higher order Boltzmann machines using linear response. Neural Networks, 2000, 13(3): 329-335
[33]  Minka T P. A Family of Algorithms for Approximate Bayesian Inference [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2001
[34]  Pakzad P, Anantharam V. Estimation and marginalization using the Kikuchi approximation methods. Neural Computation, 2005, 17(8): 1836-1873
[35]  Chen Feng, Liu Hong, Xu Wen-Li. Recursive belief propagation—— belief propagation in a well-order. Acta Automatica Sinica, 2010, 36(8): 1091-1098 (陈峰, 刘红, 徐文立. 递推信度传播算法——按良序的信度传播. 自动化学报, 2010, 36(8): 1091-1098)
[36]  Globerson A, Jaakkola T. Fixing max-product: convergent message passing algorithms for MAP LP-relaxations. In: Proceedings of the 2008 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2008. 553-560
[37]  Hazan T, Shashua A. Convergent message-passing algorithms for inference over general graphs with convex free energies. In: Proceedings of the 2008 Conference on Uncertainty in Artificial Intelligence. Helsinki, Finland: AUAI, 2008. 264-273
[38]  Shimony S E. Finding MAPs for belief networks is NP-hard. Journal of Artificial Intelligence, 1994, 68(2): 399-410
[39]  Alizadeh F. Interior point methods in semidefinite programming with applications to combinatorial optimization. Journal on Optimization, 1995, 5(1): 13-51
[40]  Ravikumar P, Lafferty J. Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In: Proceedings of the 23rd International Conference on Machine Learning. New York, USA: ACM, 2006. 737-744
[41]  Kumar A, Zilberstein S, Toussaint M. Message-passing algorithms for MAP estimation using DC programming. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 656-664
[42]  Werner T. High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008. 1-8
[43]  Jancsary J, Matz G, Trost H. An incremental subgradient algorithm for approximate MAP estimation in graphical models. In: Proceedings of the 2010 Conference on Neural Information Processing Systems. Vancouver, Canada, 2010
[44]  Sontag D, Jaakkola T. Tree block coordinate descent for MAP in graphical models. In: Proceedings of the 2009 International Conference on Artificial Intelligence and Statistics. Florida, USA: JMLR W & CP, 2009. 544-551
[45]  Osokin A, Vetrov D, Kolmogorov V. Submodular decomposition framework for inference in associative Markov networks with global constraints. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEE, 2011. 1889-1896
[46]  Yarkony J, Fowlkes C, Ihler A. Covering trees and lowerbounds on quadratic assignment. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 887-894
[47]  Sontag D, Meltzer T, Globerson A, Jaakkola T, Weiss Y. Tightening LP relaxations for MAP using message passing. In: Proceedings of the 2008 Conference on Uncertainty in Artificial Intelligence. Helsinki, Finland: AUAI, 2008. 503-510
[48]  Geoffrion A. Lagrangean relaxation for integer programming. Approaches to Integer Programming, 1974, 2: 82-114
[49]  Johnson J. Convex Relaxation Methods for Graphical Models: Lagrangian and Maximum Entropy Approaches [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2008
[50]  Bertsekas D. Auction algorithms for network flow problems: a tutorial introduction. Computational Optimization and Applications, 1992, 1(1): 7-66
[51]  Gimpel K, Smith N A. Softmax-margin CRFs: training log-linear models with cost functions. In: Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies. Los Angeles, USA: Association for Computational Linguistics, 2010. 733-736
[52]  Jojic V, Gould S, Koller D. Accelerated dual decomposition for MAP inference. In: Proceedings of the 2010 International Conference on Machine Learning. Haifa, Israel: ACM, 2010. 503-510
[53]  Martins A L, Figueiredo M A T, Aguiar P M Q, Smith N A, Xing E P. An augmented Lagrangian approach to constrained MAP inference. In: Proceedings of the 28th International Conference on Machine Learning. Bellevue, USA: ACM, 2011. 1-8
[54]  Rossi F, Van Beek P, Walsh T. Handbook of Constraint Programming. Amsterdam: Elsevier, 2006
[55]  Marinescu R, Dechter R. AND/OR branch-and-bound search for combinatorial optimization in graphical models. Artificial Intelligence, 2009, 173(16-17): 1457-1491
[56]  Meltzer T, Yanover C, Weiss Y. Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2005. 428-435
[57]  Wainwright M J, Jordan M. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 2008, 1(1-2): 1-305
[58]  Tappen M F, Freeman W T. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2003. 900-906
[59]  Roth S, Black M J. Fields of experts: a framework for learning image priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 860-867
[60]  Kwatra V, Sch?dl A, Essa I, Turk G, Bobick A. Graphcut textures: image and video synthesis using graph cuts. ACM Transactions on Graphics, 2003, 22(3): 277-286
[61]  Huang X D, Acero A, Hon H W. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. New Jersey: Prentice Hall PTR, 2001
[62]  McEliece R J, MacKay D J C, Cheng J F. Turbo decoding as an instance of Pearl's "belief propagation" algorithm. IEEE Journal on Selected Areas in Communications, 1998, 16(2): 140-152
[63]  Gallager R G. Low density parity check codes. IEEE Transactions on Information Theory, 1962, 8(1): 21-28
[64]  Kschischang F R, Frey B J, Loeliger H A. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001, 47(2): 498-519
[65]  Siepel A, Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. Journal of Computational Biology, 2004, 11(2-3): 413-428
[66]  Yanover C, Schueler-Furman O, Weiss Y. Minimizing and learning energy functions for side-chain prediction. Journal of Computational Biology, 2008, 15(7): 899-911
[67]  Gilks W R, Richardson S, Spiegelhalter D J. Markov Chain Monte Carlo in Practice. London: Chapman and Hall, 1996
[68]  Yedidia J S, Freeman W T, Weiss Y. Constructing free energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 2005, 51(7): 2282-2312
[69]  Doucet A, Godsill S J, Robert C P. Marginal maximum a posteriori estimation using Markov chain Monte Carlo. Statistics and Computing, 2002, 12: 77-84
[70]  Jiang J R, Rai P, Hal D. Message-passing for approximate MAP inference with latent variables. In: Proceedings of the Conference on Neural Information Processing Systems. Granada, Spain: MIT Press, 2011. 1197-1205
[71]  Weiss Y. Correctness of local probability propagation in graphical models with loops. Neural Computation, 2000, 12(1): 1-41
[72]  Kolmogorov V, Wainwright M. On the optimality of tree-reweighted max-product message passing. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence. Edinburgh, Scotland: AUAI, 2005. 316-323
[73]  Cowell R G, Dawid P, Lauritzen S L, Spiegelhalter D J. Probabilistic Networks and Expert Systems. Berlin: Springer, 1999
[74]  Wainwright M J, Jaakkola T S, Willsky A S. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Transactions on Information Theory, 2003, 49(5): 1120-1146
[75]  Aji S M, McEliece R J. The generalized distributive law and free energy minimization. In: Proceedings of the 39th Allerton Conference on Communication, Control, and Computing. Monticello, USA: IEEE, 2001. 672-681
[76]  Kolmogorov V. Convergent Tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1568-1583
[77]  Wiegerinck W, Heskes T. Fractional belief propagation. In: Proceedings of the 2003 Conference on Neural Information Processing Systems. Whistler, Canada: MIT Press, 2003. 438-445
[78]  Heskes T. On the uniqueness of loopy belief propagation fixed points. Neural Computation, 2004, 16(11): 2379-2413
[79]  Meshi O, Jaimovich A, Globerson A, Friedman N. Convexifying the Bethe free energy. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Montreal, Canada: AUAI, 2009. 402-410
[80]  Komodakis N, Paragios N, Tziritas G. MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 531-552
[81]  Opper M, Saad D. Advanced Mean Field Methods: Theory and Practice. Cambridge, USA: MIT Press, 2001
[82]  Plefka T. Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model. Journal of Physics A: Mathematical and General, 1982, 15(6): 1971-1978
[83]  Minka T P. Expectation propagation for approximate Bayesian inference. In: Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence. Seattle, USA: AUAI, 2001. 362-369
[84]  Minka T, Qi Y. Tree-structured approximations by expectation propagation. In: Proceedings of the 2003 Conference on Neural Information Processing Systems. Whistler, Canada: MIT Press, 2003. 1-8
[85]  Pakzad P, Anantharam V. Belief propagation and statistical physics. In: Proceedings of the 2002 Conference on Information Sciences and Systems. Princeton, USA: IEEE, 2002. 1-3
[86]  Tatikonda S C, Jordan M I. Loopy belief propagation and Gibbs measures. In: Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence. Alberta, Canada: AUAI, 2002. 493-500
[87]  Ihler A T, Fisher J W, Willsky A S, Chickering M. Loopy belief propagation: convergence and effects of message errors. Journal of Machine Learning Research, 2006, 6(1): 905-936
[88]  Meltzer T, Globerson A, Weiss Y. Convergent message passing algorithms: a unifying view. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Montreal, Canada: AUAI, 2009. 393-401
[89]  Werner T. A linear programming approach to max-sum problem: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(7): 1165-1179
[90]  Yuille A L. CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Computation, 2002, 14(7): 1691-1722
[91]  Pretti M, Pelizzola A. Stable propagation algorithm for the minimization of the Bethe free energy. Journal of Physics A: Mathematical and General, 2003, 36: 11201-11211
[92]  Sontag D A. Approximate Inference in Graphical Models Using LP Relaxations [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2010
[93]  Mitchell J E. Cutting plane methods and subgradient methods. Tutorials in Operations Research, 2005. 34-61
[94]  Bland R G, Goldfarb D, Todd M J. The ellipsoid method: a survey. Operations Research, 1981, 29(6): 1039-1091
[95]  van de Ven J, Fabio R. Distributed anytime MAP inference. In: Proceedings of the 2011 Conference on Uncertainty in Artificial Intelligence. Barcelona, Spain: AUAI, 2011. 708-716
[96]  Peng J, Hazan T, Srebro N, Xu J B. Approximate inference by intersecting semidefinite bound and local polytope. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 868-876
[97]  Weiss Y, Yanover C, Meltzer T. MAP estimation, linear programming and belief propagation with convex free energies. In: Proceedings of the 2007 Conference on Uncertainty in Artificial Intelligence. Vancouver, Canada: AUAI, 2007. 416-425
[98]  Batra D, Gallagher A C, Parikh D, Chen T. Beyond trees: MRF inference via outer-planar decomposition. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 2496-2503
[99]  Sontag D, Jaakkola T. New outer bounds on the marginal polytope. In: Proceedings of the 2007 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2007. 1393-1400
[100]  Komodakis N, Paragios N. Beyond pairwise energies: Efficient optimization for higher-order MRFs. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 2985-2992
[101]  Hazan T, Shashua A. Norm-product belief propagation: primal-dual message-passing for approximate inference. IEEE Transactions on Information Theory, 2010, 56(12): 6294-6316
[102]  Johnson J K, Malioutov D M, Willsky A S. Lagrangian relaxation for map estimation in graphical models. In: Proceedings of the 2007 Annual Allerton Conference on Communication, Control, and Computing. Illinois, USA: IEEE, 2007. 1-10
[103]  Savchynskyy B, Kappes J, Schmidt S, Schn?rr C. A study of Nesterov's scheme for Lagrangian decomposition and MAP labeling. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 1817-1823
[104]  Ravikumar P, Agarwal A, Wainwright M J. Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes. In: Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008. 800-807
[105]  Meshi O, Globerson A. An alternating direction method for dual MAP LP relaxation. In: Proceedings of the 2011 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Athens, Greece: Springer, 2011. 470-483
[106]  Canzar S, Toussaint N C, Klau G W. An exact algorithm for side-chain placement in protein design. Optimization Letters, 2011, 5(3): 393-406
[107]  Sun M, Telaprolu M, Lee H, Savarese S. Efficient and exact MAP-MRF Inference using branch and bound. In: Proceedings of the 2012 International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 1134-1142
[108]  Goldberger J. Solving Sudoku Using Combined Message Passing Algorithms, Technical ReportT R-BIU-ENG-2007-05-03, Bar-Ilan University, Israel, 2007

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133