OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2012

概率图模型中的变分近似推理方法

DOI: 10.3724/SP.J.1004.2012.01721, PP. 1721-1734

程强, 陈峰, 董建武, 徐文立

Keywords: 概率图模型,贝叶斯网,马尔科夫随机场,近似推理,变分法,对偶分解

Full-Text Cite this paper Add to My Lib

Abstract:

？概率图模型将图论和概率论相结合,为多个变量之间复杂依赖关系的表示提供了统一的框架,在计算机视觉、自然语言处理和计算生物学等领域有着广泛的应用.概率推理(包括计算边缘概率和计算最大概率状态等问题)是概率图模型研究及应用的核心问题.本文主要介绍概率图模型近似推理方法中变分推理的最新研究成果.在变分近似推理的框架下,系统地归纳了概率图模型推理问题的基本研究思路,综述了目前主要的近似推理方法,并分析了近似算法的单调性、收敛性和全局性等性质.最后,对概率图模型近似推理方法的研究方向和应用前景作了展望.

References

[1]	Koller D, Friedman N. Probabilistic Graphical Models Principles and Techniques. Cambridge, MA: MIT Press, 2009
[2]	Chu Yi-Ping, Zhang Yin, Ye Xiu-Zi, Zhang San-Yuan. Adaptive video segmentation algorithm using hidden conditional random fields. Acta Automatica Sinica, 2007, 33(12): 1252-1258 (褚一平, 张引, 叶修梓, 张三元. 基于隐条件随机场的自适应视频分割算法. 自动化学报, 2007, 33(12): 1252-1258)
[3]	Freeman W T, Pasztor E C, Carmichael O T. Learning low-level vision. International Journal of Computer Vision, 2000, 40(1): 25-47
[4]	Komodakis N, Tziritas G. Image completion using global optimization. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 442-452
[5]	Rush A M, Sontag D, Collins M, Jaakkola T. On dual decomposition and linear programming relaxations for natural language processing. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Massachusetts, USA: Association for Computational Linguistics, 2010. 1-11
[6]	Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35-45
[7]	McAuliffe J D, Pachter L, Jordan M I. Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics, 2004, 20(12): 1850-1860
[8]	Yanover C, Meltzer T, Weiss Y. Linear programming relaxations and belief propagation—— an empirical study. Journal of Machine Learning Research, 2006, 7: 1887-1907
[9]	Jordan M I, Ghahramani Z, Jaakkola T, Saul L K. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183-233
[10]	Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo: Morgan Kaufmann, 1988
[11]	Aji S M, McEliece R J. The generalized distributive law. IEEE Transactions on Information Theory, 2000, 46(2): 325-343
[12]	Park J D, Darwiche A. Complexity results and approximation strategies for MAP explanations. Journal of Artificial Intelligence Research, 2004, 21(1): 101-133
[13]	Huang J B, Chavira M, Darwiche A. Solving MAP exactly by searching on complied arithmetic circuits. In: Proceedings of the 21st National Conference on Artificial Intelligence. Menlo Park, USA: AAAI, 2006. 1143-1148
[14]	Liu Q, Ihler A. Variational algorithms for marginal MAP. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. Barcelona, Spain: AUAI, 2011. 453-462
[15]	Cheng Q, Chen F, Dong J W, Xu W L. Approximating the sum operation for marginal-MAP inference. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. Toronto, Canada, 2012, 1882-1887
[16]	Schraudolph N N, Kamenetsky D. Efficient exact inference in planar Ising models. In: Proceedings of the Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2009. 1417-1424
[17]	Schraudolph N N. Polynomial-time exact inference in NP-hard binary MRFs via reweighted perfect matching. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: JMLR W & CP, 2010. 717-724
[18]	Wainwright M J, Jaakkola T S, Willsky A S. MAP estimation via agreement on trees: message-passing and linear programming. IEEE Transactions on Information Theory, 2005, 51(11): 3697-3717
[19]	Cover T M, Thomas J A. Elements of Information Theory. New York: Wiley-Interscience, 2006
[20]	Cheng Q, Chen F, Xu W L, Wang S. Recursive sum-product algorithm for generalized outer-planar graphs. Information Processing Letters, 2012, 112(11): 449-456
[21]	Kolmogorov V, Zabih R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 147-159
[22]	Globerson A, Jaakkola T. Approximate inference using planar graph decomposition. In: Proceedings of the 2007 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2007. 473-480
[23]	Van Engelen R A. Approximating Bayesian belief networks by arc removal. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(8): 916-920
[24]	D'Ambrosio B. Incremental probabilistic inference. In: Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence. Washington, USA: AUAI, 1993. 301-308
[25]	Poole D. Average-case analysis of a search algorithm for estimating prior and posterior probabilities in Bayesian networks with extreme probabilities. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence. Chambery, France: AAAI, 1993. 606-612
[26]	Wainwright M J, Jaakkola T S, Willsky A S. A new class of upper bounds on the log partition function. IEEE Transactions on Information Theory, 2005, 51(7): 2313-2335
[27]	Heskes T. Convexity arguments for efficient minimization of the Bethe and Kikuchi free energies. Journal of Artificial Intelligence Research, 2006, 26(1): 153-190
[28]	Sontag D, Globerson A, Jaakkola T. Introduction to dual decomposition for inference. Optimization for Machine Learning. Cambridge, USA: MIT Press, 2011. 1-37
[29]	Xing E P, Jordan M I, Russell S. A generalized mean field algorithm for variational inference in exponential families. In: Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence. Acapulco, Mexico: AUAI, 2003. 538-591
[30]	Wiegerinck W. Variational approximations between mean field theory and the junction tree algorithm. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence. Stanford, USA: AUAI, 2000. 626-633
[31]	Saul L K, Jordan M I. Exploiting tractable substructures in intractable networks. In: Proceedings of the 1997 Conference on Neural Information Processing Systems. Denver, USA: MIT Press, 1997. 486-492
[32]	Leisink M A R, Kappen H J. Learning in higher order Boltzmann machines using linear response. Neural Networks, 2000, 13(3): 329-335
[33]	Minka T P. A Family of Algorithms for Approximate Bayesian Inference [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2001
[34]	Pakzad P, Anantharam V. Estimation and marginalization using the Kikuchi approximation methods. Neural Computation, 2005, 17(8): 1836-1873
[35]	Chen Feng, Liu Hong, Xu Wen-Li. Recursive belief propagation—— belief propagation in a well-order. Acta Automatica Sinica, 2010, 36(8): 1091-1098 (陈峰, 刘红, 徐文立. 递推信度传播算法——按良序的信度传播. 自动化学报, 2010, 36(8): 1091-1098)
[36]	Globerson A, Jaakkola T. Fixing max-product: convergent message passing algorithms for MAP LP-relaxations. In: Proceedings of the 2008 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2008. 553-560
[37]	Hazan T, Shashua A. Convergent message-passing algorithms for inference over general graphs with convex free energies. In: Proceedings of the 2008 Conference on Uncertainty in Artificial Intelligence. Helsinki, Finland: AUAI, 2008. 264-273
[38]	Shimony S E. Finding MAPs for belief networks is NP-hard. Journal of Artificial Intelligence, 1994, 68(2): 399-410
[39]	Alizadeh F. Interior point methods in semidefinite programming with applications to combinatorial optimization. Journal on Optimization, 1995, 5(1): 13-51
[40]	Ravikumar P, Lafferty J. Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In: Proceedings of the 23rd International Conference on Machine Learning. New York, USA: ACM, 2006. 737-744
[41]	Kumar A, Zilberstein S, Toussaint M. Message-passing algorithms for MAP estimation using DC programming. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 656-664
[42]	Werner T. High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008. 1-8
[43]	Jancsary J, Matz G, Trost H. An incremental subgradient algorithm for approximate MAP estimation in graphical models. In: Proceedings of the 2010 Conference on Neural Information Processing Systems. Vancouver, Canada, 2010
[44]	Sontag D, Jaakkola T. Tree block coordinate descent for MAP in graphical models. In: Proceedings of the 2009 International Conference on Artificial Intelligence and Statistics. Florida, USA: JMLR W & CP, 2009. 544-551
[45]	Osokin A, Vetrov D, Kolmogorov V. Submodular decomposition framework for inference in associative Markov networks with global constraints. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEE, 2011. 1889-1896
[46]	Yarkony J, Fowlkes C, Ihler A. Covering trees and lowerbounds on quadratic assignment. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 887-894
[47]	Sontag D, Meltzer T, Globerson A, Jaakkola T, Weiss Y. Tightening LP relaxations for MAP using message passing. In: Proceedings of the 2008 Conference on Uncertainty in Artificial Intelligence. Helsinki, Finland: AUAI, 2008. 503-510
[48]	Geoffrion A. Lagrangean relaxation for integer programming. Approaches to Integer Programming, 1974, 2: 82-114
[49]	Johnson J. Convex Relaxation Methods for Graphical Models: Lagrangian and Maximum Entropy Approaches [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2008
[50]	Bertsekas D. Auction algorithms for network flow problems: a tutorial introduction. Computational Optimization and Applications, 1992, 1(1): 7-66
[51]	Gimpel K, Smith N A. Softmax-margin CRFs: training log-linear models with cost functions. In: Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies. Los Angeles, USA: Association for Computational Linguistics, 2010. 733-736
[52]	Jojic V, Gould S, Koller D. Accelerated dual decomposition for MAP inference. In: Proceedings of the 2010 International Conference on Machine Learning. Haifa, Israel: ACM, 2010. 503-510
[53]	Martins A L, Figueiredo M A T, Aguiar P M Q, Smith N A, Xing E P. An augmented Lagrangian approach to constrained MAP inference. In: Proceedings of the 28th International Conference on Machine Learning. Bellevue, USA: ACM, 2011. 1-8
[54]	Rossi F, Van Beek P, Walsh T. Handbook of Constraint Programming. Amsterdam: Elsevier, 2006
[55]	Marinescu R, Dechter R. AND/OR branch-and-bound search for combinatorial optimization in graphical models. Artificial Intelligence, 2009, 173(16-17): 1457-1491
[56]	Meltzer T, Yanover C, Weiss Y. Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2005. 428-435
[57]	Wainwright M J, Jordan M. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 2008, 1(1-2): 1-305
[58]	Tappen M F, Freeman W T. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2003. 900-906
[59]	Roth S, Black M J. Fields of experts: a framework for learning image priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 860-867
[60]	Kwatra V, Sch？dl A, Essa I, Turk G, Bobick A. Graphcut textures: image and video synthesis using graph cuts. ACM Transactions on Graphics, 2003, 22(3): 277-286
[61]	Huang X D, Acero A, Hon H W. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. New Jersey: Prentice Hall PTR, 2001
[62]	McEliece R J, MacKay D J C, Cheng J F. Turbo decoding as an instance of Pearl's "belief propagation" algorithm. IEEE Journal on Selected Areas in Communications, 1998, 16(2): 140-152
[63]	Gallager R G. Low density parity check codes. IEEE Transactions on Information Theory, 1962, 8(1): 21-28
[64]	Kschischang F R, Frey B J, Loeliger H A. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001, 47(2): 498-519
[65]	Siepel A, Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. Journal of Computational Biology, 2004, 11(2-3): 413-428
[66]	Yanover C, Schueler-Furman O, Weiss Y. Minimizing and learning energy functions for side-chain prediction. Journal of Computational Biology, 2008, 15(7): 899-911
[67]	Gilks W R, Richardson S, Spiegelhalter D J. Markov Chain Monte Carlo in Practice. London: Chapman and Hall, 1996
[68]	Yedidia J S, Freeman W T, Weiss Y. Constructing free energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 2005, 51(7): 2282-2312
[69]	Doucet A, Godsill S J, Robert C P. Marginal maximum a posteriori estimation using Markov chain Monte Carlo. Statistics and Computing, 2002, 12: 77-84
[70]	Jiang J R, Rai P, Hal D. Message-passing for approximate MAP inference with latent variables. In: Proceedings of the Conference on Neural Information Processing Systems. Granada, Spain: MIT Press, 2011. 1197-1205
[71]	Weiss Y. Correctness of local probability propagation in graphical models with loops. Neural Computation, 2000, 12(1): 1-41
[72]	Kolmogorov V, Wainwright M. On the optimality of tree-reweighted max-product message passing. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence. Edinburgh, Scotland: AUAI, 2005. 316-323
[73]	Cowell R G, Dawid P, Lauritzen S L, Spiegelhalter D J. Probabilistic Networks and Expert Systems. Berlin: Springer, 1999
[74]	Wainwright M J, Jaakkola T S, Willsky A S. Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Transactions on Information Theory, 2003, 49(5): 1120-1146
[75]	Aji S M, McEliece R J. The generalized distributive law and free energy minimization. In: Proceedings of the 39th Allerton Conference on Communication, Control, and Computing. Monticello, USA: IEEE, 2001. 672-681
[76]	Kolmogorov V. Convergent Tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1568-1583
[77]	Wiegerinck W, Heskes T. Fractional belief propagation. In: Proceedings of the 2003 Conference on Neural Information Processing Systems. Whistler, Canada: MIT Press, 2003. 438-445
[78]	Heskes T. On the uniqueness of loopy belief propagation fixed points. Neural Computation, 2004, 16(11): 2379-2413
[79]	Meshi O, Jaimovich A, Globerson A, Friedman N. Convexifying the Bethe free energy. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Montreal, Canada: AUAI, 2009. 402-410
[80]	Komodakis N, Paragios N, Tziritas G. MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 531-552
[81]	Opper M, Saad D. Advanced Mean Field Methods: Theory and Practice. Cambridge, USA: MIT Press, 2001
[82]	Plefka T. Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model. Journal of Physics A: Mathematical and General, 1982, 15(6): 1971-1978
[83]	Minka T P. Expectation propagation for approximate Bayesian inference. In: Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence. Seattle, USA: AUAI, 2001. 362-369
[84]	Minka T, Qi Y. Tree-structured approximations by expectation propagation. In: Proceedings of the 2003 Conference on Neural Information Processing Systems. Whistler, Canada: MIT Press, 2003. 1-8
[85]	Pakzad P, Anantharam V. Belief propagation and statistical physics. In: Proceedings of the 2002 Conference on Information Sciences and Systems. Princeton, USA: IEEE, 2002. 1-3
[86]	Tatikonda S C, Jordan M I. Loopy belief propagation and Gibbs measures. In: Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence. Alberta, Canada: AUAI, 2002. 493-500
[87]	Ihler A T, Fisher J W, Willsky A S, Chickering M. Loopy belief propagation: convergence and effects of message errors. Journal of Machine Learning Research, 2006, 6(1): 905-936
[88]	Meltzer T, Globerson A, Weiss Y. Convergent message passing algorithms: a unifying view. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Montreal, Canada: AUAI, 2009. 393-401
[89]	Werner T. A linear programming approach to max-sum problem: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(7): 1165-1179
[90]	Yuille A L. CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Computation, 2002, 14(7): 1691-1722
[91]	Pretti M, Pelizzola A. Stable propagation algorithm for the minimization of the Bethe free energy. Journal of Physics A: Mathematical and General, 2003, 36: 11201-11211
[92]	Sontag D A. Approximate Inference in Graphical Models Using LP Relaxations [Ph.D. dissertation], Massachusetts Institute of Technology, Cambridge, MA, USA, 2010
[93]	Mitchell J E. Cutting plane methods and subgradient methods. Tutorials in Operations Research, 2005. 34-61
[94]	Bland R G, Goldfarb D, Todd M J. The ellipsoid method: a survey. Operations Research, 1981, 29(6): 1039-1091
[95]	van de Ven J, Fabio R. Distributed anytime MAP inference. In: Proceedings of the 2011 Conference on Uncertainty in Artificial Intelligence. Barcelona, Spain: AUAI, 2011. 708-716
[96]	Peng J, Hazan T, Srebro N, Xu J B. Approximate inference by intersecting semidefinite bound and local polytope. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 868-876
[97]	Weiss Y, Yanover C, Meltzer T. MAP estimation, linear programming and belief propagation with convex free energies. In: Proceedings of the 2007 Conference on Uncertainty in Artificial Intelligence. Vancouver, Canada: AUAI, 2007. 416-425
[98]	Batra D, Gallagher A C, Parikh D, Chen T. Beyond trees: MRF inference via outer-planar decomposition. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 2496-2503
[99]	Sontag D, Jaakkola T. New outer bounds on the marginal polytope. In: Proceedings of the 2007 Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2007. 1393-1400
[100]	Komodakis N, Paragios N. Beyond pairwise energies: Efficient optimization for higher-order MRFs. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 2985-2992
[101]	Hazan T, Shashua A. Norm-product belief propagation: primal-dual message-passing for approximate inference. IEEE Transactions on Information Theory, 2010, 56(12): 6294-6316
[102]	Johnson J K, Malioutov D M, Willsky A S. Lagrangian relaxation for map estimation in graphical models. In: Proceedings of the 2007 Annual Allerton Conference on Communication, Control, and Computing. Illinois, USA: IEEE, 2007. 1-10
[103]	Savchynskyy B, Kappes J, Schmidt S, Schn？rr C. A study of Nesterov's scheme for Lagrangian decomposition and MAP labeling. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 1817-1823
[104]	Ravikumar P, Agarwal A, Wainwright M J. Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes. In: Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008. 800-807
[105]	Meshi O, Globerson A. An alternating direction method for dual MAP LP relaxation. In: Proceedings of the 2011 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Athens, Greece: Springer, 2011. 470-483
[106]	Canzar S, Toussaint N C, Klau G W. An exact algorithm for side-chain placement in protein design. Optimization Letters, 2011, 5(3): 393-406
[107]	Sun M, Telaprolu M, Lee H, Savarese S. Efficient and exact MAP-MRF Inference using branch and bound. In: Proceedings of the 2012 International Conference on Artificial Intelligence and Statistics. Canaries, Spanish: JMLR W & CP, 2012. 1134-1142
[108]	Goldberger J. Solving Sudoku Using Combined Message Passing Algorithms, Technical ReportT R-BIU-ENG-2007-05-03, Bar-Ilan University, Israel, 2007

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133