全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

大数据中贝叶斯非参数方法的理论与应用研究
Research on the Theory and Application of Bayesian Nonparametric Methods in Big Data

DOI: 10.12677/SA.2023.122030, PP. 283-292

Keywords: 贝叶斯非参数,大数据,机器学习,Dirichlet过程,后验推断
Bayesian Nonparametric
, Big Data, Machine Learning, Dirichlet Process, Posterior Inference

Full-Text   Cite this paper   Add to My Lib

Abstract:

在人工智能高速发展的时代,对机器学习领域的探索占据重要的地位,而机器学习本质上源于对海量数据的分析与学习,这就离不开统计学中模型的建立与推断。贝叶斯方法作为统计学中主要且成熟的建模方法,在充分学习样本信息的前提下引入参数的先验信息,容纳了参数的不确定性,使模型推断更加合理。在贝叶斯框架下的非参数方法进一步扩大了这种不确定性,将参数的先验空间推广到分布空间,用随机过程来进行表示,此时的先验空间是无限维的。贝叶斯非参数建模方法以其巨大的灵活性和稳健性得到了广泛的关注,随着人工智能的迅速发展,研究人员在机器学习领域对贝叶斯非参数方法展开了深入的研究并取得了许多优异的成果。本篇论文探究了贝叶斯非参数的部分基础理论,并对其在大数据背景下的实际应用进行了研究与展望。
In the era of rapid development of artificial intelligence, the exploration of the field of machine learning occupies an important position, and machine learning essentially stems from the analysis and learning of big data, which cannot be separated from the establishment and inference of models in statistics. Bayesian methods, as the main and well-established modelling methods in statistics, introduce a priori information about the parameters with sufficient learning of sample information, accommodating the uncertainty of the parameters and making model inference more reasonable. Nonparametric methods in the Bayesian framework further extend this uncertainty by extending the prior space of parameters to the distribution space, which is represented by a stochastic process, at which point the prior space is infinitely dimensional. Bayesian nonparametric modelling methods have received widespread attention for their great flexibility and robustness, and with the rapid development of artificial intelligence, researchers have conducted in-depth research on Bayesian nonparametric methods in the field of machine learning and achieved many excellent results. This paper explores some of the underlying theory of Bayesian nonparametric and investigates and prospects for its practical application in the context of big data.

References

[1]  Pearl, J. (1986) Fusion, Propagation, and Structuring in Belief Networks. Artificial Intelligence, 29, 241-288.
https://doi.org/10.1016/0004-3702(86)90072-X
[2]  Blei, D.M., Ng, A.Y. and Jordan, M.I. (2003) Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022.
[3]  Reynolds, D.A. (2009) Gaussian Mixture Models. In: Li, S.Z. and Jain, A., Eds., Encyclopedia of Biometrics, Springer, Berlin, 659-663.
https://doi.org/10.1007/978-0-387-73003-5_196
[4]  Eddy, S.R. (1996) Hidden Markov Models. Current Opinion in Structural Biology, 6, 361-365.
https://doi.org/10.1016/S0959-440X(96)80056-X
[5]  Ferguson, T.S. (1973) A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1, 209-230.
https://doi.org/10.1214/aos/1176342360
[6]  Ferguson, T.S. (1974) Prior Distributions on Spaces of Probability Measures. The Annals of Statistics, 2, 615-629.
https://doi.org/10.1214/aos/1176342752
[7]  Teh, Y.W. (2010) Dirichlet Process. In: Sammut, C. and Webb, G.I., Eds., Encyclopedia of Machine Learning, Springer, Berlin, 280-287.
https://doi.org/10.1007/978-0-387-30164-8_219
[8]  Seeger, M. (2004) Gaussian Processes for Machine Learning. International Journal of Neural Systems, 14, 69-106.
https://doi.org/10.1142/S0129065704001899
[9]  Kingman, J.F.C. (1992) Poisson Processes. Vol. 3, Clarendon Press, Oxford.
[10]  Hjort, N.L. (1990) Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data. The Annals of Statistics, 18, 1259-1294.
https://doi.org/10.1214/aos/1176347749
[11]  Thibaux, R. and Jordan, M.I. (2007) Hierarchical Beta Processes and the Indian Buffet Process. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, Vol. 2, 564-571.
[12]  Geyer, C.J. (1992) Practical Markov Chain Monte Carlo. Statistical Science, 7, 473-483.
https://doi.org/10.1214/ss/1177011137
[13]  Andrieu, C., De Freitas, N., Doucet, A. and Jordan, M.I. (2003) An Introduction to MCMC for Machine Learning. Machine Learning, 50, 5-43.
https://doi.org/10.1023/A:1020281327116
[14]  Casella, G. and George, E.I. (1992) Explaining the Gibbs Sampler. The American Statistician, 46, 167-174.
https://doi.org/10.1080/00031305.1992.10475878
[15]  Blei, D.M., Kucukelbir, A. and McAuliffe, J.D. (2017) Variational Inference: A Review for Statisticians. Journal of the American Statistical Association, 112, 859-877.
https://doi.org/10.1080/01621459.2017.1285773
[16]  Teh, Y., Jordan, M., Beal, M. and Blei, D. (2004) Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes. Proceedings of the 17th International Conference on Neural Information Processing Systems, Vancouver, 1 December 2004, 1385-1392.
[17]  Müller, P., Quintana, F.A., Jara, A. and Hanson, T. (2015) Bayesian Nonparametric Data Analysis. Vol. 1, Springer, New York.
https://doi.org/10.1007/978-3-319-18968-0_1
[18]  Xuan, J., Lu, J. and Zhang, G. (2019) A Survey on Bayesian Nonparametric Learning. ACM Computing Surveys (CSUR), 52, 1-36.
https://doi.org/10.1145/3291044
[19]  Gershman, S.J. and Blei, D.M. (2012) A Tutorial on Bayesian Nonparametric Models. Journal of Mathematical Psychology, 56, 1-12.
https://doi.org/10.1016/j.jmp.2011.08.004
[20]  Hjort, N.L., Holmes, C., Müller, P. and Walker, S.G. (2010) Bayesian Nonparametrics. Vol. 28, Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511802478
[21]  Müller, P. and Mitra, R. (2013) Bayesian Nonparametric Inference—Why and How. Bayesian Analysis, 8, 342 p.
https://doi.org/10.1214/13-ba811
[22]  Orbanz, P. and Teh, Y.W. (2010) Bayesian Nonparametric Models. In: Sammut, C. and Webb, G.I., Eds., Encyclopedia of Machine Learning, Springer US, Boston, 81-89.
https://doi.org/10.1007/978-0-387-30164-8_66
[23]  Halmos, P.R. (1944) Random Alms. The Annals of Mathematical Statistics, 15, 182-189.
https://doi.org/10.1214/aoms/1177731283
[24]  Freedman, D.A. (1963) On the Asymptotic Behavior of Bayes’ Estimates in the Discrete Case. The Annals of Mathematical Statistics, 34, 1386-1403.
https://doi.org/10.1214/aoms/1177703871
[25]  Kingman, J.F. (1975) Random Discrete Distributions. Journal of the Royal Statistical Society: Series B (Methodological), 37, 1-15.
https://doi.org/10.1111/j.2517-6161.1975.tb01024.x
[26]  Ishwaran, H. and James, L.F. (2001) Gibbs Sampling Methods for Stick-Breaking Priors. Journal of the American Statistical Association, 96, 161-173.
https://doi.org/10.1198/016214501750332758
[27]  Sethuraman, J. (1994) A Constructive Definition of Dirichlet Priors. Statistica Sinica, 4, 639-650.
[28]  Ishwaran, H. and James, L.F. (2003) Generalized Weighted Chinese Restaurant Processes for Species Sampling Mixture Models. Statistica Sinica, 13, 1211-1235.
[29]  Pitman, J. (2006) Combinatorial Stochastic Processes: Ecole d’Eté de Probabilités de Saint-Flour XXXII-2002. Springer, Berlin.
[30]  Smyth, P., Welling, M. and Asuncion, A. (2008) Asynchronous Distributed Learning of Topic Models. NIPS’08: Proceedings of the 21st International Conference on Neural Information Processing Systems, Vancouver, 8-11 December 2008, 81-88.
[31]  Campbell, T., Straub, J., Fisher III, J.W. and How, J.P. (2015) Streaming, Distributed Variational Inference for Bayesian Nonparametrics. Proceedings of the 28th International Conference on Neural Information Processing Systems, Volume 1, 280-288.
[32]  Neiswanger, W., Wang, C. and Xing, E. (2015) Embarrassingly Parallel Variational Inference in Nonconjugate Models.
[33]  Fox, E.B. (2009) Bayesian Nonparametric Learning of Complex Dynamical Phenomena. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge.
[34]  Fox, E., Sudderth, E., Jordan, M. and Willsky, A. (2008) Nonparametric Bayesian Learning of Switching Linear Dynamical Systems. Proceedings of the 21st International Conference on Neural Information Processing Systems, 8 December 2008, 457-464.
[35]  Damlen, P., Wakefield, J. and Walker, S. (1999) Gibbs Sampling for Bayesian Non-Conjugate and Hierarchical Models by Using Auxiliary Variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61, 331-344.
https://doi.org/10.1111/1467-9868.00179
[36]  Neal, R.M. (2003) Slice Sampling. The Annals of Statistics, 31, 705-767.
https://doi.org/10.1214/aos/1056562461
[37]  Kalli, M., Griffin, J.E. and Walker, S.G. (2011) Slice Sampling Mixture Models. Statistics and Computing, 21, 93-105.
https://doi.org/10.1007/s11222-009-9150-y
[38]  Broderick, T., Mackey, L., Paisley, J. and Jordan, M.I. (2014) Combinatorial Clustering and the Beta Negative Binomial Process. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 290-306.
https://doi.org/10.1109/TPAMI.2014.2318721
[39]  Blei, D.M. and Jordan, M.I. (2006) Variational Inference for Dirichlet Process Mixtures. Bayesian Analysis, 1, 121-143.
https://doi.org/10.1214/06-BA104
[40]  Kurihara, K., Welling, M. and Teh, Y.W. (2007) Collapsed Variational Dirichlet Process Mixture Models. Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 7, 2796-2801.
[41]  Bryant, M. and Sudderth, E. (2012) Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes. Proceedings of the 25th International Conference on Neural Information Processing Systems, Volume 2, 2699-2707.
[42]  Kurihara, K., Welling, M. and Vlassis, N. (2006) Accelerated Variational Dirichlet Process Mixtures. In: Sch?lkopf, B., Platt, J. and Hoffman, T., Eds., Advances in Neural Information Processing Systems, The MIT Press, Cambridge, 761-768.
[43]  Lin, D. (2013) Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation. Proceedings of the 26th International Conference on Neural Information Processing Systems, Volume 1, 395-403.
[44]  Hannah, L.A., Blei, D.M. and Powell, W.B. (2011) Dirichlet Process Mixtures of Generalized Linear Models. Journal of Machine Learning Research, 12, 1923-1953.
[45]  Doshi-Velez, F., Pfau, D., Wood, F. and Roy, N. (2013) Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 394-407.
https://doi.org/10.1109/TPAMI.2013.191
[46]  Gupta, S.K., Phung, D. and Venkatesh, S. (2012) A Bayesian Nonparametric Joint Factor Model for Learning Shared and Individual Subspaces from Multiple Data Sources. Proceedings of the 2012 SIAM International Conference on Data Mining, Anaheim, 26-28 April 2012, 200-211.
https://doi.org/10.1137/1.9781611972825.18
[47]  Canini, K.R., Shashkov, M.M. and Griffiths, T.L. (2010) Modeling Transfer Learning in Human Categorization with the Hierarchical Dirichlet Process. The 27th International Conference on Machine Learning (ICML 2010), Haifa, 21-24 June 2010, 151-158.
[48]  Kang, J.H., Ma, J. and Liu, Y. (2012) Transfer Topic Modeling with Ease and Scalability. Proceedings of the 2012 SIAM International Conference on Data Mining, Anaheim, 26-28 April 2012, 564-575.
https://doi.org/10.1137/1.9781611972825.49
[49]  Elvira, C., Chainais, P. and Dobigeon, N. (2017) Bayesian Nonparametric Principal Component Analysis.
[50]  Hill, J.L. (2011) Bayesian Nonparametric Modeling for Causal Inference. Journal of Computational and Graphical Statistics, 20, 217-240.
https://doi.org/10.1198/jcgs.2010.08162
[51]  Jiang, Y. and Saxena, A. (2013) Infinite Latent Conditional Random Fields for Modeling Environments through Humans. Robotics: Science and Systems, Berlin, 24-28 June 2013, 1-8.
https://doi.org/10.15607/RSS.2013.IX.034
[52]  Plagemann, C., Kersting, K., Pfaff, P. and Burgard, W. (2007) Gaussian Beam Processes: A Nonparametric Bayesian Measurement Model for Range Finders. Robotics: Science and Systems (RSS’07), Atlanta, 27-30 June 2007.
https://doi.org/10.15607/RSS.2007.III.018
[53]  Xing, E.P. and Sohn, K. (2007) Hidden Markov Dirichlet Process: Modeling Genetic Inference in Open Ancestral Space. Bayesian Analysis, 2, 501-527.
https://doi.org/10.1214/07-BA220
[54]  Xing, E.P., Sohn, K.A., Jordan, M.I. and Teh, Y.W. (2006) Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 25-29 June 2006, 1049-1056.
https://doi.org/10.1145/1143844.1143976
[55]  Lijoi, A., Mena, R.H. and Prünster, I. (2007) A Bayesian Nonparametric Method for Prediction in EST Analysis. BMC Bioinformatics, 8, Article No. 339.
https://doi.org/10.1186/1471-2105-8-339
[56]  Haines, T.S. and Xiang, T. (2013) Background Subtraction with Dirichlet Process Mixture Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 670-683.
https://doi.org/10.1109/TPAMI.2013.239
[57]  Sudderth, E.B., Torralba, A., Freeman, W.T. and Willsky, A.S. (2008) Describing Visual Scenes Using Transformed Objects and Parts. International Journal of Computer Vision, 77, 291-330.
https://doi.org/10.1007/s11263-007-0069-5
[58]  Fox, E.B., Sudderth, E.B., Jordan, M.I. and Willsky, A.S. (2008) An HDP-HMM for Systems with State Persistence. Proceedings of the 25th International Conference on Machine Learning, Helsinki, 5-9 July 2008, 312-319.
https://doi.org/10.1145/1390156.1390196
[59]  Goldwater, S., Griffiths, T.L. and Johnson, M. (2009) A Bayesian Framework for Word Segmentation: Exploring the Effects of Context. Cognition, 112, 21-54.
https://doi.org/10.1016/j.cognition.2009.03.008

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133