全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于LDA的双通道在线主题演化模型

DOI: 10.3724/SP.J.1004.2014.02877, PP. 2877-2886

Keywords: 时效性,强度遗传,Gibbs采样,LDA模型

Full-Text   Cite this paper   Add to My Lib

Abstract:

?网络舆情分析中需要处理大量时效性较强的文本数据流.针对在线时效性较强的文本数据流,提出基于LDA(LatentDirichletallocation)的双通道在线主题演化模型(Bi-pathevolutiononline-LDA,BPE-OLDA),在下一时间片生成文本时考虑文本的内容遗传和强度遗传,很好地模拟了人在生成时效性较强的文本时的特征.估算模型参数时对Gibbs采样算法进行了简化,实验证明,使用简化后的在线Gibbs重采样算法,BPE-OLDA模型在提取时效性较强的文本数据流的主题方面具有明显的效果.

References

[1]  Chen H C, Wang F Y, Zeng D. Intelligence and security informatics for homeland security: information, communication, and transportation. IEEE Transactions on Intelligent Transportation Systems, 2004, 5(4): 329-341
[2]  Zhang Chen-Yi, Sun Jian-Ling, Ding Yi-Qun. Topic mining for microblog based on MB-LDA model. Journal of Computer Research and Development, 2011, 48(10): 1795-1802(张晨逸, 孙建伶, 丁轶群. 基于MB-LDA模型的微博主题挖掘. 计算机研究与发展, 2011, 48(10): 1795-1802)
[3]  Yang Zhen, Lai Ying-Xu, Duan Li-Juan, Li Yu-Jian. Short text sentiment classification based on context reconstruction. Acta Automatica Sinica, 2012, 38(1): 55-67(杨震, 赖英旭, 段立娟, 李玉鑑. 基于上下文重构的短文本情感极性判别研究. 自动化学报, 2012, 38(1): 55-67)
[4]  Li Wen-Qing, Sun Xin, Zhang Chang-You, Feng Ye. A semantic similarity measure between ontological concepts. Acta Automatica Sinica, 2012, 38(2): 229-235(李文清, 孙新, 张常有, 冯烨. 一种本体概念的语义相似度计算方法. 自动化学报, 2012, 38(2): 229-235)
[5]  Deerwester S, Dumais S T, Furnas G W, Landauer T K, Harshman R. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 1990, 41(6): 391-407
[6]  Griffiths T, Steyvers M. Probabilistic topic models. Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006.
[7]  Salton G, McGill M. Introduction to Modern Information Retrieval. New York: McGraw-Hill, 1986.
[8]  Zhou Jian-Ying, Wang Fei-Yue, Zeng Da-Jun. Hierarchical Dirichlet processes and their applications: a survey. Acta Automatica Sinica, 2011, 37(4): 389-407 (周建英, 王飞跃, 曾大军. 分层Dirichlet 过程及其应用综述. 自动化学报, 2011, 37(4): 389-407
[9]  Dietz L, Bickel S, Scheffer T. Unsupervised prediction of citation influences. In: Proceedings of the 24th International Conference on Machine Learning. New York, USA: ACM, 2007. 233-240
[10]  Blei D M, Lafferty J D. Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning. New York, USA: ACM, 2006. 113-120
[11]  Nallapati R, Cohen W. Link-pLSA-LDA: A new unsupervised model for topics and influence of blogs. In: Proceedings of the 2008 International Conference on Weblogs and Social Media (ICWSM). Menlo Park, CA: AAAI, 2008.
[12]  Alusumait L, Barber D, Domeniconi C. On-Line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking. In: Proceedings of the 2008 English IEEE International Conference on Data Mining. Pisa, Italy: IEEE, 2008. 3-12
[13]  Ching J, Chen Y J. Transitional Markov chain monte carlo method for Bayesian model updating, model class selection, and model averaging. Journal of Engineering Mechanics, 2007, 133(7): 816-832
[14]  Griffths T. Gibbs Sampling In the Generative Model of LDA [Online], available: http://citeseerx.ist.psu.edu/view- doc/summary?doi=10.1.1.7.8022, November 17, 2014
[15]  Wang Fei-Yue, Wang Jue. Intelligence and security informatics: the state of the art and outlook. China Basic Science, 2005, 7(2): 24-29(王飞跃, 王珏. 情报与安全信息学研究的现状与展望. 中国基础科学, 2005, 7(2): 24-29)
[16]  Wang Fei-Yue. Decision service and academic analytics for development of science and technology based on open source intelligence and big data. Bulletin of Chinese Academy of Sciences, 2012, 27(5): 527-537(王飞跃. 知识产生方式和科技决策支撑的重大变革 —— 面向大数据和开源信息的科技态势解析与决策服务. 中国科学院院刊, 2012, 27(5): 527-537)
[17]  Yin Chun-Xia, Peng Qin-Ke. Identifying word sentiment orientation for free comments via complex network. Acta Automatica Sinica, 2012, 38(3): 389-398(殷春霞, 彭勤科. 利用复杂网络为自由评论鉴定词汇情感倾向性. 自动化学报, 2012, 38(3): 389-398)
[18]  Xu R, Wunsch D. Survey of clustering algorithms. IEEE Transactions on Neural Networks, 2005, 16(3): 645-678
[19]  Landauer T K, Foltz P W, Laham D. Indexing by latent semantic analysis. Introduction to Latent Semantic Analysis, 1998, 25(2): 259-284
[20]  Hofmann T. Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 1999. 50-57
[21]  Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. The Journal of Machine Learning Research, 2003, 3: 993- 1022
[22]  Wei X, Croft W B. LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2006. 178-185
[23]  Blei D M, Lafferty J. Text Mining: Classification, Clustering, and Applications. New York: Chapman & Hall/CRC, 2009.
[24]  Mei Q Z, Cai D, Zhang D, Zhai C X. Topic modeling with network regularization. In: Proceedings of the 17th International Conference on World Wide Web. New York, USA: ACM, 2008. 10l-ll0
[25]  Boyd-Graber J, Blei D M. Syntactic topic models. In: Proceedings of the 20th Neural Information Processing Systems. Cambridge, USA: MIT, 2008.
[26]  Sun C K, Gao B, Cao Z F, Li H. HTM: A topic model for hypertexts. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. New York, USA: ACM, 2008. 514-522
[27]  Manning C D, Raghavan P, Schittze H. An Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2007. 117-119
[28]  Fgueiredo M, Jain A K. Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(3): 381-396
[29]  Xu Xin, Shen Dong, Gao Yan-Qing, Wang Kai. Learning control of dynamical systems based on Markov decision processes: research frontiers and outlooks. Acta Automatica Sinica, 2012, 38(5): 673-687 (徐昕, 沈栋, 高岩青, 王凯. 基于马氏决策过程模型的动态系统学习控制: 研究前沿与展望. 自动化学报, 2012, 38(5): 673-687

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133