OALib Journal期刊
ISSN: 2333-9721
费用:99美元
|
|
|
LDA\,算法在Mahout下的高效实现
, PP. 118-130
Keywords: LatentDirichletAllocation,Gibbs\,采样,Mahout,分布式并行计算,MapReduce\,计算框架
Abstract:
通过对运用\,Gibbs\,采样的\,LatentDirichletAllocation(LDA)\,算法和\,MapReduce\,计算框架的细致研究,实现了\,LDA\,算法在\,Mahout下的分布式并行计算.详细地考察了该分布式并行计算程序的计算性能,并深入地探讨了一些影响计算性能的关键问题.
References
[1] | {1}
|
[2] | BLEI D M, NG A Y, JORDAN M I.
|
[3] | Latent Dirichlet allocation[J].
|
[4] | Journal of Machine Learning Research, 2003 (3): 993-1022.
|
[5] | {2}
|
[6] | GRIFFITHS T L, STEYVERS M.
|
[7] | VENNER J.
|
[8] | Pro Hadoop[M].
|
[9] | New York: Apress, 2009.
|
[10] | BU Y Y, HOWE B, BALAZINSKA M, et al.
|
[11] | HaLoop: efficient iterative data processing on large clusters[J].
|
[12] | Proceedings of the VLDB Endowment, 2010(3): 285-296.
|
[13] | Finding scientific topics[J].
|
[14] | Proceedings of the National Academy of Sciences, 2004(101): 5228-5235.
|
[15] | {3}
|
[16] | {4}
|
[17] | OWEN S, ANIL R, DUNNING T, FRIEDMAN E.
|
[18] | Mahout in Action[M].
|
[19] | New York: Manning Publications, 2010.
|
[20] | {5}
|
[21] | STEYVERS M, GRIFFITHS T.
|
[22] | Probabilistic topic models[M]//LANDAUER T,
|
[23] | MCNAMARA D, DENNIS S, et al. Latent Semantic Analysis: A Road to Meaning.[s.l.]:Routledge, 2007.
|
[24] | {6}
|
[25] | HEINRICH G.
|
[26] | Parameter estimation for text analysis[R].
|
[27] | Darmstadt: Fraunhofer IGD, 2004.
|
[28] | {7}
|
[29] | NEWMAN D, ASUNCION A, SMYTH P, WELLING M.
|
[30] | Distributed inference for latent Dirichlet allocation[J].
|
[31] | Proc Neural Information Processing Systems, 2007(20): 1081-1088.
|
[32] | {8}
|
[33] | WANG Y, BAI H J, STANTON M, et al.
|
[34] | PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications[M].
|
[35] | Lecture Notes in Computer Science 5564. Berlin: Springer, 2009: 301-314.
|
[36] | {9}
|
[37] | GRIFFITHS T L, STEYVERS M.
|
[38] | A probabilistic approach to semantic representation[C]// Proceedings of the Twenty-Fourth Annual Conference of Cognitive Science Society,
|
[39] | 2002.
|
[40] | {10}
|
[41] | LIU Z Y, ZHANG Y Z, CHANG E Y.
|
[42] | PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing[J].
|
[43] | ACM Transactions on Intelligent Systems and Technology, 2011(2): 26.
|
[44] | {11}
|
[45] | SMOLA A, NARAYANAMURTHY S.
|
[46] | An architecture for parallel topic models[J].
|
[47] | Proceedings of the VLDB Endowment, 2010(3): 703-710.
|
[48] | {12}
|
[49] | EKANAYAKE J, LI H, ZHANG B J, et al.
|
[50] | Twister: a runtime for iterative MapReduce[J].
|
[51] | Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010(1): 810-818.
|
[52] | {13}
|
Full-Text
|
|
Contact Us
service@oalib.com QQ:3279437679 
WhatsApp +8615387084133
|
|