OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

模式识别与人工智能 2013

MapReduce框架下的不确定数据Top-k查询计算

, PP. 695-700

卢鑫,陈华辉,董一鸿,钱江波

Keywords: 不确定数据,Top-k查询,MapReduce

Full-Text Cite this paper Add to My Lib

Abstract:

Top-k查询是不确定性数据管理中普遍采用的一种技术。基于参数化排名函数的Top-k查询语义是近年来提出的各种查询语义的统一。文中针对海量不确定数据，提出一种基于MapReduce框架的Top-k计算的有效方法。通过分析基于参数化排名函数的不确定数据Top-k查询语义，设计一种获得未计算元组的排名函数值上界的算法，避免计算所有元组的排名函数值，解决Top-k计算中的剪枝问题。在MapReduce计算模型中提出两种不同的策略来实现该算法。文中针对单机环境和Hadoop分布式计算平台进行两组不同的对比实验。实验表明在处理海量不确定数据时，该算法在计算时间上有较高的性能提升。

References

[1]	Bohn R E,Short J E. How Much Information? [DB/OL]. [2009-09-01]. http://hmi.ucsd.edu/pdf/HMI_2009_ConsumerReport_Dec9_2009.pdf
[2]	Soliman M A,Ilyas I F,Chang K C C. Top-k Query Processing in Uncertain Databases // Proc of the 23rd IEEE International Conference on Data Engineering. Istanbul,Turkey,2007: 896-905
[3]	Soliman M A,Ilyas I F. Ranking with Uncertain Scores // Proc of the 25th IEEE International Conference on Data Engineering. Shanghai,China,2009: 317-328
[4]	Lian Xiang,Chen Lei. Probabilistic Ranked Queries in Uncertain Databases // Proc of the 11th International Conference on Extending Database Technology. Nantes,France,2008: 511-522
[5]	Hua Ming,Jian Pei,Zhang Weijie,et al. Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach // Proc of the ACM SIGMOD International Conference on Management of Data. Vancouver,Canada,2008: 673-686
[6]	Hua Ming,Dei Jian,Zhang Wenjie,et al. Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data // Proc of the 24th IEEE International Conference on Data Engineering. Cancun,Mexico,2008: 1403-1405
[7]	Hua Ming,Pei Jian,Liu Xuemin. Ranking Queries on Uncertain Data. The International Journal on Very Large Data Bases,2011,20(1): 129-153
[8]	Zhang Xi,Chomicki J. On the Semantics and Evaluation of Top-k Queries in Probabilistic Databases // Proc of the 24th IEEE International Conference on Data Engineering. Cancun,Mexico,2008: 556-563
[9]	Cormode G,Li Feifei,Yi Ke. Semantics of Ranking Queries for Probabilistic Data and Expected Ranks // Proc of the 25th IEEE International Conference on Data Engineering. Shanghai,China,2009: 305-316
[10]	Jestes J,Cormode G,Li Feifei,et al. Semantics of Ranking Queries for Probabilistic Data. IEEE Trans on Knowledge and Data Engineering,2011,23(12): 1903-1917
[11]	Ge Tingjian,Zdonik S,Madden S. Top-k Queries on Uncertain Data: On Score Distribution and Typical Answers // Proc of the ACM SIGMOD International Conference on Management of Data. Providence,USA,2009: 375-388
[12]	Li Jian,Saha B,Deshpande A. An Unified Approach to Ranking in Probabilistic Databases. The VLDB Journal,2011,20(2): 249-275
[13]	Li Jian,Deshpande A. Ranking Continuous Probabilistic Datasets. Proceedings of the VLDB Endowment,2010,3(1/2): 638-649
[14]	Wang Chonghai,Yuan Liyan,You Jiahuai,et al. On Pruning for Top-k Ranking in Uncertain Databases. Proceedings of the VLDB Endowment,2011,4(10): 598-609
[15]	Dean J,Ghemawat S. MapReduce: Simplified Data Processing on Large Cluster. Communications of the ACM,2008,51(1): 107-113
[16]	Pei Jian,Jiang Bin,Liu Xuemin,et al. Probabilistic Skylines on Uncertain Data // Proc of the 33rd International Conference on Very Large Data Bases,Vienna,Austria,2007: 15-26
[17]	Dittrich J,Quiané-Ruiz J A,Jindal A,et al. Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing). Proceedings of the VLDB Endowment,2010,3(1/2): 515-529
[18]	Ding Linlin,Xin Junchang,Wang Guoren,et al. Efficient Skyline Query Processing of Massive Data Based on Map-Reduce. Chinese Journal of Computers,2011,34(10): 1785-1796 (in Chinese)(丁琳琳,信俊昌,王国仁,等.基于Map-Reduce的海量数据高效Skyline查询处理.计算机学报,2011,34(10): 1785-1796)
[19]	Li Lingjuan,Zhang Min. Research on Algorithms of Mining Association Rule under Cloud Computing Environment. Computer Technology and Development,2011,21(2): 43-46 (in Chinese)(李玲娟,张敏.云计算环境下关联规则挖掘算法的研究.计算机技术与发展,2011,21(2): 43-46)
[20]	Jeffrey J,Yi Ke,Li Feifei. Building Wavelet Histograms on Large Data in MapReduce. Proceedings of the VLDB Endowment,2011,5(2): 109-120

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133