全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
电子学报  2015 

流水行云:支持可扩展的并行分布式流处理系统

DOI: 10.3969/j.issn.0372-2112.2015.04.003, PP. 639-646

Keywords: 流处理系统,可扩展,有状态算子,负载均衡,重配置

Full-Text   Cite this paper   Add to My Lib

Abstract:

数据流处理系统,无论是集中式还是分布式,都需要克服单点瓶颈问题.不仅如此,如果数据流处理系统是静态配置的,那么还会出现处理节点供给不足或者过剩的情况,为此本文提出了一种支持可扩展的并行分布式数据流处理系统—流水行云,该系统根据有状态算子将查询拓扑划分为并行处理的子查询,并且通过有状态算子的分发器和收集器实现了数据流的保序,同时最大化减少并行处理的通信开销,不仅如此,结合负载均衡和重配置的可扩展技术使得该系统能够根据输入负载动态调整处理节点的负载和个数.60个节点组成的集群的实验证明了该系统的可扩展能力.

References

[1]  孟小峰,慈祥.大数据管理:概念,技术与挑战[J].计算机研究与发展,2013,50(1):146-169. Meng Xiao-feng,Ci Xiang.Big data management:concepts,techniques and challenges[J].Journal of Computer Research and Development,2013,50(1):146-169.(in Chinese)
[2]  R Kumar.Two Computational Paradigm for Big Data[OL].http://kdd2012.sigkdd.org/sites/images/summerschool/Ravi.Kumar.pdf,KDD summer school,2012.
[3]  孙圣力,戴东波,黄震华,张齐勋,周立新.概率数据流上Skyline查询处理算法[J].电子学报,2009,37(2):285-293. Sun Sheng-li,Dai Dong-bo,Huang Zhen-hua,Zhang Qi-xun,Zhou Li-xin.Algorithmon computing skyline over data stream[J].Acta Electronica Sinica,2009,37(2):285-293.(in Chinese)
[4]  钱江波,王永利,陈征,陈华辉,金光.数据流窗口连接查询处理器研究[J].电子学报,2009,37(2):404-409. Qian Jiang-bo,Wang Yong-li,Chen Zheng,et al.Hardware processor for window joins over multiple data streams[J].Acta Electronica Sinica,2009,37(2):404-409.(in Chinese)
[5]  B Brain,D Mayur,M Rajeev.Load shedding for aggregation queries over data streams[A].Proceedings of the 20th International Conference on Data Engineering[C].USA:IEEE,2004.350-361.
[6]  D Mayur,M Rajeev.The sliding-window computation model and results[A].Data Streams[C].USA:Springer US,2007,31:149-167.
[7]  K H Lee,Y J Lee,H Choi,Y D Chung,B Moon.Parallel data processing with MapReduce:a survey[J].ACM SIGMOD Record,2012,40(4):11-20.
[8]  C Tyson,C Neil,A Peter,et al.MapReduce online[A].Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation[C].USA:ACM,2010.21-23.
[9]  M Zaharia,T Das,H Y Li,et al.Discretized streams:An efficient and fault-tolerant model for stream processing on large clusters[A].Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Computing[C].USA:ACM,2012.10-10.
[10]  Twitter.Storm Project[OL].http://storm-project.net/,2011-08-16.
[11]  L Neumeyer,B Robbins,A Nair,A Kesari.S4:Distributed stream computing platform[A].Proceedings of IEEE International Conference on Data Mining Workshops (ICDMW)[C].USA:IEEE,2010.170-177.
[12]  S Scott,A Henrique,G Bugra,et al.Elastic scaling of data parallel operators in stream processing[A].Proceedings of IEEE International Symposium on Parallel & Distributed Processing[C].USA:IEEE,2009.1-12.
[13]  T Heinze.Elastic complex event processing[A].Proceedings of the 8th Middleware Doctoral Symposium[C].USA:ACM,2011.1-6.
[14]  L Wang,L Lu,P STS,et al.Muppet:MapReduce style processing of fast data[J].Proceedings of the VLDB Endowment,2012.5(12):1814-1825.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133