|
- 2015
MapReduce环境中的性能特征能耗估计方法
|
Abstract:
针对MapReduce系统中负载能耗特征多样性为系统成本调度带来的负载与节点难以匹配的问题,提出一种基于负载性能特征的能耗估计方法。该方法以MapReduce系统中各节点操作系统的性能事件为依据估计在线负载的能耗。为了提升负载能耗估计结果的准确度,采用机器学习的方法,在负载执行时,搜集系统的性能特征,并建立估计模型的样本集;采用粗糙集理论中属性约简方法对性能特征属性进行约简;在性能属性约简的结果之上,基于支持向量机理论,建立能耗的估计模型,对负载运行时系统的能耗进行准确的估计。实验结果表明:基于性能特征的能耗估计方法拥有较高的估计准确率,在单作业环境中平均相对误差为4%,在多作业环境中可达到4.5%。
It is difficult to improve the energy efficiency of MapReduce clusters by matching active nodes to the needs of the workload since it is difficult to capture the features of energy consumption for cost??based scheduler for different types of workloads. A power estimation method based on performance features of workloads is proposed to solve the problem. The method estimates the power consumption by leveraging performance monitoring counters on components of worker nodes during MapReduce jobs execution. A machine learning method is used to improve the estimation accuracy. The performance monitoring counters of MapReduce system are collected to build a sample set, and then the rough set method is used to select the performance attributes that show strong impact on the energy consumption of workloads. A power estimation model based on the least square support vector machines is built from the attribute reduction results. Experimental results show that the energy estimation method accurately forecasts the power consumption of workloads in MapReduce systems. The relative error of accuracy for power prediction is 4% for only one running job and 4.5% for jobs sharing MapReduce clusters
[1] | [13]CORTES C, VAPNIK V. Support vector machine [J]. Machine Learning, 1995, 20(3): 273??297. |
[2] | [14]SUYKENS J A K, VANDEWALLE J. Least squares support vector machine classifiers [J]. Neural Processing Letters, 1999, 9(3): 293??300. |
[3] | TAN Yiming, ZENG Guosun, WANG Wei. Policy of energy optimal management for cloud computing platform with stochastic tasks [J]. Journal of Software, 2012, 23(2): 266??278. |
[4] | [9]RIVOIRE S, RANGANATHAN P, KOZYRAKIS C. A comparison of high??level full??system power models [J]. HotPower, 2008, 8: 1??5. |
[5] | [10]LI Bing, CHOW T W S, TANG Peng. Analyzing rough set based attribute reductions by extension rule [J]. Neurocomputing, 2014, 123: 185??196. |
[6] | [11]HAN Jiawei, KAMBER M, PEI Jian. Data mining: concepts and techniques [M]. San Francisco, CA, USA: Morgan Kaufmann, 2006: 63??70. |
[7] | [15]AHMAD F, CHAKRADHAR S T, RAGHUNATHAN A, et al. Tarazu: optimizing Mapreduce on heterogeneous clusters [C]∥Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems. New York, USA: ACM, 2012: 61??74. |
[8] | [1]DEAN, JEFFREY, SANJAY G. MapReduce: simplified data processing on large clusters [J]. Communications of the ACM, 2008, 1(1): 107??113. |
[9] | [2]VAVILAPALLI V K, MURTHY A C, DOUGLAS C, et al. Apache Hadoop YARN: yet another resource negotiator [C]∥Proceedings of the 4th Annual Symposium on Cloud Computing. New York, USA: ACM, 2013: 1??16. |
[10] | [3]谭一鸣, 曾国荪, 王伟. 随机任务在云计算平台中能耗的优化管理方法 [J]. 软件学报, 2012, 23(2): 266??278. |
[11] | [4]LEVERICH, JACOB, CHRISTOS K. On the energy (in) efficiency of hadoop clusters [J]. ACM SIGOPS Operating Systems Review, 2010, 44(1): 61??65. |
[12] | [5]GE Rong, FENG Xizhou, WIRTZ T, et al. eTune: a power analysis framework for data??intensive computing [C]∥Proceedings of the 2012 41st International Conference on Parallel Processing Workshops. Piscataway, NJ, USA: IEEE, 2012: 254??261. |
[13] | [6]WIRTZ T, GE Rong. Improving Mapreduce energy efficiency for computation intensive workloads [C]∥Proceedings of the 2011 International Green Computing Conference and Workshops. Piscataway, NJ, USA: IEEE, 2011: 1??8. |
[14] | [7]FAN Xiaobo, WEBER W D, BARROSO L A. Power provisioning for a warehouse??sized computer [J]. ACM SIGARCH Computer Architecture News, 2007, 35(2): 13??23. |
[15] | [8]HEATH T, DINIZ B, CARRERA E V, et al. Energy conservation in heterogeneous server clusters [C]∥Proceedings of the 10th ACM Sigplan Symposium on Principles and Practice of Parallel Programming. New York, USA: ACM, 2005: 186??195. |
[16] | [12]王国胤, 姚一豫, 于洪. 粗糙集理论与应用研究综述 [J]. 计算机学报, 2009, 32(7): 1229??1246. |
[17] | WANG Guoyin, YAO Yiyu, YU Hong. A survey on rough set theory and applications [J]. Journal of Computers, 2009, 32(7): 1209??1246. |