%0 Journal Article %T 基于哈希表的MapReduce算法优化<br>Optimization on MapReduce algorithm based on Hash table %A 李瑞霞 %A 刘仁金 %A 周先存< %A br> %A LI Rui-xia %A LIU Ren-jin %A ZHOU Xian-cun %J 山东大学学报(理学版) %D 2015 %R 10.6040/j.issn.1671-9352.0.2014.461 %X 摘要: 分布式并行计算是提高计算机性能常用的方法,但针对不同需求,并行程序的设计并没有统一的模型与方法,使得并行程序的编写完全依靠开发人员的经验。Google公司提出的分布式并行编程模型MapReduce能够完成特定类型的并行程序的开发与运行。使用哈希表对MapReduce分布式并行编程模型进行优化,减少中间结果中的碎片,并省略Combiner中间函数的调用,减少传输负载,提升运行效率,同时兼顾了Map函数与Reduce函数接口的属性,保持了MapReduce模型的并行性特点。<br>Abstract: Distributed parallel computing is commonly used to improve computer performance. But according to different demands, there is not a uniform way to design and implement parallel program. Parallel programming depends on the experience of developer. MapReduce, a distributed parallel programming model, put forward by Google, can perform special parallel program development and operation. MapReduce was optimized by using Hash table, which would decrease fragment of Map function, skip other redundancy function such as Combiner function, reduce transmission load and improve computing efficiency. Meanwhile, the attributes of Map function and Reduce function were kept to make MapReduce maintaining parallel %K 分布式 %K 哈希表 %K Hadoop %K MapReduce %K Map函数 %K 并行 %K < %K br> %K distributed %K MapReduce %K Map function %K Hash table %K parallel %K Hadoop %U http://lxbwk.njournal.sdu.edu.cn/CN/10.6040/j.issn.1671-9352.0.2014.461