|
Evaluation of Data Processing Using MapReduce Framework in Cloud and Stand - Alone ComputingKeywords: MapReduce , Hadoop , Cloud Computing , Data Processing , Parallel and Distributed Processing Abstract: An effective technique to process and analyse large amounts of data is achieved through using theMapReduce framework. It is a programming model which is used to rapidly process vast amount of datain parallel and distributed mode operating on a large cluster of machines. Hadoop, an open-sourceimplementation, is an example of MapReduce for writing and running MapReduce applications. Theproblem is to specify, which computing environment improves the performance of MapReduce to processlarge amounts of data? A standalone and cloud computing implementation are used for the experiment toevaluate whether the performance of running MapReduce system in cloud computing mode is better thanin stand-alone mode or not, with respect to the speed of processing, response time and cost efficiency.This comparison uses different sizes of dataset to show the functionality of MapReduce to process largedatasets in both modes. The finding is, running a MapReduce program to process and analysis of largedatasets in a cloud computing environment is more efficient than running in a stand-alone mode.
|