|
- 2018
Performance Evaluation and Resource Optimization of Parallel Hadoop Clusters with an Intelligent SchedulerAbstract: Data generated from real time information systems always has an incremental growth, varied representations available in the current industries big picture. Processing of data in large scale requires a parallel processing system like Hadoop cluster. Major challenge that arises in a cluster-based system is evaluating the performance of system and optimizing resources. The research carried out proposes a model for Hadoop cluster with a super node who manages the cluster and a mediation manager who does the performance monitoring. Super node in the system is equipped with intelligent scheduler that does the scheduling of the job with optimal resources. The intelligent scheduler works with cross mutation principle of genetic algorithm to find the best matching resource. The mediation node deploys ganglia monitor to collect and record the performance parameters of the Hadoop cluster. The system over all does the scheduling of different jobs with optimal usage of resources thus achieving better efficiency compared to the native scheduler in Hadoop.
|