A Job Utility and Size-based Scheduler for Meeting the ClientÃÂ¢Ãâ¬Ãâ¢s Job Requirements in Hadoop

Aditi Jain; Sanjay Jain; DA Mehta

Research Article Open Access

A Job Utility and Size-based Scheduler for Meeting the ClientÃ¢Â€Â™s Job Requirements in Hadoop

Abstract

Hadoop, an implementation of MapReduce paradigm, is an open-source powerful parallel processing framework for handling big data on distributed commodity hardware clusters such as Clouds. Proper scheduling of jobs on such a distributed cluster is an important factor in determining the clusters’ performance. Proper scheduling of jobs in Hadoop cluster requires usage of efficient algorithms that should focus on meeting job requirements like job deadline, job priority etc. provided by clients and also, the improvement in the average job response time in the cluster. Client’s requirement on job completion is an important way to measure the service quality which the client obtains from the cloud. Utility of a job denotes the quality of service requirements between client and service provider. Existing job schedulers in Hadoop (viz., FIFO, Fair Scheduler, Capacity Scheduler) usually ignore job’s requirements (like job deadline, job priority etc.) specified by clients. There is a need of a scheduler that schedules jobs efficiently considering the clients’ job requirements. The problem addressed in this work is of scheduling jobs taking into account the job requirements specified by client. In order to satisfy the client-specified job requirements, the scheduling algorithm calculates the utility value of each job using the job requirements specified by clients and the estimated job size. The results show an increase in the percentage by which jobs in the proposed scheduler are meeting client’s job requirements when compared to the default scheduler in Hadoop. Aditi Jain*, Sanjay Jain, DA Mehta

To read the full article Download Full Article