Fairness in Job Scheduling on Cplant
The primary focus of research in job scheduling has been to increase utilization and improve desired user metrics such as turnaround time and slowdown. Very little research has addressed fairness. The Cplant scheduler uses a "fair share" measure to order jobs in the queue, but how fair is this scheduler?
One possible approach to assessing fairness is as follows: For each job, find the number of jobs with a higher user usage-count that are serviced while the job waits. The problem with that approach is how do we account for "benign" backfilling that uses slots in the schedule not usable by this job? Instead, we propose assigning a "fair-start" time to a job when it is submitted by generating a non-backfilling, in order schedule based on fairness priority. If a job's actual start time does not exceed its fair-start time, the job is considered to have been fairly treated. Otherwise, it is considered to have been unfairly treated.
The original Cplant scheduler implemented no guarantee backfilling. With no guarantee backfilling none of the jobs submitted gets a scheduler reservation. Therefore, there is no blocking of jobs that can backfill onto open processors.
To preventing starvation, the Cplant scheduler will drain the machine when a job has been waiting for more than twenty-four hours. During the time that Cplant is being drained, many processors can go idle while jobs that can fit on the machine are waiting. In order to increase processor utilization and decrease job waiting times, aggressive/easy backfilling was added to the Cplant scheduler drain. With aggressive backfilling, a job that can fit on Cplant is scheduled during the time that the machine is being drained as long as the time requested for the job is no greater than the maximum remaining time for the jobs running on the machine. Therefore, one job gets a reservation and backfilling can be blocked by that job.
When Cplant became oversubscribed, the no guarantee backfilling and starvation drain made Cplant "unfair". We studied the consequences to "fairness" of various alternatives to the above scheduler.