New Challenges in Scheduling Theory

September, 12-17 2010, Centre CNRS "La Villa Clythia", Frejus

scheduling DAG of tasks with communications on large scale machines

SpeakerJean-Noel Quintin

Co-authorFrederic Wagner

We consider the problem of scheduling DAG of tasks with communications on large scale machines. This problem derives from a practical implementation of a distributed \emph{make} program. Some heuristics like Dominant Sequence Clustering (DSC) and Earliest Task First (ETF) solve this problem by using a high amount of application information such as the task execution time and the data transfer time between tasks. Such information is difficult to obtain in practical cases. Moreover supplied information can easily turn false especially with the time sharing on computers and the network contention. We introduce two online scheduling algorithms trying to balance load while reducing communications costs relying only on DAG topology. Load-balancing is achieved with modified versions of the classical work-stealing algorithm enabling us to avoid centralization points. We exhibit reasonable examples showing that our algorithms on the online problem can outperform DSC and ETF on the offline problem. Moreover our schedule can react if the network or computer performance change during the execution. Algorithms are evaluated through simulations.