On the complexity of task graph scheduling with transient and fail-stop failures
Co-authorAnne Benoit, Emmanuel Jeannot, Yves Robert
This paper deals with the complexity of task graph scheduling with transient and fail-stop failures. While computing the reliability of a given schedule is easy in the absence of task replication, the problem becomes much more difficult when task replication is used. We fill an important complexity gap of the scheduling literature: our main result is that this reliability problem is #P'-Complete (hence at least as hard as NP-Complete problems), both for transient and for fail-stop processor failures. We also study the evaluation of a restricted class of schedules, where a task cannot be scheduled before all replicas of all its predecessors have completed their execution. Although the complexity in this case with fail-stop failures remains open, we provide a method to estimate the reliability while limiting evaluation costs, and we validate it empirically.