Processor-oblivious parallel algorithms and scheduling: illustration on parallel prefix
The modified randomised work-stealing proposed by Bender&Rabin schedules a parallel program that performs W1 operations with a critical path Winf in time W1/(p.πave) +O((p.Winf)/(πave)) on p processors with average speed πave. Thus, it is well suited to schedule a fine-grain parallel algorihm that performs a nearly optimal number W1 of operations on processors with changing speeds. However, numerous parallel algorithms require to increase the number W1 of operations with respect to the one Ws of an optimal sequential algorithm.
In this talk, we present a generic adaptive algorithm based on the coupling of both a sequential optimal algorithm and a fine grain parallel one, but with non optimal work W1. The coupling is performed on-line, directed by a work-stealing schedule; it adapts the number of operations to the number and speeds of available processors. Also, the algorithm can be considered as processor-oblivious.
This generic adaptive scheme is illustrated on processor oblivious prefix computation, where W1 is strictly lower bounded by 2n - Winf; this implies a lower bound 2n/(p+1) for the parallel time on p identical processors. Extending this lower bound to processors with changing speeds, we prove that the processor-oblivious prefix computation achieves a nearly optimal parallel time 2n/((p+1).πave) + O(p.log n/(πave)). Experimentation on a 8-SMP machine exhibits this result: the processor-oblivious prefix compares favorably to an optimal parallel prefix algorithm for p identical processors.