My PhD Thesis


My thesis has been distinguished in 2001 by the AFIT. Here is a link to the list of distinguished thesis this same year.

The topics I addressed during my PhD thesis (defended in 2001) mostly cover code transformations and their application to parallelism or memory optimization. My work has focused on instruction shifting, I tried to formulate and classify the possibilities of this transformation. Most of this research has been conduced with Alain Darte.

Software pipelining

I've studied the decomposed approach of software pipelining. This approach formulates the problem as a composition of an instruction shifting and an acyclic schedule (loop compaction). Improving the previous works of Calland, Darte and Robert, I've tried to precise the importancy of the choice of a particular shift and I've proposed efficient solutions to some problems that were up to now badly understood.

From a theoretical point of view, I proposed a graph algorithm to minimize the total number of remaining constraints for the compaction phase. This algorithm can easily be combined with other objectives like the minimization of the critical path. This result has been presented at the LCPC international workshop. From a practical point of view, I've tested my technique on many graph examples. The importancy of these tests was justified on one hand by the relative abscence of experimental results about decomposed software pipelining and on the other hand by the necessity of a study on the relationships between instruction shifting and loop compaction. To perform these tests, I implemented the PASTAGA platform (for Plateforme d'Analyse Statistique et de Tests d'Algorithmes sur Graphes Aléatoires). This software is available on demand under GPL license. These experimental results have been published in the journal IJPP.

Stage scheduling

The utilization of a large number registers is the problem of most software pipelining algorithms (which is also a problem intrinsic to parallelism). Usually, the more a program contains parallelism, the larger the number of registers which necessary to store intermediate values of the computation. But a badly chosen transformation is also an unnecessary source of huge register requirements.

Stage scheduling is a technique allowing one to modify a cyclic schedule without altering its performance in order to reduce its register requirements. Up to now, the only exact solutions to this problem relied on the resolution of a (potentially) exponential number of integer linear programs. Furthermore the complexity of the problem were not known. In my most recent researches I proved that this problem is indeed NP-complete and I proposed an improvement over the previous exact solution with a single integer linear program containing a polynomial number of constraints. I also proposed a polynomial heuristic providing a guaranteed approximation.

Loop parallelization

From the point of view of iteration scheduling, one might wonder if instruction shifting can be of any use. Actually instruction shifting can be used as a standalone technique for loop parallelism and I proposed a formulation of this problem. The advantage of such an approach are the simplicity of the resulting parallel code, it regular and fair behaviour regarding other optimization objectives (mapping of data and computations, synchronisations, ...).

I proved that the problem is NP-complete and consequently I proposed an exact formulation for this resolution as a system of integer linear inequalities. Additionnaly I gave three polynomial heuristics (graph algorithms) solving particular cases and a suffisant condition for the inexistence of a solution. These results have been presented at the international conference STACS.

Locality

The software pipelining, by using instructions shifting, tend to increase the parallelism while reducing the reuse of memory. On a machine for which the memory usage is critical, or in the context of circuit synthesis, it can be interesting to adress the inverse problem: increasing the locality by increasing the data reuse within the loop body (temporal locality).

I proved that this problem is NP-complete and I proposed an exact method to solve it by integer linear programming. Although this initial problem is not completely a realistic objective, it is useful in the whole understanding of instruction shifting.

Additionnally, I proved that this problem was indeed equivalent to a more realistic and important problem: the array contraction. Thus array contraction with fusion and instruction shifting is NP-complete and I proposed an exact method (by integer linear programming) to solve it even in the case of non uniform dependences.