Runtime Interprocedural Data Placement Optimisation for
Lazy Parallel Libraries
Paul H. J. Kelly and Olav Beckmann
We are developing a lazy, self-optimising parallel library of
vector-matrix routines. The aim is to allow users to parallelise
certain computationally expensive parts of numerical programs by
simply linking with a parallel rather than sequential library of
subroutines. The library performs interprocedural data placement
optimisation at runtime, which requires the optimiser itself to be
very efficient. We achieve this firstly by working from aggregate
loop nests which have been optimised in isolation, and secondly by
using a carefully constructed mathematical formulation for data
distributions and the distribution requirements of library operators,
which together make the optimisation algorithm both simple and
efficient.