Dataparallel Skeletons

Herbert Kuchen

Algorithmic skeletons are abstracts of common parallel programming patterns. We are particularly interested in using them for distributed memory machines. In order to reach the required generality, they are typically defined as polymorphic higher order functions. It is possible to distinguish data parallel, task parallel, and application oriented skeletons. Typically, data parallelism offers a better scalability and it can be easier implemented ently than task parallelism. Thus, we focus (at least for the time being) onto data parallel skeletons, i.e. skeletons working on a distributed data structure, in our case arrays. These skeletons can mainly be divided into monolithic computation operations (like map and fold), working in parallel on the elements of an array, and monolithic communication operations, which change the mapping of partitions of an array onto the (local stores of the) available processors. These communication operations can perform a coordinated overall communication (rather than the transmission of individual messages) and they can thus be implemented in a way which guarantees that typical communication problems like deadlocks known from low level message passing cannot occur.

A couple of techniques have been developed which allow runtimes similar to C with message passing. Among them is a technique which instantiates higher order functions to a set of corresponding first order functions. Moreover a decentralized execution model is used where ``sequential computations'' are replicated on every processor and skeletons are split in such a way that every processor takes care of computation concerning its local share of data.