Exploiting Maximum Parallelism in Hierarchical Numerical Applications

Alexander Pfaffinger

Using hierarchical basis functions for the d-dimensional multilinear function representation, the number of the corresponding grid points can be reduced drastically from n^d to n log(n)^(d-1) without significant increase

of the approximation error. This leads to so-called sparse grids. Instead of flat arrays, binary trees and d-dimensional product graphs of binary trees are the natural implementation.

This product graph also reflects the dependency structure of the algorithm. Because of its complexity, exploiting the maximum inherent parallelism is tedious. An intuitive domain decomposition formulation of a sparse grid algorithm leads to a parallel complexity of O(log(n)^(d)) whereas an optimal implementation would achieve O(log(n)) complexity. The intuitive algorithm also results in an inefficient communication and synchronization pattern.

On the other side, coding an optimal program within conventional imperative languages (e.g. C with PVM) is a hard issue for general dimensions d. In the new data flow language FASAN the programmer has only to specify the mathematical data dependencies between the parts of the algorithm. The semantics of "wrapper streams" automatically generates direct communication channels between the dependent nodes, whereas the data flow semantics sends the data immediately after they are produced. Thus, the optimal parallel complexity can be expressed even with an intuitive divide-and-conquer description.