next up previous contents index
Next: Symmetric Partitioning Technique Up: Parallel Joins Previous: Parallel Joins

Fragment-And-Replicate Technique

       

The fragment-and-replicate (f-a-r) strategy partitions only one relation and replicates the other for joining it with each fragment:  
 \begin{displaymath}
R \Join_{\scriptscriptstyle C}Q \;=\;
R_1 \Join_{\scriptscri...
 ... C}Q \,\cup\, \cdots \,\cup\, R_m \Join_{\scriptscriptstyle C}Q\end{displaymath} (8)
This method is particularly useful if R is huge and Q is small. This is a situation that occurs in what is frequently referred to as a star-join   [Red Brick Systems, 1995b]. An advantage is that there are no constraints on the Rk. Any partition of R into subsets $R_1, \dots, R_m$ will suit, especially a partition of R that might reside on the disks in the case of a parallel I/O system. This also means that this technique need not be affected by data skew . For this reason, load balancing  is fairly easy to achieve.

The major drawback, however, is that Q is required to be small in order to keep replication  costs marginal. If this is not the case then shipping Q over the interconnect of the parallel hardware can become the major bottleneck.

Depending on the join algorithm that is employed for processing the partial joins $R_k \Join_{\scriptscriptstyle C}Q$ we get various search patterns. Figures 3.17 and 3.18 show the ones that arise when using a nested-loop and a sort-merge join respectively.




  
Figure: Search strategy of the fragment-and-replicate technique with the partial joins performed as nested-loops.
\begin{figure}
 \epsfxsize=0.9\textwidth
 \centerline{
 \epsffile{/home/tz/work/thesis/fig/j-far-1.ps}}
 \centerline{
 } \end{figure}




  
Figure: Search strategy of the fragment-and-replicate technique with the partial joins performed as sort-merge.
\begin{figure}
 \epsfxsize=0.9\textwidth
 \centerline{
 \epsffile{/home/tz/work/thesis/fig/j-far-2.ps}}
 \centerline{
 } \end{figure}


next up previous contents index
Next: Symmetric Partitioning Technique Up: Parallel Joins Previous: Parallel Joins

Thomas Zurek