next up previous contents index
Next: A Short Summary Up: Explicit-Partitioning Join Algorithms Previous: Partitioned Temporal Join for

Spatially Partitioned Temporal Join

  

In this section, we describe a partition scheme which was proposed by [Lu et al., 1994]. It creates disjoint fragments for both relations. Hence, it is a symmetric partitioning  approach. On the other side however, a fragment of one relation needs to be joined not only with one fragment of the other but - depending on its index number - with a set of fragments. Thus, if this method is to be parallelised, fragments need to be replicated in order to achieve independent partial joins. From this point of view, one could also speak of a fragment-and-replicate  approach.

Lu et al. use a spatial rendition which was described in [Hinrichs and Nievergelt, 1983]. An interval [ts,te] is mapped to a point in a two-dimensional grid. Its coordinates, x and y, are calculated by the equations:

This means

The spatial rendition is then divided into m strips: assume that $t_{\min}$ is the minimum of the startpoints of all intervals. Then the time domain between $t_{\min}$ and $t_{\max}$ is divided into m ranges [t0,t1], (t1,t2], (t2,t3], ..., (tm-1,tm] with $t_0=t_{\min}$ and $t_m=t_{\max}$. This creates m strips with the k-th strip being bounded by the lines x=0, y=0, x+y=tk-1 and x+y=tk. Finally, each strip is divided by lines x=t1, x=t2, ..., x=tm-1. This creates $\frac{m \cdot (m+1)}{2}$fragmentation areas. Figure 4.13 shows the result of this process for m=4 and the example that has been used throughout this chapter. The figure also demonstrates the way in which the resulting fragments are numbered.

For processing the join $R
\Join_{\scriptscriptstyle C}Q$, a fragment Rk of R has to be joined with a certain set of fragments of Q. The members of this set are determined by k. The latter describes the position of the corresponding fragmentation area on the grid. This position indicates certain characteristics: for example, fragments[*] that are positioned in the upper part of the grid hold tuples whose interval timestamp  is quite long. These tuples are likely to join with many others. In contrast, short-lived tuples are situated in the lower part of the grid. These are likely to join with only a few tuples. Intuitively, this means that fragments in the upper part need to be joined with more fragments than those located in the lower part of the grid. For the situation in figure 4.13 the necessary combination of fragments is shown in figure 4.14(a). For a more formal description of how to calculate these combinations see [Lu et al., 1994]. The example of figure 4.13, however, has several empty fragments which reduces its matrix of necessary joins to the one shown in figure 4.14(b).

Figure 4.15 shows the resulting search strategy. It is quite efficient in avoiding unnecessary comparisons in this case. This, however, is partially due to the fact that the example has only a few long-lived tuples which causes fragments R6, Q6, R7, Q7, Q9, R10, Q10 to be empty. These are all fragments which are usually involved in many partial joins (see figure 4.14(a)). This reduces the number of necessary partial joins significantly (see figure 4.14(b)). Furthermore we note that tuples from one relation may be loaded sequentially but tuples from the second relation might require several accesses.

Alternatively, the partial joins can be processed in parallel. This, however, requires tuples of one relation to be replicated. Assume that this relation is Q. Consider now the example that we have used throughout the chapter. Regarding the matrix in figure 4.14(b) we note that R5 and R9 are to be joined with the same set of fragments of Q, namely Q2, Q3, Q4, Q5, Q8. Thus we can pack R5 and R9 together into one `super-fragment' which is to be joined with those fragments of Q. This avoids the need to provide both R5 and R9 with a copy each of those fragments, thus reducing tuple replication. Similarly, R4 and R8 can be packed together. Even R1 and R3 can form a super-fragment as R1 has to join with a subset of R3's Q-fragments. This groups the partial joins of figure 4.14(b) into three[*] `super-partial joins'. This is illustrated in figure 4.16.

A major problem arises in the case that three or more relations participate in the join. Then, the structures depicted as matrices in figure 4.14 would have to be cubic or of higher order. This means that a huge number of fragment combinations have to be joined in order to compute the global result. For m=4, for example, there are 70 partial joins for a global two-way join (see matrix in figure 4.14(a)). For m=4 and a three-way join, however, there are already 528 partial joins to be computed.

The approach taken by Lu et al. is similar to many that have been used in the area of spatial join processing. Many spatial join  algorithms are based on transforming an approximation of a spatial object into another domain, and performing filtering (e.g. in the form of partitioning) in the new domain [Patel and DeWitt, 1996]. Examples for such algorithms can be found in [Orenstein, 1986], [Orenstein and Manola, 1988] or [Becker et al., 1993].




  
Figure: Spatial rendition and numbering of fragments for the example.
\begin{figure}
 \epsfxsize=0.9\textwidth
 \centerline{
 \epsffile{/home/tz/work/thesis/fig/spatial-rendition.ps}}
 \centerline{
 } \end{figure}


    
Figure: Partial joins that are to be computed for processing the spatially partitioned join.
\begin{figure}
% latex2html id marker 4018
\begin{center}

\mbox{
\subfigure[In ...
 ...e $\bullet$}& {\large $\bullet$}\\ \hline\end{tabular}}}\end{center}\end{figure}




  
Figure: Search strategy for the spatially partitioned temporal join being processed sequentially. Partial joins are processed as nested loops.
\begin{figure}
 \epsfxsize=0.9\textwidth
 \centerline{
 \epsffile{/home/tz/work/thesis/fig/tj-spatial.ps}}
 \centerline{
 } \end{figure}




  
Figure: Search strategy for the spatially partitioned temporal join being processed in parallel. Partial joins are processed as nested loops.
\begin{figure}
 \epsfxsize=0.9\textwidth
 \centerline{
 \epsffile{/home/tz/work/thesis/fig/tj-sp-par.ps}}
 \centerline{
 } \end{figure}


next up previous contents index
Next: A Short Summary Up: Explicit-Partitioning Join Algorithms Previous: Partitioned Temporal Join for

Thomas Zurek