Preliminaries

Next: Temporal Join Processing Up: Temporal Join Processing Model Previous: Temporal Join Processing Model

Preliminaries

Before we can start to describe how a temporal join $R \Join_{\scriptscriptstyle C}Q$ is processed, we have to lay out the starting situation that we assume when such a join is processed on a hardware architecture as in figure 8.7.

Firstly, we have to decide on the temporal join algorithm that is to be used. We will model the performance of the algorithm presented in section 4.4.3 for the following reasons:

It is suitable for any configuration of the hybrid architecture that has been described in section 8.2.3; the algorithm presented by Soo et al., for example, is restricted to sequential processing (see section 4.4.4).
It is based on a symmetric partitioning approach and thus easily applicable to n-way joins for $n \ge 3$ , too. In sections 4.4.5 and 4.5 we have already seen that Lu et al.'s spatially partitioned join can be expected to perform poorly in such a situation.
It performs better than the basic algorithm of section 4.4.2 as it avoids redundancies.

Therefore it is the most versatile of the algorithms that are based on explicit partitioning. Furthermore, it can be expected to perform well according to the analysis of section 4.5.

Secondly, the temporal relations R and Q are assumed to be physically distributed over the disk systems of the nodes: the disks of node i hold fragments and of R and Q respectively () such that

These fragments are supposed to be pairwise disjoint, i.e.

for . Later, we see that the first processing stage converts these initial fragments into the R'_k, R''_k, Q'_k, Q''_k for $k=1,\dots,m$ , which are required for processing $R \Join_{\scriptscriptstyle C}Q$ based on symmetric partitioning. We will refer to this as the repartitioning stage (see figure 8.9).

Thirdly, there is a partitioning strategy which produces a partition which is defined as in chapter 5. It is used within the repartitioning stage for creating the fragments R'_k, R''_k, Q'_k, Q''_k for $k=1,\dots,m$ . Please note that it is P that provides the parameter m.

The repartitioning stage is followed by a joining stage in which the partial joins are computed. We assume that

(37)

are processed sequentially on one processor as proposed in section 4.4.3. We refer to the computation of the three joins in (8.1) as RQ_k ( $k=1,\dots,m$ ).

The machine has nodes each with processors. The nodes are numbered from 1 to and the processors from 1 to such that all processors of node i have numbers from to . A function gives the number of the node to which processor j belongs

So there are processors for performing m computations RQ_k with $k=1,\dots,m$ . We now look at the way in which the RQ_k will be distributed over the processors.

If then processors perform one computation RQ_k (for ) each. Processors remain idle in that case. If then the workload distribution is as follows: the processors are divided into two sets. One has A processors, each of which performs computations RQ_k (for ) with

(38)

The other processors perform computations RQ_k (for ) with

(39)

This is illustrated in figure 8.10. Please note that the processors work concurrently but each processor processes its `load' sequentially.

The values of A and B can easily be computed from the values of and from the following two constraints:

From (8.4) follows that . If this is used to replace B in (8.5) then we get

(40)

But if m is not a multiple of then we have

Therefore (8.6) works out to be

(41)

and consequently

(42)

because of (8.4). If m is a multiple of then we have and we do not need to divide the processors, i.e. it is and B = 0 in this case. Actually, we do not need to separate a situation with from one with as it is and in that case which leads to A = m and B = 0 because of (8.7) and (8.8) respectively. And if then it is which leads to A = m and B = 0, too.

Now, we look at the opposite side of the coin, i.e. we want to determine the number j of the processor that performs computation RQ_k with . To that end we have to assume that the RQ_k are assigned to the processors as in figure 8.10. We note that this might not be the optimal assignment, for example if the most expensive computations coincide to hold subsequent numbers and therefore happen to be performed subsequently by the same processor. This causes the respective processor's load to be the most expensive one and the one that determines the overall join processing performance. In order to keep our performance model simple and manageable for a query optimiser, we do not optimise such a situation at this stage by rearranging the computations' order. Such a rearrangement could imply overhead costs if an optimal placement cannot be determined beforehand. This means that data might have to be transferred through the network and one would need to analyse the tradeoff between this overhead and the optimised costs in order to see if such a rearrangement was worth while.

The function is used to give the number of the processor that performs computation RQ_k with according to an assignment as in figure 8.10. If RQ_k is among the first computations, i.e. , then it is performed by processor

Similarly, if then it is performed by a processor j > A, i.e. by processor

This leads to being defined as

(43)

is the lifespan covered by the tuples of R and Q. In this context $t_{\min}$ and $t_{\max}$ are

As mentioned above, P is an m-way partition of , i.e. a set of m-1 breakpoints

$\begin{displaymath} \{p_1, \dots, p_{m-1}\} \end{displaymath}$

with for . (or $-\infty$ ) and (or $+\infty$ ) are used as the left and the right delimiters of the plot. P divides into m partition ranges

for $k=1,\dots,m$ . The function determines the number of the fragment (partition range respectively) to which a timepoint belongs with respect to partition P

This means that a tuple r with timestamp [t_s,t_e] is put into and the R''_k with the k given by the set

We will use the functions and to refer to the index of the first and the last computation that is performed on processor j. According to figure 8.10, these functions are defined as

Thus the first and last computations at a node i are

respectively. Later, it will be useful to refer to the first fragments R'_k, R''_k, Q'_k, Q''_k on each node, i.e. the first fragments on the first processor of this node. The indices k of these fragments are collected in the set :

**Figure:** Repartitioning of the .

**Figure:** Workload distribution among processors.

Next: Temporal Join Processing Up: Temporal Join Processing Model Previous: Temporal Join Processing Model

Thomas Zurek