Before we can start to describe how a temporal join is
processed, we have to lay out the starting situation that we assume
when such a join is processed on a hardware architecture as in
figure 8.7.
Firstly, we have to decide on the temporal join algorithm that is to be used. We will model the performance of the algorithm presented in section 4.4.3 for the following reasons:
Secondly, the temporal relations R and Q are assumed to be physically distributed over the disk systems of the nodes: the disks of node i hold fragments and of R and Q respectively () such that
These fragments are supposed to be pairwise disjoint, i.e.
for . Later, we see that the first processing
stage converts these initial fragments into the R'k, R''k,
Q'k, Q''k for , which are required for processing
based on symmetric partitioning. We will refer to this
as the repartitioning stage (see
figure 8.9).
Thirdly, there is a partitioning strategy which produces a partition
which is defined as in chapter 5.
It is used within the repartitioning stage for creating the fragments
R'k, R''k, Q'k, Q''k for . Please note that
it is P that provides the parameter m.
The repartitioning stage is followed by a joining stage in which the partial joins are computed. We assume that
(37) |
The machine has nodes each with processors. The nodes are numbered from 1 to and the processors from 1 to such that all processors of node i have numbers from to . A function gives the number of the node to which processor j belongs
So there are processors for performing m computations RQk withIf then processors perform one computation RQk (for ) each. Processors remain idle in that case. If then the workload distribution is as follows: the processors are divided into two sets. One has A processors, each of which performs computations RQk (for ) with
(38) |
(39) |
The values of A and B can easily be computed from the values
of and from the following two constraints:
(40) |
(41) |
(42) |
Now, we look at the opposite side of the coin, i.e. we want to determine the number j of the processor that performs computation RQk with . To that end we have to assume that the RQk are assigned to the processors as in figure 8.10. We note that this might not be the optimal assignment, for example if the most expensive computations coincide to hold subsequent numbers and therefore happen to be performed subsequently by the same processor. This causes the respective processor's load to be the most expensive one and the one that determines the overall join processing performance. In order to keep our performance model simple and manageable for a query optimiser, we do not optimise such a situation at this stage by rearranging the computations' order. Such a rearrangement could imply overhead costs if an optimal placement cannot be determined beforehand. This means that data might have to be transferred through the network and one would need to analyse the tradeoff between this overhead and the optimised costs in order to see if such a rearrangement was worth while.
The function is used to give the number of the processor that performs computation RQk with according to an assignment as in figure 8.10. If RQk is among the first computations, i.e. , then it is performed by processor
Similarly, if then it is performed by a processor j > A, i.e. by processor This leads to being defined as(43) |
is
the lifespan covered by the tuples of R and Q. In this context
and
are
As mentioned above, P is an m-way partition of , i.e. a set of m-1 breakpoints
Thus the first and last computations at a node i are
respectively. Later, it will be useful to refer to the first fragments R'k, R''k, Q'k, Q''k on each node, i.e. the first fragments on the first processor of this node. The indices k of these fragments are collected in the set :