Basic Strategy

Next: Variations Up: Minimum-Overlaps Strategies Previous: Minimum-Overlaps Strategies

Basic Strategy

Similar to the underflow strategy, the goal of this one is to create fragments R₁, R₂, ...(Q₁, Q₂, ..., respectively) such that a given number X of tuples is not exceeded and that the sum of intervals overlapping the breakpoints, i.e.

is minimal at the same time. But this looks very much like IP (see chapter 5). In fact, we can use a variation of IP-opt to compute such a partition. We note that the original version of IP-opt limits the total number of intervals (tuples) that fall into a segment, i.e. in terms of a join $R \Join_{\scriptscriptstyle C}Q$ it would use

as the constraint to determine the breakpoints rather than

for all $k=1,\dots,m$ as the basic underflow strategy.

The variation of IP-opt can be based on the information stored in the IP-table . First, two arrays and are initialised according to equation (5.10). Then the timepoints of are processed from 1 to N. For each t_i, a partition is computed for the span [t₁,t_i] that has minimal sum of overlaps and has fragment loads less than X. More precisely, for each t_i a predecessor is determined which represents the preceding breakpoint that leads to the partition with a minimum number of overlapping tuples. If no such predecessor can be found then the original IP-opt stops with the message that there is no partition that satisfies the constraints. For practical purpose, we propose to weaken this and to use the timepoint t_i-1 (that precedes t_i) as despite the fact that the fragment resulting from that has a size larger than X. This is in line with the basic underflow strategy which copes with such an extreme situation in the same way.

Finally, a minimising partition can be obtained from the sequence , , ...(until a delimiting dummy point t₀ appears). For a detailed discussion of IP-opt refer to section 5.5. Figure 9.10 summarises the algorithm in a form that makes use of IP-tables.

Figure 9.11 shows how the example scenario is partitioned using the minimum-overlaps strategy for X=10. In comparison to the underflow strategy it reduces the total number of overlaps from 13 to 9. Therefore, one can expect this strategy to perform at least as well as the basic underflow strategy for single-processor systems. In the case of a multiprocessor setting, the advantage of having a reduced total number of overlaps might not necessarily pay off: the minimum-overlaps algorithm chooses breakpoints that reduce overlaps possibly at the expense of achieving well balanced fragments. In contrast, the underflow strategy aims to create equally filled fragments. Finally, another disadvantage is the algorithms run time complexity of O(N²) compared to O(N) of the underflow strategy. This implies that the minimum-overlaps strategy should only be applied to small IP-tables or to very big joins, i.e. in a situation in which the time spent on the optimisation through the minimum-overlaps strategy does contribute only marginally to the overall join costs.

**Figure:** Basic algorithm of the minimum-overlaps strategy for relations R and Q.

**Figure:** The partition for the example of figure 5.2 using the minimum-overlaps strategy with a maximum load of X=10.

Next: Variations Up: Minimum-Overlaps Strategies Previous: Minimum-Overlaps Strategies

Thomas Zurek