In section 7.3.3 we proposed to `condense' IP-tables in order to reduce their size. The idea was to collapse a of the original entries into one new entry. The new IP-table would have only of the entries and therefore only of the size of the original IP-table. A special form of condensation is the endpoint IP-table in which only those entries appear which correspond to interval endpoints. Theoretically, condensation can be expected to have two effects:
The effect on the minimum-overlaps strategies is difficult to predict. As mentioned before, their prime goal is to reduce the total number of overlaps. During this process, they possibly make concessions that worsen the load balance . It will be interesting to see whether the minimum-overlaps strategies perform better or worse with an increasing value for a.
First, we analyse the graph of figure 10.48 which shows the
performances for primary underflow partitioning on a parallel
architecture. As expected, the performances become worse for an
increasing a, but only for the joins and .For the join there is a rather unexpected effect: for most
values of a, the performance is slightly better in comparison to
a=1 or . This is rather surprising and cannot be
explained from an algorithmic point of view. Its origin lies either in
the data itself or in the fact that the chosen XR and XQ values
(see above) were not optimal for
and that the summarising
effect of condensation corrected this a little bit as outlined above.
Also, when processing ,
and , the
strategies use the timepoint sets
respectively. However, the set obviously contains much more timepoints to choose from than V'(R,a) or V'(Q,a). Because of condensation there are hardly coinciding timepoints which leads to
for . This advantage translates into a much better resistance against the negative effects of condensation for the join
Now we turn to figure 10.49 which shows the averaged
performances on the parallel architecture for the primary
minimum-overlaps strategy depending on a. We note that there is a
rather positive effect of condensation in the first part of the chart.
The best performances are achieved around a=16 (for ),
a=22 (for ) and a=19 (for ). Previously, we
observed that there is a relatively severe penalty caused by the
concessions that the strategy makes with respect to the load balance
in order to achieve a minimum number of overlaps. It seems that the
strategy can make less concessions (a) if there are less timepoints
from which it can choose and (b) if the values
and
are larger. Naturally, if the choice becomes too
restricted then there is a negative impact on the performance.
Therefore, there is a tradeoff between restricting the choice a little
bit but not too much. This is exactly what can be seen in the graphs
of figure 10.49.
The scenery changes when it comes to the single-processor architecture
(figures 10.46 and 10.47). For the joins (join 1) and (join 3), the impact of an
increasing condensation of the IP-tables on the performance is
generally negative. Again, processing of the join
(join 2) resists much better against the negative effects of
condensation because of the reasons mentioned above. We can even
observe a slight performance benefit which is probably caused by the
same reasons as mentioned earlier.
Finally, we turn our attention to the effect that condensed IP-tables have on the performance of the optimisation itself. Figure 10.50 shows the times that the optimisation process took when being run on a two-processor Sun SS20 computing server. As expected, we see that the primary minimum-overlaps partitioning benefits most of it due to its time complexity of O(N2). It is interesting that - in terms of elapsed time - it is roughly as fast as primary underflow partitioning for .
In summary, it is fair to say that experiments show that condensation does not only have positive effects on the performance the optimisation process itself but also that the anticipated negative effects are not that severe at all. The primary minimum-overlaps strategy in particular can draw performance benefits when using condensed IP-tables. Generally, it is possible to say that a condensation factor a between approx. 10 and 20 can improve the performances in many cases or at least does not severely penalise the performances in the other cases. This is an interesting result with respect to section 7.3 in which we addressed the problem of IP-table size. It means that the sizes of IP-tables which were already comparable or lower than those of data samples could be further decreased to between to of the original size without paying a severe performance penalty and sometimes even improving the join performances.
3c!width 2ptPrimary Underflow | 3c|Primary Min.-Overlaps | |||||||||||
a | ![]() |
![]() |
||||||||||
1c|(Join 1) | 1c|(Join 2) | 1c!width 2pt(Join 3) | 1c|(Join 1) | 1c|(Join 2) | 1c|(Join 3) | |||||||
end | 716 | 1141 | 896 | 1111 | 1336 | 1114 | ||||||
1 | 716 | 1141 | 896 | 1111 | 1336 | 1114 | ||||||
2 | 807 | 1131 | 976 | 1085 | 1327 | 1109 | ||||||
3 | 795 | 1178 | 962 | 1026 | 1346 | 1063 | ||||||
4 | 801 | 1140 | 957 | 1007 | 1302 | 1044 | ||||||
5 | 777 | 1105 | 947 | 979 | 1234 | 1014 | ||||||
6 | 758 | 1144 | 921 | 968 | 1237 | 948 | ||||||
7 | 819 | 1083 | 925 | 981 | 1229 | 1021 | ||||||
8 | 780 | 1107 | 889 | 932 | 1299 | 1063 | ||||||
9 | 787 | 1138 | 1026 | 920 | 1302 | 1028 | ||||||
10 | 823 | 1063 | 950 | 935 | 1139 | 926 | ||||||
11 | 825 | 1091 | 942 | 986 | 1282 | 975 | ||||||
12 | 791 | 1096 | 1004 | 931 | 1318 | 1045 | ||||||
13 | 814 | 1095 | 1020 | 908 | 1205 | 1117 | ||||||
14 | 749 | 1089 | 1067 | 848 | 1180 | 1058 | ||||||
15 | 802 | 1160 | 1041 | 853 | 1217 | 966 | ||||||
16 | 771 | 1038 | 1006 | 936 | 1176 | 970 | ||||||
17 | 793 | 1050 | 913 | 952 | 1207 | 1039 | ||||||
18 | 792 | 1179 | 933 | 964 | 1122 | 960 | ||||||
19 | 820 | 1117 | 966 | 985 | 1132 | 1019 | ||||||
20 | 788 | 1242 | 1002 | 959 | 1128 | 1055 | ||||||
21 | 755 | 1090 | 1041 | 1049 | 1159 | 1101 | ||||||
22 | 804 | 1055 | 1027 | 1024 | 1111 | 1144 | ||||||
23 | 774 | 1016 | 1061 | 1011 | 1301 | 1100 | ||||||
24 | 864 | 1036 | 1047 | 1079 | 1274 | 1141 | ||||||
25 | 859 | 999 | 1027 | 1065 | 1172 | 1166 | ||||||
26 | 762 | 1030 | 1089 | 1076 | 1188 | 1112 | ||||||
27 | 856 | 1061 | 1098 | 1097 | 1175 | 1154 | ||||||
28 | 918 | 1102 | 1179 | 1117 | 1147 | 1173 | ||||||
29 | 1000 | 1009 | 1092 | 1087 | 1206 | 1096 | ||||||
30 | 959 | 1073 | 1114 | 1054 | 1067 | 1152 | ||||||
32 | 871 | 1103 | 1236 | 969 | 1069 | 1227 | ||||||
36 | 1015 | 1057 | 1249 | 1134 | 1071 | 1238 | ||||||
40 | 971 | 1064 | 1217 | 1161 | 1196 | 1196 | ||||||
44 | 1070 | 1016 | 1327 | 1331 | 1283 | 1283 | ||||||
48 | 1058 | 1116 | 1260 | 1334 | 1154 | 1265 | ||||||
52 | 1139 | 1043 | 1352 | 1096 | 1277 | 1299 | ||||||
56 | 1140 | 1165 | 1420 | 1410 | 1201 | 1420 | ||||||
60 | 1237 | 1235 | 1288 | 1643 | 1212 | 1350 |
3c!width 2ptPrimary Underflow | 3c|Primary Min.-Overlaps | |||||||||||
a | ![]() |
![]() |
||||||||||
1c|(Join 1) | 1c|(Join 2) | 1c!width 2pt(Join 3) | 1c|(Join 1) | 1c|(Join 2) | 1c|(Join 3) | |||||||
a | Join 1 | Join 2 | Join 3 | Join 1 | Join 2 | Join 3 | ||||||
end | 7898 | 5560 | 10410 | 7877 | 5362 | 10410 | ||||||
1 | 7898 | 5560 | 10410 | 7877 | 5361 | 10410 | ||||||
2 | 7897 | 5555 | 10406 | 7880 | 5572 | 10406 | ||||||
3 | 7934 | 5550 | 10402 | 7882 | 5564 | 10400 | ||||||
4 | 7922 | 5554 | 10394 | 7888 | 5567 | 10394 | ||||||
5 | 7976 | 5550 | 10394 | 7888 | 5564 | 10389 | ||||||
6 | 7970 | 5541 | 10390 | 7900 | 5371 | 10383 | ||||||
7 | 7961 | 5531 | 10381 | 7898 | 5371 | 10379 | ||||||
8 | 7985 | 5532 | 10378 | 7894 | 5562 | 10366 | ||||||
9 | 7993 | 5377 | 10370 | 7896 | 5373 | 10379 | ||||||
10 | 7957 | 5386 | 10366 | 7898 | 5370 | 10361 | ||||||
11 | 7991 | 5378 | 10365 | 7914 | 5546 | 10345 | ||||||
12 | 8005 | 5377 | 10372 | 7913 | 5380 | 10355 | ||||||
13 | 8013 | 5363 | 10368 | 7916 | 5376 | 10365 | ||||||
14 | 8009 | 5381 | 10354 | 7922 | 5389 | 10359 | ||||||
15 | 8037 | 5380 | 10344 | 7915 | 5388 | 10341 | ||||||
16 | 8034 | 5386 | 10331 | 7556 | 5390 | 10320 | ||||||
17 | 8065 | 5388 | 10331 | 7922 | 5389 | 10311 | ||||||
18 | 8088 | 5396 | 10332 | 7928 | 5402 | 10311 | ||||||
19 | 8098 | 5382 | 10333 | 7952 | 5389 | 10314 | ||||||
20 | 8107 | 5360 | 10330 | 7977 | 5371 | 10311 | ||||||
21 | 8144 | 5377 | 10332 | 7999 | 5400 | 10313 | ||||||
22 | 8177 | 5367 | 10344 | 8023 | 5397 | 10322 | ||||||
23 | 8191 | 5368 | 10353 | 8047 | 5404 | 10331 | ||||||
24 | 8204 | 5377 | 10363 | 8075 | 5419 | 10342 | ||||||
25 | 8214 | 5368 | 10369 | 8087 | 5414 | 10353 | ||||||
26 | 8245 | 5367 | 10382 | 8115 | 5415 | 10364 | ||||||
27 | 8282 | 5374 | 10393 | 8140 | 5421 | 10377 | ||||||
28 | 8314 | 5354 | 10405 | 8163 | 5404 | 10392 | ||||||
29 | 8321 | 5350 | 10416 | 8188 | 5376 | 10403 | ||||||
30 | 8335 | 5354 | 10429 | 8215 | 5404 | 10418 | ||||||
32 | 8395 | 5358 | 10457 | 8259 | 5390 | 10441 | ||||||
36 | 8490 | 5362 | 10511 | 8370 | 5380 | 10499 | ||||||
40 | 8583 | 5335 | 10567 | 8468 | 5344 | 10554 | ||||||
44 | 8687 | 5306 | 10627 | 8574 | 5313 | 10617 | ||||||
48 | 8785 | 5321 | 10685 | 8669 | 5316 | 10677 | ||||||
52 | 8859 | 5306 | 10744 | 8770 | 5302 | 10735 | ||||||
56 | 8961 | 5325 | 10807 | 8863 | 5314 | 10796 | ||||||
60 | 9044 | 5293 | 10871 | 8952 | 5281 | 10863 |