In the experiment of previous section, the number m of fragments / partial joins was fixed to a value of 16 which matched the number of processors in the parallel architecture. Here, we want to find out whether this value was a good choice or whether higher values of m lead to better performances. In other words: we want to find out whether it is better to have a small number of relatively big partial joins or a large number of relatively small ones. There are no indications neither from the performance model (chapter 8) nor from the strategies themselves (chapter 9) whether one way or the other is better. Therefore, we have to rely on the observations made in the experiments. In order to reduce the complexity of the experiments and to concentrate on the main issues, we chose four of the eleven strategies on offer, namely the uniform lifespan, basic underflow, primary underflow and the primary minimum-overlaps strategies. The choice was led by the goal to select at least one member of each family and also those that are the most promising with respect to good performances.
Six experiments were conducted to test the dependency on m of these strategies, i.e. one per combination of join and hardware. The performance results are given in tables 10.4, 10.5 and 10.6 and visualised in figures 10.10 to 10.15.
First, we want to analyse the behaviour of the strategies on the parallel architecture. In all cases, the primary underflow strategy is a narrow winner before the primary minimum-overlaps strategy. Both strategies show relatively high costs on low values of m and perform better for higher values, with almost constant costs in the second half of the chart. The costs of the uniform lifespan strategy are between 2 and 3 times (join ), between 1.3 and 2 times (join ) and between 2.2 and 3.5 times (join )higher than those of the primary underflow strategy. In general, it also performs better for higher values of m. However, there are some exceptions to this rule such as a sharp rise from m=896 to m=1024 for the join or some significant shaking in the area between m=384 and m=1024 for the join . A much more irregular behaviour is shown by the basic underflow strategy. The exact shape of its cost function can only be explained by looking into details of the data. However, it seems that it performs best for lower values of m, which stands in contrast to the other three strategies. In general, it performs worse than the primary underflow and minimum-overlaps strategies.
Now, we turn our attention to the results for the single-processor hardware architecture. Here, the scene looks different: for lower values of m the uniform lifespan strategy has between 10% and 30% higher costs in comparison to the primary underflow strategy, whereas for high values of m it performs between 1% and 4% better. This is a surprising result. It means in other words that the uniform lifespan strategy outperforms all other strategies if simply the value of m is chosen to be high enough. However, the advantage in comparison to other strategies is only very minor (see joins and ) or does not exist (see join ). A further interesting result is that the primary minimum-overlaps strategy is better (joins and ) or at least as good (join ) as the primary underflow strategy. This is due to the facts that (a) a good load balance between the partial joins is far less important on a single-processor machine in comparison to a parallel one and that (b) the advantage of having a reduced total number of overlapping intervals materialises in the single-processor case.
From the results in tables 10.4, 10.5 and 10.6 we can also derive some information about the speed-up . It is defined as the ratio between the processing times that are required for the same problem on a machine with n processors and on one with 1 processor. Ideally, the speed-up is n in this case. In realistic situations, however, such a speed-up is never achieved because of communication and synchronisation overheads. If we compare the respective lowest costs on the parallel and single-processor machines for the three joins, then we get speed-ups of 9.5 for , 4.6 for and 10.6 for .
4c!width 2pt, | 4c|, | |||||||||||||
m | 1c|Uniform | 1c|Primary | 1c|Primary | 1c!width 2ptBasic | 1c|Uniform | 1c|Primary | 1c|Primary | 1c|Basic | ||||||
1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c!width 2ptUnderflow | 1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c|Underflow | |||||||
16 | 3189 | 1077 | 1988 | 1490 | 12783 | 11370 | 11286 | 11807 | ||||||
32 | 2111 | 891 | 1359 | 1326 | 10580 | 9589 | 9464 | 9852 | ||||||
48 | 1930 | 843 | 1141 | 1439 | 9675 | 8839 | 8750 | 9650 | ||||||
64 | 1837 | 797 | 1178 | 1384 | 9117 | 8458 | 8394 | 9437 | ||||||
80 | 1743 | 788 | 1011 | 1529 | 8783 | 8228 | 8177 | 9299 | ||||||
96 | 1699 | 777 | 967 | 1749 | 8559 | 8009 | 8029 | 9389 | ||||||
112 | 1668 | 776 | 986 | 1713 | 8395 | 7952 | 7929 | 9328 | ||||||
128 | 1700 | 763 | 963 | 1791 | 8241 | 7847 | 7855 | 9133 | ||||||
256 | 1613 | 757 | 926 | 2884 | 7194 | 7633 | 7618 | 9209 | ||||||
384 | 1587 | 752 | 910 | 2808 | 7258 | 7580 | 7569 | 8980 | ||||||
512 | 1630 | 770 | 916 | 2465 | 7310 | 7578 | 7568 | 8773 | ||||||
640 | 1629 | 788 | 903 | 2143 | 7351 | 7586 | 7399 | 8708 | ||||||
768 | 1577 | 788 | 898 | 1612 | 7379 | 7586 | 7493 | 8633 | ||||||
896 | 1554 | 788 | 923 | 1657 | 7416 | 7586 | 7543 | 8649 | ||||||
1024 | 2501 | 788 | 935 | 1582 | 7093 | 7586 | 7594 | 8666 |
4c!width 2pt, | 4c|, | |||||||||||||
m | 1c|Uniform | 1c|Primary | 1c|Primary | 1c!width 2ptBasic | 1c|Uniform | 1c|Primary | 1c|Primary | 1c|Basic | ||||||
1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c!width 2ptUnderflow | 1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c|Underflow | |||||||
16 | 3779 | 1901 | 2458 | 1884 | 10322 | 9248 | 9029 | 8629 | ||||||
32 | 2771 | 1454 | 1747 | 1647 | 7793 | 6966 | 6995 | 7229 | ||||||
48 | 2431 | 1339 | 1556 | 1531 | 6942 | 6395 | 6451 | 6693 | ||||||
64 | 2238 | 1295 | 1380 | 1468 | 6522 | 6111 | 6109 | 6320 | ||||||
80 | 2160 | 1253 | 1424 | 1476 | 6258 | 5930 | 5931 | 6235 | ||||||
96 | 2089 | 1224 | 1351 | 1575 | 6096 | 5814 | 5814 | 5612 | ||||||
112 | 2040 | 1209 | 1389 | 1697 | 5974 | 5740 | 5736 | 5616 | ||||||
128 | 1914 | 1202 | 1396 | 1613 | 5874 | 5679 | 5685 | 5625 | ||||||
256 | 1795 | 1169 | 1324 | 2398 | 5333 | 5501 | 5336 | 5688 | ||||||
384 | 1763 | 1147 | 1301 | 2274 | 5335 | 5376 | 5358 | 5704 | ||||||
512 | 1523 | 1170 | 1239 | 2544 | 5353 | 5402 | 5390 | 5760 | ||||||
640 | 1551 | 1174 | 1247 | 3251 | 5375 | 5436 | 5425 | 5827 | ||||||
768 | 1758 | 1171 | 1260 | 3432 | 5393 | 5472 | 5462 | 5863 | ||||||
896 | 1598 | 1171 | 1263 | 3051 | 5420 | 5503 | 5474 | 5886 | ||||||
1024 | 1699 | 1171 | 1263 | 2417 | 5462 | 5503 | 5474 | 5939 |
4c!width 2pt, | 4c|, | |||||||||||||
m | 1c|Uniform | 1c|Primary | 1c|Primary | 1c!width 2ptBasic | 1c|Uniform | 1c|Primary | 1c|Primary | 1c|Basic | ||||||
1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c!width 2ptUnderflow | 1c|Lifespan | 1c|Underflow | 1c|Min.-O. | 1c|Underflow | |||||||
16 | 4327 | 1243 | 1753 | 2122 | 19029 | 14551 | 14621 | 14739 | ||||||
32 | 3274 | 1058 | 1277 | 2165 | 14506 | 12153 | 12150 | 12878 | ||||||
48 | 2940 | 1001 | 1217 | 2281 | 12954 | 11389 | 11394 | 12330 | ||||||
64 | 2800 | 975 | 1111 | 2354 | 12189 | 10989 | 10986 | 12126 | ||||||
80 | 2683 | 964 | 1102 | 2590 | 11724 | 10773 | 10769 | 12076 | ||||||
96 | 2622 | 949 | 1060 | 2875 | 11413 | 10619 | 10616 | 12074 | ||||||
112 | 2575 | 936 | 1053 | 3111 | 11191 | 10514 | 10519 | 12030 | ||||||
128 | 2559 | 941 | 1047 | 4434 | 11010 | 10442 | 10441 | 11995 | ||||||
256 | 2451 | 938 | 1021 | 5912 | 10449 | 10214 | 10213 | 11973 | ||||||
384 | 2426 | 948 | 1013 | 6503 | 9964 | 10180 | 10179 | 12145 | ||||||
512 | 2447 | 963 | 1015 | 6522 | 9973 | 10195 | 10194 | 12062 | ||||||
640 | 2413 | 968 | 1025 | 6371 | 9992 | 10229 | 10228 | 12080 | ||||||
768 | 2430 | 997 | 1031 | 6117 | 10009 | 10272 | 10269 | 12095 | ||||||
896 | 2476 | 1009 | 1032 | 5626 | 10033 | 10323 | 10322 | 12001 | ||||||
1024 | 2252 | 1009 | 1048 | 2122 | 10072 | 10374 | 10376 | 12013 |