Dependency on m

In the experiment of previous section, the number m of fragments / partial joins was fixed to a value of 16 which matched the number of processors in the parallel architecture. Here, we want to find out whether this value was a good choice or whether higher values of m lead to better performances. In other words: we want to find out whether it is better to have a small number of relatively big partial joins or a large number of relatively small ones. There are no indications neither from the performance model (chapter 8) nor from the strategies themselves (chapter 9) whether one way or the other is better. Therefore, we have to rely on the observations made in the experiments. In order to reduce the complexity of the experiments and to concentrate on the main issues, we chose four of the eleven strategies on offer, namely the uniform lifespan, basic underflow, primary underflow and the primary minimum-overlaps strategies. The choice was led by the goal to select at least one member of each family and also those that are the most promising with respect to good performances.

Six experiments were conducted to test the dependency on m of these strategies, i.e. one per combination of join and hardware. The performance results are given in tables 10.4, 10.5 and 10.6 and visualised in figures 10.10 to 10.15.

First, we want to analyse the behaviour of the strategies on the parallel architecture. In all cases, the primary underflow strategy is a narrow winner before the primary minimum-overlaps strategy. Both strategies show relatively high costs on low values of m and perform better for higher values, with almost constant costs in the second half of the chart. The costs of the uniform lifespan strategy are between 2 and 3 times (join ), between 1.3 and 2 times (join $R \Join_{\scriptscriptstyle C}Q$ ) and between 2.2 and 3.5 times (join )higher than those of the primary underflow strategy. In general, it also performs better for higher values of m. However, there are some exceptions to this rule such as a sharp rise from m=896 to m=1024 for the join or some significant shaking in the area between m=384 and m=1024 for the join $R \Join_{\scriptscriptstyle C}Q$ . A much more irregular behaviour is shown by the basic underflow strategy. The exact shape of its cost function can only be explained by looking into details of the data. However, it seems that it performs best for lower values of m, which stands in contrast to the other three strategies. In general, it performs worse than the primary underflow and minimum-overlaps strategies.

Now, we turn our attention to the results for the single-processor hardware architecture. Here, the scene looks different: for lower values of m the uniform lifespan strategy has between 10% and 30% higher costs in comparison to the primary underflow strategy, whereas for high values of m it performs between 1% and 4% better. This is a surprising result. It means in other words that the uniform lifespan strategy outperforms all other strategies if simply the value of m is chosen to be high enough. However, the advantage in comparison to other strategies is only very minor (see joins and ) or does not exist (see join $R \Join_{\scriptscriptstyle C}Q$ ). A further interesting result is that the primary minimum-overlaps strategy is better (joins and $R \Join_{\scriptscriptstyle C}Q$ ) or at least as good (join ) as the primary underflow strategy. This is due to the facts that (a) a good load balance between the partial joins is far less important on a single-processor machine in comparison to a parallel one and that (b) the advantage of having a reduced total number of overlapping intervals materialises in the single-processor case.

From the results in tables 10.4, 10.5 and 10.6 we can also derive some information about the speed-up . It is defined as the ratio between the processing times that are required for the same problem on a machine with n processors and on one with 1 processor. Ideally, the speed-up is n in this case. In realistic situations, however, such a speed-up is never achieved because of communication and synchronisation overheads. If we compare the respective lowest costs on the parallel and single-processor machines for the three joins, then we get speed-ups of 9.5 for , 4.6 for $R \Join_{\scriptscriptstyle C}Q$ and 10.6 for .

**Table:** Performance results (in sec.) for the join depending on m.
	4c!width 2pt,	4c\|,
m	1c\|Uniform	1c\|Primary	1c\|Primary	1c!width 2ptBasic	1c\|Uniform	1c\|Primary	1c\|Primary	1c\|Basic
	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c!width 2ptUnderflow	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c\|Underflow
16	3189	1077	1988	1490	12783	11370	11286	11807
32	2111	891	1359	1326	10580	9589	9464	9852
48	1930	843	1141	1439	9675	8839	8750	9650
64	1837	797	1178	1384	9117	8458	8394	9437
80	1743	788	1011	1529	8783	8228	8177	9299
96	1699	777	967	1749	8559	8009	8029	9389
112	1668	776	986	1713	8395	7952	7929	9328
128	1700	763	963	1791	8241	7847	7855	9133
256	1613	757	926	2884	7194	7633	7618	9209
384	1587	752	910	2808	7258	7580	7569	8980
512	1630	770	916	2465	7310	7578	7568	8773
640	1629	788	903	2143	7351	7586	7399	8708
768	1577	788	898	1612	7379	7586	7493	8633
896	1554	788	923	1657	7416	7586	7543	8649
1024	2501	788	935	1582	7093	7586	7594	8666

**Table:** Performance results (in sec.) for the join $R \Join_{\scriptscriptstyle C}Q$ depending on m.
	4c!width 2pt,	4c\|,
m	1c\|Uniform	1c\|Primary	1c\|Primary	1c!width 2ptBasic	1c\|Uniform	1c\|Primary	1c\|Primary	1c\|Basic
	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c!width 2ptUnderflow	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c\|Underflow
16	3779	1901	2458	1884	10322	9248	9029	8629
32	2771	1454	1747	1647	7793	6966	6995	7229
48	2431	1339	1556	1531	6942	6395	6451	6693
64	2238	1295	1380	1468	6522	6111	6109	6320
80	2160	1253	1424	1476	6258	5930	5931	6235
96	2089	1224	1351	1575	6096	5814	5814	5612
112	2040	1209	1389	1697	5974	5740	5736	5616
128	1914	1202	1396	1613	5874	5679	5685	5625
256	1795	1169	1324	2398	5333	5501	5336	5688
384	1763	1147	1301	2274	5335	5376	5358	5704
512	1523	1170	1239	2544	5353	5402	5390	5760
640	1551	1174	1247	3251	5375	5436	5425	5827
768	1758	1171	1260	3432	5393	5472	5462	5863
896	1598	1171	1263	3051	5420	5503	5474	5886
1024	1699	1171	1263	2417	5462	5503	5474	5939

**Table:** Performance results (in sec.) for the join depending on m.
	4c!width 2pt,	4c\|,
m	1c\|Uniform	1c\|Primary	1c\|Primary	1c!width 2ptBasic	1c\|Uniform	1c\|Primary	1c\|Primary	1c\|Basic
	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c!width 2ptUnderflow	1c\|Lifespan	1c\|Underflow	1c\|Min.-O.	1c\|Underflow
16	4327	1243	1753	2122	19029	14551	14621	14739
32	3274	1058	1277	2165	14506	12153	12150	12878
48	2940	1001	1217	2281	12954	11389	11394	12330
64	2800	975	1111	2354	12189	10989	10986	12126
80	2683	964	1102	2590	11724	10773	10769	12076
96	2622	949	1060	2875	11413	10619	10616	12074
112	2575	936	1053	3111	11191	10514	10519	12030
128	2559	941	1047	4434	11010	10442	10441	11995
256	2451	938	1021	5912	10449	10214	10213	11973
384	2426	948	1013	6503	9964	10180	10179	12145
512	2447	963	1015	6522	9973	10195	10194	12062
640	2413	968	1025	6371	9992	10229	10228	12080
768	2430	997	1031	6117	10009	10272	10269	12095
896	2476	1009	1032	5626	10033	10323	10322	12001
1024	2252	1009	1048	2122	10072	10374	10376	12013