next up previous contents index
Next: Summary, Conclusions and Future Up: Using IP-Tables for Selectivity Previous: Parallel and Other Partitioned

Summary

  In this chapter, we have shown an analytical way of calculating temporal join result sizes or - respectively - temporal join selectivities. To our knowledge, there has only been one paper discussing the selectivity estimation for temporal joins [Segev et al., 1993]. Its approach requires that the statistical process that creates the timestamps  is either well understood or follows certain standard probability distributions such as the Poisson distribution  for interval startpoints or the Erlang-n distribution  for interval lengths. The first case is quite rare: imagine the example of the distribution and lengths of telephone calls which depend on many statistical processes that are influenced by holidays, pricing, marketing or TV campaigns and even the weather. It is difficult to incorporate all these effects into a thorough statistical model for a query optimiser. In the second case, the assumptions can be erroneous for the same reasons.

In contrast to that, our technique is based on the information stored in IP-tables. For a set of elementary temporal joins, exact result sizes can be computed (section 11.3.1). For cases of temporal joins that arise from a composition of the elementary join conditions we gave the formulas (11.3) and (11.4). These allow to derive the result sizes of composite temporal joins from those of the elementary joins that are involved (section 11.3.2). Finally, we also provided a way to calculate result sizes of partial temporal joins that occur in parallel join processing (section 11.3.3).

The advantages of our analytical approach as opposed to statistical ones are


next up previous contents index
Next: Summary, Conclusions and Future Up: Using IP-Tables for Selectivity Previous: Parallel and Other Partitioned

Thomas Zurek