Optimisation of Partitioned Temporal Joins

Author:	Thomas Zurek
Date:	July 1997
Published in:	Proc. of the BNCOD'97 Conference, London, United Kingdom
Publisher:	Springer, LNCS 1271
Pages:	101 - 115

Download

Full Paper (120 kB)
HTML-Version of the conference talk (90 kB)
slides of my conference talk: landscape, portrait slides

Abstract

Partitioning data for temporal join processing is not trivial because tuples have to be replicated between data fragments. This causes three types of overheads: (a) an overhead caused by the replication process itself, and (b) a processing overhead caused by the `additional' work that has to be done and (c) an overhead for removing duplicates in the result. Previous work has mainly concentrated on avoiding (a) but still suffers from the consequences of (b) and (c).

In this paper, we show how partitioned temporal joins can be optimised for sequential and parallel processing by reducing tuple replication, thereby reducing the total overhead. For that purpose, a new data structure, namely the IP-table, is introduced. The idea is to have IP-tables of individual temporal relations stored in the database system's catalog from which they can be retrieved for the optimisation process. IP-tables of two or more temporal relations might be required for optimisations, too. These can be created by merging IP-tables of individual relations at optimisation time -- a fast and straightforward process.

IP-tables can be used for creating and analysing partitions over interval timestamps. Three strategies for partitioning interval data are presented, each of which can be easily and efficiently implemented using IP-tables. The performance determining parameters of a partition can also be derived from IP-tables. Finally, we give an example of the optimisation working for a parallel temporal join.

Keywords

temporal databases, temporal join, explicit partitioning, parallel processing

Thomas Zurek, <tz@dcs.ed.ac.uk>