Data Partitioning in Temporal Databases


Author: Thomas Zurek
Date: September 1995
Type: Proposal for a PhD Thesis on Temporal Databases
Institution: Department of Computer Science, Edinburgh University

Abstract

Temporal relational query processing can impose considerable demands on the performance of a database management system. The main reasons are the huge size of temporal relations and the complexity of temporal operators. Parallelism can offer solutions to achieve the necessary performance.

In order to successfully implement parallel algorithms for temporal query operators one needs to partition the data over the temporal attribute(s). Latter are usually represented by intervals or rectangles. This makes the data partitioning problem much more complex because data has to be replicated in most cases in order to guarantee the correctness of the respective operation. This can cause significant overhead costs. Such an effect does not exist in conventional parallel database systems.

This paper motivates and proposes a PhD research project for finding and designing techniques to partition temporal relations over their temporal attribute(s). These techniques should minimise the overhead costs caused by the replication of tuples. Only expensive algorithms can find optimal solutions to this problem if they can be found at all (this depends on the actual cost function). Therefore it is necessary to find fast strategies that offer near-optimal solutions. I believe that these can be based on heuristics using statistical information obtained from the database catalog and data sampling.

The paper concludes with a description of the stages and methods of the project. It also gives a rudimentary timetable for performing the stages.


Thomas Zurek, <tz@dcs.ed.ac.uk>