...NAME="547"> 
While CPU clock speeds double every 18 month on average (i.e. a 60% increase per year), the bandwidth of single-disk I/O devices increases by around 10% every year [Gray, 1995].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...values
For a detailed discussion, we refer the reader to chapter 3 where several partitioning methods, such as sorting and hashing, are described and analysed.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...time
See section 2.3 for a description of this concept.  
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...conventional
By the term conventional DBMS we refer to DBMS which do not provide specific support for temporal data processing. 
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...aspects
There are critics who argue that time is just another attribute in all respects. The reader might look at [Davies et al., 1995].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...offices
For simplicity, we assume that these are the current offices for persons who are still working in the department and the last occupied offices for persons who have already left.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...period
For our purposes, the definitions of a period and an interval are identical. TSQL2, for example, uses the term period to refer to our notion of interval because the term interval has already been used in SQL92 for a different concept.    
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...databases
An operational database manages the data that is required for day-to-day operations of a company, such as reservation systems in the case of a travel company. This stands in contrast to data warehouses whose purpose is to provide information to support decision-taking in the management of a company, such as customer behaviour and market trends.  
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...snapshots
We note that this is a gross over-simplification. The data warehouse, for example, must also take into account changes in the schemata and in the semantics of the data over time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...$R \times
Q$
In strict terms this is not true because the cartesian product results in tuple pairs (r,q) whereas the join creates tuples that origin in concatenations $r \circ q$ of tuples. This formality, however, is usually ignored by many authors. For our purposes, it can be ignored too.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...model
See [CODASYL, 1971], [Codd, 1970] and [Cattell, 1996] for details of these models.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...condition
Please remember that, in the scenario, names are assumed to be unique for simplicity.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...similar
The notion of `similar' depends on the data type; for numeric values this might be a low difference in values whereas for strings it can be same prefixes etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...tuples
or references to tuples to reduce the amount of memory required.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...attributes
or, more generally, two foreign key attribute sets, as a key can consist of more than one attribute.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...algebra
See [Korth and Silberschatz, 1991] or [Lockemann et al., 1993] for example.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...attribute
Exceptions are those temporal join conditions that involve an equality relationship between interval start- and endpoints, such as the start or meet joins in table 4.1. These can be processed by one of the conventional equi-join techniques.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...joins
Alternative names for the temporal intersection join are T-join  [Segev, 1993] and time-join  [Rana and Fotouhi, 1993]. Segev's TE-join  includes, apart from the intersection condition, an equi-join condition, as does the valid-time natural join  [Clifford and Croker, 1987], [Soo et al., 1994], the natural time-join  [Clifford and Croker, 1987] and the time-intersection equi-join   [Segev, 1993].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...time
Actually this is only a special situation for the case that the cause is immediately followed by the effect. There are many examples for which the effect is delayed for a certain constant period. Such constant delays are not a problem and can be incorporated in the join condition without changing any of the issues that have been elaborated for the `same time' notion.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...applications
See discussion in section 10.1 on this issue.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...fragments
In this section, we use the term fragment for both, the actual subset of tuples and the fragmentation area of the two dimensional grid that determines which tuples become members of that subset. This is intended to simplify the description and to help the reader rather than to disturb him/her.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...three
This allows the comparison of the resulting situation with those that are illustrated in figures 4.8 and 4.11.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...comparisons
or tuple accesses in this particular case.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...homogeneous
A heterogeneous cluster of processors, e.g. a network of workstations, will require a specific load balancing  approach that creates higher loads for more powerful nodes and smaller loads for less powerful ones.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...replication
Please remember that by this we mean a logical replication  which might, but does not necessarily translate into a physical replication . Soo et al.'s join algorithm, for example, does not physically replicate tuples although several tuples are logically replicated because they are present in more than one fragment of the relation (see section 4.4.4).  
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...ordered
As a slight abuse of set theory, we require some sets to be ordered. This is for naming purposes only and simplifies the notation in general. Therefore we refrain from introducing separate brackets and operator symbols to distinguish between conventional and ordered sets as this would probably confuse the reader more than our slightly abusive notation. 
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...fragment
Therefore, an interval falls into both, the left and the right fragment if it starts at the breakpoint.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...HREF="node46.htm#eq:ip:1">5.6)
Later, after having proved that also satisfies the second constraint of IP, we can conclude that the two sums are equal; otherwise P would not be a minimising partition as required by the lemma.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...endpoints
The reason for including S(R) is convenience. In principle, the reduction that is presented later in this section can be modified to concentrate on the endpoints set E(R) only.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...partition
In the sense as it was defined in chapter 5.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...Q
This does not imply a restriction to 2-way temporal joins. The techniques that we propose in this thesis can also be applied to n-way joins with $n \ge 3$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...endpoints
As you might remember, theorem 1 says that an optimal partition  can be found within the set E(R) of endpoints of intervals in R. To simplify the definition of an IP-table we concentrate on at the moment and show a reduction of an IP-table to values in section 7.3.4.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...that
Please remember the comment made in the footnote on page [*] with respect to the notation for conventional and ordered sets.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...6 bytes
e.g. one byte per day, month, year, hour, minute, second.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...usually
Considering the majority of compilers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...`resolution'
By analogy with the sense in which this term is used for images.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
In the remainder of this subsection we will only use the notation for condensed IP-tables. This is for improving the readability of the text. Nevertheless, everything that applies to condensed IP-tables equally applies to endpoint IP-tables in this context.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...I'(Q,b)
As mentioned earlier, we use the notation for condensed IP-tables in this case although the techniques apply to endpoint IP-tables too.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...discussion
See summary in section 8.2.2.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...memory
See discussion about shared-disk.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...machines
The interested reader might refer to papers such as [Teradata Corporation, 1983], [Teradata Corporation, 1985], [Carino and Kostamaa, 1992], [Sloan, 1992], [Witkowski, 1993] and many others to get details about Teradata machines and their successors.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...components
This argument his entirely historical as many SM systems are now entirely built of commodity parts whereas SN systems frequently have proprietary (and therefore expensive) interconnect.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...tuples
It is for simplicity reasons that we assume that these buffers keep tuples. Alternatively, they might hold references to tuples or one could create 2m index structures for describing the fragments. This would require very detailed performance modeling without providing any benefit for our purposes. Therefore we assume that the fragments are to be materialised.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...sorting
One could argue that the relation might originally be sorted. However, it is difficult to maintain such a sort-order during repartitioning, especially when communication over the interconnect is involved. Most protocols do not guarantee that messages sent in a certain order also arrive in this order. One would then need to restore this order on arrival of the messages. Thus the usage of a sort-merge join algorithm would impose an overhead under any circumstances.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...i.e.
We note that repartitioning applies to all relations that participate in the join, i.e. R and Q in the prototypical case. Therefore the cost components C1(a), C1(b), C1(c), C1(d) are computed for R using R's parameters - this is indicated by C1(a)(R), C1(b)(R) etc. - and for Q using Q's parameters - this is indicated by C1(a)(Q), C1(b)(Q) etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...blockwise
or - in other terms - pagewise.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...approach
See section 3.4.1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...strategies
In the remainder, we will frequently use the term strategy  as a shortcut for partitioning strategy.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...approach
See algorithm ChooseIntervals in [Soo et al., 1994].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...function
Here, we refer to $o_{\scriptscriptstyle R}(t)$ assuming that R is to be partitioned. However, the technique that we describe can be applied to any temporal relation, in particular also to $R \cup Q$ in which case would have to be used rather than $o_{\scriptscriptstyle R}(t)$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
The number of minutes per day is .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
Sometimes we will refer to these joins also as join 1 , join 2  and join 3  respectively.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...fragment
See if-statement in figures 9.7 (page [*]) and 9.9 (page [*]).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
...or and in the case of condensed IP-tables.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...negative
with one single exception, i.e. for join 1, a=16 and the primary minimum-overlaps strategy (figure 10.47).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...endpoints
In the context of the phone calls scenario this would mean that the time when a phone call starts should not imply a certain length of the phone call.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Thomas Zurek