- ...NAME="547">
- While CPU clock speeds double every 18 month on
average (i.e. a 60% increase per year), the bandwidth of single-disk
I/O devices increases by around 10% every year [Gray, 1995].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...values
- For a detailed discussion, we refer the reader to
chapter 3 where several partitioning methods, such as
sorting and hashing, are described and analysed.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...time
- See section 2.3 for a
description of this concept.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...conventional
- By the term conventional DBMS we refer to
DBMS which do not provide specific support for temporal data
processing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...aspects
- There are critics who argue that time
is just another attribute in all respects. The reader might look at
[Davies et al., 1995].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...offices
- For simplicity, we assume that these
are the current offices for persons who are still working in the
department and the last occupied offices for persons who have already
left.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...period
- For our purposes, the definitions of a
period and an interval are identical. TSQL2, for example,
uses the term period to refer to our notion of interval because
the term interval has already been used in SQL92 for a different
concept.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...databases
- An operational database
manages the data that is required for day-to-day operations of a
company, such as reservation systems in the case of a travel company.
This stands in contrast to data warehouses whose purpose is to provide
information to support decision-taking in the management of a company,
such as customer behaviour and market trends.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...snapshots
- We note that this is a
gross over-simplification. The data warehouse, for example, must also
take into account changes in the schemata and in the semantics of the
data over time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
- In strict terms this is not true because the cartesian
product results in tuple pairs (r,q) whereas the join creates tuples
that origin in concatenations
of tuples. This formality,
however, is usually ignored by many authors. For our purposes, it can
be ignored too.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...model
- See [CODASYL, 1971], [Codd, 1970] and
[Cattell, 1996] for details of these models.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...condition
- Please
remember that, in the scenario, names are assumed to be unique for
simplicity.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...similar
- The notion of `similar' depends on
the data type; for numeric values this might be a low difference in
values whereas for strings it can be same prefixes etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...tuples
- or references to
tuples to reduce the amount of memory required.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...attributes
- or, more generally, two foreign key attribute sets,
as a key can consist of more than one attribute.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...algebra
- See [Korth and Silberschatz, 1991] or
[Lockemann et al., 1993] for example.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...attribute
- Exceptions are those temporal
join conditions that involve an equality relationship between interval
start- and endpoints, such as the start or meet joins in
table 4.1. These can be processed by one of the
conventional equi-join techniques.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...joins
- Alternative names for the temporal intersection
join are T-join
[Segev, 1993] and time-join [Rana and Fotouhi, 1993].
Segev's TE-join includes, apart from the
intersection condition, an equi-join condition, as does the
valid-time natural join
[Clifford and Croker, 1987], [Soo et al., 1994], the natural
time-join [Clifford and Croker, 1987] and the
time-intersection equi-join
[Segev, 1993].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...time
- Actually this is only a special situation for the case
that the cause is immediately followed by the effect. There are many
examples for which the effect is delayed for a certain constant
period. Such constant delays are not a problem and can be incorporated
in the join condition without changing any of the issues that have
been elaborated for the `same time' notion.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...applications
- See discussion in
section 10.1 on this issue.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...fragments
- In this
section, we use the term fragment for both, the actual subset of
tuples and the fragmentation area of the two dimensional grid
that determines which tuples become members of that subset. This is
intended to simplify the description and to help the reader rather
than to disturb him/her.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...three
- This
allows the comparison of the resulting situation with those that are
illustrated in figures 4.8 and 4.11.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...comparisons
- or tuple accesses
in this particular case.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...homogeneous
- A heterogeneous
cluster of processors, e.g. a network of workstations, will require a
specific load balancing approach that creates
higher loads for more powerful nodes and smaller loads for less
powerful ones.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...replication
- Please remember that by this we mean a
logical replication which might, but does
not necessarily translate into a physical replication . Soo et al.'s join algorithm, for example, does not
physically replicate tuples although several tuples are logically
replicated because they are present in more than one fragment of the
relation (see section 4.4.4).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...ordered
- As a slight abuse of set theory, we require some
sets to be ordered. This is for naming purposes only and simplifies
the notation in general. Therefore we refrain from introducing
separate brackets and operator symbols to distinguish between
conventional and ordered sets as this would probably confuse the
reader more than our slightly abusive
notation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...fragment
- Therefore, an interval falls into
both, the left and the right fragment if it starts at the
breakpoint.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...HREF="node46.htm#eq:ip:1">5.6)
- Later, after having proved that also satisfies the second constraint of IP, we can conclude that the
two sums are equal; otherwise P would not be a minimising partition
as required by the lemma.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...endpoints
- The reason for including S(R) is convenience.
In principle, the reduction that is presented later in this section can
be modified to concentrate on the endpoints set E(R) only.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...partition
- In the sense as it was
defined in chapter 5.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Q
- This does not imply a restriction to 2-way temporal
joins. The techniques that we propose in this thesis can also be
applied to n-way joins with
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...endpoints
- As you might remember,
theorem 1 says that an optimal partition can be found within the set E(R) of endpoints of
intervals in R. To simplify the definition of an IP-table we
concentrate on at the moment and show a reduction of
an IP-table to values in
section 7.3.4.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...that
- Please remember the comment made in the footnote
on page
with respect to the notation for
conventional and ordered sets.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...6 bytes
- e.g. one byte per day, month, year, hour, minute,
second.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...usually
- Considering the majority of compilers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...`resolution'
- By analogy with the sense
in which this term is used for images.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
- In the remainder of this subsection we will only use
the notation for condensed IP-tables. This is for improving the
readability of the text. Nevertheless, everything that applies to
condensed IP-tables equally applies to endpoint IP-tables in this
context.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...I'(Q,b)
- As mentioned earlier, we use the notation
for condensed IP-tables in this case although the techniques apply to
endpoint IP-tables too.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...discussion
- See summary in
section 8.2.2.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...memory
- See discussion
about shared-disk.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...machines
- The interested reader might refer to papers such as
[Teradata Corporation, 1983], [Teradata Corporation, 1985], [Carino and Kostamaa, 1992], [Sloan, 1992],
[Witkowski, 1993] and many others to get details about Teradata
machines and their successors.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...components
- This argument his
entirely historical as many SM systems are now entirely built of
commodity parts whereas SN systems frequently have proprietary (and
therefore expensive) interconnect.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...tuples
- It is for simplicity reasons that we assume that
these buffers keep tuples. Alternatively, they might hold references
to tuples or one could create 2m index structures for describing the
fragments. This would require very detailed performance modeling
without providing any benefit for our purposes. Therefore we assume
that the fragments are to be materialised.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...sorting
- One could argue that the relation
might originally be sorted. However, it is difficult to maintain such
a sort-order during repartitioning, especially when communication over
the interconnect is involved. Most protocols do not guarantee that
messages sent in a certain order also arrive in this order. One would
then need to restore this order on arrival of the messages. Thus the
usage of a sort-merge join algorithm would impose an overhead under
any circumstances.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...i.e.
- We note that repartitioning applies to all
relations that participate in the join, i.e. R and Q in the
prototypical case. Therefore the cost components C1(a), C1(b),
C1(c), C1(d) are computed for R using R's parameters -
this is indicated by C1(a)(R), C1(b)(R) etc. - and for Q
using Q's parameters - this is indicated by C1(a)(Q),
C1(b)(Q) etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...blockwise
- or - in other terms - pagewise.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...approach
- See
section 3.4.1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...strategies
- In the remainder, we will frequently use the term
strategy as a shortcut for partitioning
strategy.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...approach
- See algorithm ChooseIntervals in
[Soo et al., 1994].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...function
- Here, we refer to
assuming that R is to
be partitioned. However, the technique that we describe can be applied
to any temporal relation, in particular also to
in which
case would have to be used rather than
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
- The number of minutes per day is .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
- Sometimes we will refer to
these joins also as join 1 ,
join 2 and join 3
respectively.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...fragment
- See
if-statement in figures 9.7
(page
) and 9.9
(page
).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
- ...or and in the case of
condensed IP-tables.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...negative
- with one single exception, i.e. for
join 1, a=16 and the primary minimum-overlaps strategy
(figure 10.47).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...endpoints
- In the context of the phone calls
scenario this would mean that the time when a phone call starts should
not imply a certain length of the phone call.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.