Next: Sequential Join Algorithms
Up: Join Processing
Previous: Join Performance Issues
The join condition plays a very important role in join processing.
Certain types of conditions allow certain processing and optimisation
techniques. For that purpose, it has proved to be very useful to
classify joins depending on the respective join condition. This
section summarises the most prominent types in traditional relational
queries. Please note that these categories are not disjoint but
emphasise certain features of the join condition, such as the data
types of the join attributes or the predicates that are involved.
- Theta-Joins:
- Many join conditions are based on the pattern
|  |
(5) |
where
is one of the following
predicates/operators : =,
, <, >,
or
. Joins with such a condition are called
theta-joins. Naturally R.A and Q.B must be attributes that
are comparable by these
operators; it would not make sense to
have strings in R.A and integers in Q.B and compare them by
. More generally, theta-conditions consist of various simple
ones of the form described in (3.2) which are connected by
logical operator, such as AND and OR.
- Equi-Joins:
-
An equi-join has a join condition that is based on the equality
predicate =. It therefore is a special case of a theta-join.
The majority of join conditions are believed to be based on an equality
predicate, such as in `key = foreign key' conditions. Consequently,
most join algorithms have been optimised for equi-conditions. See
section 3.4 for details.
- Nonequi-Joins:
-
Theta-joins which are not equi-joins are called nonequi-joins
[Mishra and Eich, 1992]. Nonequi-joins have attracted less research efforts than
equi-joins due to their rare usage. However, new data types, such as
timestamps , intervals ,
rectangles , polygons etc., have become
more interesting with the growing requirements and ambitions of data
modeling. In the context of relational query processing this means
that there are many possible new join conditions that describe
relationships between objects of one of these data types. These
relationships very often can be described in the form of a
nonequi-join condition. This has triggered some interest in new types
of data-type-specific joins, such as those described in the following
two paragraphs. Most of them can be considered as traditional
nonequi-joins.
- Temporal Joins:
-
This class of joins involves temporal data types, such as time
intervals or dates. The interval type is predominant in most temporal
data models. Therefore, temporal join conditions essentially describe
relationships between intervals. In the context of time representation
in natural language processing, Allen identified 13 possible
relationships between two intervals [Allen, 1983]. Each of them
implies its own subtype of temporal join. As an example, you might
refer to the before join presented in
section 3.2.1. Temporal joins are discussed in
detail in the next chapter.
- Spatial Joins:
-
Spatial data types, such as rectangles or polygons, are used in
geographic information systems (GIS) . Imagine two spatial relations, one
holding areas (polygons) with nuclear plants and the other one storing
areas with high cancer rates. If we wanted to investigate possible
spatial connections between nuclear plants and cancer rates we would
need to join these two relations using `intersection of areas' as the
(spatial) join condition. Spatial joins are more dynamic than temporal
joins in the sense that many geometric data types have to be catered
for rather than one or two. On one hand, this makes them more general
but, on the other hand, allows hardly any data-type-specific
optimisation. For further details on spatial joins the reader might
refer to [Günther, 1993], [Lo and Ravishankar, 1996] or [Patel and DeWitt, 1996], for example.
Next: Sequential Join Algorithms
Up: Join Processing
Previous: Join Performance Issues
Thomas Zurek