Next: Sequential Join Algorithms Up: Join Processing Previous: Join Performance Issues

Types of Joins

The join condition plays a very important role in join processing. Certain types of conditions allow certain processing and optimisation techniques. For that purpose, it has proved to be very useful to classify joins depending on the respective join condition. This section summarises the most prominent types in traditional relational queries. Please note that these categories are not disjoint but emphasise certain features of the join condition, such as the data types of the join attributes or the predicates that are involved.

Theta-Joins:

Many join conditions are based on the pattern

$\begin{displaymath} R.A \;\;\theta\;\; Q.B\end{displaymath}$

(5)

where $\theta$ is one of the following predicates/operators : =, $\not=$ , <, >, $\le$ or $\ge$ . Joins with such a condition are called theta-joins. Naturally R.A and Q.B must be attributes that are comparable by these $\theta$ operators; it would not make sense to have strings in R.A and integers in Q.B and compare them by $\le$ . More generally, theta-conditions consist of various simple ones of the form described in (3.2) which are connected by logical operator, such as AND and OR.

Equi-Joins:

An equi-join has a join condition that is based on the equality predicate =. It therefore is a special case of a theta-join. The majority of join conditions are believed to be based on an equality predicate, such as in `key = foreign key' conditions. Consequently, most join algorithms have been optimised for equi-conditions. See section 3.4 for details.

Nonequi-Joins:

Theta-joins which are not equi-joins are called nonequi-joins [Mishra and Eich, 1992]. Nonequi-joins have attracted less research efforts than equi-joins due to their rare usage. However, new data types, such as timestamps , intervals , rectangles , polygons etc., have become more interesting with the growing requirements and ambitions of data modeling. In the context of relational query processing this means that there are many possible new join conditions that describe relationships between objects of one of these data types. These relationships very often can be described in the form of a nonequi-join condition. This has triggered some interest in new types of data-type-specific joins, such as those described in the following two paragraphs. Most of them can be considered as traditional nonequi-joins.

Temporal Joins:

This class of joins involves temporal data types, such as time intervals or dates. The interval type is predominant in most temporal data models. Therefore, temporal join conditions essentially describe relationships between intervals. In the context of time representation in natural language processing, Allen identified 13 possible relationships between two intervals [Allen, 1983]. Each of them implies its own subtype of temporal join. As an example, you might refer to the before join presented in section 3.2.1. Temporal joins are discussed in detail in the next chapter.

Spatial Joins:

Spatial data types, such as rectangles or polygons, are used in geographic information systems (GIS) . Imagine two spatial relations, one holding areas (polygons) with nuclear plants and the other one storing areas with high cancer rates. If we wanted to investigate possible spatial connections between nuclear plants and cancer rates we would need to join these two relations using `intersection of areas' as the (spatial) join condition. Spatial joins are more dynamic than temporal joins in the sense that many geometric data types have to be catered for rather than one or two. On one hand, this makes them more general but, on the other hand, allows hardly any data-type-specific optimisation. For further details on spatial joins the reader might refer to [Günther, 1993], [Lo and Ravishankar, 1996] or [Patel and DeWitt, 1996], for example.

Next: Sequential Join Algorithms Up: Join Processing Previous: Join Performance Issues

Thomas Zurek