Uniform Workloads

Next: Experiments Up: Evaluation of Characteristics Previous: Evaluation of Characteristics

Uniform Workloads

In practice, uniform workloads are a very rare exception. Therefore they cannot be used to draw realistic conclusions. However, they can help to simplify the evaluation of the components of the performance model that are not susceptible to data skew and other forms of non-uniformity. To prepare such a preliminary evaluation is the focus of this section. Under the notion of uniformity of the workload we assume the following:

All intervals in R and Q have the same length .
The startpoints of these intervals are uniformly distributed over the timeline, i.e. at any time there is the same number of intervals starting as at a time with .
The only input parameter to define a partition is m. The (m-1) breakpoints are supposed to be at equal distances.
Both relations have the same span, i.e.
The previous points imply that
We also assume that both relations have the same number of tuples, i.e.
Tuples in R and Q are supposed to be of an equal size, i.e.
for $r \in R$ and $q \in Q$ .

Following these assumptions on uniform data, we can derive the necessary parameters for the costs model.

First, we want to approximate the size of a fragment R_k with . For that purpose we can use which gives the average number of fragments to which a tuple $r \in R$ is assigned. Assuming a partition of into m equally sized segments, i.e. any two subsequent breakpoints, p_k and p_k+1, are at equal distance. The latter can be calculated by . If is the average length of a timestamp interval then the interval occupies a portion of

(61)

of a segment. As interval startpoints are uniformly distributed over a segment, there are some intervals starting in close enough proximity to the end of a segment to overlap a breakpoint. Therefore we have to add 1 to the value obtained by (8.30) in order to get an estimate for :

Figure 8.14 shows an example for , and one interval in R starting at each point . (8.30) gives a value of 0.4 which means that 40% of the intervals starting within a segment overlap the right breakpoint. In the example of figure 8.14, these comprise 4 out of 10 intervals. From a global point of view this implies that 40% of the relation's tuples overlap a breakpoint. Therefore it is

Thus in this example.

The quotient in (8.30) can also result in values beyond 1 which means that which means that the (average) length of a segment is smaller than the average length of an interval. Consequently, most (in the general case) or all (under the assumption of uniformity) intervals overlap the right breakpoint. This is a bad choice for a partition.

**Figure:** An example for the approximation of for chronons and chronons.

Now we can approximate the size of a fragment R_k for :

As each tuple can only be assigned to one R'_k, the sizes of these are given by

From |R_k| = |R'_k| + |R''_k| we can conclude that

In the same way, the values for the |Q_k|, |Q'_k| and |Q''_k| can be derived. Table 8.8 summarises the approximations made under the assumption of uniformity.

Next: Experiments Up: Evaluation of Characteristics Previous: Evaluation of Characteristics

Thomas Zurek