Condensation of IP-Tables

Next: Endpoint IP-Tables Up: Size Considerations Previous: Realistic Examples

Subsections

Condensation of IP-Tables

We now present one possibility to reduce the size of an IP-table. The idea is to collapse a certain number of IP-table entries, say a , into one. We call this a condensation of an IP-table by a and refer to it as I'(R,a) if the original IP-table is I(R). The parameter a is called the condensation factor . Condensation means that N' new timepoints are created with

(27)

For simplicity we assume N' = N / a for a moment. Later we will come back to the situation when this constraint does not hold. The timepoints form a new timepoint t'₁, the timepoints form t'₂ etc. In general, the timepoints form a new timepoint t'_j which gets the value of :

(28)

with . This set of new timepoints is referred to as V'(R,a) . See figure 7.6 for a condensation by a=2 for the example of figure 7.2. The definition of the t'_j implies that

This conserves the notion of an overlap: if an interval ends at one of the points within then it logically ends now at t'_j because of the collapse. Therefore it cannot overlap t'_j.

**Figure:** Condensation of timepoints with a=2 for the example of figure 7.2.

The collapse of also implies that - logically - all intervals that started at one of these points are now considered to start at t'_j. We use a new function which describes this fact:

(29)

for all and for all other values it is

Using instead of $s_{\scriptscriptstyle R}$ in the formula (7.1) still delivers a correct result: if I'(R,a) is used for partitioning rather than I(R) then the resulting partition is a subset of V'(R,a). We prove this in the following.

Theorem 29365

Let be a partition for R with p_k-1 < p_k for , and let and p_m=t'_N'. Then the following holds

(30)

for all $k=1,\dots,m$ .

Proof:

For a let p_k-1 = t'_x and p_k = t'_y with . For convenience, we define . Then it is

Now we can show the following

Corollary 29381

Let be a partition for R with p_k-1 < p_k for , and let and p_m=t'_N'. Then the following holds

(31)

for all $k=1,\dots,m$ .

Proof:

Trivial because of theorem 5.

In the case that all definitions remain the same as shown above for . For j=N' we define

for the same reasons as above. However, it pays attention to the fact that t'_N' does not comprise the same number a of the original timepoints but only . The proof of theorem 5 is not changed by this situation. Therefore theorem 5 and corollary 1 hold for this case too.

The condensation process divides the size of the IP-table by a. As a consequence, the information in the IP-table becomes coarser and less precise. This might decrease the quality of the resulting partitions but, for example, can make the process of deriving the partition more efficient. Just consider the optimal partitioning algorithm IP-opt whose runtime is a function of N. In chapter 10, we will perform experiments with different values of a and look at the impact it has on the quality of the partitions that are derived.

Figure 7.7 shows the condensed IP-table I'(R,2) for the example of figure 7.2. The notion of condensation can be applied to create a function with providing the number of intervals that ended within the condensed timepoint's range, i.e. within (t'_j-1,t'_j]. Similarly, we can define a function with providing the number of intervals that intersect with the range of t'_j, i.e. (t'_j-1,t'_j]. Then the formulas (5.1), (5.2) and (5.3) can be used by replacing $s_{\scriptscriptstyle R}$ , $e_{\scriptscriptstyle R}$ and $i_{\scriptscriptstyle R}$ through , and respectively. This is straightforward for the same reasons that applied in the case of $s_{\scriptscriptstyle R}$ , $e_{\scriptscriptstyle R}$ and $i_{\scriptscriptstyle R}$ (see section 5.2), just that condensed timepoints are used. Please note that condensation assumes that

for . This is another expression of the fact that condensation makes the `resolution' of the timeline coarser. In summary, the formulas of figure 5.1 apply too as they were derived from (5.1), (5.2) and (5.3). This can be verified for the example of figure 7.7.

**Figure:** The IP-table I'(R,2) (in bold typeface) for the intervals in figure 7.2 plus the values and .

Next: Endpoint IP-Tables Up: Size Considerations Previous: Realistic Examples

Thomas Zurek