Note on functions over streams.

A function over streams is a function whose domain and codomain are both streams, where by a stream is meant an infinite sequence. The type of such a function has the form f :: Str A -> Str B where A is called the input alphabet and B is called the output alphabet. Here Str A denotes the type of streams (infinite lists) over A. One should think of Str as defined coinductively. I use Haskell notations, particularly `::' for typing. .

Many programs and scripts that are written to be `filters' in a Unix environment provide examples of functions on streams. For example, the function that copies the input through, except that lowercase characters are replaced by their uppercase equivalent is a stream processing function.

A continuous function is however something more than a function on streams: to produce the next character of output, it should be necessary to examine at most a finite prefix of the input. By providing longer and longer portions of the input, one can be sure of obtaining output eventually.

Note that the filter `grep p' is not continuous according to this definition: it need produce no output when there is no instance of pattern p in the input. Not all Unix filters are continuous. A continuous function is `productive' or non-null in a certain sense.

There is more than one way to skin a cat, or formulate the notion of continuity. One can express it by quantification over streams. The following formulation, which is the subject of this note uses inductive and coinductive definitions rather than quantification over streams.

The central operator is a binary operator on types which I shall write with an infix '&'.

   T A B = (mu X) B + (A -> X) 
The values of a type A&B are well-founded trees in which there are leaves of type B, and node-branches are indexed by A. We can define a function which interprets such a tree as a program of a certain kind. We write this function with an infix `':
 
   () :: T A B  -> Str A -> (B, Str A) 
   (inl b)  input = (b, input) 
   (inr f)  (a:input') = (f a)  input' 
The program reads a prefix of the input stream, then emits an output character and stops, returning a value of B, paired with the unconsumed suffix of the input stream.

One should think of the definition as using induction over T(A,B). (A catamorphism.)

Now to model stream processing functions Stream A -> Stream B, I propose the following coinductively defined type Prog(A,B).

   Prog(A,B) = (nu X) T A (B,X)
Such an program consumes some finite prefix of the input, then delivers a character and a new version of itself to apply to the unconsumed input. We express this with an infix `%':
    (%%) :: Prog(A,B) -> Str A -> Str B 
    p %% input = let ((c,p'),input') = p  input
                 in c : p' %% input' 
A continuous function :: Str A -> Str B is one which is obtained as the image of a program in Prog(A,B) under the interpretation %%.

One should think of this definition as using co-induction into Str(B). (An anamorphism.)

It is interesting that the type Prog(A,B) has both inductive and co-inductive aspects, with the former "inside" the latter.

     Prog(A,B) = (nu X)(mu Y)((B,X) + (A -> Y)

Note that the alphabets need not be finite. The cardinal of the alphabets plays no role above. However it may be interesting to look at the case when the input and output alphabets are finite, eg the binary digits. In that case we are looking at continuous functions over Cantor space. We have a (relatively well-known) inductive description of the functions from Str(2) to 2, and we use the trick above to describe the functions from Str(2) to Str(2).

Cantor space

There is a `piping' combinator on continuous programs that appears to reflect composition. A suitable operation is defined here in a Haskell program, written for the sake of type-checking the definition. That the operation reflects composition remains to be proved.


Peter G. Hancock
Last modified: Thu Oct 30 09:38:47 GMT 2003