On an idea of Peter Dybjer's

[There is also some latex source that should compile for some of these definitions if you are interested. Here is the dvi.] A bit more than a couple of years ago, Peter Dybjer first suggested the idea of using families of sets in constructive type theories to model ordinals "a la von Neumann". Earlier this year (at a TYPES workshop ) Peter gave a talk about "ordinals as universes". He sketched an analogue of the Veblen hierarchy along these lines. As far as I know, no-one has had the opportunity yet to explore fully Peter's suggestion. I intend to accumulate on this page notes about this interesting idea. I expect to rewrite at least parts of it rather frequently. It is likely that the organisation will deteriorate from time to time.

For a long time I have felt that a general theory of ordinals comparable to that of the von Neumann ordinals in classical set theory is lacking in constructive type theory, and indeed that this is something of a "scandal". Ordinal notations are one thing; but "the" ordinals are another - meaning by "the" ordinals canonical representatives of well-orderings such as the von Neumann ordinals. (There is a typical lament for example in this page: search for the term "spine-chilling".) On the other hand, it's unlikely one needs to fill exactly the rôle of the notion of ordinal in classical set theory.

It seems important to `go slowly'. At certain points one stares at the ordinals defined so far, then formulates explicitly a principle that underlies their definition, and then exploits the principle to push further. If one is too hasty, the patterns of definition will come out slightly awry, the reflection principles will accordingly come out slightly wrong, and the whole enterprise will veer off in the wrong direction. (I expect there will be quite a lot of veering-off.)

It is also important and difficult to settle on some workable notation. I have not done so yet; so what follows may be less easy to follow than it needs to be.


Recall the main landscape features of small countable ordinals.


Arguably the first few `watershed' ordinals are

Various Veblen hierarchies

Given a normal (ie. continuous and strictly increasing) ordinal function f, by the Veblen hierarchy over f, I mean the 2 place function V_f(a,b) defined so that for each a, V_f(a) enumerates the common fixed points of the family of functions { V_f(a') | a' < a }.

There are lots of interesting normal functions to start with, some of them unfamiliar to me. Here are a few.

As mentioned already, there are the `n-place' Veblen functions, and beyond that Shütte's `Klammersymbolen'. I think Anton Setzer has a way of extending the Klammersymbolen (multi-dimensionally?) up to the Bachmann-Howard ordinal. There may even be some definitions of these things in his drafts on ordinal systems.

Finite ordinals

Represent the finite ordinals by families.

And so on.

Successor operations: two candidates

Now we reflect on what is going on in the construction of the finite ordinals, and try to formulate a successor operation. There seem to be two candidates.

Suppose we are given a family A,B. What should the successor be? In either case the index set needs to be extended by a new element, so we take A' = A + 1. Here + means the binary disjoint union operation on sets, and 1 means the singleton set { 0 } with sole element 0.

Let us call the first operation (with the new element at the top) Succ^ and the second (with the empty set at the bottom) Succ_.

In both definitions above, the operation (on sets) X |-> X + 1 is of key importance. Martin-Löf has suggested that this operation deserves its own name. (See pp. 103-104 of Nordström, Petersson and Smith.) So let's take S to be a unary type constructor with the following formation and introduction rules:

     X : Set                                x : X
     --------- formation      0 : S X      ---------
     S X : Set                             s x : S X

I would prefer to write this information as follows, using a more compact notation vaguely like Agda's.

     S : Set -> Set 
     S X = data 0 | S (x : X )

Here `data' is a form of coproduct operator, such that one has the opportunity to associate constructor names with the various cofactors: in this case the names are 0 (with no components), and s (with one component, which is an X). To analyse such coproducts, there is a `case' construction.

        case z of 0    -> a
               |  S x  -> b x

Sometimes I miss out the `z of': in that case I mean \z->case z of ... . Sometimes I use pattern-matching definitions rather than the case construction - these should always be reducible to case constructions.


Note that when we step from (U,T) to Succ^ (U,T), or Succ_ (U,T), there is no `interesting' interaction between the two components. (I don't mean anything disparaging by this!) That is, U' is defined in terms of U, but not T -- it is really an operation on sets, `artificially' made into an operator on families. (Compare this to the closure of a universe under a quantifier like Pi, Sigma, or W, where there is an `interesting' interaction.) However it is true that in the definition of Succ^, T' depends on U (as well of course as T), which is perhaps `mildly' interesting.

The ordinal w

Now the ordinal w is the least ordinal which contains 0 and is closed under the successor operation Succ_. It reflects the property of the class of ordinals that it is non-empty (hence there is a least ordinal), and for any ordinal there is a greater (hence a least ordinal which is strictly greater). Clearly, one way to define w is as follows.
       Uw = Nat = S Nat     -- the set of natural numbers
       Tw 0     = {}
       Tw (s x) = S (Tw x)
But perhaps this is already to leap ahead too far. That is what we expect.

Containing versus extending

Something occurs to me here: the distinction between

Isn't it the case that in the presence of the successor operation (on families), we can get the latter from the former, simply by starting with (ie. extending) the successor Succ^ (A,B)? Yes!

In this connection it may be interesting to look at one way of expressing the Veblen derivative of a single function f : On -> On:

       f' a  =  (f^w . (+1))^a (f^w 0)
The `(+1)' which occurs here is perhaps the correlate of the successor operation Succ^ on families.

Closing under zero and successor

Let us (for the time being) consider a `universe' to be a family of sets containing the empty set and closed under (set) successor S. (This is a reasonable notion of limit ordinal.) Then the operation which given a family (A,B) produces the least universe (A',B') extending (A,B) is defined as follows.

     A' = A + S A' 
     B' (i a)      = B a
     B' (j 0)      = {}
     B' (j (s a')) = S (B' a')
Let us call this operation Sw. (Another name might be Univ, if we are calling a universe a family which contains the empty set and is closed under successor.) Note that it is an operator on families. In a sense, it takes a family to the least limit family extending it. If we start with the empty family, then it takes this to the family of finite ordinals. There is something monadic about Sw. It is a closure operator, in the sense that it is inflationary, monotone, and idempotent.

Finite multiples of w

The ordinal w*2 is the least ordinal which contains both 0 and w, and is closed under the successor operation. Similarly w*(k+1) is the least ordinal closed under the successor operation, which contains all of 0, w, w*2, ... , w*k. This gets us (arbitrarily close) to w^2. But what really is the step to w^2 itself? Somehow, we need to close under the general process of throwing in the last closure ordinal, and then closing again under zero and successor.

If one wants to go really slowly one should probably say something like this: "after w we have sw, ssw, ... then we can build w*2 as the least universe closed under 0, s, w. Similarly we can go on and build the ordinals up to w*3 and w*3 itself is the universe closed under 0, s, w, w*2. And similarly we can define w*n for finite n. Then we wish to internalize the process which brings us from w*a to w*(a+1) as a construction on general ordinals a. Now, w*(a+1) is the ordinal closed under 0, s, extending and containing w*a."

Finite powers of w

Now we can consider Sw as an operator under which families can be closed, in addition to S. So we can define a set Uw^2 as follows:

     Uw^2 = data $0    
               | $S  (a : Uw^2) 
               | $Sw (a : Uw^2, b : Tw^2 a -> Uw^2) 
The family Tw^2 is defined as follows:
     Tw^2 $0        = {}
     Tw^2 ($S a)    = S (Tw^2 a)
     Tw^2 ($Sw a b) = Sw (Tw^2 a) (Tw^2 . b)
It seems reasonable to regard (Uw^2,Tw^2) as one analogue of the ordinal w^2 -- which can be regarded as the first ordinal which is greater than 0, and closed under both +1 and +w. But this may not be the right point of view.

Now w^2 is surely closed under Sw -- but should we throw in also 0 and closure under S (yet) again? Maybe this is a point at which we can `veer off', and we should take care not to leap ahead too quickly.

The operator involved here may be this. Given a family of sets (A,B), define (A',B') by

     A'   = data $b (a : A)
               | $0    
               | $S  (a' : A') 
               | $Sw (a' : A', b' : B' a' -> A') 
The family B' is defined as follows:
     B' ($b a)      = B a
     B' $0          = {}
     B' ($S a')     = S (B' a')
     B' ($Sw a' b') = Sw (B' a') (B' . b')
Perhaps we might call this operator Sw^2.

Addition, multiplication, sum


Let's consider (binary) addition. If we are given two families of sets (A1,B1) and (A2,B2), we can define another family (A,B) which is in some sense the ordered sum of the first two: (A1,B1) + (A2,B2) (with (A2,B2) `above' (A1,B1)).

       A = A1 + A2
       B (i a1) = B1 a1
       B (j a2) = A1 + B2 a2
This agrees with the definitions of successor, and `one'. (The definition of addition makes more sense, perhaps, if we model ordinals as transition systems rather than just families of sets.) The equations for the predecessor are
       (i a1)[b1]   = i (a1[b1])
       (j a2)[i a1] = i a1
       (j a2)[j b2] = j (a2[b2])


Let's consider (binary) multiplication. If we are given two families of sets (A,B) and (C,D), we can define another family (E,F) which is in some sense the ordered product of the first two: (C,D) * (A,B) (with (A,B) `controlling' (C,D) so the operands of * are `the wrong way round' for the usual sense of ordinal arithmetic).

       E       = A * C
       F (a,c) = D c + B a * C
If we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are
       (a,c)[i d]      = (a,c[d])
       (a,c)[j (b,c')] = (a[b],c')

Sum (of an ordered series)

Let's consider a general ordered sum. If we are given a family of sets (A,B), and for each a : A a family of sets (C a,D a), we can define another family (E,F) which is in some sense the ordered product of the family. This ought to generalise both binary addition (when the index family is the family `2') and multiplication (when the indexing is in fact constant).

       E       = (Sigma a : A) C a
       F (a,c) = D a c + (Sigma b : B a) C (a[b])
If we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are (just as above)
       (a,c)[i d]      = (a,c[d])
       (a,c)[j (b,c')] = (a[b],c')


It is quite challenging to give a definition of exponentiation. Perhaps one can start with the `two to the _' operation, which in the case of linear orderings is the lexicographic order of chains which are descending in the exponent order. (I've not succeeded in defining this nicely as a transition system.) Then one might think of the `w to the _' operation, in which the descending chains are `decorated' with numerical `coefficients'. (One should also consider how to define the closely related multi-set order, better suited to non-linear orders: Thierry knows a great deal about this.) Finally, one could think about the general exponential.

General exponentiation is an enormously subtle operation. Almost everyone overlooks the cases in which the base is zero or one. The functions sgbar a = 0^a and sg a = 0^(0^a) are very important.

Closure functional: singleton version

Suppose we have any family (U : Set, T : U -> Set) and any operator Theta from families of sets to families of sets. The operator Theta can be split apart into two simpler operators, as follows.

        Phi : Fam Set -> Set        
        Psi : (Pi (A,B) : Fam Set) Phi(A,B) -> Set
        Theta (A,B) = (Phi(A,B),Psi(A,B)) : Fam Set
The least family of sets extending (U,T) and closed under (Phi,Psi) could perhaps be defined as follows.
        U'  = data $t (u : U)
                 | $phi (a : U', b : T' a -> U')
                 | $psi (a : U', b : T' a -> U', c : Phi(T' a, T' . b))

        T' ($t u)       = T u 
        T' ($phi a b)   = Phi (T' a, T' . b)
        T' ($psi a b c) = Psi (T' a, T' . b) c
Let us call (U',T') Cl(Phi,Psi)(U,T). The type of Cl is as follows.
       Cl : (Fam Set -> Fam Set) -> Fam Set -> Fam Set
The existence of this functional is somehow equivalent to the `Mahloness' of the universe of sets. (Maybe one should think about whether this is some kind of `free' construction with a given `indeterminate' family. There is something monadic about this.)

It is natural to throw in zero and successor.

       Deriv : (Fam Set -> Fam Set) -> Fam Set
       Deriv Theta = (U',T') 
                 U' = data $0
                         | $s (a : U')
                         | $phi (a : U', b : T' a -> U')
                         | $psi (a : U', b : T' a -> U', c : Phi (T' a, T' . b) )
                 T' = case $0         -> U0
                         | $s a       -> S (T' a)
                         | $phi a b   -> Phi (T' a, T' . b)
                         | $psi a b c -> Psi (T' a, T' . b) c 
This is the least family containing the empty set, and closed under both the successor operation, and the given operator Theta. The step from Theta to Der (Cl Theta) is then something very like the Veblen derivative.

The least family of sets containing (U,T) and closed under Theta = \(A,B)->(Phi(A,B),Psi(A,B)) could perhaps be defined as follows.

         (U',T') = Cl Theta (Succ (U,T))

How does it look if we take (U0,T0) for the `contained' family, and von-Neumann successor for the operator on families? Then

        U = U0, which is empty,
        (so T is irrelevant), 
        Phi(A,B)       = S A,
        Psi(A,B) 0     = A
        Psi(A,B) (s a) = B a,
Hmm. I don't know what to make of that.

Closure functional: family version

Slight generalisation of the `singleton' functional. (I have the feeling that this can be subsumed in the simpler functional.) Given a family of families of sets, and a family of operators from families of sets to families of sets, to define the least family of sets containing each of the given family of sets, and closed under each of the given families of operators. (This is roughly speaking what Peter Dybjer appears to suggest gives us an analogue of the Veblen hierarchy.)

Suppose we have

The least family of sets containing each of the given families of sets and closed under each of the given family of operators from families of sets to families of sets could perhaps be defined as follows.
   U'  = data $u (i : I)
            | $t (i : I, u : U i)
            | $phi (j : J, a : U', b : T' u -> U')
            | $psi (j : J, a : U', b : T' u -> U', c : Phi j (T' u, T' . t) )

        T' ($u i)          = U i
        T' ($t i u)        = T i u 
        T' ($phi j a b)    = Phi j (T' a) (T' . b)
        T' ($psi j a b c)  = Psi j (T' a) (T' . b) c  

A transition structure on families

A family of sets is an ordered pair (A,B) where A : Set and B : A -> Set. So Fam Set is a type, or large set. Let us however ignore this "size consideration" for the moment and pretend that F = Fam Set is a set, and so can be the state space of a transition system. We can put the following transition structure on F. The transitions from (A,B) are defined to have type
        Tfam (A,B) = (Sigma a : A) B a -> A : Set
The destination states of the transitions are given by
        (A,B)[a,b] = (B a, B . b) : F 
where `_ . _' refers to composition. To see that this type checks, remember that a : A, b : B a -> A and B : A -> Set, so B a : Set and B . b : B a -> Set.

A chain starting from (A,B) : Fam Set has the form

              | (a,b) : (Sigma a : A) B a -> A
            (A',B') = (B a, B . b) 
              | (a',b') : (Sigma a' : B a) (B . b) a' -> B a
            (A'',B'') = ((B . b) a', B . b . b')
              | (a'',b'') : (Sigma a'' : (B . b) a') (B . b . b') a'' -> (B . b) a'
             (A''',B''') = ((B . b . b') a'', B . b . b' . b'')
and so on. Each successive family of sets is a "internal refinement" or restriction of its predecessor.

Now, what about size questions? I have defined a transition system to have a set of states. The situation appears to be this. For each fixed family of sets (A,B) we can put a transition structure on the set of "internal-families" of (A,B), ie the set

            sfam (A,B) = (Sigma a : A) B a -> A 
Namely, we take as transition sets
             T : sfam (A,B) -> Set
             T(a,b) =  (Sigma c : B a) B (b c) -> B a
                    =  sfam (B a, B . b) 
and as destinations
             _[_] : (Pi (a,b) : sfam (A,B)) T (a,b) -> sfam (A,B) 
             (a,b)[c,d] = (b c, b . d) 
                          where a : A
                                b : B a -> A
                                c : B a
                                d : B (b c) -> B a
In other words, we can replace talk of "sets" by talk of "small sets".

The ordinals below the Bachmann-Howard ordinal

I've not thought a great deal about it, but Peter Dybjer's idea for a `Veblen' functional suggests to me investigating closure functionals of higher type. (cf. Erik Palmgren's hierarchy.) With functionals of finite type, maybe one could approach the Bachmann-Howard ordinal?

It may be interesting to chase up a result of Michael Rathjen's on the strength of Martin-Löf type theory with a super-universe, parts I and II. Part I is the Archive for Mathematical Logic, somewhere; part II is on his web page. I think it concerns a version of Martin-Löf type theory without generalised inductive definitions, but with something like a super-universe closed under the `next universe' operator. (And natural numbers?) If I recall correctly, he located the strength of this system using the Veblen hierarchy based upon the G (Gamma) function: he calls the ordinal Phi (G_0) 0 , and I guess that Phi is the Veblen hierarchy on G_a.

Another thing here is Peter Aczel's hierarchy of `diagonalisation' functionals from his JSL 1972 paper on describing ordinals by means of functionals of transfinite type. It could be that the functionals considered here are in some way analogous to these diagonalisation functions. Peter Aczel identified a certain ordinal above the Bachmann-Howard ordinal (as it were, using not epsilon numbers in the third number class, but Gamma> numbers to index the diagonalisations) with the least countable ordinal that cannot be described by diagonalisation functionals of autonomously transfinite types. Who knows, maybe this ordinal will turn up here if one thinks about functionals of transfinite type.

Another thing to look at may be Herman Jervell's constructions from the 70's. None of the Edinburgh libraries seem to have the relevant volume of LNM.

Misc, detritus

Is one really restricted to countable ordinals? (There is a remark about this in Peter's slides.)

Can't one formulate in this framework a notion of regularity, and deal with regular ordinals? After all, an ordinal is regular if (it is infinite and) closed under sums of sequences which are indexed by a smaller ordinal; the property presents itself as a closure property.

There is a (very possibly completely confused) suggestion for a regularity-based definition of `the ordinals' (up to, I presume the first inaccessible) as a transition system in the middle of the following page.

How does one account for recursive definitions in Peter's style? Are these universes merely fixed points, or least fixed points?

Peter Hancock
Last modified: Sun Sep 19 10:16:50 BST 1999