# On an idea of Peter Dybjer's

[There is also some latex source that should compile for some of these definitions if you are interested. Here is the dvi.] A bit more than a couple of years ago, Peter Dybjer first suggested the idea of using families of sets in constructive type theories to model ordinals "a la von Neumann". Earlier this year (at a TYPES workshop ) Peter gave a talk about "ordinals as universes". He sketched an analogue of the Veblen hierarchy along these lines. As far as I know, no-one has had the opportunity yet to explore fully Peter's suggestion. I intend to accumulate on this page notes about this interesting idea. I expect to rewrite at least parts of it rather frequently. It is likely that the organisation will deteriorate from time to time.

For a long time I have felt that a general theory of ordinals comparable to that of the von Neumann ordinals in classical set theory is lacking in constructive type theory, and indeed that this is something of a "scandal". Ordinal notations are one thing; but "the" ordinals are another - meaning by "the" ordinals canonical representatives of well-orderings such as the von Neumann ordinals. (There is a typical lament for example in this page: search for the term "spine-chilling".) On the other hand, it's unlikely one needs to fill exactly the rôle of the notion of ordinal in classical set theory.

It seems important to `go slowly'. At certain points one stares at the ordinals defined so far, then formulates explicitly a principle that underlies their definition, and then exploits the principle to push further. If one is too hasty, the patterns of definition will come out slightly awry, the reflection principles will accordingly come out slightly wrong, and the whole enterprise will veer off in the wrong direction. (I expect there will be quite a lot of veering-off.)

It is also important and difficult to settle on some workable notation. I have not done so yet; so what follows may be less easy to follow than it needs to be.

# Ordinals

Recall the main landscape features of small countable ordinals.

## Watersheds

Arguably the first few `watershed' ordinals are

• `0` zero, the first ordinal, with no predecessors. Closed under practically anything: addition, multiplication, exponentiation, ..., for trivial reasons. (In some ways zero is the most mysterious ordinal of all!)
• `1` one, the first non-zero ordinal, with predecessor `0`. Closed under addition and multiplication, but not exponentiation. Closed under sums (but not products) of series of length < 1.
• `2` two. Closed under multiplication, and exponentiation, but not binary addition. Closed under sums (and products) of series of length < 2.
• `w` omega, the first infinite ordinal, with predecessors the finite ordinals. Closed under addition, multiplication, exponentiation, etc etc. Closed under sums (and products) of series of finite length.
• `w^w` omega-to-the-omega, the ordinal of primitive recursive arithmetic (or arithmetic with recursion restricted to lowest type). This is (for example) the order type of the unary polynomial functions from the natural numbers to the natural numbers ordered by eventual dominance (so, highest different exponent). The predecessors of `w^w` are describable by terms built up from `0`, `1` and ``` w``` by addition and multiplication.
• `e_0` epsilon_zero, the ordinal of first order arithmetic, with recursion allowed at all finite types. The predecessors of `e_0` are describable by terms built up from `0` and `w` by addition and (multiplication and) exponentiation.
• `phi_w(0)`, the ordinal of primitive recursive arithmetic including (transfinite) recursion over well-founded countably branching trees. Actually, maybe this one's just for real obsessives.
• `G_0` Gamma_zero, the ordinal of type theory with finitary data types, and an external sequence of universes closed under the `Pi` quantifier. (Also ATR0, and dozens of other systems.) Its predecessors are described by terms built up from the terms below `e_0` by closing them additionally under a binary operator for Veblen's 2-place function (based on the function `w^a` which enumerates the non-zero ordinals closed under binary addition).
• The Bachmann-Howard ordinal. The description of terms below this ordinal involves the use of certain ordinals in the third number class. I shall not bother to say any more at the moment, because I expect to be occupied with the smaller ordinals for some time.

## Various Veblen hierarchies

Given a normal (ie. continuous and strictly increasing) ordinal function `f`, by the Veblen hierarchy over ``` f```, I mean the 2 place function `V_f(a,b)` defined so that for each `a`, `V_f(a)` enumerates the common fixed points of the family of functions ```{ V_f(a') | a' < a }```.

There are lots of interesting normal functions to start with, some of them unfamiliar to me. Here are a few.

• Usually one starts with the function `w^a`, giving the classical Veblen hierarchy. So `V_f(1,0)` is ``` e_0```, the first epsilon number, and the proof-theoretic ordinal of first order arithmetic. Many weak subsystems of second order arithmetic crop up in this hierarchy. The only one I can remember is sigma-1-1 axiom of choice, which has ordinal ``` V_f(e_0,0)```. Perhaps ramified analysis of level 1 has ordinal `V_f(2,0)`; this was all explored by Feferman and the late Kurt Schütte while dinosaurs were roaming the earth, in the 50's.
• For any normal `f`, `V_f(a,0)` is again normal, so one may form a normal hierarchy (which I call the second Veblen hierarchy) on that function -- and so on. (This takes us into the realms of the so-called "n-place" Veblen functions; I can't remember the exact definition, or even where to look it up.)
• One may also start with the function `1 + a`, which is surely the simplest non-trivial normal function. It enumerates the non-zero ordinals. In this case `V_f(a,b)` equals `w^a + b`. The second Veblen hierarchy on this function is again the classical (first) Veblen hierarchy.
• `w.(1+a)` which enumerates the limit ordinals. Its derivative is `(w^w).(1+a)`, unless I am mistaken. The derivative of that is `(w^w^2).(1+a)`, UIAM. In general, the `b`-th derivative `V_f(b,a)` stands some chance of being `(w^(w^b)).(1+a)`. (I should check this.) If that's true, then the second Veblen hierarchy is based on `w^(w^a)` which (strangely enough) enumerates the infinite ordinals closed under binary multiplication (the finite ones being `0`, `1` and `2`).

As mentioned already, there are the `n-place' Veblen functions, and beyond that Shütte's `Klammersymbolen'. I think Anton Setzer has a way of extending the Klammersymbolen (multi-dimensionally?) up to the Bachmann-Howard ordinal. There may even be some definitions of these things in his drafts on ordinal systems.

# Finite ordinals

Represent the finite ordinals by families.
• The ordinal zero.

Let `U0 = {} : Set` denote the empty set, and ``` T0 : U0 -> Set``` the empty set-valued function with domain `U0`. There is surely no other choice.

• The ordinal one.

Let `U1 = {0} : Set` denote the set with a single element `0`, and `T1 : U1 -> Set` the set-valued function with domain `U1` whose value at `0` is the set `U0 = {}`.

• The ordinal two.

Let `U2 : Set` denote the set with a two elements `0, 1`, and `T2 : U2 -> Set` the set-valued function with domain `U2` whose value at `0` is the set `U0 = {}`, and whose value at `1` is the set `U1 = {0}`. This family of sets is sometimes called the boolean universe, or Jan Smith's microscopic universe.

And so on.

# Successor operations: two candidates

Now we reflect on what is going on in the construction of the finite ordinals, and try to formulate a successor operation. There seem to be two candidates.

Suppose we are given a family `A,B`. What should the successor be? In either case the index set needs to be extended by a new element, so we take `A' = A + 1`. Here ``` +``` means the binary disjoint union operation on sets, and `1` means the singleton set `{ 0 }` with sole element `0`.

• `B'` is defined by
```                  B' (i a) = B a + 1
B' (j 0) = {}
```

One alters the function `B` to `B'`, having one more new point in its domain. However the new element `goes at the bottom'. The old elements are themselves `shifted up by 1'.

• `B'` is defined by
```
B' (i a) = B a
B' (j 0) = A
```

One extends the function `B` to `B'`, having one more new point in its domain, and at this point taking as its value the entire domain of `B`.

This operation is the analogue of the von Neumann successor operation which maps a set `x` to ```x' = x U {x}```. One extends `x` by one new element, namely the singleton `{x}`.

This is the same as what one gets if one looks at Erik Palmgren's next-universe operator, (the one under which the superuniverse is closed) and strips out all the closure conditions pertaining to quantifiers, given sets, logical constants, the identity type etc.

Let us call the first operation (with the new element at the top) `Succ^` and the second (with the empty set at the bottom) `Succ_`.

In both definitions above, the operation (on sets) ```X |-> X + 1``` is of key importance. Martin-Löf has suggested that this operation deserves its own name. (See pp. 103-104 of Nordström, Petersson and Smith.) So let's take `S` to be a unary type constructor with the following formation and introduction rules:

```     X : Set                                x : X
--------- formation      0 : S X      ---------
S X : Set                             s x : S X
```

I would prefer to write this information as follows, using a more compact notation vaguely like Agda's.

```     S : Set -> Set
S X = data 0 | S (x : X )
```

Here `data' is a form of coproduct operator, such that one has the opportunity to associate constructor names with the various cofactors: in this case the names are `0` (with no components), and `s` (with one component, which is an `X`). To analyse such coproducts, there is a `case' construction.

```        case z of 0    -> a
|  S x  -> b x
```

Sometimes I miss out the `z of': in that case I mean ``` \z->case z of ...``` . Sometimes I use pattern-matching definitions rather than the case construction - these should always be reducible to case constructions.

# Uninterestingness

Note that when we step from `(U,T)` to ```Succ^ (U,T)```, or `Succ_ (U,T)`, there is no `interesting' interaction between the two components. (I don't mean anything disparaging by this!) That is, `U'` is defined in terms of `U`, but not `T` -- it is really an operation on sets, `artificially' made into an operator on families. (Compare this to the closure of a universe under a quantifier like `Pi`, `Sigma`, or `W`, where there is an `interesting' interaction.) However it is true that in the definition of `Succ^`, `T'` depends on `U` (as well of course as `T`), which is perhaps `mildly' interesting.

# The ordinal `w`

Now the ordinal `w` is the least ordinal which contains `0` and is closed under the successor operation ``` Succ_```. It reflects the property of the class of ordinals that it is non-empty (hence there is a least ordinal), and for any ordinal there is a greater (hence a least ordinal which is strictly greater). Clearly, one way to define `w` is as follows.
```       Uw = Nat = S Nat     -- the set of natural numbers
Tw 0     = {}
Tw (s x) = S (Tw x)
```
But perhaps this is already to leap ahead too far. That is what we expect.

# Containing versus extending

Something occurs to me here: the distinction between

• an operator on families on sets which takes us from ``` (A,B)``` to `(A',B')`, where the latter is the least family extending `(A,B)` enjoying certain closure properties.
```    (A',B') : Fam Set
A' = A + ...
B' (i a) = B a
B' ...   = ...
```
• an operator on families on sets which takes us from ``` (A,B)``` to `(A',B')`, where the latter is the least family containing `(A,B)` enjoying certain closure properties.
```    (A',B') : Fam Set
A' = S A + ...
B' (i 0)     = A
B' (i (s a)) = B a
B' ...       = ...
```
Isn't it the case that in the presence of the successor operation (on families), we can get the latter from the former, simply by starting with (ie. extending) the successor ```Succ^ (A,B)```? Yes!

In this connection it may be interesting to look at one way of expressing the Veblen derivative of a single function `f : On -> On`:

```       f' a  =  (f^w . (+1))^a (f^w 0)
```
The ``(+1)`' which occurs here is perhaps the correlate of the successor operation `Succ^` on families.

# Closing under zero and successor

Let us (for the time being) consider a `universe' to be a family of sets containing the empty set and closed under (set) successor `S`. (This is a reasonable notion of limit ordinal.) Then the operation which given a family `(A,B)` produces the least universe `(A',B')` extending `(A,B)` is defined as follows.

```     A' = A + S A'
B' (i a)      = B a
B' (j 0)      = {}
B' (j (s a')) = S (B' a')
```
Let us call this operation `Sw`. (Another name might be `Univ`, if we are calling a universe a family which contains the empty set and is closed under successor.) Note that it is an operator on families. In a sense, it takes a family to the least limit family extending it. If we start with the empty family, then it takes this to the family of finite ordinals. There is something monadic about `Sw`. It is a closure operator, in the sense that it is inflationary, monotone, and idempotent.

# Finite multiples of `w`

The ordinal `w*2` is the least ordinal which contains both `0` and `w`, and is closed under the successor operation. Similarly `w*(k+1)` is the least ordinal closed under the successor operation, which contains all of `0`, `w`, `w*2`, ... , ``` w*k```. This gets us (arbitrarily close) to `w^2`. But what really is the step to `w^2` itself? Somehow, we need to close under the general process of throwing in the last closure ordinal, and then closing again under zero and successor.

If one wants to go really slowly one should probably say something like this: "after w we have sw, ssw, ... then we can build w*2 as the least universe closed under 0, s, w. Similarly we can go on and build the ordinals up to w*3 and w*3 itself is the universe closed under 0, s, w, w*2. And similarly we can define w*n for finite n. Then we wish to internalize the process which brings us from w*a to w*(a+1) as a construction on general ordinals a. Now, w*(a+1) is the ordinal closed under 0, s, extending and containing w*a."

# Finite powers of `w`

Now we can consider `Sw` as an operator under which families can be closed, in addition to ``` S```. So we can define a set `Uw^2` as follows:

```     Uw^2 = data \$0
| \$S  (a : Uw^2)
| \$Sw (a : Uw^2, b : Tw^2 a -> Uw^2)
```
The family `Tw^2` is defined as follows:
```     Tw^2 \$0        = {}
Tw^2 (\$S a)    = S (Tw^2 a)
Tw^2 (\$Sw a b) = Sw (Tw^2 a) (Tw^2 . b)
```
It seems reasonable to regard `(Uw^2,Tw^2)` as one analogue of the ordinal `w^2` -- which can be regarded as the first ordinal which is greater than `0`, and closed under both `+1` and `+w`. But this may not be the right point of view.

Now `w^2` is surely closed under `Sw` -- but should we throw in also `0` and closure under ``` S``` (yet) again? Maybe this is a point at which we can `veer off', and we should take care not to leap ahead too quickly.

The operator involved here may be this. Given a family of sets `(A,B)`, define `(A',B')` by

```     A'   = data \$b (a : A)
| \$0
| \$S  (a' : A')
| \$Sw (a' : A', b' : B' a' -> A')
```
The family `B'` is defined as follows:
```     B' (\$b a)      = B a
B' \$0          = {}
B' (\$S a')     = S (B' a')
B' (\$Sw a' b') = Sw (B' a') (B' . b')
```
Perhaps we might call this operator `Sw^2`.

Let's consider (binary) addition. If we are given two families of sets `(A1,B1)` and `(A2,B2)`, we can define another family `(A,B)` which is in some sense the ordered sum of the first two: `(A1,B1) + (A2,B2)` (with `(A2,B2)` `above' `(A1,B1)`).

```       A = A1 + A2
B (i a1) = B1 a1
B (j a2) = A1 + B2 a2
```
This agrees with the definitions of successor, and `one'. (The definition of addition makes more sense, perhaps, if we model ordinals as transition systems rather than just families of sets.) The equations for the predecessor are
```       (i a1)[b1]   = i (a1[b1])
(j a2)[i a1] = i a1
(j a2)[j b2] = j (a2[b2])
```

## Multiplication

Let's consider (binary) multiplication. If we are given two families of sets `(A,B)` and `(C,D)`, we can define another family `(E,F)` which is in some sense the ordered product of the first two: `(C,D) * (A,B)` (with `(A,B)` `controlling' `(C,D)` so the operands of `*` are `the wrong way round' for the usual sense of ordinal arithmetic).

```       E       = A * C
F (a,c) = D c + B a * C
```
If we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are
```       (a,c)[i d]      = (a,c[d])
(a,c)[j (b,c')] = (a[b],c')
```

## Sum (of an ordered series)

Let's consider a general ordered sum. If we are given a family of sets `(A,B)`, and for each `a : A` a family of sets `(C a,D a)`, we can define another family `(E,F)` which is in some sense the ordered product of the family. This ought to generalise both binary addition (when the index family is the family `2') and multiplication (when the indexing is in fact constant).

```       E       = (Sigma a : A) C a
F (a,c) = D a c + (Sigma b : B a) C (a[b])
```
If we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are (just as above)
```       (a,c)[i d]      = (a,c[d])
(a,c)[j (b,c')] = (a[b],c')
```

## Exponential?

It is quite challenging to give a definition of exponentiation. Perhaps one can start with the `two to the _' operation, which in the case of linear orderings is the lexicographic order of chains which are descending in the exponent order. (I've not succeeded in defining this nicely as a transition system.) Then one might think of the `w to the _' operation, in which the descending chains are `decorated' with numerical `coefficients'. (One should also consider how to define the closely related multi-set order, better suited to non-linear orders: Thierry knows a great deal about this.) Finally, one could think about the general exponential.

General exponentiation is an enormously subtle operation. Almost everyone overlooks the cases in which the base is zero or one. The functions `sgbar a = 0^a` and ```sg a = 0^(0^a)``` are very important.

# Closure functional: singleton version

Suppose we have any family ```(U : Set, T : U -> Set)``` and any operator `Theta` from families of sets to families of sets. The operator ``` Theta``` can be split apart into two simpler operators, as follows.

```        Phi : Fam Set -> Set
Psi : (Pi (A,B) : Fam Set) Phi(A,B) -> Set
Theta (A,B) = (Phi(A,B),Psi(A,B)) : Fam Set
```
The least family of sets extending `(U,T)` and closed under `(Phi,Psi)` could perhaps be defined as follows.
```        U'  = data \$t (u : U)
| \$phi (a : U', b : T' a -> U')
| \$psi (a : U', b : T' a -> U', c : Phi(T' a, T' . b))

T' (\$t u)       = T u
T' (\$phi a b)   = Phi (T' a, T' . b)
T' (\$psi a b c) = Psi (T' a, T' . b) c
```
Let us call `(U',T')` `Cl(Phi,Psi)(U,T)`. The type of `Cl` is as follows.
```       Cl : (Fam Set -> Fam Set) -> Fam Set -> Fam Set
```
The existence of this functional is somehow equivalent to the `Mahloness' of the universe of sets. (Maybe one should think about whether this is some kind of `free' construction with a given `indeterminate' family. There is something monadic about this.)

It is natural to throw in zero and successor.

```       Deriv : (Fam Set -> Fam Set) -> Fam Set
Deriv Theta = (U',T')
where
U' = data \$0
| \$s (a : U')
| \$phi (a : U', b : T' a -> U')
| \$psi (a : U', b : T' a -> U', c : Phi (T' a, T' . b) )
T' = case \$0         -> U0
| \$s a       -> S (T' a)
| \$phi a b   -> Phi (T' a, T' . b)
| \$psi a b c -> Psi (T' a, T' . b) c
```
This is the least family containing the empty set, and closed under both the successor operation, and the given operator ``` Theta```. The step from `Theta` to ```Der (Cl Theta)``` is then something very like the Veblen derivative.

The least family of sets containing `(U,T)` and closed under ```Theta = \(A,B)->(Phi(A,B),Psi(A,B))``` could perhaps be defined as follows.

```         (U',T') = Cl Theta (Succ (U,T))
```

How does it look if we take `(U0,T0)` for the `contained' family, and von-Neumann successor for the operator on families? Then

```        U = U0, which is empty,
(so T is irrelevant),
Phi(A,B)       = S A,
Psi(A,B) 0     = A
Psi(A,B) (s a) = B a,
```
Hmm. I don't know what to make of that.

# Closure functional: family version

Slight generalisation of the `singleton' functional. (I have the feeling that this can be subsumed in the simpler functional.) Given a family of families of sets, and a family of operators from families of sets to families of sets, to define the least family of sets containing each of the given family of sets, and closed under each of the given families of operators. (This is roughly speaking what Peter Dybjer appears to suggest gives us an analogue of the Veblen hierarchy.)

Suppose we have

• a family `(I,\i->(U i,T i))` of families of sets `(U i : Set, T i : U -> Set)` where ```i : I```.
• a family of operators from families of sets to families of sets. These operators can be split apart into the form:
```        Phi j : (Pi A : Set) (A -> Set) -> Set
Psi j : (Pi A : Set, B : A -> Set) Phi j (A,B) -> Set
```
where `j : J` and `J : Set`.
The least family of sets containing each of the given families of sets and closed under each of the given family of operators from families of sets to families of sets could perhaps be defined as follows.
```   U'  = data \$u (i : I)
| \$t (i : I, u : U i)
| \$phi (j : J, a : U', b : T' u -> U')
| \$psi (j : J, a : U', b : T' u -> U', c : Phi j (T' u, T' . t) )

T' (\$u i)          = U i
T' (\$t i u)        = T i u
T' (\$phi j a b)    = Phi j (T' a) (T' . b)
T' (\$psi j a b c)  = Psi j (T' a) (T' . b) c
```

# A transition structure on families

A family of sets is an ordered pair `(A,B)` where `A : Set` and `B : A -> Set`. So `Fam Set` is a type, or large set. Let us however ignore this "size consideration" for the moment and pretend that `F = Fam Set` is a set, and so can be the state space of a transition system. We can put the following transition structure on `F`. The transitions from `(A,B)` are defined to have type
```        Tfam (A,B) = (Sigma a : A) B a -> A : Set
```
The destination states of the transitions are given by
```        (A,B)[a,b] = (B a, B . b) : F
```
where ``_ . _`' refers to composition. To see that this type checks, remember that `a : A`, ```b : B a -> A``` and `B : A -> Set`, so `B a : Set` and `B . b : B a -> Set`.

A chain starting from `(A,B) : Fam Set` has the form

```            (A,B)
|
|
| (a,b) : (Sigma a : A) B a -> A
|
V
(A',B') = (B a, B . b)
|
|
| (a',b') : (Sigma a' : B a) (B . b) a' -> B a
|
V
(A'',B'') = ((B . b) a', B . b . b')
|
|
| (a'',b'') : (Sigma a'' : (B . b) a') (B . b . b') a'' -> (B . b) a'
|
V
(A''',B''') = ((B . b . b') a'', B . b . b' . b'')
```
and so on. Each successive family of sets is a "internal refinement" or restriction of its predecessor.

Now, what about size questions? I have defined a transition system to have a set of states. The situation appears to be this. For each fixed family of sets `(A,B)` we can put a transition structure on the set of "internal-families" of `(A,B)`, ie the set

```            sfam (A,B) = (Sigma a : A) B a -> A
```
Namely, we take as transition sets
```             T : sfam (A,B) -> Set
T(a,b) =  (Sigma c : B a) B (b c) -> B a
=  sfam (B a, B . b)
```
and as destinations
```             _[_] : (Pi (a,b) : sfam (A,B)) T (a,b) -> sfam (A,B)
(a,b)[c,d] = (b c, b . d)
where a : A
b : B a -> A
c : B a
d : B (b c) -> B a
```
In other words, we can replace talk of "sets" by talk of "small sets".

# The ordinals below the Bachmann-Howard ordinal

I've not thought a great deal about it, but Peter Dybjer's idea for a `Veblen' functional suggests to me investigating closure functionals of higher type. (cf. Erik Palmgren's hierarchy.) With functionals of finite type, maybe one could approach the Bachmann-Howard ordinal?

It may be interesting to chase up a result of Michael Rathjen's on the strength of Martin-Löf type theory with a super-universe, parts I and II. Part I is the Archive for Mathematical Logic, somewhere; part II is on his web page. I think it concerns a version of Martin-Löf type theory without generalised inductive definitions, but with something like a super-universe closed under the `next universe' operator. (And natural numbers?) If I recall correctly, he located the strength of this system using the Veblen hierarchy based upon the `G` (Gamma) function: he calls the ordinal `Phi (G_0) 0` , and I guess that `Phi` is the Veblen hierarchy on ``` G_a```.

Another thing here is Peter Aczel's hierarchy of `diagonalisation' functionals from his JSL 1972 paper on describing ordinals by means of functionals of transfinite type. It could be that the functionals considered here are in some way analogous to these diagonalisation functions. Peter Aczel identified a certain ordinal above the Bachmann-Howard ordinal (as it were, using not epsilon numbers in the third number class, but Gamma> numbers to index the diagonalisations) with the least countable ordinal that cannot be described by diagonalisation functionals of autonomously transfinite types. Who knows, maybe this ordinal will turn up here if one thinks about functionals of transfinite type.

Another thing to look at may be Herman Jervell's constructions from the 70's. None of the Edinburgh libraries seem to have the relevant volume of LNM.