For a long time I have felt that a general theory of ordinals
comparable to that of the von Neumann ordinals in classical set theory
is lacking in constructive type theory, and indeed that this is
something of a "scandal". Ordinal notations are one thing; but "the"
ordinals are another - meaning by "the" ordinals canonical
representatives of well-orderings such as the von Neumann ordinals. (There is
a typical lament for example in this page:
search for the term "spine-chilling".) On the other hand, it's
unlikely one needs to fill *exactly* the rôle of the notion
of ordinal in classical set theory.

It seems important to `go slowly'. At certain points one stares at the ordinals defined so far, then formulates explicitly a principle that underlies their definition, and then exploits the principle to push further. If one is too hasty, the patterns of definition will come out slightly awry, the reflection principles will accordingly come out slightly wrong, and the whole enterprise will veer off in the wrong direction. (I expect there will be quite a lot of veering-off.)

It is also important and difficult to settle on some workable notation. I have not done so yet; so what follows may be less easy to follow than it needs to be.

Arguably the first few `watershed' ordinals are

`0`

zero, the first ordinal, with no predecessors. Closed under practically anything: addition, multiplication, exponentiation, ..., for trivial reasons. (In some ways zero is the most mysterious ordinal of all!)`1`

one, the first non-zero ordinal, with predecessor`0`

. Closed under addition and multiplication, but not exponentiation. Closed under sums (but not products) of series of length < 1.`2`

two. Closed under multiplication, and exponentiation, but not binary addition. Closed under sums (and products) of series of length < 2.`w`

omega, the first infinite ordinal, with predecessors the finite ordinals. Closed under addition, multiplication, exponentiation, etc etc. Closed under sums (and products) of series of finite length.`w^w`

omega-to-the-omega, the ordinal of primitive recursive arithmetic (or arithmetic with recursion restricted to lowest type). This is (for example) the order type of the unary polynomial functions from the natural numbers to the natural numbers ordered by eventual dominance (so, highest different exponent). The predecessors of`w^w`

are describable by terms built up from`0`

,`1`

and`w`

by addition and multiplication.`e_0`

epsilon_zero, the ordinal of first order arithmetic, with recursion allowed at all finite types. The predecessors of`e_0`

are describable by terms built up from`0`

and`w`

by addition and (multiplication and) exponentiation.`phi_w(0)`

, the ordinal of primitive recursive arithmetic including (transfinite) recursion over well-founded countably branching trees. Actually, maybe this one's just for real obsessives.`G_0`

Gamma_zero, the ordinal of type theory with finitary data types, and an external sequence of universes closed under the`Pi`

quantifier. (Also ATR0, and dozens of other systems.) Its predecessors are described by terms built up from the terms below`e_0`

by closing them additionally under a binary operator for Veblen's 2-place function (based on the function`w^a`

which enumerates the non-zero ordinals closed under binary addition).- The Bachmann-Howard ordinal. The description of terms below
this ordinal involves the use of certain ordinals in the
*third*number class. I shall not bother to say any more at the moment, because I expect to be occupied with the smaller ordinals for some time.

Given a normal (ie. continuous and strictly increasing) ordinal
function `f`

, by the Veblen hierarchy over ```
f
```

, I mean the 2 place function `V_f(a,b)`

defined
so that for each `a`

, `V_f(a)`

enumerates the
common fixed points of the family of functions ```
{ V_f(a') | a'
< a }
```

.

There are lots of interesting normal functions to start with, some of them unfamiliar to me. Here are a few.

- Usually one starts with the function
`w^a`

, giving the classical Veblen hierarchy. So`V_f(1,0)`

is`e_0`

, the first epsilon number, and the proof-theoretic ordinal of first order arithmetic. Many weak subsystems of second order arithmetic crop up in this hierarchy. The only one I can remember is sigma-1-1 axiom of choice, which has ordinal`V_f(e_0,0)`

. Perhaps ramified analysis of level 1 has ordinal`V_f(2,0)`

; this was all explored by Feferman and the late Kurt Schütte while dinosaurs were roaming the earth, in the 50's. - For any normal
`f`

,`V_f(a,0)`

is again normal, so one may form a normal hierarchy (which I call the second Veblen hierarchy) on that function -- and so on. (This takes us into the realms of the so-called "n-place" Veblen functions; I can't remember the exact definition, or even where to look it up.) - One may also start with the function
`1 + a`

, which is surely the simplest non-trivial normal function. It enumerates the non-zero ordinals. In this case`V_f(a,b)`

equals`w^a + b`

. The second Veblen hierarchy on this function is again the classical (first) Veblen hierarchy. `w.(1+a)`

which enumerates the limit ordinals. Its derivative is`(w^w).(1+a)`

, unless I am mistaken. The derivative of that is`(w^w^2).(1+a)`

, UIAM. In general, the`b`

-th derivative`V_f(b,a)`

stands some chance of being`(w^(w^b)).(1+a)`

. (I should check this.) If that's true, then the second Veblen hierarchy is based on`w^(w^a)`

which (strangely enough) enumerates the infinite ordinals closed under binary multiplication (the finite ones being`0`

,`1`

and`2`

).

As mentioned already, there are the `n-place' Veblen functions, and beyond that Shütte's `Klammersymbolen'. I think Anton Setzer has a way of extending the Klammersymbolen (multi-dimensionally?) up to the Bachmann-Howard ordinal. There may even be some definitions of these things in his drafts on ordinal systems.

- The ordinal zero.
Let

`U0 = {} : Set`

denote the empty set, and`T0 : U0 -> Set`

the empty set-valued function with domain`U0`

. There is surely no other choice. - The ordinal one.
Let

`U1 = {0} : Set`

denote the set with a single element`0`

, and`T1 : U1 -> Set`

the set-valued function with domain`U1`

whose value at`0`

is the set`U0 = {}`

. - The ordinal two.
Let

`U2 : Set`

denote the set with a two elements`0, 1`

, and`T2 : U2 -> Set`

the set-valued function with domain`U2`

whose value at`0`

is the set`U0 = {}`

, and whose value at`1`

is the set`U1 = {0}`

. This family of sets is sometimes called the boolean universe, or Jan Smith's microscopic universe.

And so on.

Suppose we are given a family `A,B`

. What should the
successor be? In either case the index set needs to be extended by
a new element, so we take `A' = A + 1`

. Here ```
+
```

means the binary disjoint union operation on sets, and
`1`

means the singleton set `{ 0 }`

with sole
element `0`

.

`B'`

is defined byB' (i a) = B a + 1 B' (j 0) = {}

One alters the function

`B`

to`B'`

, having one more new point in its domain. However the new element `goes at the bottom'. The old elements are themselves `shifted up by 1'.`B'`

is defined byB' (i a) = B a B' (j 0) = A

One extends the function

`B`

to`B'`

, having one more new point in its domain, and at this point taking as its value the entire domain of`B`

.This operation is the analogue of the von Neumann successor operation which maps a set

`x`

to`x' = x U {x}`

. One extends`x`

by one new element, namely the singleton`{x}`

.This is the same as what one gets if one looks at Erik Palmgren's next-universe operator, (the one under which the superuniverse is closed) and strips out all the closure conditions pertaining to quantifiers, given sets, logical constants, the identity type etc.

`Succ^`

and the second (with the empty set at the
bottom) `Succ_`

.
In both definitions above, the operation (on sets) ```
X
|-> X + 1
```

is of key importance. Martin-Löf has
suggested that this operation deserves its own name. (See pp.
103-104 of Nordström, Petersson and Smith.) So let's take
`S`

to be a unary type constructor with the following
formation and introduction rules:

X : Set x : X --------- formation 0 : S X --------- S X : Set s x : S X

I would prefer to write this information as follows, using a more compact notation vaguely like Agda's.

S : Set -> Set S X = data 0 | S (x : X )

Here `data' is a form of coproduct operator, such that one has the
opportunity to associate constructor names with the various
cofactors: in this case the names are `0`

(with no components), and
`s`

(with one component, which is an `X`

). To
analyse such coproducts, there is a `case' construction.

case z of 0 -> a | S x -> b x

Sometimes I miss out the `z of': in that case I mean ```
\z->case z of ...
```

. Sometimes I use pattern-matching
definitions rather than the case construction - these should always
be reducible to case constructions.

Note that when we step from `(U,T)`

to ```
Succ^
(U,T)
```

, or `Succ_ (U,T)`

, there is no `interesting'
interaction between the two components. (I don't mean anything
disparaging by this!) That is, `U'`

is defined in terms of
`U`

, but not `T`

-- it is really an operation on
sets, `artificially' made into an operator on families. (Compare this to
the closure of a universe under a quantifier like `Pi`

,
`Sigma`

, or `W`

, where there is an
`interesting' interaction.) However it is true that in the definition
of `Succ^`

, `T'`

depends on `U`

(as
well of course as `T`

), which is perhaps `mildly'
interesting.

`w`

`w`

is the least ordinal which contains
`0`

and is closed under the successor operation ```
Succ_
```

. It reflects the property of the class of ordinals
that it is non-empty (hence there is a least ordinal), and for any
ordinal there is a greater (hence a least ordinal which is strictly
greater). Clearly, one way to define `w`

is as follows.
Uw = Nat = S Nat -- the set of natural numbers Tw 0 = {} Tw (s x) = S (Tw x)But perhaps this is already to leap ahead too far. That is what we

Something occurs to me here: the distinction between

- an operator on families on sets which takes us from
`(A,B)`

to`(A',B')`

, where the latter is the least family*extending*`(A,B)`

enjoying certain closure properties.(A',B') : Fam Set A' = A + ... B' (i a) = B a B' ... = ...

- an operator on families on sets which takes us from
`(A,B)`

to`(A',B')`

, where the latter is the least family*containing*`(A,B)`

enjoying certain closure properties.(A',B') : Fam Set A' = S A + ... B' (i 0) = A B' (i (s a)) = B a B' ... = ...

```
Succ^
(A,B)
```

? Yes!
In this connection it may be interesting to look at one way of
expressing the Veblen derivative of a *single* function
`f : On -> On`

:

f' a = (f^w . (+1))^a (f^w 0)The `

`(+1)`

' which occurs here is perhaps the correlate
of the successor operation `Succ^`

on families.
Let us (for the time being) consider a `universe' to be a family
of sets containing the empty set and closed under (set) successor
`S`

. (This is a reasonable notion of limit ordinal.)
Then the operation which given a family `(A,B)`

produces
the least universe `(A',B')`

extending `(A,B)`

is defined as follows.

A' = A + S A' B' (i a) = B a B' (j 0) = {} B' (j (s a')) = S (B' a')Let us call this operation

`Sw`

. (Another name might be
`Univ`

, if we are calling a universe a family which
contains the empty set and is closed under successor.) Note that it is
an operator on families. In a sense, it takes a family to the least
limit family extending it. If we start with the empty family, then it
takes this to the family of finite ordinals. There is something monadic
about `Sw`

. It is a closure operator, in the sense that
it is inflationary, monotone, and idempotent.
`w`

The ordinal `w*2`

is the least ordinal which contains
both `0`

and `w`

, and is closed under the
successor operation. Similarly `w*(k+1)`

is the least
ordinal closed under the successor operation, which contains all of
`0`

, `w`

, `w*2`

, ... , ```
w*k
```

. This gets us (arbitrarily close) to `w^2`

.
But what really is the step to `w^2`

itself? Somehow, we
need to close under the general process of throwing in the last
closure ordinal, and then closing again under zero and successor.

If one wants to go really slowly one should probably say something like this: "after w we have sw, ssw, ... then we can build w*2 as the least universe closed under 0, s, w. Similarly we can go on and build the ordinals up to w*3 and w*3 itself is the universe closed under 0, s, w, w*2. And similarly we can define w*n for finite n. Then we wish to internalize the process which brings us from w*a to w*(a+1) as a construction on general ordinals a. Now, w*(a+1) is the ordinal closed under 0, s, extending and containing w*a."

`w`

Now we can consider `Sw`

as an
operator under which families can be closed, in addition to ```
S
```

. So we can define a set `Uw^2`

as follows:

Uw^2 = data $0 | $S (a : Uw^2) | $Sw (a : Uw^2, b : Tw^2 a -> Uw^2)The family

`Tw^2`

is defined as follows:
Tw^2 $0 = {} Tw^2 ($S a) = S (Tw^2 a) Tw^2 ($Sw a b) = Sw (Tw^2 a) (Tw^2 . b)It seems reasonable to regard

`(Uw^2,Tw^2)`

as one
analogue of the ordinal `w^2`

-- which can be regarded
as the first ordinal which is greater than `0`

, and
closed under both `+1`

and `+w`

. But this may
not be the right point of view.
Now `w^2`

is surely closed under `Sw`

--
but should we throw in also `0`

and closure under ```
S
```

(yet) again? Maybe this is a point at which we can `veer
off', and we should take care not to leap ahead too quickly.

The operator involved here may be this. Given a family of sets
`(A,B)`

, define `(A',B')`

by

A' = data $b (a : A) | $0 | $S (a' : A') | $Sw (a' : A', b' : B' a' -> A')The family

`B'`

is defined as follows:
B' ($b a) = B a B' $0 = {} B' ($S a') = S (B' a') B' ($Sw a' b') = Sw (B' a') (B' . b')Perhaps we might call this operator

`Sw^2`

.
Let's consider (binary) addition. If we are given two families
of sets `(A1,B1)`

and `(A2,B2)`

, we can
define another family `(A,B)`

which is in some sense the
ordered sum of the first two: `(A1,B1) + (A2,B2)`

(with
`(A2,B2)`

`above' `(A1,B1)`

).

A = A1 + A2 B (i a1) = B1 a1 B (j a2) = A1 + B2 a2This agrees with the definitions of successor, and `one'. (The definition of addition makes more sense, perhaps, if we model ordinals as transition systems rather than just families of sets.) The equations for the predecessor are

(i a1)[b1] = i (a1[b1]) (j a2)[i a1] = i a1 (j a2)[j b2] = j (a2[b2])

Let's consider (binary) multiplication. If we are given two
families of sets `(A,B)`

and `(C,D)`

, we can
define another family `(E,F)`

which is in some sense the
ordered product of the first two: `(C,D) * (A,B)`

(with
`(A,B)`

`controlling' `(C,D)`

so the operands
of `*`

are `the wrong way round' for the usual sense of
ordinal arithmetic).

E = A * C F (a,c) = D c + B a * CIf we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are

(a,c)[i d] = (a,c[d]) (a,c)[j (b,c')] = (a[b],c')

Let's consider a general ordered sum. If we are given a family
of sets `(A,B)`

, and for each `a : A`

a
family of sets `(C a,D a)`

, we can define another family
`(E,F)`

which is in some sense the ordered product of
the family. This ought to generalise both binary addition (when the
index family is the family `2') and multiplication (when the
indexing is in fact constant).

E = (Sigma a : A) C a F (a,c) = D a c + (Sigma b : B a) C (a[b])If we model ordinals as transition systems rather than just families of sets, the equations for the predecessor are The equations for the predecessor are (just as above)

(a,c)[i d] = (a,c[d]) (a,c)[j (b,c')] = (a[b],c')

It is quite challenging to give a definition of exponentiation.
Perhaps one can start with the `two to the _' operation, which in
the case of linear orderings is the lexicographic order of chains
which are descending in the exponent order. (I've not succeeded in
defining this *nicely* as a transition system.) Then one
might think of the `w to the _' operation, in which the descending
chains are `decorated' with numerical `coefficients'. (One should
also consider how to define the closely related multi-set order,
better suited to non-linear orders: Thierry knows a great deal
about this.) Finally, one could think about the general
exponential.

General exponentiation is an enormously subtle operation. Almost
everyone overlooks the cases in which the base is zero or one. The
functions `sgbar a = 0^a`

and ```
sg a =
0^(0^a)
```

are very important.

Suppose we have *any* family ```
(U : Set, T : U ->
Set)
```

and *any* operator `Theta`

from
families of sets to families of sets. The operator ```
Theta
```

can be split apart into two simpler operators, as
follows.

Phi : Fam Set -> Set Psi : (Pi (A,B) : Fam Set) Phi(A,B) -> Set Theta (A,B) = (Phi(A,B),Psi(A,B)) : Fam SetThe least family of sets

`(U,T)`

and
`(Phi,Psi)`

could perhaps be
defined as follows.
U' = data $t (u : U) | $phi (a : U', b : T' a -> U') | $psi (a : U', b : T' a -> U', c : Phi(T' a, T' . b)) T' ($t u) = T u T' ($phi a b) = Phi (T' a, T' . b) T' ($psi a b c) = Psi (T' a, T' . b) cLet us call

`(U',T')`

`Cl(Phi,Psi)(U,T)`

. The
type of `Cl`

is as follows.
Cl : (Fam Set -> Fam Set) -> Fam Set -> Fam SetThe existence of this functional is somehow equivalent to the `Mahloness' of the universe of sets. (Maybe one should think about whether this is some kind of `free' construction with a given `indeterminate' family. There is something monadic about this.)

It is natural to throw in zero and successor.

Deriv : (Fam Set -> Fam Set) -> Fam Set Deriv Theta = (U',T') where U' = data $0 | $s (a : U') | $phi (a : U', b : T' a -> U') | $psi (a : U', b : T' a -> U', c : Phi (T' a, T' . b) ) T' = case $0 -> U0 | $s a -> S (T' a) | $phi a b -> Phi (T' a, T' . b) | $psi a b c -> Psi (T' a, T' . b) cThis is the least family containing the empty set, and closed under both the successor operation, and the given operator

```
Theta
```

. The step from `Theta`

to ```
Der (Cl
Theta)
```

is then something very like the Veblen derivative.
The least family of sets *containing* `(U,T)`

and *closed under* ```
Theta =
\(A,B)->(Phi(A,B),Psi(A,B))
```

could perhaps be defined as
follows.

(U',T') = Cl Theta (Succ (U,T))

How does it look if we take `(U0,T0)`

for the
`contained' family, and von-Neumann successor for the operator on
families? Then

U = U0, which is empty, (so T is irrelevant), Phi(A,B) = S A, Psi(A,B) 0 = A Psi(A,B) (s a) = B a,Hmm. I don't know what to make of that.

Slight generalisation of the `singleton' functional. (I have the feeling that this can be subsumed in the simpler functional.) Given a family of families of sets, and a family of operators from families of sets to families of sets, to define the least family of sets containing each of the given family of sets, and closed under each of the given families of operators. (This is roughly speaking what Peter Dybjer appears to suggest gives us an analogue of the Veblen hierarchy.)

Suppose we have

- a family
`(I,\i->(U i,T i))`

of families of sets`(U i : Set, T i : U -> Set)`

where`i : I`

. - a family of operators from families of sets to families of
sets. These operators can be split apart into the form:
Phi j : (Pi A : Set) (A -> Set) -> Set Psi j : (Pi A : Set, B : A -> Set) Phi j (A,B) -> Set

where`j : J`

and`J : Set`

.

U' = data $u (i : I) | $t (i : I, u : U i) | $phi (j : J, a : U', b : T' u -> U') | $psi (j : J, a : U', b : T' u -> U', c : Phi j (T' u, T' . t) ) T' ($u i) = U i T' ($t i u) = T i u T' ($phi j a b) = Phi j (T' a) (T' . b) T' ($psi j a b c) = Psi j (T' a) (T' . b) c

`(A,B)`

where `A : Set`

and `B : A -> Set`

. So `Fam Set`

is a type, or large set. Let us however ignore this "size consideration" for the moment
and pretend that `F = Fam Set`

is a set, and so can be the state space
of a transition system. We can put the following transition structure on `F`

.
The `(A,B)`

are defined to have type
Tfam (A,B) = (Sigma a : A) B a -> A : SetThe

(A,B)[a,b] = (B a, B . b) : Fwhere `

`_ . _`

' refers to composition. To see that this
type checks, remember that `a : A`

, ```
b : B a ->
A
```

and `B : A -> Set`

, so `B a : Set`

and
`B . b : B a -> Set`

.
A chain starting from `(A,B) : Fam Set`

has the form

(A,B) | | | (a,b) : (Sigma a : A) B a -> A | V (A',B') = (B a, B . b) | | | (a',b') : (Sigma a' : B a) (B . b) a' -> B a | V (A'',B'') = ((B . b) a', B . b . b') | | | (a'',b'') : (Sigma a'' : (B . b) a') (B . b . b') a'' -> (B . b) a' | V (A''',B''') = ((B . b . b') a'', B . b . b' . b'')and so on. Each successive family of sets is a "internal refinement" or restriction of its predecessor.

Now, what about size questions? I have defined a transition system to
have a *set* of states. The situation appears to be this. For
each fixed family of sets `(A,B)`

we can put a transition
structure on the set of "internal-families" of `(A,B)`

, ie
the set

sfam (A,B) = (Sigma a : A) B a -> ANamely, we take as transition sets

T : sfam (A,B) -> Set T(a,b) = (Sigma c : B a) B (b c) -> B a = sfam (B a, B . b)and as destinations

_[_] : (Pi (a,b) : sfam (A,B)) T (a,b) -> sfam (A,B) (a,b)[c,d] = (b c, b . d) where a : A b : B a -> A c : B a d : B (b c) -> B aIn other words, we can replace talk of "sets" by talk of "small sets".

It may be interesting to chase up a result of
Michael Rathjen's on the strength of Martin-Löf type theory
with a super-universe, parts I and II. Part I is the Archive for
Mathematical Logic, somewhere; part II is on his web page. I think it
concerns a version of Martin-Löf type theory without
generalised inductive definitions, but with something like a
super-universe closed under the `next universe' operator. (And natural
numbers?) If I recall correctly, he located the strength of this
system using the Veblen hierarchy based upon the `G`

(Gamma) function: he calls the ordinal `Phi (G_0) 0`

, and
I guess that `Phi`

is the Veblen hierarchy on ```
G_a
```

.

Another thing here is Peter Aczel's hierarchy of
`diagonalisation' functionals from his JSL 1972 paper on describing
ordinals by means of functionals of transfinite type. It could be
that the functionals considered here are in some way analogous to these
diagonalisation functions. Peter Aczel identified a certain ordinal
above the Bachmann-Howard ordinal (as it were, using not epsilon
numbers in the third number class, but *Gamma*> numbers
to index the diagonalisations) with the least countable ordinal
that cannot be described by diagonalisation functionals of
autonomously transfinite types. Who knows, maybe this ordinal will
turn up here if one thinks about functionals of transfinite
type.

Another thing to look at may be Herman Jervell's constructions from the 70's. None of the Edinburgh libraries seem to have the relevant volume of LNM.

Can't one formulate in this framework a notion of regularity, and deal with regular ordinals? After all, an ordinal is regular if (it is infinite and) closed under sums of sequences which are indexed by a smaller ordinal; the property presents itself as a closure property.

There is a (very possibly completely confused) suggestion for a regularity-based definition of `the ordinals' (up to, I presume the first inaccessible) as a transition system in the middle of the following page.

How does one account for recursive definitions in Peter's style? Are these universes merely fixed points, or least fixed points?

Peter Hancock Last modified: Sun Sep 19 10:16:50 BST 1999