Go to the previous, next chapter.

What exactly are the encodings of the DOS code pages?

DOS uses `code pages' for `IBM OEM' encoding of fonts. There are six code pages supplied with DOS 5.0:

  437 (English)
  850 (Multilingual - Latin I)
  852 (Slavic - Latin II)
  860 (Portugal)
  863 (Canadian French)
  865 (Nordic)

(The character code range 0 - 127 is the same in all code pages).

The problem is that MS idea of how to define what a code page is, is to show a low resolution print out of the glyphs! Which is fine for the letters of the alphabet, numerals and the obvious punctuation marks, but worthless for accents (is it `cedilla' or `ogonek'? is it `caron' or `breve'?) and many other characters. For example, 249 is a small dot, while 250 is a slightly larger dot. Is one of these supposed to be `bullet' (which already occurs at 7)? Or is one of them maybe supposed to be `middot' or `dotcentered'? Is 228 supposed to be `Sigma' or `summation'. Is 225 supposed to be `beta' or `germandbls'? Etc etc

And what is the character that looks like `Pt' in code position 158?

Anyway, surely there is a table somewhere that defines precisely what these encodings are supposed to be. That is, a table that gives for each code number the name and/or a description of the character.


Excerpted from The comp.fonts FAQ, Copyright © 1992-96 by Norman Walsh