Index
A
- abort on exception
- C example,  117
 
 - accuracy,  230
- floating-point operations,  4
 - significant digits (number of),  19
 - threshold,  29
 
 - adb,  61
 - addrans
- random number utilities,  49
 
 - argument reduction
- trigonometric functions,  48
 
 
B
- base conversion
- base 10 to base 2,  50
 - base 2 to base 10,  50
 - formatted I/O,  50
 
 - Bessel functions,  230
 
C
- C driver
- example, call FORTRAN subroutines from C,  119
 
 - clock speed,  136
 - compiler option
- -Xa
- X/Open behavior for libm,  229
 
 - -Xt
- SVID behavior for libm,  229
 
 
 - conversion between number sets,  20
 - conversions between decimal strings and binary floating-point numbers,  4
 - convert_external
- binary floating-point,  49
 - data conversion,  49
 
 
D
- data types
- relation to IEEE formats,  5
 
 - dbx,  61
 - decimal representation
- maximum positive normal number,  18
 - minimum positive normal number,  18
 - precision,  18
 - ranges,  18
 
 - double-precision representation
- C example,  90
 - FORTRAN example,  91
 
 
E
- errno.h
- define values for errno,  229
 
 - examine the accrued exception bits
- C example,  104
 
 - examine the accrued exception flags
- C example,  106
 
 
F
- f77_floatingpoint.h
- define handler types
- FORTRAN,  69
 
 
 - -fast,  135
 - floating-point
- exceptions list,  4
 - rounding direction,  4
 - rounding precision,  4
 - tutorial,  147
 
 - floating-point accuracy
- decimal strings and binary floating-point numbers,  4
 
 - floating-point exceptions,  2, 133
- abort on exceptions,  117
 - accrued exception bits,  104
 - common exceptions,  54
 - default result,  55
 - definition,  54
 - flags,  57
- accrued,  57
 - current,  57
 
 - ieee_functions,  39
 - ieee_retrospective,  45
 - list of exceptions,  54
 - priority,  57
 - trap precedence,  57
 
 - floating-point options,  130
 - floating-point queue (FQ),  132
 - floating-point status register (FSR),  127, 132
 - floating-point unit,  137
- disable on SPARC,  137
 - enable on SPARC,  137
 
 - floating-point unit (FPU),  130, 131
 - floatingpoint.h
- define handler types
- C and C++,  69
 
 
 - flush to zero (see Store 0),  23
 - fmod,  230
 - -fnonstd,  135
 - fpversion,  136
 
G
- generate an array of numbers
- FORTRAN example,  92
 
 - Goldberg paper,  147
- abstract,  148
 - acknowledgments,  216
 - details,  201
 - IEEE standard,  165
 - IEEE standards,  161
 - introduction,  148
 - references,  216
 - rounding error,  149
 - summary,  215
 - systems aspects,  187
 
 - gradual underflow
- error properties,  25
 
 
H
- HUGE
- compatibility with IEEE standard,  226
 
 - HUGE_VAL
- compatibility with IEEE standard,  226
 
 
I
- IEEE double extended format
- biased exponent
- x86 architecture,  14
 
 - bit-field assignment
- x86 architecture,  14
 
 - fraction
- x86 architecture,  14
 
 - Inf
- SPARC architecture,  13
 - x86 architecture,  16
 
 - NaN
- x86 architecture,  18
 
 - normal number
- SPARC architecture,  13
 - x86 architecture,  16
 
 - quadruple precision
- SPARC architecture,  12
 
 - sign bit
- x86 architecture,  15
 
 - significand
- explicit leading bit
- x86 architecture,  14
 
 
 - subnormal number
- SPARC architecture,  13
 - x86 architecture,  16
 
 
 - IEEE double format
- biased exponent,  8
 - bit patterns and equivalent values,  10
 - bit-field assignment,  8
 - denormalized number,  10
 - fraction,  8
- storage on SPARC,  8
 - storage on x86,  8
 
 - implicit bit,  10
 - Inf, infinity,  10
 - NaN, not a number,  11
 - normal number,  10
 - precision,  10
 - sign bit,  9
 - significand,  10
 - subnormal number,  10
 
 - IEEE formats
- relation to language data types,  5
 
 - IEEE single format
- biased exponent,  6
 - biased exponent,implicit bit,  7
 - bit assignments,  6
 - bit patterns and equivalent values,  7
 - bit-field assignment,  6
 - denormalized number,  7
 - fraction,  6
 - Inf,positive infinity,  7
 - mixed number, significand,  7
 - NaN, not a number,  8
 - normal number
- maximum positive,  7
 
 - normal number bit pattern,  6
 - precision, normal number,  7
 - sign bit,  6
 - subnormal number bit pattern,  6
 
 - IEEE Standard 754
- double extended format,  4
 - double format,  3
 - single format,  3
 
 - ieee_flags,  42
- accrued exception flag,  42
 - examine accrued exception bits
- C example,  104
 
 - rounding direction,  42
 - rounding precision,  42, 44
 - set exception flags
- C example,  107
 
 - truncate rounding,  43
 
 - ieee_functions
- bit mask operations,  38
 - floating-point exceptions,  39
 
 - ieee_handler,  69
- abort on exception
- FORTRAN example,  117
 
 - example, calling sequence,  62
 - trap on common exceptions,  54
 - trap on exception
- C example,  109
 
 
 - ieee_retrospective
- check underflow exception flag,  135
 - floating-point exceptions,  44
 - floating-point status register (FSR),  45
 - getting information about nonstandard IEEE modes,  44
 - getting information about outstanding exceptions,  44
 - nonstandard_arithmetic in effect,  45
 - precision,  44
 - rounding,  44
 - suppress exception messages,  46
 
 - ieee_sun
- IEEE classification functions,  38
 
 - ieee_values
- quadruple-precision values,  40
 - representing floating-point values,  40
 - representing Inf,  40
 - representing NaN,  40
 - representing normal number,  40
 - single-precision values,  40
 
 - ieee_values functions
- C example,  99
 
 - Inf,  2, 227
- default result of divide by zero,  55
 
 
L
- lcrans
- random number utilities,  49
 
 - libm
- SVID compliance,  227
 
 - libm
- default directories
- executables,  32
 - header files,  32
 
 - list of functions,  32
 - standard installation,  32
 
 - libm functions
- double precision,  37
 - quadruple precision,  37
 - single precision,  37
 
 - libmil (see also in-line templates),  132, 227
 - libsunmath
- default directories
- executables,  33
 - header files,  33
 
 - list of functions,  34
 - standard installation,  33
 
 
M
- MAXFLOAT,  229
 
N
- NaN,  2, 14, 227, 230
 - nonstandard_arithmetic,  135
- turn off IEEE gradual underflow,  135
 - underflow,  46
- gradual,  46
 
 
 - normal number
- maximum positive,  7
 - minimum positive,  23, 27
 
 - number line
- binary representation,  19
 - decimal representation,  19
 - powers of 2,  26
 
 
O
- operating system math library
- libm.a,  32
 - libm.so,  32
 
 
P
- pi
- infinitely precise value,  48
 
 - PowerPC
- bit pattern values,  13
 - double format,  8
 - double-extended format,  12
 - IEEE arithmetic,  3
 - quad-precision values,  41
 - ranges and precisions,  18
 - the FPSCR register,  57
 - underflow thresholds,  22
 
 
Q
- quadruple-precision representation
- FORTRAN example,  91
 
 - quiet NaN
- default result of invalid operation,  55
 
 
R
- random number generators,  92
 - random number utilities
- shufrans,  49
 
 - represent double-precision value
- C example,  90
 - FORTRAN example,  92
 
 - represent single-precision value
- C example,  90
 
 - rounding direction,  4, 25
- C example,  102
 - ulp (unit in the last place),  25
 
 - rounding precision,  4
 - roundoff error
- accuracy
- loss of,  24
 
 
 
S
- set exception flags
- C example,  107
 
 - shufrans
- shuffle pseudo-random numbers,  49
 
 - single format,  6
 - single precision representation
- C example,  90
 
 - SPARC
- FPU,  137
 
 - square root instruction,  133, 227
 - standard_arithmetic
- turn on IEEE behavior,  135
 
 - Store 0,  23
- flush underflow results,  27, 28
 
 - subnormal number,  27, 132
- floating-point calculations,  23
 
 - SVID behavior of libm
- -Xt compiler option,  229
 
 - SVID exceptions
- errno set to EDOM
- improper operands,  226
 
 - errno set to ERANGE
- overflow or underflow,  226
 
 - matherr,  226
 - PLOSS,  230
 - TLOSS,  230
 
 - System V Interface Definition (SVID),  225
 
T
- trap,  131
- abort on exception,  117
 - ieee_retrospective,  45
 
 - trap on exception
- C example,  109, 110
 
 - trap on floating-point exceptions
- C example,  109
 
 - trigonometric functions
- argument reduction,  48
 
 - tutorial, floating-point,  147
 
U
- underflow
- floating-point calculations,  22
 - gradual,  23, 132
 - nonstandard_arithmetic,  46
 - threshold,  27
 
 - underflow thresholds
- double extended precision,  22
 - double precision,  22
 - single precision,  22
 
 - unordered comparison
- floating-point values,  56
 - NaN,  56
 
 
V
- values.h
- define error messages,  229
 
 
X
- X/Open behavior of libm
- -Xa compiler option,  229
 
 - X_TLOSS,  229
 - -Xa,  229
 - -Xc,  229
 - -Xs,  229
 - -Xt,  229