Index

A

abort on exception

: C example, 117

accuracy, 230

: floating-point operations, 4
: significant digits (number of), 19
: threshold, 29

adb, 61

addrans

: random number utilities, 49

argument reduction

: trigonometric functions, 48

B

base conversion

: base 10 to base 2, 50
: base 2 to base 10, 50
: formatted I/O, 50

Bessel functions, 230

C

C driver

: example, call FORTRAN subroutines from C, 119

clock speed, 136

compiler option

-Xa

: X/Open behavior for libm, 229

-Xt

: SVID behavior for libm, 229

conversion between number sets, 20

conversions between decimal strings and binary floating-point numbers, 4

convert_external

: binary floating-point, 49
: data conversion, 49

D

data types

: relation to IEEE formats, 5

dbx, 61

decimal representation

: maximum positive normal number, 18
: minimum positive normal number, 18
: precision, 18
: ranges, 18

double-precision representation

: C example, 90
: FORTRAN example, 91

E

errno.h

: define values for errno, 229

examine the accrued exception bits

: C example, 104

examine the accrued exception flags

: C example, 106

F

f77_floatingpoint.h

define handler types

: FORTRAN, 69

-fast, 135

floating-point

: exceptions list, 4
: rounding direction, 4
: rounding precision, 4
: tutorial, 147

floating-point accuracy

: decimal strings and binary floating-point numbers, 4

floating-point exceptions, 2, 133

abort on exceptions, 117

accrued exception bits, 104

common exceptions, 54

default result, 55

definition, 54

flags, 57

: accrued, 57
: current, 57

ieee_functions, 39

ieee_retrospective, 45

list of exceptions, 54

priority, 57

trap precedence, 57

floating-point options, 130

floating-point queue (FQ), 132

floating-point status register (FSR), 127, 132

floating-point unit, 137

: disable on SPARC, 137
: enable on SPARC, 137

floating-point unit (FPU), 130, 131

floatingpoint.h

define handler types

: C and C++, 69

flush to zero (see Store 0), 23

fmod, 230

-fnonstd, 135

fpversion, 136

G

generate an array of numbers

: FORTRAN example, 92

Goldberg paper, 147

: abstract, 148
: acknowledgments, 216
: details, 201
: IEEE standard, 165
: IEEE standards, 161
: introduction, 148
: references, 216
: rounding error, 149
: summary, 215
: systems aspects, 187

gradual underflow

: error properties, 25

H

HUGE

: compatibility with IEEE standard, 226

HUGE_VAL

: compatibility with IEEE standard, 226

I

IEEE double extended format

biased exponent

: x86 architecture, 14

bit-field assignment

: x86 architecture, 14

fraction

: x86 architecture, 14

Inf

: SPARC architecture, 13
: x86 architecture, 16

NaN

: x86 architecture, 18

normal number

: SPARC architecture, 13
: x86 architecture, 16

quadruple precision

: SPARC architecture, 12

sign bit

: x86 architecture, 15

significand

explicit leading bit

: x86 architecture, 14

subnormal number

: SPARC architecture, 13
: x86 architecture, 16

IEEE double format

biased exponent, 8

bit patterns and equivalent values, 10

bit-field assignment, 8

denormalized number, 10

fraction, 8

: storage on SPARC, 8
: storage on x86, 8

implicit bit, 10

Inf, infinity, 10

NaN, not a number, 11

IEEE formats

: relation to language data types, 5

IEEE single format

biased exponent, 6

biased exponent,implicit bit, 7

bit assignments, 6

bit patterns and equivalent values, 7

bit-field assignment, 6

denormalized number, 7

fraction, 6

Inf,positive infinity, 7

mixed number, significand, 7

NaN, not a number, 8

normal number

: maximum positive, 7

normal number bit pattern, 6

precision, normal number, 7

sign bit, 6

subnormal number bit pattern, 6

IEEE Standard 754

: double extended format, 4
: double format, 3
: single format, 3

ieee_flags, 42

accrued exception flag, 42

examine accrued exception bits

: C example, 104

rounding direction, 42

rounding precision, 42, 44

set exception flags

: C example, 107

truncate rounding, 43

ieee_functions

: bit mask operations, 38
: floating-point exceptions, 39

ieee_handler, 69

abort on exception

: FORTRAN example, 117

example, calling sequence, 62

trap on common exceptions, 54

trap on exception

: C example, 109

ieee_retrospective

: check underflow exception flag, 135
: floating-point exceptions, 44
: floating-point status register (FSR), 45
: getting information about nonstandard IEEE modes, 44
: getting information about outstanding exceptions, 44
: nonstandard_arithmetic in effect, 45
: precision, 44
: rounding, 44
: suppress exception messages, 46

ieee_sun

: IEEE classification functions, 38

ieee_values

: quadruple-precision values, 40
: representing floating-point values, 40
: representing Inf, 40
: representing NaN, 40
: representing normal number, 40
: single-precision values, 40

ieee_values functions

: C example, 99

Inf, 2, 227

: default result of divide by zero, 55

L

lcrans

: random number utilities, 49

libm

: SVID compliance, 227

libm

default directories

: executables, 32
: header files, 32

list of functions, 32

standard installation, 32

libm functions

: double precision, 37
: quadruple precision, 37
: single precision, 37

libsunmath

default directories

: executables, 33
: header files, 33

list of functions, 34

standard installation, 33

M

: MAXFLOAT, 229

N

NaN, 2, 14, 227, 230

nonstandard_arithmetic, 135

turn off IEEE gradual underflow, 135

underflow, 46

: gradual, 46

normal number

: maximum positive, 7
: minimum positive, 23, 27

number line

: binary representation, 19
: decimal representation, 19
: powers of 2, 26

O

operating system math library

: libm.a, 32
: libm.so, 32

P

: infinitely precise value, 48

PowerPC

: bit pattern values, 13
: double format, 8
: double-extended format, 12
: IEEE arithmetic, 3
: quad-precision values, 41
: ranges and precisions, 18
: the FPSCR register, 57
: underflow thresholds, 22

Q

quadruple-precision representation

: FORTRAN example, 91

quiet NaN

: default result of invalid operation, 55

R

random number generators, 92

random number utilities

: shufrans, 49

represent double-precision value

: C example, 90
: FORTRAN example, 92

represent single-precision value

: C example, 90

rounding direction, 4, 25

: C example, 102
: ulp (unit in the last place), 25

rounding precision, 4

roundoff error

accuracy

: loss of, 24

S

set exception flags

: C example, 107

shufrans

: shuffle pseudo-random numbers, 49

single format, 6

single precision representation

: C example, 90

SPARC

: FPU, 137

square root instruction, 133, 227

standard_arithmetic

: turn on IEEE behavior, 135

Store 0, 23

: flush underflow results, 27, 28

subnormal number, 27, 132

: floating-point calculations, 23

SVID behavior of libm

: -Xt compiler option, 229

SVID exceptions

errno set to EDOM

: improper operands, 226

errno set to ERANGE

: overflow or underflow, 226

matherr, 226

PLOSS, 230

TLOSS, 230

System V Interface Definition (SVID), 225

T

trap, 131

: abort on exception, 117
: ieee_retrospective, 45

trap on exception

: C example, 109, 110

trap on floating-point exceptions

: C example, 109

trigonometric functions

: argument reduction, 48

tutorial, floating-point, 147

U

underflow

: floating-point calculations, 22
: gradual, 23, 132
: nonstandard_arithmetic, 46
: threshold, 27

underflow thresholds

: double extended precision, 22
: double precision, 22
: single precision, 22

unordered comparison

: floating-point values, 56
: NaN, 56

V

values.h

: define error messages, 229

X

X/Open behavior of libm

: -Xa compiler option, 229

X_TLOSS, 229

-Xa, 229

-Xc, 229

-Xs, 229

-Xt, 229