Exceptions and Exception Handling

4

This chapter describes IEEE floating-point exceptions and shows how to detect, locate, and handle them.

This chapter is organized into the following sections:

What Is an Exception?

page 54

Detecting Exceptions

page 58

Locating an Exception

page 61

Handling Exceptions

page 76

What Is an Exception?	page 54
Detecting Exceptions	page 58
Locating an Exception	page 61
Handling Exceptions	page 76

The floating point environment provided by Sun WorkShop Compilers 4.2 and the Solaris operating system on SPARC, Intel, and PowerPC systems supports all of the exception handling facilities required by the IEEE standard as well as many of the recommended optional facilities. One objective of these facilities is explained in the IEEE 854 Standard (IEEE 854, page 18):

... to minimize for users the complications arising from exceptional conditions. The arithmetic system is intended to continue to function on a computation as long as possible, handling unusual situations with reasonable default responses, including setting appropriate flags.

To achieve this objective, the standards specify default results for exceptional operations and require that an implementation provide status flags, which may be sensed, set, or cleared by a user, to indicate that exceptions have occurred. The standards also recommend that an implementation provide a means for a program to trap (i.e., interrupt normal control flow) when an exception occurs. The program can optionally supply a trap handler that handles the exception in an appropriate manner, for example by providing an alternate result for the exceptional operation and resuming execution. This chapter lists the exceptions defined by IEEE 754 along with their default results and describes the features of the floating point environment that support status flags, trapping, and exception handling.

What Is an Exception?

It is hard to define exceptions. To quote W. Kahan,

An arithmetic exception arises when an attempted atomic arithmetic operation has no result that would be acceptable universally. The meanings of atomic and acceptable vary with time and place. (See Handling Arithmetic Exceptions by W. Kahan.)

For example, an exception arises when a program attempts to take the square root of a negative number. (This example is one case of an invalid operation exception.) When such an exception occurs, the system responds in one of two ways:

If the exception's trap is disabled (the default case), the system records the fact that the exception occurred and continues executing the program using the default result specified by IEEE 754 for the excepting operation.
If the exception's trap is enabled, the system generates a SIGFPE signal. If the program has installed a SIGFPE signal handler, the system transfers control to that handler; otherwise, the program aborts.

IEEE 754 defines five basic types of floating point exceptions: invalid operation, division by zero, overflow, underflow and inexact. The first three (invalid, division, and overflow) are sometimes collectively called common exceptions. These exceptions can seldom be ignored when they occur. ieee_handler(3m) gives an easy way to trap on common exceptions only. The other two exceptions (underflow and inexact) are seen more often--in fact, most floating point operations incur the inexact exception--and they can usually, though not always, be safely ignored.

Table 4-1 condenses information found in IEEE Standard 754-1985. It describes the five floating-point exceptions and the default response of an IEEE arithmetic environment when these exceptions are raised.

Table 4-1 IEEE Floating-Point Exceptions

IEEE Exception
Reason Why This Arises
Example
Default Result When
Trap is Disabled

Invalid operation

An operand is invalid for the operation about to be performed.

For Intel this exception can also occur because of a stack fault.

for x > 0
fp_op(signaling_NaN)
0 ×
0 / 0
/
x REM 0
Unordered comparison
(see note (1))
Invalid conversions
(see note (2))

Quiet NaN

Division by zero

Divisor is zero, and dividend is a finite nonzero number, or, more generally, when an exact infinite result is delivered by an operation on finite operands.

x / 0 for x 0,± or NaN
log(0)

Correctly signed infinity

Overflow

Correctly rounded result is larger in magnitude than the largest number in the destination precision (that is, the range of the exponent is exceeded.)

Double precision:
MAXDOUBLE + 1.0e294
exp(709.8)

Single precision:
(float)MAXDOUBLE
MAXFLOAT + 1.0e32
expf(88.8)

Depends on rounding mode (RM), and the sign of the intermediate result:

RM + - nearest + - tozero +max -max positive + -max negative +max -

Underflow

Underflow occurs whenever a nonzero result computed, as though the exponent range and the precision were unbounded, lies strictly between ±2E_min (see note (3)).

Double precision:
nextafter(min_normal,-)
nextafter(min_subnormal,-)
MINDOUBLE /3.0
exp(-708.5)

Single precision:
(float)MINDOUBLE
nextafterf(MINFLOAT, -)
expf(-87.4)

Subnormal or zero

Inexact

Rounded result of a valid operation is different from the infinitely precise result.

2.0 / 3.0
(float)1.12345678
log(1.1)
MAXDOUBLE + MAXDOUBLE,
when no overflow trap

The result of the operation (rounded, overflowed, or underflowed, as the case may be)

Table 4-1 IEEE Floating-Point Exceptions
IEEE Exception	Reason Why This Arises	Example	Default Result When Trap is Disabled
Invalid operation	An operand is invalid for the operation about to be performed. For Intel this exception can also occur because of a stack fault.	for x > 0 fp_op(signaling_NaN) 0 × 0 / 0 / x REM 0 Unordered comparison (see note (1)) Invalid conversions (see note (2))	Quiet `NaN`
Division by zero	Divisor is zero, and dividend is a finite nonzero number, or, more generally, when an exact infinite result is delivered by an operation on finite operands.	x / 0 for x 0,± or NaN log(0)	Correctly signed infinity
Overflow	Correctly rounded result is larger in magnitude than the largest number in the destination precision (that is, the range of the exponent is exceeded.)	Double precision: MAXDOUBLE + 1.0e294 exp(709.8) Single precision: (float)MAXDOUBLE MAXFLOAT + 1.0e32 expf(88.8)	Depends on rounding mode (RM), and the sign of the intermediate result: RM + - nearest + - tozero +max -max positive + -max negative +max -
Underflow	Underflow occurs whenever a nonzero result computed, as though the exponent range and the precision were unbounded, lies strictly between ±2E_min (see note (3)).	Double precision: nextafter(min_normal,-) nextafter(min_subnormal,-) MINDOUBLE /3.0 exp(-708.5) Single precision: (float)MINDOUBLE nextafterf(MINFLOAT, -) expf(-87.4)	Subnormal or zero
Inexact	Rounded result of a valid operation is different from the infinitely precise result.	2.0 / 3.0 (float)1.12345678 log(1.1) MAXDOUBLE + MAXDOUBLE, when no overflow trap	The result of the operation (rounded, overflowed, or underflowed, as the case may be)

Notes for Table 4-1

: 1. Unordered comparison: Any pair of floating-point values can be compared, even if they are not of the same format. Four mutually exclusive relations are possible: less than, greater than, equal, or unordered. Unordered means that at least one of the operands is a NaN (not a number).

Every NaN compares "unordered" with everything, including itself. Table 4-2 shows which predicates cause the invalid operation exception when the relation is unordered.

Table 4-2 Unordered Comparisons
Predicates
math	c, c++	f77	Invalid Exception if unordered
=	==	.EQ.	no
	!=	.NE.	no
>	>	.GT.	yes
	>=	.GE.	yes
<	<	.LT.	yes
	<=	.LE.	yes

: 2. Invalid conversion: Attempt to convert NaN or infinity to an integer, or integer overflow on conversion from floating-point format.
: 3. E_min: the minimum exponent is -126, -1022 and -16382, for IEEE single, double, and extended precisions. See Chapter 2, "IEEE Arithmetic," for a description of the IEEE floating-point formats.
: 4. The Intel floating point environment provides another exception not mentioned in the IEEE standards: the denormal operand exception. This exception is raised whenever a floating point operation is performed on a subnormal number.
: 5. Exceptions are prioritized in the following order: invalid (highest priority), overflow, division, underflow, inexact, denormal (Intel only having lowest priority).

The only combinations of exceptions that can occur simultaneously are overflow with inexact and underflow with inexact (and denormal operand on Intel). If trapping on overflow, underflow, and inexact is enabled, the overflow and underflow traps take precedence over the inexact trap; they all take precedence over a denormal operand trap on Intel.

Detecting Exceptions

As required by the IEEE standard, the floating point environments on SPARC, Intel, and PowerPC systems provide status flags that record the occurrence of floating point exceptions. These flags can be tested by a program in order to detect which exceptions have occurred. The flags can also be explicitly set and cleared. The ieee_flags function provides one way to access these flags.

character*8 out call ieee_flags('clear', 'exception', 'overflow', out)

To determine whether an exception has occurred from C or C++, use:

i = ieee_flags("get", "exception", in, out);

i = ieee_flags("get", "exception", in, out);

When the action is "get", the string returned in out is:

"not available" -- if information on exceptions is not available
"" (an empty string) -- if there are no accrued exceptions or, in the case of Intel, the denormal operand is the only accrued exception
the name of the exception named in the third argument, in, if that argument names an exception that has occurred
otherwise, the name of the highest priority exception that has occurred.

For example, in the FORTRAN call:


character*8 out i = ieee_flags('get', 'exception', 'division', out)

the string returned in out is "division" if the division-by-zero exception has occurred; otherwise it is the name of the highest priority exception that has occurred. Note that in is ignored unless it names a particular exception; for example, the argument "all" is ignored in the C call:

i = ieee_flags("get", "exception", "all", out);

i = ieee_flags("get", "exception", "all", out);

Besides returning the name of an exception in out, ieee_flags returns an integer value that combines all of the exception flags currently raised. This value is the bitwise ``or'' of all the accrued exception flags, where each flag is represented by a single bit as shown in Table 4-3. The positions of the bits corresponding to each exception are given by the fp_exception_type values defined in the file sys/ieeefp.h. (Note that these bit positions are machine-dependent and need not be contiguous.)

Table 4-3 Exception Bits

Exception
Bit Position
Accrued Exception Bit

invalid

fp_invalid

i & (1 << fp_invalid)

overflow

fp_overflow

i & (1 << fp_overflow)

division

fp_division

i & (1 << fp_division)

underflow

fp_underflow

i & (1 << fp_underflow)

inexact

fp_inexact

i & (1 << fp_inexact)

denormalized

fp_denormalized

i & (1 << fp_denormalized) (Intel only)

Table 4-3 Exception Bits
Exception	Bit Position	Accrued Exception Bit
invalid	fp_invalid	i & (1 << fp_invalid)
overflow	fp_overflow	i & (1 << fp_overflow)
division	fp_division	i & (1 << fp_division)
underflow	fp_underflow	i & (1 << fp_underflow)
inexact	fp_inexact	i & (1 << fp_inexact)
denormalized	fp_denormalized	i & (1 << fp_denormalized) (Intel only)

This fragment of a C or C++ program shows one way to decode the return value.

/* * Decode integer that describes all accrued exceptions. * fp_inexact etc. are defined in <sys/ieeefp.h> */ char *out; int invalid, division, overflow, underflow, inexact; code = ieee_flags("get", "exception", "", &out); printf ("out is %s, code is %d, in hex: 0x%08X\n", out, code, code); inexact = (code >> fp_inexact) & 0x1; division = (code >> fp_division) & 0x1; underflow = (code >> fp_underflow) & 0x1; overflow = (code >> fp_overflow) & 0x1; invalid = (code >> fp_invalid) & 0x1; printf("%d %d %d %d %d \n", invalid, division, overflow, underflow, inexact);

/* * Decode integer that describes all accrued exceptions. * fp_inexact etc. are defined in <sys/ieeefp.h> / char out; int invalid, division, overflow, underflow, inexact; code = ieee_flags("get", "exception", "", &out); printf ("out is %s, code is %d, in hex: 0x%08X\n", out, code, code); inexact = (code >> fp_inexact) & 0x1; division = (code >> fp_division) & 0x1; underflow = (code >> fp_underflow) & 0x1; overflow = (code >> fp_overflow) & 0x1; invalid = (code >> fp_invalid) & 0x1; printf("%d %d %d %d %d \n", invalid, division, overflow, underflow, inexact);

Locating an Exception

Often, programmers do not write programs with exceptions in mind, so when an exception is detected, the first question asked is: Where did the exception occur? Of course, one way to locate where an exception occurs is to test the exception flags at various points throughout a program, but to isolate an exception precisely by this approach can require many tests and carry a significant overhead.

An easier way to determine where an exception occurs is to enable its trap. When an exception whose trap is enabled occurs, the Solaris operating system notifies the program by sending a SIGFPE signal (see the signal(5) manual page). Thus, by enabling trapping for an exception, you can determine where the exception occurs either by running under a debugger and stopping on receipt of a SIGFPE signal or by establishing a SIGFPE handler that prints the address of the instruction where the exception occurred. Note that trapping must be enabled for an exception to generate a SIGFPE signal; when trapping is disabled and an exception occurs, the corresponding flag is set and execution continues with the default result specified in Table 4-1, but no signal is delivered.

Using the Debuggers to Locate an Exception

This section gives examples showing how to use dbx (source-level debugger) and adb (assembly-level debugger) to investigate the cause of a floating-point exception and locate the instruction that raised it. Recall that in order to use the source-level debugging features of dbx, programs should be compiled with the -g flag. Refer to the WorkShop: Command-Line Utilities guide, the chapters on debugging, for more information.

Consider the following program:

program ex double precision x, y, routine x = -4.2d0 y = routine(x) print * , x, y end double precision function routine(x) double precision x routine = sqrt(x) - 1.0d0 return end

program ex double precision x, y, routine x = -4.2d0 y = routine(x) print * , x, y end double precision function routine(x) double precision x routine = sqrt(x) - 1.0d0 return end

Compiling and running this program produces:

-4.2000000000000 NaN Note: IEEE floating-point exception flags raised: Inexact; Invalid Operation; See the Numerical Computation Guide, ieee_flags(3M)

-4.2000000000000 NaN Note: IEEE floating-point exception flags raised: Inexact; Invalid Operation; See the Numerical Computation Guide, ieee_flags(3M)

To determine the cause of the invalid operation exception, you can recompile with the -ftrap option to enable trapping on invalid operations and use either dbx or adb to locate the site at which a SIGFPE signal is delivered. Alternatively, you can use adb or dbx without recompiling the program by linking with a startup routine that enables the invalid operation trap or by manually enabling the trap.

Using dbx to Locate the Instruction Causing an Exception

The simplest way to locate the code that causes a floating-point exception is to recompile with the -g and -ftrap flags and then use dbx to track down the location where the exception occurs. First, recompile the program as follows:


example% `f77 -g -ftrap=invalid ex.f`

Compiling with -g allows you to use the source-level debugging features of dbx. Specifying -ftrap=invalid causes the program to run with trapping enabled for invalid operation exceptions.

Next, invoke dbx, issue the catch fpe command to stop when a SIGFPE is issued, and run the program. The result is:

example% dbx a.out Reading symbolic information for a.out Reading symbolic information for rtld /usr/lib/ld.so.1 Reading symbolic information for libF77.so.3 Reading symbolic information for libsunmath.so.1 Reading symbolic information for libm.so.1 Reading symbolic information for libc.so.1 Reading symbolic information for libdl.so.1 (dbx) catch fpe (dbx) run Running: a.out (process id 17516) signal FPE (invalid floating point operation) in routine at line 10 in file "ex.f" 10 routine = sqrt(x) - 1.0d0 (dbx) print x x = -4.2 (dbx)

example% `dbx a.out` Reading symbolic information for a.out Reading symbolic information for rtld /usr/lib/ld.so.1 Reading symbolic information for libF77.so.3 Reading symbolic information for libsunmath.so.1 Reading symbolic information for libm.so.1 Reading symbolic information for libc.so.1 Reading symbolic information for libdl.so.1 (dbx) `catch fpe` (dbx) `run` Running: a.out (process id 17516) signal FPE (invalid floating point operation) in routine at line 10 in file "ex.f" 10 routine = sqrt(x) - 1.0d0 (dbx) `print x` x = -4.2 (dbx)

The output shows that the exception occured in the routine function as a result of attempting to take the square root of a negative number.

Using adb to Locate the Instruction Causing an Exception

You can also use adb to identify the cause of an exception, although adb can't locate the source file and line number as dbx can. Again, the first step is to recompile with -ftrap:


example% `f77 -ftrap=invalid ex.f`

Now invoke adb and run the program. When an invalid operation exception occurs, adb stops at an instruction following the one that caused the exception. To find the instruction that caused the exception, disassemble several instructions and look for the last floating point instruction prior to the instruction at which adb has stopped. On SPARC, the result might resemble the following transcript.

example% adb a.out :r SIGFPE 8: numerical exception (invalid floating point operation) stopped at routine_+0x20: ldd [%l0], %f2 routine_+10?5i routine_+0x10: ld [%l0 + 0x4], %f3 fsqrtd %f2, %f4 sethi %hi(0x11c00), %l0 or %l0, 0x1d8, %l0 ldd [%l0], %f2 <f2=F -4.2000000000000002e+00

example% `adb a.out` `:r` SIGFPE 8: numerical exception (invalid floating point operation) stopped at routine_+0x20: ldd [%l0], %f2 `routine_+10?5i` routine_+0x10: ld [%l0 + 0x4], %f3 fsqrtd %f2, %f4 sethi %hi(0x11c00), %l0 or %l0, 0x1d8, %l0 ldd [%l0], %f2 `<f2=F` -4.2000000000000002e+00

The output shows that the exception was caused by an fsqrtd instruction. Examining the source register shows that the exception was a result of attempting to take the square root of a negative number.

On Intel, because instructions do not have a fixed length, finding the correct address from which to disassemble the code may involve some trial and error. In this example, the exception occurs close to the beginning of a function, so we can disassemble from there. The following might be a typical result.

example% adb a.out :r SIGFPE: Arithmetic Exception (invalid floating point operation) stopped at routine_+0x13: faddp %st,%st(1) routine_?8i routine_: routine_: pushl %ebp movl %esp,%ebp subl $0x24,%esp fldl $0x8048b88 movl 0x8(%ebp),%eax fldl (%eax) fsqrt faddp %st,%st(1) $x 80387 chip is present. cw 0x137e sw 0x3000 cssel 0x17 ipoff 0x8acd datasel 0x1f dataoff 0x0 st[0] -4.2000000000000001776356839 VALID st[1] -1.0 VALID st[2] +0.0 EMPTY st[3] +0.0 EMPTY st[4] +0.0 EMPTY st[5] +0.0 EMPTY st[6] +0.0 EMPTY st[7] +0.0 EMPTY

example% `adb a.out` `:r` SIGFPE: Arithmetic Exception (invalid floating point operation) stopped at routine_+0x13: faddp %st,%st(1) `routine_?8i` routine_: routine_: pushl %ebp movl %esp,%ebp subl $0x24,%esp fldl $0x8048b88 movl 0x8(%ebp),%eax fldl (%eax) fsqrt faddp %st,%st(1) `$x` 80387 chip is present. cw 0x137e sw 0x3000 cssel 0x17 ipoff 0x8acd datasel 0x1f dataoff 0x0 st[0] -4.2000000000000001776356839 VALID st[1] -1.0 VALID st[2] +0.0 EMPTY st[3] +0.0 EMPTY st[4] +0.0 EMPTY st[5] +0.0 EMPTY st[6] +0.0 EMPTY st[7] +0.0 EMPTY

The output reveals that the exception was caused by a fsqrt instruction; examination of the floating point registers reveals that the exception was a result of attempting to take the square root of a negative number.

On PowerPC, adb always stops on the instruction immediately after the one that causes a trap, as the following sample session shows. The example also shows, however, that locating an invalid square root operation on PowerPC can be more difficult than locating a similar exception on SPARC or Intel because current PowerPC processors do not provide a floating point square root instruction in hardware. Finding the root cause of the exception, therefore, requires more diligence.

First, locate the instruction that caused the exception and examine the source registers:

example% adb a.out :r SIGFPE: Arithmetic Exception (invalid floating point operation) stopped at __ieee754_sqrt+0x7c: b __ieee754_sqrt+0x2a0 __ieee754_sqrt+782i __ieee754_sqrt+0x78: fdiv %f1,%f0%f0 __ieee754_sqrt+0x7c: b __ieee754_sqrt+0x2a0 $x F0: +0.0000000e+00 -4.2000000e+00 -NaN F3: -NaN -NaN -NaN F6: -NaN -NaN -NaN F9: -NaN -NaN -NaN F12: -NaN -NaN -NaN F15: -NaN -NaN -NaN F18: -NaN -NaN -NaN F21: -NaN -NaN -NaN F24: -NaN -NaN -NaN F27: -NaN -NaN -NaN F30: -NaN -4.2000000e+00

example% `adb a.out` `:r` SIGFPE: Arithmetic Exception (invalid floating point operation) stopped at __ieee754_sqrt+0x7c: b __ieee754_sqrt+0x2a0 `__ieee754_sqrt+782i` __ieee754_sqrt+0x78: fdiv %f1,%f0%f0 __ieee754_sqrt+0x7c: b __ieee754_sqrt+0x2a0 `$x` F0: +0.0000000e+00 -4.2000000e+00 -NaN F3: -NaN -NaN -NaN F6: -NaN -NaN -NaN F9: -NaN -NaN -NaN F12: -NaN -NaN -NaN F15: -NaN -NaN -NaN F18: -NaN -NaN -NaN F21: -NaN -NaN -NaN F24: -NaN -NaN -NaN F27: -NaN -NaN -NaN F30: -NaN -4.2000000e+00

The preceding output indicates that the exception occured as a result of attempting to divide 0/0. The function name __ieee754_sqrt, however, suggests that the exception arose in a subroutine to compute a square root. A stack backtrace confirms that __ieee754_sqrt was called from __d_sqrt, which is the internal name for the function d_sqrt. We set a breakpoint at the location where __d_sqrt is called, rerun the program, and determine the argument passed to it:

$c __ieee754_sqrt() + 7c sqrt() + 2c __d_sqrt() + 14 routine_() + 14 MAIN_() + 44 main() + 44 routine_+10:b :r breakpoint routine+0x10: bl __d_sqrt <r3/F GPB.driver.x: GPB.driver.x: -4.2000000000000002e+00

`$c` __ieee754_sqrt() + 7c sqrt() + 2c __d_sqrt() + 14 routine_() + 14 MAIN_() + 44 main() + 44 `routine_+10:b` `:r` breakpoint routine+0x10: bl __d_sqrt `<r3/F` GPB.driver.x: GPB.driver.x: -4.2000000000000002e+00

The preceding output shows that the exception was caused by attempting to take the square root of a negative number.

Enabling Traps Without Recompilation

In the preceding examples, trapping on invalid operation exceptions was enabled by recompiling the main subprogram with the -ftrap flag. In some cases, recompiling the main program may not be possible, so you may need to resort to other means to enable trapping. There are several ways to do this.

If the object files and libraries that comprise the program are available, you can enable trapping merely by relinking the program with an appropriate initialization routine. First, create a C source file similar to the following.

#include <ieeefp.h> #pragma init (trapinvalid) void trapinvalid() { /* FP_X_INV et al are defined in ieeefp.h */ fpsetmask(FP_X_INV); }

#include <ieeefp.h> #pragma init (trapinvalid) void trapinvalid() { /* FP_X_INV et al are defined in ieeefp.h */ fpsetmask(FP_X_INV); }

Now compile this file to create an object file and link the original program with this object file:

example% cc -c init.c example% f77 ex.o init.o example% a.out Floating point exception 7, invalid operand, occurred at address 8048afd. Abort

example% `cc -c init.c` example% `f77 ex.o init.o` example% `a.out` Floating point exception 7, invalid operand, occurred at address 8048afd. Abort

Even if relinking is not possible, you can still enable traps manually while running dbx or adb by directly modifying the floating point status register. This can be somewhat tricky because the Solaris operating system does not enable the floating point unit until the first time it is used within a program, at which point the floating point status register is reset with all traps disabled. Thus, you cannot manually enable trapping until after the program has executed at least one floating point instruction.

In our example, the floating point unit has already been accessed by the time the routine function is called, so we can set a breakpoint on entry to that function, enable trapping on invalid operation exceptions, instruct dbx to stop on the receipt of a SIGFPE signal, and continue execution. On SPARC, the steps are as follows.

example% dbx a.out Reading symbolic information for a.out Reading symbolic information for rtld /usr/lib/ld.so.1 Reading symbolic information for libF77.so.3 Reading symbolic information for libsunmath.so.1 Reading symbolic information for libm.so.1 Reading symbolic information for libc.so.1 Reading symbolic information for libdl.so.1 (dbx) stop in routine_ dbx: warning: 'routine_' has no debugger info -- will trigger on first instruction (2) stop in routine_ (dbx) run Running: a.out (process id 6631) stopped in routine_ at 0x10c00 0x00010c00: routine_ : save %sp, -0x68, %sp (dbx) assign $fsr=0x08000000 dbx: warning: unknown language, 'fortran' assumed (dbx) catch fpe (dbx) cont signal FPE (invalid floating point operation) in routine_ at 0x10c20 0x00010c20: routine_+0x0020: ldd [%l0], %f2 (dbx)

example% `dbx a.out` Reading symbolic information for a.out Reading symbolic information for rtld /usr/lib/ld.so.1 Reading symbolic information for libF77.so.3 Reading symbolic information for libsunmath.so.1 Reading symbolic information for libm.so.1 Reading symbolic information for libc.so.1 Reading symbolic information for libdl.so.1 (dbx) `stop in routine_` dbx: warning: 'routine_' has no debugger info -- will trigger on first instruction (2) stop in routine_ (dbx) `run` Running: a.out (process id 6631) stopped in routine_ at 0x10c00 0x00010c00: routine_ : save %sp, -0x68, %sp (dbx) `assign $fsr=0x08000000` dbx: warning: unknown language, 'fortran' assumed (dbx) `catch fpe` (dbx) `cont` signal FPE (invalid floating point operation) in routine_ at 0x10c20 0x00010c20: routine_+0x0020: ldd [%l0], %f2 (dbx)

Using a Signal Handler to Locate an Exception

The previous section presented several methods for enabling trapping at the outset of a program in order to locate the first occurrence of an exception. In contrast, you can isolate any particular occurrence of an exception by enabling trapping within the program itself. If you enable trapping but do not install a SIGFPE handler, the program will abort on the next occurrence of the trapped exception. Alternatively, if you install a SIGFPE handler, the next occurrence of the trapped exception will cause the system to transfer control to the handler, which can then print diagnostic information, such as the address of the instruction where the exception occurred, and either abort or resume execution. (In order to resume execution with any prospect for a meaningful outcome, the handler may need to supply a result for the exceptional operation as described in the next section.)

You can use ieee_handler to simultaneously enable trapping on any of the five IEEE floating-point exceptions and either request that the program abort when the specified exception occurs or establish a SIGFPE handler. You can also install a SIGFPE handler using one of the lower-level functions sigfpe(3), signal(3c), or sigaction(2); however, these functions do not enable trapping as ieee_handler does. (Remember that a floating point exception triggers a SIGFPE signal only when its trap is enabled.)

`ieee_handler(3m)`

The syntax of a call to ieee_handler is:

             i = ieee_handler(action, exception, handler)

The two input parameters action and exception are strings. The third input parameter, handler, is of type sigfpe_handler_type, which is defined in floatingpoint.h (or f77_floatingpoint.h for FORTRAN programs).

The three input parameters can take the following values:

Input Parameter
C or C++ Type
Possible Value

action

char *

get, set, clear

exception

char *

invalid, division, overflow,
underflow, inexact,
all, common

handler

sigfpe_handler_type

user-defined routine
SIGFPE_DEFAULT
SIGFPE_IGNORE
SIGFPE_ABORT

Input Parameter	C or C++ Type	Possible Value
action	`char *`	get, set, clear
exception	`char *`	invalid, division, overflow, underflow, inexact, all, common
handler	`sigfpe_handler_type`	user-defined routine `SIGFPE_DEFAULT` `SIGFPE_IGNORE` `SIGFPE_ABORT`

When the requested action is "set", ieee_handler establishes the handling function specified by handler for the specified exception. The handling function may be SIGFPE_DEFAULT or SIGFPE_IGNORE, both of which select the default IEEE behavior, SIGFPE_ABORT, which causes the program to abort on the occurrence of any of the named exceptions, or the address of a user-supplied subroutine, which causes that subroutine to be invoked (with the parameters described in the sigaction(2) manual page for a signal handler installed with the SA_SIGINFO flag set) when any of the named exceptions occurs. If the handler is SIGFPE_DEFAULT or SIGFPE_IGNORE, ieee_handler also disables trapping on the specified exceptions; for any other handler, ieee_handler enables trapping.

When the requested action is "clear", ieee_handler revokes whatever handling function is currently installed for the specified exception and disables its trap. (This is the same as "set"ting SIGFPE_DEFAULT.) The third parameter is ignored when action is "clear".

For both the "set" and "clear" actions, ieee_handler returns 0 if the requested action is available and a nonzero value otherwise.

When the requested action is "get", ieee_handler returns the address of the handler currently installed for the specified exception (or SIGFPE_DEFAULT, if no handler is installed).

The following examples show a few code fragments illustrating the use of ieee_handler. This C code causes the program to abort on division by zero:

#include <sunmath.h> if (ieee_handler("set", "division", SIGFPE_ABORT) != 0) printf("ieee trapping not supported here \n");

#include <sunmath.h> if (ieee_handler("set", "division", SIGFPE_ABORT) != 0) printf("ieee trapping not supported here \n");

Here is the equivalent FORTRAN code:

#include <f77_floatingpoint.h> i = ieee_handler('set', 'division', SIGFPE_ABORT) if(i.ne.0) print *,'ieee trapping not supported here'

#include <f77_floatingpoint.h> i = ieee_handler('set', 'division', SIGFPE_ABORT) if(i.ne.0) print *,'ieee trapping not supported here'

This C fragment restores IEEE default exception handling for all exceptions:

#include <sunmath.h> if (ieee_handler("clear", "all", 0) != 0) printf("could not clear exception handlers\n");

#include <sunmath.h> if (ieee_handler("clear", "all", 0) != 0) printf("could not clear exception handlers\n");

Here is the same action in FORTRAN:

i = ieee_handler('clear', 'all', 0) if (i.ne.0) print *, 'could not clear exception handlers'

Reporting an Exception from a Signal Handler

When a SIGFPE handler installed via ieee_handler is invoked, the operating system provides additional information indicating the type of exception that occurred, the address of the instruction that caused it, and the contents of the machine's integer and floating point registers. The handler can examine this information and print a message identifying the exception and the location at which it occurred.

To access the information supplied by the system, declare the handler as follows. (The remainder of this chapter presents sample code in C; see Appendix A, "Examples," for examples of SIGFPE handlers in FORTRAN.)

#include <siginfo.h> #include <ucontext.h> void handler(int sig, siginfo_t *sip, ucontext_t *uap) { ... }

When the handler is invoked, the sig parameter contains the number of the signal that was sent. Signal numbers are defined in sys/signal.h; the SIGFPE signal number is 8.

The sip parameter points to a structure that records additional information about the signal. For a SIGFPE signal, the relevant members of this structure are sip->si_code and sip->si_addr (see sys/siginfo.h). The significance of these members depends on the system and on what event triggered the SIGFPE signal.

The sip->si_code member is one of the SIGFPE signal types listed in Table 4-4. (The tokens shown are defined in sys/machsig.h.)

Table 4-4 Types for Arithmetic Exceptions

SIGFPE Type
IEEE Type

FPE_INTDIV

FPE_INTOVF

FPE_FLTRES

inexact

FPE_FLTDIV

division

FPE_FLTUND

underflow

FPE_FLTINV

invalid

FPE_FLTOVF

overflow

As the table shows, each type of IEEE floating point exception has a corresponding SIGFPE signal type. Integer division by zero (FPE_INTDIV) and integer overflow (FPE_INTOVF) are also included among the SIGFPE types, but because they are not IEEE floating point exceptions you cannot install handlers for them via ieee_handler. (You can install handlers for these SIGFPE types via sigfpe(3); note, though, that integer division by zero is either ignored or generates SIGILL on PowerPC systems, and integer overflow is ignored on all SPARC, Intel, and PowerPC systems. On SPARC and Intel systems, special instructions can cause the delivery of a SIGFPE signal of type FPE_INTOVF, but Sun compilers do not generate these instructions.)

For a SIGFPE signal corresponding to an IEEE floating point exception, the sip->si_code member indicates which exception occurred on SPARC and PowerPC systems, while on Intel systems it indicates the highest priority exception whose flag is raised (excluding the denormal operand flag). The sip->si_addr member holds the address of the instruction that caused the exception on SPARC and PowerPC systems, and on Intel systems it holds the address of the instruction at which the trap was taken (usually the next floating point instruction following the one that caused the exception).

Finally, the uap parameter points to a structure that records the state of the system at the time the trap was taken. The contents of this structure are system-dependent; see sys/reg.h for definitions of some of its members.

Using the information provided by the operating system, we can write a SIGFPE handler that reports the type of exception that occurred and the address of the instruction that caused it. The following example shows such a handler.

#include <stdio.h> #include <sys/ieeefp.h> #include <sunmath.h> #include <siginfo.h> #include <ucontext.h> void handler(int sig, siginfo_t *sip, ucontext_t *uap) { unsigned code, addr; #ifdef i386 unsigned sw; sw = uap->uc_mcontext.fpregs.fp_reg_set.fpchip_state.status & ~uap->uc_mcontext.fpregs.fp_reg_set.fpship_state.state[0]; if (sw & (1 << fp_invalid)) code = FPE_FLTINV; else if (sw & (1 << fp_division)) code = FPE_FLTDIV; else if (sw & (1 << fp_overflow)) code = FPE_FLTOVF; else if (sw & (1 << fp_underflow)) code = FPE_FLTUND; else if (sw & (1 << fp_inexact)) code = FPE_FLTRES; else code = 0; addr = uap->uc_mcontext.fpregs.fp_reg_set.fpchip_state.state[3]; #else code = sip->si_code; addr = (unsigned) sip->si_addr; #endif fprintf(stderr, "fp exception %x at address %x\n", code, addr); } int main() { double x; /* trap on common floating point exceptions */ if (ieee_handler("set", "common", handler) != 0) printf("Did not set exception handler\n"); /* cause an underflow exception (will not be reported) */ x = min_normal(); printf("min_normal = %g\n", x); x = x / 13.0; printf("min_normal / 13.0 = %g\n", x); /* cause an overflow exception (will be reported) */ x = max_normal(); printf("max_normal = %g\n", x); x = x * x; printf("max_normal * max_normal = %g\n", x); ieee_retrospective(stderr); return 0; }

On SPARC and PowerPC systems, the output from this program resembles the following:

min_normal = 2.22507e-308 min_normal / 13.0 = 1.7116e-309 max_normal = 1.79769e+308 fp exception 4 at address 10d0c max_normal * max_normal = 1.79769e+308 Note: IEEE floating-point exception flags raised: Inexact; Underflow; IEEE floating-point exception traps enabled: overflow; division by zero; invalid operation; See the Numerical Computation Guide, ieee_flags(3M), ieee_handler(3M)

On Intel systems, the operating system saves a copy of the accrued exception flags and then clears them before invoking a SIGFPE handler. Unless the handler takes steps to preserve them, the accrued flags are lost once the handler returns. Thus, the output from the preceding program does not indicate that an underflow exception was raised:

min_normal = 2.22507e-308 min_normal / 13.0 = 1.7116e-309 max_normal = 1.79769e+308 fp exception 4 at address 8048fe6 max_normal * max_normal = 1.79769e+308 Note: IEEE floating-point exception traps enabled: overflow; division by zero; invalid operation; See the Numerical Computation Guide, ieee_handler(3M)

Note that the instruction that causes the exception need not deliver the IEEE default result when trapping is enabled: in the preceding outputs, the value reported for max_normal * max_normal is not the default result for an operation that overflows (i.e., a correctly signed infinity). In general, a SIGFPE handler must supply a result for an operation that causes a trapped exception in order to continue the computation with meaningful values. The next section shows how this can be done.

Handling Exceptions

Historically, most numerical software has been written without regard to exceptions (for a variety of reasons), and many programmers have become accustomed to environments in which exceptions cause a program to abort immediately. Now, some high-quality software packages such as LAPACK are being carefully designed to avoid exceptions such as division by zero and invalid operations and to scale their inputs aggressively to preclude overflow and potentially harmful underflow. Neither of these approaches to dealing with exceptions is appropriate in every situation, however: ignoring exceptions can pose problems when one person writes a program or subroutine that is intended to be used by someone else (perhaps someone who does not have access to the source code), and attempting to avoid all exceptions can require many defensive tests and branches and carry a significant cost (see Demmel and Li, "Faster Numerical Algorithms via Exception Handling", IEEE Trans. Comput. 43 (1994), pp. 983-992.)

The default exception response, status flags, and optional trapping facility of IEEE arithmetic are intended to provide a third alternative: continuing a computation in the presence of exceptions and either detecting them after the fact or intercepting and handling them as they occur. As noted above, ieee_flags can be used to detect exceptions after the fact, and ieee_handler can be used to enable trapping and install a SIGFPE handler to intercept exceptions as they occur. In order to continue the computation, however, the IEEE standard recommends that a trap handler be able to provide a result for the operation that incurred an exception. In the Solaris operating system, this can be accomplished using the uap parameter supplied to a signal handler.

Recall that a signal handler may be declared in C as follows:

#include <siginfo.h> #include <ucontext.h> void handler(int sig, siginfo_t *sip, ucontext_t *uap) { ... }

When a SIGFPE signal handler is invoked as a result of a trapped floating point exception, the uap parameter points to a data structure that contains a copy of the machine's integer and floating point registers as well as other system-dependent information describing the exception. If the signal handler returns normally, the saved data are restored and the program resumes execution at the point at which the trap was taken. Thus, by accessing and decoding the information in the data structure that describes the exception and possibly modifying the saved data, a SIGFPE handler can substitute a user-supplied value for the result of an exceptional operation and continue computation. As an illustration, the following examples show how to subsitute a scaled result for an operation that underflows or overflows.

Substituting IEEE Trapped Under/Overflow Results

The IEEE standard recommends that when underflow and overflow are trapped, the system should provide a way for a trap handler to substitute an exponent-wrapped result, i.e., a value that agrees with what would have been the rounded result of the operation that underflowed or overflowed except that the exponent is wrapped around the end of its usual range, thereby effectively scaling the result by a power of two. The scale factor is chosen to map underflowed and overflowed results as nearly as possible to the middle of the exponent range so that subsequent computations will be less likely to underflow or overflow further. By keeping track of the number of underflows and overflows that occur, a program can scale the final result to compensate for the exponent wrapping. This under/overflow "counting mode" can be used to produce accurate results in computations that would otherwise exceed the range of the available floating point formats. (See P. Sterbenz, Floating-Point Computation.)

The output from the preceding program is:

159.309 1.59309e-28 1 4.14884e+137 4.14884e-163 1 Note: IEEE floating-point exception traps enabled: underflow; overflow; See the Numerical Computation Guide, ieee_handler(3M)

On Intel, the floating point hardware provides the exponent-wrapped result when a floating point instruction incurs a trapped underflow or overflow and its destination is a register. When trapped underflow or overflow occurs on a floating point store instruction, however, the hardware traps without completing the store (and without popping the stack, if the store instruction is a store-and-pop). Thus, in order to implement counting mode, an under/overflow handler must generate the scaled result and fix up the stack when a trap occurs on a store instruction. The following example illustrates such a handler. Note that the sample program must be compiled with the following inline template file, which supplies dummy functions that force traps to be taken in proper sequence.

.inline dummyf,1 flds (%esp) .end .inline dummy,2 fldl (%esp) .end

#include <stdio.h> #include <ieeefp.h> #include <math.h> #include <sunmath.h> #include <siginfo.h> #include <ucontext.h> /* offsets into the saved fp environment */ #define CW 0 /* control word */ #define SW 1 /* status word */ #define TW 2 /* tag word */ #define OP 4 /* opcode */ #define EA 5 /* operand address */ #define FPenv(x) uap->uc_mcontext.fpregs.fp_reg_set. fpchip_state.state[(x)] #define FPreg(x) *(long double *)(10*(x)+(char*)&uap-> uc_mcontext.fpregs.fp_reg_set.fpchip_state.state[7]) /* * Supply the IEEE 754 default result for trapped under/overflow */ void ieee_trapped_default(int sig, siginfo_t *sip, ucontext_t *uap) { double dscl; float fscl; unsigned sw, op, top; int mask, e; /* preserve flags for untrapped exceptions */ sw = uap->uc_mcontext.fpregs.fp_reg_set.fpchip_state.status; FPenv(SW) |= (sw & (FPenv(CW) & 0x3f)); /* if the excepting instruction is a store, scale the stack top, store it, and pop the stack if need be */ fpsetmask(0); op = FPenv(OP) >> 16; switch (op & 0x7f8) { case 0x110: case 0x118: case 0x150: case 0x158: case 0x190: case 0x198: fscl = scalbnf(1.0f, (sip->si_code == FPE_FLTOVF)? -96 : 96); *(float *)FPenv(EA) = (FPreg(0) * fscl) * fscl; if (op & 8) { /* pop the stack */ FPreg(0) = FPreg(1); FPreg(1) = FPreg(2); FPreg(2) = FPreg(3); FPreg(3) = FPreg(4); FPreg(4) = FPreg(5); FPreg(5) = FPreg(6); FPreg(6) = FPreg(7); top = (FPenv(SW) >> 10) & 0xe; FPenv(TW) |= (3 << top); top = (top + 2) & 0xe; FPenv(SW) = (FPenv(SW) & ~0x3800) | (top << 10); } break; case 0x510: case 0x518: case 0x550: case 0x558: case 0x590: case 0x598: dscl = scalbn(1.0, (sip->si_code == FPE_FLTOVF)? -768 : 768); *(double *)FPenv(EA) = (FPreg(0) * dscl) * dscl; if (op & 8) { /* pop the stack */ FPreg(0) = FPreg(1); FPreg(1) = FPreg(2); FPreg(2) = FPreg(3); FPreg(3) = FPreg(4); FPreg(4) = FPreg(5); FPreg(5) = FPreg(6); FPreg(6) = FPreg(7); top = (FPenv(SW) >> 10) & 0xe; FPenv(TW) |= (3 << top); top = (top + 2) & 0xe; FPenv(SW) = (FPenv(SW) & ~0x3800) | (top << 10); } break; } } extern float dummyf(float); extern double dummy(double); int main() { volatile float a, b; volatile double x, y; ieee_handler("set", "underflow", ieee_trapped_default); ieee_handler("set", "overflow", ieee_trapped_default); a = b = 1.0e30f; a = dummyf(a * b); printf( "%g\n", a ); a = dummyf(a / b); printf( "%g\n", a ); a = dummyf(a / b); printf( "%g\n", a ); x = y = 1.0e300; x = dummy(x * y); printf( "%g\n", x ); x = dummy(x / y); printf( "%g\n", x ); x = dummy(x / y); printf( "%g\n", x ); ieee_retrospective(stdout); return 0; }

As on SPARC, the output from the preceding program on Intel is:

159.309 1.59309e-28 1 4.14884e+137 4.14884e-163 1 Note: IEEE floating-point exception traps enabled: underflow; overflow;

See the Numerical Computation Guide, ieee_handler(3M)

On PowerPC, the floating point hardware supplies the exponent-wrapped result for a floating point instruction that incurs a trapped underflow or overflow. An under/overflow handler need not take any action to provide the scaled result, as the following example shows.

#include <stdio.h> #include <sunmath.h> void ieee_trapped_default() { } int main() { volatile float a, b; volatile double x, y; ieee_handler("set", "underflow", ieee_trapped_default); ieee_handler("set", "overflow", ieee_trapped_default); a = b = 1.0e30f; a *= b; printf( "%g\n", a ); a /= b; printf( "%g\n", a ); a /= b; printf( "%g\n", a ); x = y = 1.0e300; x *= y; printf( "%g\n", x ); x /= y; printf( "%g\n", x ); x /= y; printf( "%g\n", x ); ieee_retrospective(stdout); return 0; }

As expected, the output is:

159.309 1.59309e-28 1 4.14884e+137 4.14884e-163 1 Note: IEEE floating-point exception traps enabled: underflow; overflow;

See the Numerical Computation Guide, ieee_handler(3M)

SIGFPE Type	IEEE Type
`FPE_INTDIV`
`FPE_INTOVF`
`FPE_FLTRES`	inexact
`FPE_FLTDIV`	division
`FPE_FLTUND`	underflow
`FPE_FLTINV`	invalid
`FPE_FLTOVF`	overflow

Exceptions and Exception Handling

4

What Is an Exception?

Table 4-1 IEEE Floating-Point Exceptions

Table 4-2 Unordered Comparisons

Detecting Exceptions

Table 4-3 Exception Bits

Locating an Exception

Table 4-4 Types for Arithmetic Exceptions

Handling Exceptions