Previous Next Contents Index Doc Set Home


Sun ANSI C Compiler-Specific Information

3


The Sun ANSI C compiler is compatible with the C language described in the American National Standard for Programming Language--C, ANSI/ISO 9899-1990. This chapter documents those areas specific to the Sun ANSI C compiler.


Environment Variables

TMPDIR

cc normally creates temporary files in the directory /tmp. You can specify another directory by setting the environment variable TMPDIR to the directory of your choice. However, if TMPDIR is not a valid directory, cc uses /tmp. The -xtemp option has precedence over the TMPDIR environment variable.

If you use a Bourne shell, type:

$ TMPDIR=dir; export TMPDIR
If you use a C shell, type:

% setenv TMPDIR dir

SUNPRO_SB_INIT_FILE_NAME

The absolute path name of the directory containing the .sbinit(5) file. This variable is used only if the -xsb or -xsbfast flag is used.

PARALLEL

(SPARC) Refer to Environment Variable on page 73 for details.


Global Behavior: Value versus unsigned Preserving

A program that depends on unsigned preserving arithmetic conversions behaves differently. This is considered to be the most serious change made by ANSI C.

In the first edition of K&R, The C Programming Language (Prentice-Hall, 1978), unsigned specified exactly one type; there were no unsigned chars, unsigned shorts, or unsigned longs, but most C compilers added these very soon thereafter.

In previous C compilers, the unsigned preserving rule is used for promotions: when an unsigned type needs to be widened, it is widened to an unsigned type; when an unsigned type mixes with a signed type, the result is an unsigned type.

The other rule, specified by ANSI C, came to be called "value preserving," in which the result type depends on the relative sizes of the operand types. When an unsigned char or unsigned short is widened, the result type is int if an int is large enough to represent all the values of the smaller type. Otherwise, the result type is unsigned int. The value preserving rule produces the least surprise arithmetic result for most expressions.

Only in the -Xt and -Xs modes does the compiler use the unsigned preserving promotions; in the other modes, -Xc and -Xa, the value preserving promotion rules are used. When the -xtransition option is used, the compiler warns about each expression whose behavior might depend on the promotion rules used.


Keywords

asm Keyword

The _asm keyword is a synonym for the asm keyword. asm is available under all compilation modes, although a warning is issued when it is used under the -Xc mode.

The asm statement has the form:

asm("string"):

where string is a valid assembly language statement.

For example:

main()
{
	int i;

	/* i = 10 */
	asm("mov 10,%l0");
	asm("st  %l0,[%fp-8]");

	printf("i = %d\n",i);
}
% cc foo.c
% a.out
i = 10
%

asm statements must appear within function bodies.

_Restrict Keyword

For a compiler to effectively perform parallel execution of a loop, it needs to determine if certain lvalues designate distinct regions of storage. Aliases are lvalues whose regions of storage are not distinct. Determining if two pointers to objects are aliases is a difficult and time-consuming process because it could require analysis of the entire program.

Example: the function vsq()

void vsq(int n, double * a, double * b)
{
	int i;
	for (i=0; i<n; i++) b[i] = a[i] * a[i];
}

The compiler can parallelize the execution of the different iterations of the loops if it knows that pointers a and b access different objects. If there is an overlap in objects accessed through pointers a and b then it would be unsafe for the compiler to execute the loops in parallel. At compile time, the compiler does not know if the objects accessed by a and b overlap by simply analyzing the function vsq(); the compiler may need to analyze the whole program to get this information.

Restricted pointers are used to specify pointers which designate distinct objects so that the compiler can perform pointer alias analysis. To support restricted pointers, the keyword _Restrict is recognized by the Sun ANSI C compiler as an extension. Below is an example of declaring function parameters of vsq() as restricted pointers:

void vsq(int n, double * _Restrict a, double * _Restrict b)

Pointers a and b are declared as restricted pointers, so the compiler knows that the regions of storage pointed to by a and b are distinct. With this alias information, the compiler is able to parallelize the loop.

The _Restrict keyword is a type qualifier, like volatile, and it qualifies pointer types only. _Restrict is recognized as a keyword only for compilation modes -Xa (default) and -Xt. For these two modes, the compiler defines the macro __RESTRICT to enable users write portable code with restricted pointers.

The compiler defines the macro __RESTRICT to enable users to write portable code with restricted pointers. For example, the following code works on the Sun ANSI C compiler in all compilation modes, and should work on other compilers which do not support restricted pointers:

#ifdef __RESTRICT
#define restrict _Restrict
#else
#define restrict
#endif

void vsq(int n, double * restrict a, double * restrict b)
{
	int i;
	for (i=0; i<n; i++) b[i] = a[i] * a[i];
}

If restricted pointers become a part of the ANSI C Standard, it is likely that "restrict" will be the keyword. Users may want to write code with restricted pointers using:

#define restrict _Restrict

as in vsq() because this way there will be minimal changes should "restrict" become a keyword in the ANSI C Standard. The Sun ANSI C compiler uses _Restrict as the keyword because it is in the implementor's name space, so there is no conflict with identifiers in the user's name space.

There are situations where a user may not want to change the source code. One can specify pointer-valued function parameters to be treated as restricted pointers with the command-line option -xrestrict; refer to "-xrestrict=f" on page 43 for details.

If a function list is specified, pointer parameters in the specified functions are treated as restricted; otherwise, all pointer parameters in the entire C file are treated as restricted. For example, -xrestrict=vsq would qualify the pointers a and b given in "Example: the function vsq()" on page 59 with the keyword _Restrict.

It is critical that _Restrict be used correctly. If pointers qualified as restricted pointers point to objects which are not distinct, loops may be incorrectly parallelized, resulting in undefined behavior. For example, assume that pointers a and b of function vsq() point to objects which overlap, such that b[i] and a[i+1] are the same object. If a and b are not declared as restricted pointers, the loops will be executed serially. If a and b are incorrectly qualified as restricted pointers, the compiler may parallelize the execution of the loops; this is not safe, because b[i+1] should only be computed after b[i] has been computed.


long long Data Type

The Sun ANSI C compiler includes the data types long long, and unsigned long long, which are similar to the data type long. long long can store 64 bits of information; long can store 32 bits of information. long long is not available in -Xc mode.

Printing long long Data Types

To print or scan long long data types, prefix the conversion specifier with the letters "ll." For example, to print llvar, a variable of long long data type, in signed decimal format, use:

printf("%lld\n", llvar);

Usual Arithmetic Conversions

Some binary operators convert the types of their operands to yield a common type, which is also the type of the result. These are called the usual arithmetic conversions:


Constants

This section contains information related to constants that is specific to the Sun ANSI C compiler.

Integral Constants

Decimal, octal, and hexadecimal integral constants can be suffixed to indicate type, as shown in the Table 3-1.

Table  3-1 Data Type Suffixes  
Suffix
Type

u or U

unsigned

l or L

long

ll or LL

long long1

lu, LU, Lu, lU, ul, uL, Ul, or UL

unsigned long

llu, LLU, LLu, llU, ull, ULL, uLL, Ull

unsigned long long1

1 long long and unsigned long long are not available in -Xc mode.

When assigning types to unsuffixed constants, the compiler uses the first of this list in which the value can be represented, depending on the size of the constant:

Character Constants

A multiple-character constant that is not an escape sequence has a value derived from the numeric values of each character. For example, the constant '123' has a value of:

Table  3-2 Multiple-character Constant (ANSI)

0

'3'

'2'

'1'

or 0x333231.

With the -Xs option and in other, non-ANSI versions of C, the value is:

Table  3-3 Multiple-character Constant (non-ANSI)

0

'1'

'2'

'3'

or 0x313233.


Include Files

To include any of the standard header files supplied with the C compilation system, use this format:

#include <stdio.h> 

The angle brackets (<>) cause the preprocessor to search for the header file in the standard place for header files on your system, usually the /usr/include directory.

The format is different for header files that you have stored in your own directories:

#include "header.h"

The quotation marks (" ") cause the preprocessor to search for header.h first in the directory of the file containing the #include line.

If your header file is not in the same directory as the sourcefiles that include it, specify the path of the directory in which it is stored with the -I option to cc. Suppose, for instance, that you have included both stdio.h and header.h in the source file mycode.c:

#include <stdio.h>
#include "header.h"

Suppose further that header.h is stored in the directory../defs. The command:

% cc -I../defs mycode.c
directs the preprocessor to search for header.h first in the directory containing mycode.c, then in the directory ../defs, and finally in the standard place. It also directs the preprocessor to search for stdio.h first in ../defs, then in the standard place. The difference is that the current directory is searched only for header files whose names you have enclosed in quotation marks.

You can specify the -I option more than once on the cc command-line. The preprocessor searches the specified directories in the order they appear. You can specify multiple options to cc on the same command-line:

% cc -o prog -I../defs mycode.c


Nonstandard Floating Point

IEEE 754 floating-point default arithmetic is "nonstop." Underflows are "gradual." Following is a summary of explanation. See the Numerical Computation Guide for details.

Nonstop means that execution does not halt on occurrences like division by zero, floating-point overflow, or invalid operation exceptions. For example, consider the following, where x is zero and y is positive:

z = y / x;
By default, z is set to the value +Inf, and execution continues. With the -fnonstd option, however, this code causes an exit, such as a core dump.

Here is how gradual underflow works. Suppose you have the following code:

x = 10;
for (i = 0; i < LARGE_NUMBER; i++)
x = x / 10;

The first time through the loop, x is set to 1; the second time through, to 0.1; the third time through, to 0.01; and so on. Eventually, x reaches the lower limit of the machine's capacity to represent its value. What happens the next time the loop runs?

Let's say that the smallest number characterizable is:

1.234567e-38
The next time the loop runs, the number is modified by "stealing" from the mantissa and "giving" to the exponent:

1.23456e-39
and, subsequently,

1.2345e-40
and so on. This is known as "gradual underflow," which is the default behavior. In nonstandard behavior, none of this "stealing" takes place; typically, x is simply set to zero.


Preprocessing Directives

This section describes assertions, pragmas, and predefined names.

Assertions

A line of the form:

#assert predicate (token-sequence) 

associates the token-sequence with the predicate in the assertion name space (separate from the space used for macro definitions). The predicate must be an identifier token.

#assert predicate 

asserts that predicate exists, but does not associate any token sequence with it.

The compiler provides the following predefined predicates by default (not in
-Xc mode):

#assert system (unix)

#assert machine (sparc) (SPARC)

#assert machine (i386) (Intel)

#assert machine (ppc) (PowerPC)

#assert cpu (sparc) (SPARC)

#assert cpu (i386) (Intel)

#assert cpu (ppc) (PowerPC)

lint provides the following predefinition predicate by default (not in
-Xc mode):

#assert lint (on)       

Any assertion may be removed by using #unassert, which uses the same syntax as assert. Using #unassert with no argument deletes all assertions on the predicate; specifying an assertion deletes only that assertion.

An assertion may be tested in a #if statement with the following syntax:

#if #predicate(non-empty token-list) 

For example, the predefined predicate system can be tested with the following line:

#if #system(unix) 

which evaluates true.

Pragmas

Preprocessing lines of the form:

#pragma pp-tokens 

specify implementation-defined actions.

The following #pragmas are recognized by the compilation system:

If pragma redefine_extname is encountered after the first use of "old_extname", as a function definition, an initializer, or an expression, the effect is undefined. (Not supported in -Xs and -Xc modes.)

defines symbol to be a weak symbol. The linker does not produce an error message if it does not find a definition for symbol.

#pragma weak symbol1 = symbol2

defines symbol1 to be a weak symbol, which is an alias for the symbol symbol2. This form of the pragma can only be used in the same translation unit where symbol2 is defined, either in the sourcefiles or one of its included headerfiles. Otherwise, a compilation error will result.

If your program calls but does not define symbol1, and symbol1 is a weak symbol in a library being linked,the linker uses the definition from that library. However, if your program defines its own version of symbol1, then the program's definition is used and the weak global definition of symbol1 in the library is not used. If the program directly calls symbol2, the definition from the library is used; a duplicate definition of symbol2 causes an error.

The compiler ignores unrecognized pragmas. Using the -v option will give a warning on unrecognized pragmas.

Predefined Names

The following identifier is predefined as an object-like macro:

Table  3-4 Predefined Identifier
Identifier
Description
__STDC__ 
__STDC__  1  -Xc
__STDC__  0  -Xa, -Xt
Not defined  -Xs

The compiler will issue a warning if __STDC__ is undefined (#undef __STDC__). __STDC__ is not defined in -Xs mode.

Predefinitions (not valid in -Xc mode):

The following predefinitions are valid in all modes:

The compiler also predefines the object-like macro
_ _PRAGMA_REDEFINE_EXTNAME

to indicate that the pragma will be recognized.

The following is predefined in -Xa and -Xt modes only:

_ _RESTRICT


MP C (SPARC)

SunSoft MP C is an extended ANSI C compiler that can optimize code to run on SPARC shared-memory multiprocessor machines. The process is called parallelizing. The compiled code can execute in parallel using the multiple processors on the system.

The SunSoft WorkShop includes the license required to use the features of MP C.

This section contains an overview and example of using MP C, and documents the environment variable, keyword, pragmas, and options used with MP C.

Refer to the "MP C" white paper, located in /opt/SUNWspro/READMEs/mpc.ps, for examples on using MP C and for further reference information.

Overview

The MP C compiler generates parallel code for those loops that it determines are safe to parallelize. Typically, these loops have iterations that are independent of each other. For such loops, it does not matter in what order the iterations are executed or if they are executed in parallel. Many, although not all, vector loops fall into this category.

Because of the way aliasing works in C, it is difficult to determine the safety of parallelization. To help the compiler, MP C offers pragmas and additional pointer qualifications to provide aliasing information known to the programmer that the compiler cannot determine.

Example of Use

The following example illustrates the use of MP C and how parallel execution can be controlled. To enable parallelization of the target program, the
"-xautopar" option can be used as follows:

% cc -fast -xO4 -xautopar example.c -o example 
This generates an executable called example, which can be executed normally.

Environment Variable

If multiprocessor execution is desired, the PARALLEL environment variable needs to be set. It specifies the number of processors available to the program:

% setenv PARALLEL 2
This will enable the execution of the program on two threads. If the target machine has multiple processors, the threads can map to independent processors.

% example
Running the program will lead to creation of two threads that will execute the parallelized portions of the program.

Keyword

The keyword _Restrict can be used with MP C. Refer to the section "_Restrict Keyword" on page 59 for details.

Explicit Parallelization and Pragmas

Often, there is not enough information available for the compiler to make a decision on the legality or profitability of parallelization. MP C supports pragmas that allow the programmer to effectively parallelize loops that otherwise would be too difficult or impossible for the compiler to handle.

Serial Pragmas

There are two serial pragmas, and both apply to "for" loops:

The #pragma MP serial_loop pragma indicates to the compiler that the next for loop is not to be implicitly/automatically parallelized.

The #pragma MP serial_loop_nested pragma indicates to the compiler that the next for loop and any for loops nested within the scope of this for loop are not to be implicitly/automatically parallelized. The scope of the serial_loop_nested pragma does not extend beyond the scope of the loop to which it applies.

Parallel Pragmas

There is one parallel pragma: #pragma MP taskloop [options].

The MP taskloop pragma can, optionally, take one or more of the following arguments.

Only one option can be specified per MP taskloop pragma; however, the pragmas are cumulative and apply to the next for loop encountered within the current block in the source code:

        #pragma MP taskloop maxcpus(4)
#pragma MP taskloop shared(a,b)
#pragma MP taskloop storeback(x)
These options may appear multiple times prior to the for loop to which they apply. In case of conflicting options, the compiler will issue a warning message.

Nesting of for loops
An MP taskloop pragma applies to the next for loop within the current block. There is no nesting of parallelized for loops by MP C.

Eligibility for Parallelizing
An MP taskloop pragma suggests to the compiler that, unless otherwise disallowed, the specified for loop should be parallelized.

For loops with irregular control flow and unknown loop iteration increment are not eligible for parallelization. For example, for loops containing setjmp, longjmp, exit, abort, return, goto, labels, and break should not be considered as candidates for parallelization.

Of particular importance is to note that for loops with inter-iteration dependencies can be eligible for explicit parallelization. This means that if a MP taskloop pragma is specified for such a loop the compiler will simply honor it, unless the for loop is disqualified. It is the user's responsibility to make sure that such explicit parallelization will not lead to incorrect results.

If both the serial_loop or serial_loop_nested and taskloop pragmas are specified for a for loop, the last one specified will prevail.

Consider the following example:

      #pragma MP serial_loop_nested
for (i=0; i<100; i++) {
# pragma MP taskloop
for (j=0; j<1000; j++) {
...
}
}
The i loop will not be parallelized but the j loop might be.

Number of Processors
#pragma MP taskloop maxcpus (number_of_processors) specifies the number of processors to be used for this loop, if possible.

The value of maxcpus must be a positive integer. If maxcpus equals 1, then the specified loop will be executed in serial. (Note that setting maxcpus to be 1 is equivalent to specifying the serial_loop pragma.) The smaller of the values of maxcpus or the interpreted value of the PARALLEL environment variable will be used. When the environment variable PARALLEL is not specified, it is interpreted as having the value 1.

If more than one maxcpus pragma is specified for a for loop, the last one specified will prevail.

Classifying Variables
A variable used in a loop is classified as being either a "private", "shared", "reduction", or "readonly" variable. The variable will belong to only one of these classifications. A variable can only be classified as a reduction or readonly variable via an explicit pragma. See #pragma MP taskloop reduction and #pragma MP taskloop readonly. A variable can be classified as being either a "private or "shared" variable via an explicit pragma or through the following default scoping rules.

Default Scoping Rules for Private and Shared Variables
A private variable is one whose value is private to each processor processing some iterations of a for loop. In other words, the value assigned to a private variable in one iteration of a for loop is not propagated to other processors processing other iterations of that for loop. A shared variable, on the other hand, is a variable whose current value is accessible by all processors processing iterations of a for loop. The value assigned to a shared variable by one processor working on iterations of a loop may be seen by other processors working on other iterations of the loop. Loops being explicitly parallelized through use of #pragma MP taskloop directives, that contain references to shared variables, must ensure that such sharing of values does not cause any correctness problems (such as race conditions). No synchronization is provided by the compiler on updates and accesses to shared variables in an explicitly parallelized loop.

In analyzing explicitly parallelized loops, the compiler uses the following "default scoping rules" to determine whether a variable is private or shared:

It is highly recommended that all variables used in an explicitly parallelized for loop be explicitly classified as one of shared, private, reduction, or readonly, to avoid the "default scoping rules."

Since the compiler does not perform any synchronization on accesses to shared variables, extreme care must be exercised before using an MP taskloop pragma for a loop that contains, for example, array references. If inter-iteration data dependencies exist in such an explicitly parallelized loop, then its parallel execution may give erroneous results. The compiler may or may not be able to detect such a potential problem situation and issue a warning message. In any case, the compiler will not disable the explicit parallelization of loops with potential shared variable problems.

Private Variables
#pragma MP taskloop private (list_of_private_variables) specifies all the variables that should be treated as private variables for this loop. All other variables used in the loop that are not explicitly specified as shared, readonly, or reduction variables, will be either shared or private as defined by the default scoping rules.

A private variable is one whose value is private to each processor processing some iterations of a loop. In other words, the value assigned to a private variable by one of the processors working on iterations of a loop is not propagated to other processors processing other iterations of that loop. A private variable has no initial value at the start of each iteration of a loop and must be set to a value within the iteration of a loop prior to its first use within that iteration. Execution of a program with a loop containing an explicitly declared private variable whose value is used prior to being set will result in undefined behavior.

Shared Variables
#pragma MP taskloop shared (list_of_shared_variables) specifies all the variables that should be treated as shared variables for this loop. All other variables used in the loop that are not explicitly specified as private, readonly, storeback or reduction variables, will be either shared or private as defined by the default scoping rules.

A shared variable is a variable whose current value is accessible by all processors processing iterations of a for loop. The value assigned to a shared variable by one processor working on iterations of a loop may be seen by other processors working on other iterations of the loop.

Read-only Variables
Read-only variables are a special class of shared variables that are not modified in any iteration of a loop. #pragma MP taskloop readonly (list_of_readonly_variables) indicates to the compiler that it may use a separate copy of that variable's value for each processor processing iterations of the loop.

Storeback Variables
#pragma MP taskloop storeback (list_of_storeback_variables) specifies all the variables to be treated as storeback variables.

A storeback variable is one whose value is computed in a loop, and this computed value is then used after the termination of the loop. The last loop iteration values of storeback variables are available for use after the termination of the loop. Such a variable is a good candidate to be declared explicitly via this directive as a storeback variable when the variable is a private variable, whether by explicitly declaring the variable private or by the default scoping rules.

Note that the storeback operation for a storeback variable occurs at the last iteration of the explicitly parallelized loop, regardless of whether or not that iteration updates the value of the storeback variable. In other words the processor that processes the last iteration of a loop may not be the same processor that currently contains the last updated value for a storeback variable. Consider the following example:

       #pragma MP taskloop private(x)
#pragma MP taskloop storeback(x)
for (i=1; i <= n; i++) {
if (...) {
x =...
}
}
printf ("%d", x);
In the above example the value of the storeback variable x printed out via the printf() call may not be the same as that printed out by a serial version of the i loop, because in the explicitly parallelized case, the processor that processes the last iteration of the loop (when i==n), which performs the storeback operation for x may not be the same processor that currently contains the last updated value for x. The compiler will attempt to issue a warning message to alert the user of such potential problems.

In an explicitly parallelized loop, variables referenced as arrays are not treated as storeback variables. Hence it is important to include them in the list_of_storeback_variables if such storeback operation is desired (for example, if the variables referenced as arrays have been declared as private variables).

Savelast
#pragma MP taskloop savelast specifies that all the private variables of a loop be treated as a storeback variables. The syntax of this pragma is as follows:

      #pragma MP taskloop savelast
It is often convenient to use this form, rather than list out each private variable of a loop when declaring each variable as storeback variables.

Reduction Variables
#pragma MP taskloop reduction (list_of_reduction_variables) specifies that all the variables appearing in the reduction list will be treated as reduction variables for the loop. A reduction variable is one whose partial values can be individually computed by each of the processors processing iterations of the loop, and whose final value can be computed from all its partial values. The presence of a list of reduction variables can facilitate the compiler in identifying that the loop is a reduction loop, allowing generation of parallel reduction code for it.

Consider the following example:

        #pragma MP taskloop reduction(x)
for (i=0; i<n; i++) {
x = x + a[i];
}
the variable x is a (sum) reduction variable and the i loop is a(sum) reduction loop.

Scheduling Control
The MP C compiler supports several pragmas that can be used in conjunction with the taskloop pragma to control the loop scheduling strategy for a given loop. The syntax for this pragma is:

#pragma MP taskloop schedtype (scheduling_type)
This pragma can be used to specify the specific scheduling_type to be used to schedule the parallelized loop. Scheduling_type can be one of the following:

In static scheduling all the iterations of the loop are uniformly distributed among all the participating processors.

Example:

            #pragma MP taskloop maxcpus(4)
#pragma MP taskloop schedtype(static)
for (i=0; i<1000; i++) {
...
}
In the above example, each of the four processors will process 250 iterations of the loop.

In self scheduling, each participating processor processes a fixed number of iterations (called the "chunk size") until all the iterations of the loop have been processed. The optional chunk_size parameter specifies the "chunk size" to be used. Chunk_size must be a positive integer constant, or variable of integral type. If specified as a variable chunk_size must evaluate to a positive integer value at the beginning of the loop. If this optional parameter is not specified or its value is not positive, the compiler will select the chunk size to be used.

Example:

            #pragma MP taskloop maxcpus(4)
#pragma MP taskloop schedtype(self(120))
for (i=0; i<1000; i++) {
...
}
In the above example, the number of iterations of the loop assigned to each participating processor, in order of work request, are:

120, 120, 120, 120, 120, 120, 120, 120, 40.

In guided self scheduling, each participating processor processes a variable number of iterations (called the "min chunk size") until all the iterations of the loop have been processed. The optional min_chunk_size parameter specifies that each variable chunk size used must be at least min_chunk_size in size. Min_chunk_size must be a positive integer constant, or variable of integral type. If specified as a variable min_chunk_size must evaluate to a positive integer value at the beginning of the loop. If this optional parameter is not specified or its value is not positive, the compiler will select the chunk size to be used.

Example:

            #pragma MP taskloop maxcpus(4)
#pragma MP taskloop schedtype(gss(10))
for (i=0; i<1000; i++) {
...
}
In the above example, the number of iterations of the loop assigned to each participating processor, in order of work request, are:

250, 188, 141, 106, 79, 59, 45, 33, 25, 19, 14, 11, 10, 10, 10.

In factoring scheduling, each participating processor processes a variable number of iterations (called the "min chunk size") until all the iterations of the loop have been processed. The optional min_chunk_size parameter specifies that each variable chunk size used must be at least min_chunk_size in size. Min_chunk_size must be a positive integer constant, or variable of integral type. If specified as a variable min_chunk_size must evaluate to a positive integer value at the beginning of the loop. If this optional parameter is not specified or its value is not positive, the compiler will select the chunk size to be used.

Example:

           #pragma MP taskloop maxcpus(4)
#pragma MP taskloop schedtype(factoring(10))
for (i=0; i<1000; i++) {
...
}
In the above example, the number of iterations of the loop assigned to each participating processor, in order of work request, are:

125, 125, 125, 125, 62, 62, 62, 62, 32, 32, 32, 32, 16, 16, 16, 16, 10, 10,
10, 10, 10, 10.

Compiler Options

The following compiler options can be used in MP C. Refer to Chapter 2, "cc Compiler Options" for complete descriptions of the options.

"-xautopar" on page 27
"-xdepend" on page 31
"-xexplicitpar" on page 31
"-xloopinfo" on page 34
"-xparallel" on page 39
"-xreduction" on page 42
"-xrestrict=f" on page 43
"-xvpara" on page 52
"-Zlp" on page 53.


Previous Next Contents Index Doc Set Home