Chapter 9. Implementation-defined Behavior

This section describes compiler behavior that is defined by the implementation according to the C and/or C++ standards. The standards require that the behavior of each particular implementation be documented.

9.1. Implementation-defined Behavior

The C and C++ standards define implementation-defined behavior as behavior, for a correct program construct and correct data, that depends on the characteristics of the implementation. The behavior of the Cray Standard C and Cray C++ compilers for these cases is summarized in this section.

9.1.1. Messages

All diagnostic messages issued by the Cray compilers are reported through the UNICOS message system. For information on messages issued by the compilers and for information about the UNICOS message system, see Appendix C.

9.1.2. Environment

When argc and argv are used as parameters to the main function, the array members argv[0] through argv[argc-1] contain pointers to strings that are set by the command shell. The shell sets these arguments to the list of words on the command line used to invoke the compiler (the argument list). For further information on how the words in the argument list are formed, refer to the documentation on the shell in which you are running. For information on UNICOS shells, see the sh, csh, or ksh man pages.

A third parameter, char **envp, provides access to environment variables. The value of the parameter is a pointer to the first element of an array of null-terminated strings, that matches the output of the env command. The array of pointers is terminated by a null pointer.

The compiler does not distinguish between interactive devices and other, noninteractive devices. The library, however, may determine that stdin, stdout, and stderr (cin, cout, and cerr in C++) refer to interactive devices and buffer them accordingly. For further information, see the description of I/O in the UNICOS System Libraries Reference Manual.

9.1.2.1. Identifiers

The identifier (as defined by the standards) is merely a sequence of letters and digits. Specific uses of identifiers are called names.

In C, the compiler treats the first 255 characters of a name as significant, regardless of whether it is an internal or external name. The case of names, including external names, is significant. In C++, all characters of a name are significant.

9.1.2.2. Types

Table 9-1 summarizes data types supported on Cray Research systems and the characteristics of each type. Representation is the number of bits used to represent values for each data type. Memory is the number of storage bits that the data type occupies.

For the Cray Research implementation, size, in the context of the sizeof operator, refers to the size allocated to store the operands in memory; it does not refer to representation, as specified in Table 9-1. Thus, the sizeof operator will return a size that is equal to the value in the Memory column of Table 9-1 divided by 8 (the number of bits in a byte).

Table 9-1. Cray Research systems data type mapping

Cray PVP systems

Cray MPP systems

Type

Representation (bits)

Memory (bits)

Representation (bits)

Memory (bits)

bool

(C++ only)

8

8

8

8

char

8

8

8

8

wchar_t

(C++ only)

64

64

64

64

short

32

CRAY T90: 46/64

(See Footnote 1)

64

32

32

int

46/64

(See Footnote 1)

64

64

64

long

64

64

64

64

long long

(See Footnote 2)

64

64

64

64

float

64

64

32

32

double

64

64

64

64

long double

128

128

64

64

float complex

(See Footnote 3)

128

(64 each part)

128

64

(32 each part)

64

double complex

(See Footnote 3)

128

(64 each part)

128

128

(64 each part)

128

long double complex (See Footnote 3)

256

(128 each part)

256

128

(64 each part)

128

void and char pointers

64

64

64

64

Other pointers

32

CRAY T90: 64

64

64

64

Footnote 1: Depends on use of the -h [no]fastmd option. This option is described in Section 2.14.2.

Footnote 2: Available in extended mode only.

Footnote 3: Cray Research extension (Cray Standard C only).

9.1.2.3. Characters

The full 8-bit ASCII code set can be used in source files. Characters not in the character set defined in the standard are permitted only within character constants, string literals, and comments. The -h [no]calchars option allows the use of the @ sign and $ sign in identifier names. For more information on the -h [no]calchars option, see Section 2.7.3.

A character consists of 8 bits. Up to 8 characters can be packed into a Cray word. A plain char type, one that is declared without a signed or unsigned keyword, is treated as an unsigned type.

Character constants and string literals can contain any characters defined in the 8-bit ASCII code set. The characters are represented in their full 8-bit form. A character constant can contain up to 8 characters. The integer value of a character constant is the value of the characters packed into a word from left to right, with the result right-justified, as shown in the following table:

Table 9-2.

Character constantInteger value
'a'0x61
'ab'0x6162

In a character constant or string literal, if an escape sequence is not recognized, the \ character that initiates the escape sequence is ignored, as shown in the following table:

Table 9-3.

Character constant

Integer value

Explanation

'\a'

0x7

Recognized as the ASCII BEL character

'\8'

0x38

Not recognized; ASCII value for 8

'\['

0x5b

Not recognized; ASCII value for [

'\c'

0x63

Not recognized; ASCII value for c

9.1.2.4. Wide Characters

Wide characters are treated as signed 64-bit integer types. Wide character constants cannot contain more than one multibyte character. Multibyte characters in wide character constants and wide string literals are converted to wide characters in the compiler by calling the mbtowc function. The current locale in effect at the time of compilation determines the method by which mbtowc converts multibyte characters to wide characters, and the shift states required for the encoding of multibyte characters in the source code. If a wide character, as converted from a multibyte character or as specified by an escape sequence, cannot be represented in the extended execution character set, it is truncated.

9.1.2.5. Integers

All integral values are represented in a twos complement format. For representation and memory storage requirements for integral types on Cray Research systems, see Table 9-1.

When an integer is converted to a shorter signed integer, and the value cannot be represented, the result is the truncated representation treated as a signed quantity. When an unsigned integer is converted to a signed integer of equal length, and the value cannot be represented, the result is the original representation treated as a signed quantity.

The bitwise operators (unary operator ~ and binary operators <<, >>, &, ^, and |) operate on signed integers in the same manner in which they operate on unsigned integers. The result of E1 >> E2, where E1 is a negative-valued signed integral value, is E1 right-shifted E2 bit positions; vacated bits are filled with 0's on UNICOS systems and 1's on UNICOS/mk systems. On UNICOS/mk systems, this behavior can be modified by using the -h nosignedshifts option (see Section 2.7.4).

On UNICOS/mk systems, the shift operators (>> and <<) use only the rightmost six bits of the second operand. For example, shifting by 65 is the same as shifting by 1. On CRAY Y-MP systems, bits higher than the sixth bit are not ignored. Values higher than 63 cause the result to be 0.

The result of the / operator is the largest integer less than or equal to the algebraic quotient when either operand is negative and the result is a nonnegative value. If the result is a negative value, it is the smallest integer greater than or equal to the algebraic quotient. The / operator behaves the same way in C/C++ as in Fortran.

The sign of the result of the % operator is the sign of the first operand.

Integer overflow is ignored. Because some integer arithmetic uses the floating-point instructions on UNICOS systems, floating-point overflow can occur during integer operations. Division by 0 and all floating-point exceptions, if not detected as an error by the compiler, can cause a run-time abort.

9.1.2.6. Floating-point Arithmetic

Cray Research systems use either Cray floating-point arithmetic or IEEE floating-point arithmetic. These types of floating-point representation are described in the sections that follow.

9.1.2.6.1. Cray Floating-point Representation

Types float and double represent Cray single-precision (64-bit) floating-point values; long double represents Cray double-precision (128-bit) floating-point values.

An integral number that is converted to a floating-point number that cannot exactly represent the original value is truncated toward 0. A floating-point number that is converted to a narrower floating-point number is also truncated toward 0.

Floating-point arithmetic depends on implementation-defined ranges for types of data. The values of the minimums and maximums for these ranges are defined by macros in the standard header file float.h. All floating-point operations on operands that are within the defined range yield results that are also in this range if the true mathematical result is in the range. The results are accurate to within the ability of the hardware to represent the true value.

The maximum positive value for types float, double, and long double is approximately as follows:

2.7 × 102456

Several math functions return this upper limit if the true value equals or exceeds it.

The minimum positive value for types float, double, and long double is approximately as follows:

3.67 × 10-2466

These numbers define a range that is slightly smaller than the value that can be represented by Cray Research hardware, but use of numbers outside this range may not yield predictable results. For exact values, use the values defined in the header file, float.h.

A floating-point value, when rounded off, can be accurately represented to approximately 14 decimal places for types float and double, and to approximately 28 decimal places for type long double as determined by the following equation:

numberofdecimal digits=numberofbitslog210.0

Digits beyond these precisions may not be accurate. It is safest to assume only 14 or 28 decimal places of accuracy.

Epsilon, the difference between 1.0 and the smallest value greater than 1.0 that is representable in the given floating-point type, is approximately 7.1 × 10-15 for types float and double, and approximately 2.5 × 10-29 for type long double.

9.1.2.6.2. IEEE Floating-point Representation

On UNICOS/mk systems, float represents IEEE single-precision (32-bit) floating-point values; double and long double represent double-precision (64-bit) floating-point values. IEEE extended double precision (128-bit) is not available on UNICOS/mk systems.

On UNICOS systems with IEEE floating-point hardware, float and double represent IEEE double-precision (64-bit) floating-point values. The long double represents IEEE extended double-precision (128-bit) floating-point values. IEEE single-precision (32-bit) is not available on UNICOS systems.

An integral number that is converted to a floating-point number that cannot exactly represent the original value is rounded according to the current rounding mode. A floating-point number that is converted to a floating-point number with fewer significant digits also is rounded according to the current rounding mode on UNICOS/mk systems; on UNICOS systems, the number is rounded to closest, but not in an IEEE round-to-nearest fashion.

Floating-point arithmetic depends on implementation-defined ranges for types of data. The values of the minimums and maximums for these ranges are defined by macros in the standard header file, float.h. All floating-point operations on operands that are within the defined range yield results that are also in this range if the true mathematical result is in the range. The results are accurate to within the ability of the hardware to represent the true value.

The maximum positive values are approximately as follows:

3.4 × 1038 

Single (32 bits)

1.8 × 10308 

Double (64 bits)

1.2 × 104932 

Extended double (128 bits)

The minimum positive values are approximately as follows:

1.8 × 10-38 

Single (32 bits)

2.2 × 10-308 

Double (64 bits)

3.4 × 10-4932 

Extended double (128 bits)

For exact values, use the macros defined in the header file, float.h.

Rounding of 32 and 64 bit floating-point arithmetic is determined by the current rounding mode. The 128 bit floating-point arithmetic is rounded to the closest, without regard to the rounding mode. A floating-point value, when rounded off, can be accurately represented to approximately 7 decimal places for single-precision types, approximately 16 decimal places for double-precision types, and approximately 34 decimal places for extended double-precision types as determined by the following equation:

numberofdecimal digits=numberofbitslog210.0

Digits beyond these precisions may not be accurate.

Epsilon, the difference between 1.0 and the smallest value greater than 1.0 that is representable in the given floating-point type, is approximately as follows:

1.2 × 10-7 

Single (32 bits)

2.2 × 10-16 

Double (64 bits)

1.9 × 10-34 

Extended double (128 bits)

Upon entering the main function at the beginning of the program execution, the rounding mode is set to round to nearest, all floating-point exception status flags are cleared, and traps are enabled for overflow, invalid operation, and division-by-zero exceptions. Traps are disabled for all other exceptions. On CRAY T90 systems with IEEE floating-point hardware the default rounding mode and the trap modes can be specified at program startup by using the cpu command (see the cpu man page for more information).

9.1.2.7. Arrays and Pointers

An unsigned int value can hold the maximum size of an array. The type size_t is defined to be a typedef name for unsigned int in the headers: malloc.h, stddef.h, stdio.h, stdlib.h, string.h, and time.h. If more than one of these headers is included, only the first defines size_t.

A type int can hold the difference between two pointers to elements of the same array. The type ptrdiff_t is defined to be a typedef name for int in the header stddef.h.

On all Cray Research systems, if a pointer type's value is cast to a signed or unsigned int or long int, and then cast back to the original type's value, the two pointer values will compare equal.

Pointers on UNICOS systems differ from pointers on UNICOS/mk systems. The sections that follow describe pointer implementation on each type of system.

9.1.2.7.1. Pointers on UNICOS Systems

Although a pointer value can be stored in an object of integer type, an operation may give different results when performed on the same value treated as an integer or as a pointer. An integer result should not be used as a pointer. For example, do not assume that adding 5 to an integer is the same as adding 5 to a pointer, because the result is affected by the kind of pointer used in the operation. In particular, results may differ from those on a system using a simpler representation of pointers, such as UNICOS/mk systems.

Pointers other than character pointers are internally represented just like integers: as a single 64-bit field. Character pointers use one of the formats shown in Figure 9-1, depending on the size of A registers.

Figure 9-1. Character pointer format

Converting a 64-bit integer to a character pointer type results in a pointer to the byte specified by the value in the offset field of the word specified in the address field.

9.1.2.7.2. Pointers on UNICOS/mk systems

Pointers on UNICOS/mk systems are byte pointers. Byte pointers use the same internal representation as integers; a byte pointer counts the numbers of bytes from the first address.

A pointer can be explicitly converted to any integral type large enough to hold it. The result will have the same bit pattern as the original pointer. Similarly, any value of integral type can be explicitly converted to a pointer. The resulting pointer will have the same bit pattern as the original integral type.

9.1.2.8. Registers

Use of the register storage class in the declaration of an object has no effect on whether the object is placed in a register. The compiler performs register assignment aggressively; that is, it automatically attempts to place as many variables as possible into registers.

9.1.2.9. Classes, Structures, Unions, Enumerations, and Bit Fields

Accessing a member of a union by using a member of a different type results in an attempt to interpret, without conversion, the representation of the value of the member as the representation of a value in the different type.

Members of a class or structure are packed into Cray words from left to right. Padding is appended to a member to correctly align the following member, if necessary. Member alignment is based on the size of the member:

  • For a member bitfield of any size, alignment is any bit position that allows the member to fit entirely within a 64-bit word.

  • For a member with a size less than 64 bits, alignment is the same as the size. For example, a char has a size and alignment of 8 bits; a float or short on UNICOS/mk systems has a size and alignment of 32 bits.

  • For a member with a size equal to or greater than 64 bits, alignment is 64 bits.

  • For a member with array type, alignment is equal to the alignment of the element type.

A plain int type bit field is treated as an unsigned int bit field.

The values of an enumeration type are represented in the type signed int for C; they are a separate type in C++.

9.1.2.10. Qualifiers

When an object that has volatile-qualified type is accessed, it is simply a reference to the value of the object. If the value is not used, the reference need not result in a load of the value from memory.

9.1.2.11. Declarators

A maximum of 12 pointer, array, and/or function declarators are allowed to modify an arithmetic, structure, or union type.

9.1.2.12. Statements

The compiler has no fixed limit on the maximum number of case values allowed in a switch statement.

The C++ compiler parses asm statements for correct syntax, but otherwise ignores them.

9.1.2.13. Exceptions

In C++, when an exception is thrown, the memory for the temporary copy of the exception being thrown is allocated on the stack and a pointer to the allocated space is returned.

9.1.2.14. System Function Calls

See the exit man page for a description of the form of the unsuccessful termination status that is returned from a call to exit.

9.1.3. Preprocessing

The value of a single-character constant in a constant expression that controls conditional inclusion matches the value of the same character in the execution character set. No such character constant has a negative value. For each, 'a' has the same value in the two contexts:

#if 'a' == 97
if ('a' == 97)

The -I option and the method for locating included source files is described in Section 2.19.1.

The source file character sequence in a #include directive must be a valid UNICOS file name or path name. A #include directive may specify a file name by means of a macro, provided the macro expands into a source file character sequence delimited by double quotes or < and > delimiters, as follows:

#define myheader "./myheader.h"
#include myheader

#define STDIO <stdio.h>
#include STDIO

The macros __DATE__ and __TIME__ contain the date and time of the beginning of translation. For more information, see the description of the predefined macros in Chapter 6.

The #pragma directives are described in section Chapter 4.