Chapter 12. Implementation-defined Behavior

This chapter describes compiler behavior that is defined by the implementation according to the C and/or C++ standards. The standards require that the behavior of each particular implementation be documented.

12.1. Implementation-defined Behavior

The C and C++ standards define implementation-defined behavior as behavior, for a correct program construct and correct data, that depends on the characteristics of the implementation. The behavior of the Cray C and C++ compilers for these cases is summarized in this section.

12.1.1. Messages

All diagnostic messages issued by the compilers are reported through the UNICOS/mp message system. For information on messages issued by the compilers and for information about the UNICOS/mp message system, see Appendix E.

12.1.2. Environment

When argc and argv are used as parameters to the main function, the array members argv[0] through argv[argc-1] contain pointers to strings that are set by the command shell. The shell sets these arguments to the list of words on the command line used to invoke the compiler (the argument list). For further information on how the words in the argument list are formed, refer to the documentation on the shell in which you are running. For information on UNICOS/mp shells, see the sh(1) or csh(1) man page.

A third parameter, char **envp, provides access to environment variables. The value of the parameter is a pointer to the first element of an array of null-terminated strings, that matches the output of the env(1) command. The array of pointers is terminated by a null pointer.

The compiler does not distinguish between interactive devices and other, noninteractive devices. The library, however, may determine that stdin, stdout, and stderr (cin, cout, and cerr in Cray C++) refer to interactive devices and buffer them accordingly.

12.1.2.1. Identifiers

The identifier (as defined by the standards) is merely a sequence of letters and digits. Specific uses of identifiers are called names.

The Cray C compiler treats the first 255 characters of a name as significant, regardless of whether it is an internal or external name. The case of names, including external names, is significant. In Cray C++, all characters of a name are significant.

12.1.2.2. Types

Table 12-1 summarizes Cray C and C++ types and the characteristics of each type. Representation is the number of bits used to represent an object of that type. Memory is the number of storage bits that an object of that type occupies.

In the Cray C and C++ compilers, size, in the context of the sizeof operator, refers to the size allocated to store the operand in memory; it does not refer to representation, as specified in Table 12-1. Thus, the sizeof operator will return a size that is equal to the value in the Memory column of Table 12-1 divided by 8 (the number of bits in a byte).

Table 12-1. Data Type Mapping

 

UNICOS/mp

Type

Representation (bits)

Memory (bits)

bool (C++)

_Bool (C)

8

8

char

8

8

wchar_t

32

32

short[1]

16

16

int

32

32

long

64

64

long long

64

64

float

32

32

double

64

64

long double

128

128

float complex

64 (each part is 32 bits)

64

double complex

128 (each part is 64 bits)

128

long double complex

256 (each part is 128 bits)

256

Pointers

64

64

12.1.2.3. Characters

The full 8-bit ASCII code set can be used in source files. Characters not in the character set defined in the standard are permitted only within character constants, string literals, and comments. The -h [no]calchars option allows the use of the @ sign and $ sign in identifier names. For more information on the -h [no]calchars option, see Section 2.9.3.

A character consists of 8 bits. Up to 8 characters can be packed into a 64-bit word. A plain char type, one that is declared without a signed or unsigned keyword, is treated as an unsigned type.

Character constants and string literals can contain any characters defined in the 8-bit ASCII code set. The characters are represented in their full 8-bit form. A character constant can contain up to 8 characters. The integer value of a character constant is the value of the characters packed into a word from left to right, with the result right-justified, as shown in the following table:

Table 12-2. Packed Characters

Character constant

Integer value

'a'

0x61

'ab'

0x6162

In a character constant or string literal, if an escape sequence is not recognized, the \ character that initiates the escape sequence is ignored, as shown in the following table:

Table 12-3. Unrecognizable Escape Sequences

Character constant

Integer value

Explanation

'\a'

0x7

Recognized as the ASCII BEL character

'\8'

0x38

Not recognized; ASCII value for 8

'\['

0x5b

Not recognized; ASCII value for [

'\c'

0x63

Not recognized; ASCII value for c

12.1.2.4. Wide Characters

Wide characters are treated as signed 64-bit integer types. Wide character constants cannot contain more than one multibyte character. Multibyte characters in wide character constants and wide string literals are converted to wide characters in the compiler by calling the mbtowc(3) function. The current locale in effect at the time of compilation determines the method by which mbtowc(3) converts multibyte characters to wide characters, and the shift states required for the encoding of multibyte characters in the source code. If a wide character, as converted from a multibyte character or as specified by an escape sequence, cannot be represented in the extended execution character set, it is truncated.

12.1.2.5. Integers

All integral values are represented in a twos complement format. For representation and memory storage requirements for integral types, see Table 12-1.

When an integer is converted to a shorter signed integer, and the value cannot be represented, the result is the truncated representation treated as a signed quantity. When an unsigned integer is converted to a signed integer of equal length, and the value cannot be represented, the result is the original representation treated as a signed quantity.

The bitwise operators (unary operator ~ and binary operators <<, >>, &, ^, and |) operate on signed integers in the same manner in which they operate on unsigned integers. The result of E1 >> E2, where E1 is a negative-valued signed integral value, is E1 right-shifted E2 bit positions; vacated bits are filled with 1s. This behavior can be modified by using the -h nosignedshifts option (see Section 2.9.4). Bits higher than the sixth bit are not ignored. Values higher than 31 cause the result to be 0 or all 1s for right shifts.

The result of the / operator is the largest integer less than or equal to the algebraic quotient when either operand is negative and the result is a nonnegative value. If the result is a negative value, it is the smallest integer greater than or equal to the algebraic quotient. The / operator behaves the same way in C and C++ as in Fortran.

The sign of the result of the percent (%) operator is the sign of the first operand.

Integer overflow is ignored. Because some integer arithmetic uses the floating-point instructions, floating-point overflow can occur during integer operations. Division by 0 and all floating-point exceptions, if not detected as an error by the compiler, can cause a run time abort.

12.1.2.6. Arrays and Pointers

An unsigned int value can hold the maximum size of an array. The type size_t is defined to be a typedef name for unsigned long in the headers: malloc.h, stddef.h, stdio.h, stdlib.h, string.h, and time.h. If more than one of these headers is included, only the first defines size_t.

A type int can hold the difference between two pointers to elements of the same array. The type ptrdiff_t is defined to be a typedef name for long in the header stddef.h.

If a pointer type's value is cast to a signed or unsigned long int, and then cast back to the original type's value, the two pointer values will compare equal.

Pointers on UNICOS/mp systems are byte pointers. Byte pointers use the same internal representation as integers; a byte pointer counts the numbers of bytes from the first address.

A pointer can be explicitly converted to any integral type large enough to hold it. The result will have the same bit pattern as the original pointer. Similarly, any value of integral type can be explicitly converted to a pointer. The resulting pointer will have the same bit pattern as the original integral type.

12.1.2.7. Registers

Use of the register storage class in the declaration of an object has no effect on whether the object is placed in a register. The compiler performs register assignment aggressively; that is, it automatically attempts to place as many variables as possible into registers.

12.1.2.8. Classes, Structures, Unions, Enumerations, and Bit Fields

Accessing a member of a union by using a member of a different type results in an attempt to interpret, without conversion, the representation of the value of the member as the representation of a value in the different type.

Members of a class or structure are packed into words from left to right. Padding is appended to a member to correctly align the following member, if necessary. Member alignment is based on the size of the member:

  • For a member bit field of any size, alignment is any bit position that allows the member to fit entirely within a 64-bit word.

  • For a member with a size less than 64 bits, alignment is the same as the size. For example, a char has a size and alignment of 8 bits; a float has a size and alignment of 32 bits.

  • For a member with a size equal to or greater than 64 bits, alignment is 64 bits.

  • For a member with array type, alignment is equal to the alignment of the element type.

A plain int type bit field is treated as an signed int bit field.

The values of an enumeration type are represented in the type signed int in C; they are a separate type in C++.

12.1.2.9. Qualifiers

When an object that has volatile-qualified type is accessed, it is simply a reference to the value of the object. If the value is not used, the reference need not result in a load of the value from memory.

12.1.2.10. Declarators

A maximum of 12 pointer, array, and/or function declarators are allowed to modify an arithmetic, structure, or union type.

12.1.2.11. Statements

The compiler has no fixed limit on the maximum number of case values allowed in a switch statement.

The Cray C++ compiler parses asm statements for correct syntax, but otherwise ignores them.

12.1.2.12. Exceptions

In Cray C++, when an exception is thrown, the memory for the temporary copy of the exception being thrown is allocated on the stack and a pointer to the allocated space is returned.

12.1.2.13. System Function Calls

See the exit(3) man page for a description of the form of the unsuccessful termination status that is returned from a call to exit(3).

12.1.3. Preprocessing

The value of a single-character constant in a constant expression that controls conditional inclusion matches the value of the same character in the execution character set. No such character constant has a negative value. For each, 'a' has the same value in the two contexts:

#if 'a' == 97
if ('a' == 97)

The -I option and the method for locating included source files is described in Section 2.19.4.

The source file character sequence in a #include directive must be a valid UNICOS/mp file name or path name. A #include directive may specify a file name by means of a macro, provided the macro expands into a source file character sequence delimited by double quotes or < and > delimiters, as follows:

#define myheader "./myheader.h"
#include myheader

#define STDIO <stdio.h>
#include STDIO

The macros __DATE__ and __TIME__ contain the date and time of the beginning of translation. For more information, see the description of the predefined macros in Chapter 9.

The #pragma directives are described in Chapter 3.

Footnotes

[1]

We do not recommend using shorts because of performance penalties.