| Cray C and C++ Reference Manual - S-2179-51 | ||
|---|---|---|
| Prev Section | Chapter 4. OpenMP C API Directives | Next Section |
This section presents a directive and several clauses for controlling the data environment during the execution of parallel regions, as follows:
A threadprivate directive (see Section 4.7.1) is provided to make filescope, namespace-scope, or static block-scope variables local to a thread.
Clauses that may be specified on the directives to control the sharing attributes of variables for the duration of the parallel or work-sharing constructs are described in Section 4.7.2.
The threadprivate directive makes the named file-scope, namespace-scope, or static block-scope variables specified in the variable-list private to a thread. variable-list is a comma-separated list of variables that do not have an incomplete type. The syntax of the threadprivate directive is as follows:
#pragma omp threadprivate(variable-list) new-line |
Each copy of a threadprivate variable is initialized once, at an unspecified point in the program prior to the first reference to that copy, and in the usual manner (that is, as the master copy would be initialized in a serial execution of the program). Note that if an object is referenced in an explicit initializer of a threadprivate variable, and the value of the object is modified prior to the first reference to a copy of the variable, then the behavior is unspecified.
As with any private variable, a thread must not reference another thread's copy of a threadprivate object. During serial regions and master regions of the program, references will be to the master thread's copy of the object.
After the first parallel region executes, the data in the threadprivate objects is guaranteed to persist only if the dynamic threads mechanism has been disabled and if the number of threads remains unchanged for all parallel regions.
The restrictions to the threadprivate directive are as follows:
A threadprivate directive for file-scope or namespace-scope variables must appear outside any definition or declaration, and must lexically precede all references to any of the variables in its list.
Each variable in the variable-list of a threadprivate directive at file or namespace scope must refer to a variable declaration at file or namespace scope that lexically precedes the directive.
A threadprivate directive for static block-scope variables must appear in the scope of the variable and not in a nested scope. The directive must lexically precede all references to any of the variables in its list.
Each variable in the variable-list of a threadprivate directive in block scope must refer to a variable declaration in the same scope that lexically precedes the directive. The variable declaration must use the static storage-class specifier.
If a variable is specified in a threadprivate directive in one translation unit, it must be specified in a threadprivate directive in every translation unit in which it is declared.
A threadprivate variable must not appear in any clause except the copyin, schedule, num_threads, or the if clause.
The address of a threadprivate variable is not an address constant.
A threadprivate variable must not have an incomplete type or a reference type.
A threadprivate variable with non-POD class type must have an accessible, unambiguous copy constructor if it is declared with an explicit initializer.
The following example illustrates how modifying a variable that appears in an initializer can cause unspecified behavior, and also how to avoid this problem by using an auxiliary object and a copy-constructor.
int x = 1;
T a(x);
const T b_aux(x); /* Capture value of x = 1 */
T b(b_aux);
#pragma omp threadprivate(a, b)
void f(int n) {
x++;
#pragma omp parallel for
/* In each thread:
* Object a is constructed from x (with value 1 or 2?)
* Object b is copy-constructed from b_aux
*/
for (int i=0; i<n; i++) {
g(a, b); /* Value of a is unspecified. */
}
} |
Several directives accept clauses that allow a user to control the sharing attributes of variables for the duration of the region. Sharing attribute clauses apply only to variables in the lexical extent of the directive on which the clause appears. Not all of the following clauses are allowed on all directives. The list of clauses that are valid on a particular directive are described with the directive.
If a variable is visible when a parallel or work-sharing construct is encountered, and the variable is not specified in a sharing attribute clause or threadprivate directive, then the variable is shared. Static variables declared within the dynamic extent of a parallel region are shared. Heap allocated memory (for example, using malloc() in C) is shared. (The pointer to this memory, however, can be either private or shared.) Variables with automatic storage duration declared within the dynamic extent of a parallel region are private.
Most of the clauses accept a variable-list argument, which is a comma-separated list of variables that are visible. If a variable referenced in a data-sharing attribute clause has a type derived from a template, and there are no other references to that variable in the program, the behavior is undefined.
All variables that appear within directive clauses must be visible. Clauses may be repeated as needed, but no variable may be specified in more than one clause, except that a variable can be specified in both a firstprivate and a lastprivate clause.
The following sections describe the data-sharing attribute clauses:
private, (Section 4.7.2.1)
firstprivate, (Section 4.7.2.2)
lastprivate, (Section 4.7.2.3)
shared, (Section 4.7.2.4)
default, (Section 4.7.2.5)
reduction, (Section 4.7.2.6)
copyin, (Section 4.7.2.7)
The private clause declares the variables in variable-list to be private to each thread in a team. The syntax of the private clause is as follows:
private(variable-list) |
The behavior of a variable specified in a private clause is as follows. A new object with automatic storage duration is allocated for the construct. The size and alignment of the new object are determined by the type of the variable. This allocation occurs once for each thread in the team, and a default constructor is invoked for a class object if necessary; otherwise the initial value is indeterminate. The original object referenced by the variable has an indeterminate value upon entry to the construct, must not be modified within the dynamic extent of the construct, and has an indeterminate value upon exit from the construct.
In the lexical extent of the directive construct, the variable references the new private object allocated by the thread.
The restrictions to the private clause are as follows:
A variable with a class type that is specified in a private clause must have an accessible, unambiguous default constructor.
A variable specified in a private clause must not have a const-qualified type unless it has a class type with a mutable member.
A variable specified in a private clause must not have an incomplete type or a reference type.
Variables that appear in the reduction clause of a parallel directive cannot be specified in a private clause on a work-sharing directive that binds to the parallel construct.
The firstprivate clause provides a superset of the functionality provided by the private clause. The syntax of the firstprivate clause is as follows:
firstprivate(variable-list) |
Variables specified in variable-list have private clause semantics, as described in Section 4.7.2.1. The initialization or construction happens as if it were done once per thread, prior to the thread's execution of the construct. For a firstprivate clause on a parallel construct, the initial value of the new private object is the value of the original object that exists immediately prior to the parallel construct for the thread that encounters it. For a firstprivate clause on a work-sharing construct, the initial value of the new private object for each thread that executes the work-sharing construct is the value of the original object that exists prior to the point in time that the same thread encounters the work-sharing construct.
The restrictions to the firstprivate clause are as follows:
A variable specified in a firstprivate clause must not have an incomplete type or a reference type.
A variable with a class type that is specified as firstprivate must have an accessible, unambiguous copy constructor.
Variables that are private within a parallel region or that appear in the reduction clause of a parallel directive cannot be specified in a firstprivate clause on a work-sharing directive that binds to the parallel construct.
The lastprivate clause provides a superset of the functionality provided by the private clause. The syntax of the lastprivate clause is as follows:
lastprivate(variable-list) |
Variables specified in the variable-list have private clause semantics. When a lastprivate clause appears on the directive that identifies a work-sharing construct, the value of each lastprivate variable from the sequentially last iteration of the associated loop, or the lexically last section directive, is assigned to the variable's original object. Variables that are not assigned a value by the last iteration of the for or parallel for, or by the lexically last section of the sections or parallel sections directive, have indeterminate values after the construct. Unassigned subobjects also have an indeterminate value after the construct.
The restrictions to the lastprivate clause are as follows:
All restrictions for private apply
A variable with a class type that is specified as lastprivate must have an accessible, unambiguous copy assignment operator
Variables that are private within a parallel region or that appear in the reduction clause of a parallel directive cannot be specified in a lastprivate clause on a work-sharing directive that binds to the parallel construct
This clause shares variables that appear in the variable-list among all the threads in a team. All threads within a team access the same storage area for shared variables. The syntax of the shared clause is as follows:
shared(variable-list) |
The default clause allows the user to affect the data-sharing attributes of variables. The syntax of the default clause is as follows:
default(shared | none) |
Specifying default(shared) is equivalent to explicitly listing each currently visible variable in a shared clause, unless it is threadprivate or const-qualified. In the absence of an explicit default clause, the default behavior is the same as if default(shared) were specified.
Specifying default(none) requires that at least one of the following must be true for every reference to a variable in the lexical extent of the parallel construct:
The variable is explicitly listed in a data-sharing attribute clause of a construct that contains the reference
The variable is declared within the parallel construct
The variable is threadprivate
The variable has a const-qualified type
The variable is the loop control variable for a for loop that immediately follows a for or parallel for directive, and the variable reference appears inside the loop
Specifying a variable on a firstprivate, lastprivate, or reduction clause of an enclosed directive causes an implicit reference to the variable in the enclosing context. Such implicit references are also subject to the requirements listed above.
Only a single default clause may be specified on a parallel directive.
A variable's default data-sharing attribute can be overridden by using the private, firstprivate, lastprivate, reduction, and shared clauses, as demonstrated by the following example:
#pragma omp parallel for default(shared) firstprivate(i) private(x) private(r) lastprivate(i) |
This clause performs a reduction on the scalar variables that appear in variable-list, with the operator op. The syntax of the reduction clause is as follows:
reduction(op:variable-list) |
A reduction is typically specified for a statement with one of the following forms:
x = x op expr x binop= expr x = expr op x (except for subtraction) x++ ++x x-- --x |
where:
The following is an example of the reduction clause:
#pragma omp parallel for reduction(+: a, y) reduction(||: am)
for (i=0; i<n; i++) {
a += b[i];
y = sum(y, c[i]);
am = am || b[i] == c[i];
} |
As shown in the example, an operator may be hidden inside a function call. The user should be careful that the operator specified in the reduction clause matches the reduction operation.
Although the right operand of the || operator has no side effects in this example, they are permitted, but should be used with care. In this context, a side effect that is guaranteed not to occur during sequential execution of the loop may occur during parallel execution. This difference can occur because the order of execution of the iterations is indeterminate.
The operator is used to determine the initial value of any private variables used by the compiler for the reduction and to determine the finalization operator. Specifying the operator explicitly allows the reduction statement to be outside the lexical extent of the construct. Any number of reduction clauses may be specified on the directive, but a variable may appear in at most one reduction clause for that directive.
A private copy of each variable in variable-list is created, one for each thread, as if the private clause had been used. The private copy is initialized according to the operator (see Table 4-2).
At the end of the region for which the reduction clause was specified, the original object is updated to reflect the result of combining its original value with the final value of each of the private copies using the operator specified. The reduction operators are all associative (except for subtraction), and the compiler may freely reassociate the computation of the final value. (The partial results of a subtraction reduction are added to form the final value.)
The value of the original object becomes indeterminate when the first thread reaches the containing clause and remains so until the reduction computation is complete. Normally, the computation will be complete at the end of the construct; however, if the reduction clause is used on a construct to which nowait is also applied, the value of the original object remains indeterminate until a barrier synchronization has been performed to ensure that all threads have completed the reduction clause.
The following table lists the operators that are valid and their canonical initialization values. The actual initialization value will be consistent with the data type of the reduction variable.
The restrictions to the reduction clause are as follows:
The type of the variables in the reduction clause must be valid for the reduction operator except that pointer types and reference types are never permitted.
A variable that is specified in the reduction clause must not be const-qualified.
Variables that are private within a parallel region or that appear in the reduction clause of a parallel directive cannot be specified in a reduction clause on a work-sharing directive that binds to the parallel construct.
#pragma omp parallel private(y)
{ /* ERROR - private variable y cannot be specified
in a reduction clause */
#pragma omp for reduction(+: y)
for (i=0; i<n; i++)
y += b[i];
}
/* ERROR - variable x cannot be specified in both
a shared and a reduction clause */
#pragma omp parallel for shared(x) reduction(+: x) |
The copyin clause provides a mechanism to assign the same value to threadprivate variables for each thread in the team executing the parallel region. For each variable specified in a copyin clause, the value of the variable in the master thread of the team is copied, as if by assignment, to the thread-private copies at the beginning of the parallel region. The syntax of the copyin clause is as follows:
copyin(variable-list) |
The restrictions to the copyin clause are as follows:
A variable that is specified in the copyin clause must have an accessible, unambiguous copy assignment operator.
A variable that is specified in the copyin clause must be a threadprivate variable.
The copyprivate clause provides a mechanism to use a private variable to broadcast a value from one member of a team to the other members. It is an alternative to using a shared variable for the value when providing such a shared variable would be difficult (for example, in a recursion requiring a different variable at each level). The copyprivate clause can appear only on the single directive.
The syntax of the copyprivate clause is as follows:
copyprivate(variable-list) |
The effect of the copyprivate clause on the variables in its variable-list occurs after the execution of the structured block associated with the single construct, and before any of the threads in the team have left the barrier at the end of the construct. Then, in all other threads in the team, for each variable in the variable-list, that variable becomes defined (as if by assignment) with the value of the corresponding variable in the thread that executed the construct's structured block.
Restrictions to the copyprivate clause are as follows:
A variable that is specified in the copyprivate clause must not appear in a private or firstprivate clause for the same single directive.
If a single directive with a copyprivate clause is encountered in the dynamic extent of a parallel region, all variables specified in the copyprivate clause must be private in the enclosing context.
A variable that is specified in the copyprivate clause must have an accessible unambiguous copy assignment operator.
| Prev Section | Table of Contents | Title Page | Index | Next Section |
| Master and Synchronization Directives | Up one level | Directive Binding |