4.4. Work-sharing Constructs

A work-sharing construct divides the execution of the enclosed code region among the members of the team that encounter it. A work-sharing construct must be enclosed within a parallel region in order for the directive to execute in parallel. The work-sharing directives do not launch new threads, and there is no implied barrier on entry to a work-sharing construct.

The following restrictions apply to the work-sharing directives:

The following sections describe the work-sharing directives:

4.4.1. Specify Parallel Execution: DO and END DO Directives

The DO directive specifies that the iterations of the immediately following DO loop must be divided among the threads in the parallel region. If there is no enclosing parallel region, the DO loop is executed serially.

The loop that follows a DO directive cannot be a DO WHILE or a DO loop without loop control.

The format of this directive is as follows:

!$OMP DO [clause[[,] clause]...]

do_loop

[!$OMP END DO [NOWAIT]]

clause 

clause can be one of the following:

  • PRIVATE(var[, var] ...)

  • FIRSTPRIVATE(var[, var] ...)

  • LASTPRIVATE(var[, var] ...)

  • REDUCTION({operator|intrinsic}:var[, var] ...)

  • SCHEDULE(type[,chunk])

  • ORDERED

The SCHEDULE and ORDERED clauses are described in this section. The PRIVATE, FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses are described in Section 4.7.2.

do_loop 

A DO loop.

If ordered sections are contained in the dynamic extent of the DO directive, the ORDERED clause must be present. The code enclosed within an ordered section is executed in the order in which it would be executed in a sequential execution of the loop. For more information on ordered sections, see the ORDERED directive in Section 4.6.6.

The SCHEDULE clause specifies how iterations of the DO loop are divided among the threads of the team. Within the SCHEDULE(type[,chunk]) clause syntax, type can be one of the following:

type 

Effect

STATIC 

When SCHEDULE(STATIC,chunk) is specified, iterations are divided into pieces of a size specified by chunk. The pieces are statically assigned to threads in the team in a round-robin fashion in the order of the thread number. chunk must be a scalar integer expression.

When no chunk is specified, the iterations are divided among threads in contiguous pieces, and one piece is assigned to each thread.

Deferred Implementation.

DYNAMIC 

When SCHEDULE(DYNAMIC,chunk) is specified, the iterations are broken into pieces of a size specified by chunk. As each thread finishes its iterations, it dynamically obtains the next set of iterations.

When no chunk is specified, it defaults to 1. Performance, however, may be better when chunk is set to a small multiple of the vector length of your machine. This is particularly true when the loop body is small. The vector length of CRAY SV1, CRAY J90, CRAY Y-MP E, CRAY Y-MP M90, and CRAY EL systems is 64. The vector length of CRAY C90 and CRAY T90 systems is 128.

This is the default SCHEDULE type.

GUIDED 

When SCHEDULE(GUIDED,chunk) is specified, each of the iterations are handed out in pieces of exponentially decreasing size. chunk specifies the minimum number of iterations to dispatch each time, except when there are less than chunk number of iterations, at which point the rest are dispatched.

When no chunk is specified, it defaults to 1.

RUNTIME 

When SCHEDULE(RUNTIME) is specified, the decision regarding scheduling is deferred until run time and you cannot specify a chunk.

The schedule type and chunk size can be chosen at run time by setting the OMP_SCHEDULE environment variable. If this environment variable is not set, the resulting schedule is DYNAMIC.

For more information on the OMP_SCHEDULE environment variable, see Section 2.3.

Note: The OpenMP Fortran API does not define a default scheduling mechanism. You should not rely on a particular implementation of a schedule type for correct execution because it is possible to have variations in the implementations of the same schedule type across different compilers.

If an END DO directive is not specified, it is assumed at the end of the DO loop. If NOWAIT is specified on the END DO directive, threads do not synchronize at the end of the parallel loop. Threads that finish early proceed straight to the instructions following the loop without waiting for the other members of the team to finish the DO directive.

Example. If there are multiple independent loops within a parallel region, you can use the NOWAIT clause to avoid the implied BARRIER at the end of the DO directive, as follows:

!$OMP PARALLEL
!$OMP DO
      DO I=2,N
        B(I) = (A(I) + A(I-1)) / 2.0
      ENDDO
!$OMP END DO NOWAIT
!$OMP DO
      DO I=1,M
        Y(I) = SQRT(Z(I))
      ENDDO
!$OMP END DO NOWAIT
!$OMP END PARALLEL

Parallel DO loop control variables are block-level entities within the DO loop. If the loop control variable also appears in the LASTPRIVATE variable list of the parallel DO, it is copied out to a variable of the same name in the enclosing PARALLEL region. The variable in the enclosing PARALLEL region must be SHARED if it is specified on the LASTPRIVATE variable list of a DO directive.

The following restrictions apply to the DO directives:

4.4.2. Mark Code for Specific Threads: SECTION, SECTIONS and END SECTIONS Directives

The SECTIONS directive specifies that the enclosed sections of code are to be divided among threads in the team. It is a noniterative work-sharing construct. Each section is executed once by a thread in the team.

The format of this directive is as follows:

!$OMP SECTIONS [clause[[,] clause]...]

[!$OMP SECTION]

block

[!$OMP SECTION

block]

. . .

!$OMP END SECTIONS [NOWAIT]

clause 

The clause can be one of the following:

  • PRIVATE(var[, var] ...)

  • FIRSTPRIVATE(var[, var] ...)

  • LASTPRIVATE(var[, var] ...)

  • REDUCTION({ operator|intrinsic}:var[, var] ...)

The PRIVATE, FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses are described in Section 4.7.2.

block 

Denotes a structured block of Fortran statements. You cannot branch into or out of the block.

Each section must be preceded by a SECTION directive, though the SECTION directive is optional for the first section. The SECTION directives must appear within the lexical extent of the SECTIONS/END SECTIONS directive pair. The last section ends at the END SECTIONS directive. Threads that complete execution of their sections wait at a barrier at the END SECTIONS directive unless a NOWAIT is specified.

The following restrictions apply to the SECTIONS directive:

4.4.3. Request Single-thread Execution: SINGLE and END SINGLE Directives

The SINGLE directive specifies that the enclosed code is to be executed by only one thread in the team. Threads in the team that are not executing the SINGLE directive wait at the END SINGLE directive unless NOWAIT is specified.

The format of this directive is as follows:

!$OMP SINGLE [clause[[,] clause]...]

 block

!$OMP END SINGLE [NOWAIT]

clause 

The clause can be one of the following:

  • PRIVATE(var[, var] ...)

  • FIRSTPRIVATE(var[, var] ...)

The PRIVATE and FIRSTPRIVATE clauses are described in Section 4.7.2.

block 

Denotes a structured block of Fortran statements. You cannot branch into or out of the block.

Example. In the following code fragment, the first thread that encounters the SINGLE directive executes subroutines OUTPUT and INPUT. You must not make any assumptions as to which thread will execute the SINGLE section. All other threads will skip the SINGLE section and stop at the barrier at the END SINGLE construct. If other threads can proceed without waiting for the thread executing the SINGLE section, a NOWAIT clause can be specified on the END SINGLE directive.

!$OMP PARALLEL DEFAULT(SHARED)
      CALL WORK(X)
!$OMP BARRIER
!$OMP SINGLE
      CALL OUTPUT(X)
      CALL INPUT(Y)
!$OMP END SINGLE
      CALL WORK(Y)
!$OMP END PARALLEL