| Cray T3ETM Fortran Optimization Guide - 004-2518-002 | ||
|---|---|---|
| Prev Section | Chapter 2. Parallel Virtual Machine (PVM) | Next Section |
The PVMFBCAST(3) and PVMFMCAST(3) routines offer two methods of sending messages to multiple PEs in a single call. The broadcast (glossary, ) routine, PVMFBCAST, sends to all PEs in a group, whether that group consists of all PEs involved in the job or a predefined subset of all PEs. The multicast (glossary, ) routine, PVMFMCAST, sends to all PEs with PE numbers that appear in an array that you define.
Although PVMFMCAST provides more flexibility concerning which PEs will receive the message, PVMFBCAST is usually faster. If the group name you give to PVMFBCAST is the global name, PVMALL, PVM uses an optimized method to transfer the data. Instead of the broadcasting PE sending directly to all other PEs, it sends to half the PEs. When these PEs receive the message, they each forward it to half the remaining PEs, and so on. (For an illustration, see Figure 2-1.) This provides better and more scalable performance in the following situations:
If the number of PEs is approximately 32 or larger. There is usually extra time involved in forwarding such messages, meaning the forwarding method may not be as efficient with a smaller number of PEs.
If the data packets are small (less than or equal to PVM_DATA_MAX). If they are larger, the forwarding method is abandoned, and all the receiving PEs try to do remote loads from the sending PE at more or less the same time.
If you use a group name representing a subset of the PEs (a name other than PVMALL), there is no special optimization. PVM simply goes through the list of PEs in the group and sends to each PE.
The PVMFMCAST routine does not offer special optimizations. PVM goes through the specified array of PE numbers and sends to each PE.
The following two examples use PVMFBCAST and PVMFMCAST, respectively, to transfer an array of 10 elements to all other PEs attached to the job:
Example 2-3. PVMFBCAST
PROGRAM BCAST
INCLUDE 'fpvm3.h'
PARAMETER(LEN=10)
INTEGER MYTID, ME, NPES
DIMENSION ARR(LEN)
C Use PVM method of obtaining task id, PE number, number of PEs
CALL PVMFMYTID(MYTID)
CALL PVMFGETPE(MYTID, ME)
CALL PVMFGSIZE(PVMALL, NPES)
C PE 0 initializes, packs, and sends the array of 10 elements
IF (ME .EQ. 0) THEN
DO I = 1, LEN
ARR(I) = I / 2.0
ENDDO
CALL PVMFINITSEND(PvmDataRaw, ISTAT)
CALL PVMFPACK(REAL8, ARR, LEN, 1, ISTAT)
CALL PVMFBCAST(PVMALL, LEN, ISTAT)
C All other PEs receive it
ELSE
CALL PVMFRECV(0, LEN, ISTAT)
CALL PVMFUNPACK(REAL8, ARR, LEN, 1, ISTAT)
ENDIF
C A representative PE prints the array
IF (ME .EQ. 2) THEN
WRITE(*,*) 'The array values are: ', ARR
ENDIF
END |
Example 2-4. PVMFMCAST
PROGRAM MCAST
INCLUDE 'fpvm3.h'
PARAMETER(LEN=10)
INTEGER MYTID, ME, NPES
DIMENSION ARR(LEN)
INTEGER PE_ARR(NUM_PES)
C Use PVM method of obtaining task id, PE number, number of PEs
CALL PVMFMYTID(MYTID)
CALL PVMFGETPE(MYTID, ME)
CALL PVMFGSIZE(PVMALL, NPES)
C Set up array of PE numbers
DO I = 1, NPES-1
PE_ARR(I) = I
ENDDO
C PE 0 initializes, packs, and sends the array of 10 elements
IF (ME .EQ. 0) THEN
DO I = 1, LEN
ARR(I) = I / 2.0
ENDDO
CALL PVMFINITSEND(PvmDataRaw, ISTAT)
CALL PVMFPACK(REAL8, ARR, LEN, 1, IPACK)
CALL PVMFMCAST(NPES, PE_ARR, LEN, ICAST)
C All other PEs receive it
ELSE
CALL PVMFRECV(0, LEN, IRECV)
CALL PVMFUNPACK(REAL8, ARR, LEN, 1, IUPK)
ENDIF
C A representative PE prints the array
IF (ME .EQ. 1) THEN
WRITE(*,*) 'The array values are: ', ARR
ENDIF
END |
The output from both programs is as follows:
The array values are: 0.5, 1., 1.5, 2., 2.5, 3., 3.5, 4., 4.5, 5. |
Because the efficient message-passing system used by PVMFBCAST becomes more of a factor as the number of PEs increases, the advantage in using PVMFBCAST is most apparent when more PEs are involved in the job. But even when using as few as eight PEs, PVMFBCAST still has better performance than PVMFMCAST.
| Prev Section | Table of Contents | Title Page | Next Section |
| Avoiding Barriers | Up one level | Minimizing Synchronization Time When Receiving Data |