This chapter describes aspects of PVM that are specific to UNICOS/mk systems. On a UNICOS/mk system, which contains up to 2048 processing elements (PEs), some subset of this number of PEs is assigned to a job running on the system. Those PEs are collectively known as a partition.
The UNICOS/mk implementation of PVM can be used in either or both of the following modes:
Stand-alone mode, in which PVM is used for communication (PE-to-PE) within the partition.
Distributed mode, in which PVM is used to communicate outside the partition.
There is one PVM library that is part of the MPT product environments for the UNICOS/mk system. When the UNICOS/mk executable file is initiated, it determines whether it is being used in distributed mode and performs the proper setup. If it determines that it is not being used in distributed mode, certain PVM functions are not available and return errors if called.
This section summarizes special features that can be found in the UNICOS/mk version of PVM and notes differences between it and the other versions. These features are also documented in the applicable PVM man pages.
Most existing UNICOS/mk applications and algorithms are written to use PE numbers for communication. Standard PVM notation used only the concept of a PVM task identifier (pvm_tid, whose internal representation is subject to change). To simplify programming, the UNICOS/mk version lets you use PE numbers in place of pvm_tids in many of the PVM functions. An extra function, pvm_get_PE(3), returns the PE number associated with a pvm_tid.
PVM supports the concept of dynamic groups, in which tasks can join and leave groups at any time. The barrier and broadcast functions use these groups for collective synchronization and communications. On a UNICOS/mk system, a static, well-defined group consisting of all the tasks (or PEs) in the partition is referred to as the global group. To simplify programming, PVM has essentially predefined this group by permitting a null name (or, in C, a null char pointer) to be used to refer to this global group. (The Fortran PVM include file, fpvm3.h, contains a declaration of a null character variable, PVMALL.) PVM uses some key optimizations to carry out barriers and broadcasts for the global group.
UNICOS/mk applications can use PVM calls to obtain their own PE number. From C these calls are as follows:
my_pe = pvm_get_PE (pvm_mytid()); |
From Fortran the calls are as follows:
CALL PVMFMYTID (MYTID) CALL PVMFGETPE (MYTID,MYPE) |
UNICOS/mk applications can use PVM calls to obtain the number of PEs in the partition. From C this is as follows:
n_pes = pvm_gsize(0); |
From Fortran the call is as follows:
CALL PVMFGSIZE (PVMALL, NPES) |
The variable PVMALL is declared in fpvm3.h.
The UNICOS/mk version of PVM treats data buffers packed using PvmDataInPlace encoding differently than the network version does. In the UNICOS/mk version, such data must not be reused until the data has been unpacked by the receiving PE. You are responsible for any additional synchronization or communication required to ensure this coordination.
You can control a number of features and settings in PVM. The default behavior and settings of PVM may not be suitable for all or part of some applications, and you may wish to change them. In general, you can set options in two ways:
Many options can be set by using the pvm_setopt(3) function. This function allows an option to be set for a specific PE or to be changed dynamically during execution of an application. For example, if the pvm_parent(3) function is called to see if the application is being used in distributed mode, the following code sequence ensures that a return code of PvmNoParent, which is considered an error, does not cause the program to abort or print out an error message:
oldvalue = pvm_setopt (PvmAutoErr, 0); parent_id = pvm_parent (); (void) pvm_setopt (PvmAutoErr, oldvalue); |
Many options can be set by using the UNICOS/mk environment variables without changing source code. These take effect with PVM initialization and apply to the application as a whole.
While many options can be set by using either mechanism, some can only be set using one mechanism or the other. This section describes those that you can set by using UNICOS/mk environment variables. Table 3-1, lists the UNICOS/mk environment variables. For more information about the pvm_setopt(3) function, use the man(1) command to view the man page online.
When setting an environment variable, you must ensure that it is available for the UNICOS/mk executable file. If you are using the UNICOS/mk version in stand-alone mode, this means that the environment variable must be set before the executable file is run:
% setenv PVM_TRACE 7 % ./t3e.out |
If you are using PVM in distributed mode, the PVM daemon starts the UNICOS/mk executable file. Therefore, you must set the environment variable before the daemon is started, as follows:
% setenv PVM_TRACE 7 % pvmd3 hostfile |
Remember, it is the UNICOS/mk daemon, not the task that calls pvm_spawn(3), that starts the UNICOS/mk executable file.
The PVM_ROOT environment variable specifies the path at which PVM libraries and system programs are installed. For PVM to function, this variable must be set on each PVM system. On UNICOS/mk systems, $PVM_ROOT is set for you automatically when you load the mpt module to access the MPT software.
Table 3-1. UNICOS/mk Environment Variables
Variable | Description | Default |
|---|---|---|
PVM_AUTO_ERR | Sets the PVM error-handling value, which is equivalent to the PvmAutoErr option in pvm_setopt(3). | 1 (error reporting on) |
| Setting this value with PVM_AUTO_ERR lets you do so without changing your source. |
|
PVM_CHECKING | Certain common PVM operations run the risk of losing data. By default, PVM performs a check to avoid this problem. While the cost of this check is not prohibitive, it can have an impact on performance, and might be unnecessary for your application. The PVM_CHECKING environment variable lets you control whether the check is performed. This control is at a very gross level: either the check is performed throughout the entire program or it is not performed at all. | 1 (Check is performed) |
| When PvmDataDefault encoding is used for packing 64-bit integer data, only the low-order 32 bits are packed. By default, PVM checks whether any of the truncated high-order bits contained significant data and generates an error (PvmLostPrecision) if they did. |
|
| If you set PVM_CHECKING to 0, this check is not performed. If you set PVM_CHECKING to 1 (the default setting), the check is performed. |
|
PVM_DATA_BUFFERS | Sets the initial and incremental number of send buffers. For more information on send buffers, see “Buffer Memory Management”. | Initial: 0 blocks; incremental: 1 block |
PVM_DATA_MAX | Sets the integer number of the maximum number of bytes in an initial message. The specified value must be a multiple of 8. When a message is sent with PVM, the library sends a header and a relatively small amount of data in an initial message. The default size for this data is 4096 bytes. Messages that contain more than this amount of data must transfer the data later in a second, slower transfer. By increasing the amount of data that can be transferred with the initial message, you can reduce communications overhead. | 4096 (The default value is in the description) |
| The value of PVM_DATA_MAX represents memory that is taken up by internal message pools and allocated for each message structure active in the system (whether or not the memory is actually used for a given message). The larger the value, the more memory that is used by PVM and unavailable to the application. The smaller the value, the more messages that will require a second transfer. |
|
| PVM_DATA_MAX has a particularly significant impact on the performance of messages broadcast to multiple tasks, due to the way these are implemented on the UNICOS/mk system. If a broadcast is used in a time-critical portion of code, you may want to verify that PVM_DATA_MAX is at least as large as the message being broadcast. |
|
PVM_MAXGTIDS | Changes the maximum number of tasks that can join a group. For information about the out-of-resources error, PvmOutOfResGmems, see “PvmOutOfResGmems”. | sysconf(_SC_CRAY_NPES) (Number of PEs in application) |
PVM_MAX_PACK | Sets the initial and incremental data block sizes. For information about setting this variable, see “Send Buffer Initial Size”, and “Send Buffer Increment Size”. | Initial: 4096 bytes; incremental: 4096 bytes |
PVM_PE_LIST | Lists the virtual PE numbers within a partition that can communicate with the daemon. Either a comma-separated list of virtual PE numbers or all can be specified. If all is used, all PEs in a partition can communicate with the daemon. | Only PE 0 communicates with the daemon. |
PVM_RETRY_COUNT | Sets the number of times that PVM retries sending a message to another PE before giving up and returning a PvmOutOfResSMP error. For more information, see “PvmOutOfResSMP”. | 500 |
PVM_SM_POOL | When PVM is started up, it allocates a pool of shared memory for use in message passing. This pool represents space used to buffer message headers and small messages while the receiving PE is doing computations or I/O. Each entry or message uses PVM_DATA_MAX plus 32 bytes of memory. | The larger of the following values:
|
| The PVM_SM_POOL environment variable sets the integer number for the number of messages in the pool for each PE. |
|
| For information about the out-of-resources error, PvmOutOfResSMP, see “PvmOutOfResSMP”. |
|
PVM_TOTAL_PACK | Establishes the upper limit on memory allocated for send buffer data blocks. For information about setting this variable, see “Total Memory Use”. | 999,999,999 |
PVM_TRACE | Sets a mask of trace options, equivalent to the PvmTraceOpts options in pvm_setopt(3). Using PVM_TRACE to set these options lets you do so without changing the source. | All tracing is off. |
| This environment variable controls only the collection of trace data, not its output. The pvm_disptrace(3) function is used to display trace data. |
|
When PvmDataDefault and PvmDataRaw encoding is used, PVM allocates and uses blocks of memory on the sending PE. (These blocks are referred to as send buffers.) By default, this allocation and usage is transparent to your application; that is, you should not have to do anything special. However, if your application is trying to optimize its use of memory, you may need to understand how PVM uses memory, and you may want to control PVM memory usage. This section discusses these topics.
The design of buffer memory is based on the following:
By default, all send buffer space is dynamically allocated in the following manner:
Memory is allocated only if needed.
Only the amount of memory needed is allocated.
Portions of memory are freed once they are no longer needed.
By using environment variables, you can control initial allocation of send buffers.
By using environment variables or pvm_setopt(3) calls, you can change the amount of additional memory allocated for each send buffer, and control or prohibit incremental memory units when even more memory is required.
You can specify a total limit on the amount of memory allocated at any one time for the send buffers.
Send buffers are never freed by PVM. Once allocated and used, they are kept for later use. However, any incremental memory allocated for a send buffer is freed as soon as it is no longer needed.
The scenario in Table 3-2 shows PVM memory use, using the default settings.
Table 3-2. Default Settings for Buffer Memory Management
User call | PVM action | Memory use in bytes (sending PE) |
|---|---|---|
pvm_initsend (PvmDataRaw); | Allocates send buffer. | 4096 |
pvm_pkbyte (...32...); | Copies data. (4064 bytes are free.) | 4096 |
pvm_pkbyte (...32000...); | Copies 4064 bytes. Allocates 32000 - 4064 = 27936 bytes. Copies remaining data. | 32,032 |
pvm_pkbyte (...32...); | Allocates 4096 bytes. Copies data. (4064 bytes are free.) | 36,128 |
pvm_pkbyte (...40...); | Copies data. (4024 bytes are free.) | 36,128 |
pvm_send (...); | Sends message. | 36,128 |
pvm_recv (...); | Receives message. | 36,128 |
Final pvm_upkbyte by receiving PE for message or pvm_recv call for next message | Frees incremental data blocks. Returns buffer to free list. | 4096 |
This scenario shows how PVM allocates memory for send buffers. Although 36,128 bytes were allocated, only 32,104 were actually used. The 4096 bytes allocated in the second incremental allocation were used for only 72 bytes.
The following parameters are available for controlling send buffer memory use:
Initial number of send buffers
Send buffer increment
Send buffer initial size
Send buffer increment size
Total memory use
You can set all five parameters by using environment variables, which take effect at PVM initialization time. Four of the five can also be set by calling pvm_setopt(3), which changes the settings dynamically at run-time. (The fifth parameter, initial number of send buffers, affects an initialization time function, and so a run time change would have no effect.) You can call the pvm_getopt(3) function to obtain the current settings for all five parameters.
Only three environment variables are needed to set the five parameters because two of these variables let you set either one or two parameters at once.
By using an environment variable, you set the value for all PEs at once. By calling pvm_setopt(3), you can set different values for different PEs, or you can change a value during the execution of the program. You can, of course, combine the two mechanisms by using the environment variables to set the default values and pvm_setopt(3) to change specific cases.
The following sections discuss how you can use and set these parameters.
During initialization time, PVM allocates an initial number of send buffers. The default is 0; that is, no send buffers are allocated initially. In this case, as soon as you call the pvm_initsend(3) function with the PvmDataDefault or PvmDataRaw option, a new send buffer is dynamically allocated. This requires library calls and possibly an operating system call, and thus is expensive in time. Alternatively, you can initially allocate some send buffers, perhaps enough to avoid having to dynamically allocate any additional buffers.
To set the initial number of send buffers, enter the PVM_DATA_BUFFERS environment variable as follows:
setenv PVM_DATA_BUFFERS <number> |
In the following example, the PVM_DATA_BUFFERS setting tells PVM to initially allocate 10 send buffers:
setenv PVM_DATA_BUFFERS 10 |
The pvm_setopt(3) function does not support this parameter. You can call pvm_getopt(3) with the PvmDataBuffers option to find out the value of PVM_DATA_BUFFERS.
Whenever PVM dynamically allocates a new send buffer, it makes library calls to allocate memory for a specified number of send buffers. The default is 1; that is, PVM allocates enough memory for a single new send buffer. This process is expensive in time because PVM must make another set of library calls to allocate more memory each time a new buffer is needed.
You can amortize the cost of the library calls by using the send buffer increment parameter. This parameter setting tells PVM to allocate enough memory for a specified number of additional buffers each time it needs to allocate memory for a single one.
This parameter can also tell PVM not to allocate additional memory for send buffers. By setting the initial number of send buffers to some number and setting the increment to 0, you can fix the number of send buffers allocated by PVM. In this case, if PVM runs out of send buffers, your application receives a PvmOutOfResBuf error.
The send buffer increment is the second option on the PVM_DATA_BUFFERS environment variable. To set this parameter, enter PVM_DATA_BUFFERS as follows:
setenv PVM_DATA_BUFFERS number+increment |
In the following example, the PVM_DATA_BUFFERS setting tells PVM to initially allocate 10 send buffers and to allocate 4 more at a time if more buffers are needed, up to a total of PVM_TOTAL_PACK:
setenv PVM_DATA_BUFFERS 10+4 |
You can use the PvmDataBuffersIncr option with pvm_setopt(3) to change the setting dynamically. You can also use this option with pvm_getopt(3) to see the send buffer's increment setting.
Each send buffer contains an initial block of memory for use in packing data. The default is 4096 bytes. If more is needed, PVM makes library calls to allocate an additional block. If less is needed, the difference is wasted. If you know that most messages in your code are of a specific size, you can set this parameter to that size to avoid wasting memory or allocating additional blocks.
To set the send buffer initial size, enter the PVM_MAX_PACK environment variable as follows:
setenv PVM_MAX_PACK initial |
In the following example, the PVM_MAX_PACK setting tells PVM to initially allocate 16,384 bytes of memory for each send buffer:
setenv PVM_MAX_PACK 16384 |
You can use the PvmMaxPack option with pvm_setopt(3) to change the setting dynamically. You can also use this option with pvm_getopt(3) to see the send buffer initial size.
When PVM dynamically allocates an additional block of memory, it uses a minimum allocation size. The default is 4096 bytes. If PVM needs less than this amount of memory, it allocates the minimum size. If PVM needs more than this minimum size, it allocates what it needs.
The send buffer increment size parameter enables you to avoid multiple allocations of blocks that are only a few words in length. For example, if most of your messages fit within 4096 bytes, but you have one large message that requires a total of 164,096 bytes, you could set this parameter to 160,000 bytes.
This parameter can also be set to 0 to tell PVM that it must not allocate additional memory blocks. In this case, if the data fails to fit into the initial block, PVM returns a PvmTooMuchData error to your application.
The send buffer increment size is the second option on the PVM_MAX_PACK environment variable. To set this parameter, enter PVM_MAX_PACK as follows:
setenv PVM_MAX_PACK initial+increment |
In the following example, the PVM_MAX_PACK setting tells PVM to initially allocate 4096 bytes of memory for each send buffer, but, if more is needed, to allocate a block no smaller than 160,000 bytes:
setenv PVM_MAX_PACK 4096+160000 |
You can use the PvmMaxPack option with pvm_setopt(3) to change the setting dynamically. You can also use this option with pvm_getopt(3) to see the send buffer increment size.
PVM tracks the amount of memory allocated for data blocks, both initial and incremental blocks. There is no set default; you are limited only by the available memory in the PE.
The total memory use parameter establishes a limit for the amount of memory allocated. If PVM exceeds this limit, it returns a PvmMemLimit error to your application.
This parameter does not reflect total memory usage by PVM, but only the data block allocation associated with send buffers. For many applications, this is the predominant source for PVM memory usage.
To set the total memory use parameter, enter the PVM_TOTAL_PACK environment variable as follows:
setenv PVM_TOTAL_PACK limit |
In the following example, the PVM_TOTAL_PACK setting tells PVM to use no more than 1,048,576 bytes of memory at any time for send buffer data blocks:
setenv PVM_TOTAL_PACK 1048576 |
You can use the PvmTotalPack option with pvm_setopt(3) to change the setting dynamically. You can also use this option with pvm_getopt(3) to see total memory use. To see how much memory is remaining from the current limit, use the PvmTotalPackLeft option with pvm_getopt(3).
In the original scenario (“Simple Scenario, Part 1”), 36,128 bytes of buffer memory were allocated, but only 32,104 were actually used. Memory use could be made most efficient by using PvmDataInPlace encoding, which avoids PVM buffer allocation altogether. But this change may require some additional synchronization within the program, and thus it may not be desirable.
Next in order of simplicity, you could move the large pvm_pkbyte(3) call (with 32,000 bytes) to the end. Consequently, the three small packs would go into the initial 4096 bytes, and just enough bytes would be allocated for the large pvm_pkbyte(3) call.
Instead (or in addition), the following PVM_MAX_PACK settings could be considered to more efficiently manage memory:
setenv PVM_MAX_PACK 32104 |
This setting ensures that all the memory needed is allocated with the send buffer. If all message traffic looked like this, this would be most efficient. By setting PVM_MAX_PACK to 32104+0, you could verify that no message exceeded this limit.
setenv PVM_MAX_PACK 4096+28008 |
This setting ensures that the first incremental memory allocation is sufficient for the remaining packs. If most messages fit into 4096 bytes, and the rest fit into 32,104 bytes, this setting limits normal memory use while avoiding unnecessary malloc(3) or free(3) calls for the large messages.
This scenario shows only the memory allocation for a single message. A real application has many messages of different sizes; therefore, while PVM_MAX_PACK settings might help this one message, they might have adverse effects on others.
If only one PE is sending a large message, another approach is to change the source code so that this PE calls pvm_setopt(3) once with PvmMaxPack and perhaps again with PvmMaxPackIncr, each set to the values indicated in the previous setenv commands, prior to packing and sending the large message. For example, you could call pvm_setopt with PVM_MAX_PACK set equal to 32,104, or you could call pvm_setopt with PVM_MAX_PACK set equal to 40 and call pvm_setopt with PvmMaxPackIncr set equal to 28,008.
When running a PVM application on UNICOS/mk systems, you may receive out-of-resource errors. Receiving one of these errors, shown in Table 3-3, means that you have encountered a fixed limit within the PVM implementation.
Table 3-3. Out-of-resource Errors
Error | Fixed limit |
|---|---|
PvmOutOfResSMP | A shared memory pool of messages used in sends |
PvmOutOfResBuf | A preallocated set of data buffers used by pvm_initsend(3), pvm_recv(3), and related functions |
PvmOutOfResGmems | A maximum number of tasks that can join a group |
These limits are fixed for various reasons, but you can raise each of them. However, you should be careful about doing so for two reasons:
Raising a limit causes PVM to allocate more memory, and this memory is not available for your application to use.
Your application may not be using PVM efficiently. Making a simple code change may eliminate the error and also give you better performance.
Two of the out-of-resource conditions (PvmOutofResSMP and PvmOutofResBuf) might occur only occasionally, due to unusual timing circumstances. Instead of wasting memory to handle these unlikely situations, consider writing your application to accept these errors if they occur and to retry the action that caused the error until the action succeeds. For example, the following code fragment retries a send until it succeeds:
10 CONTINUE CALL PVMFSEND (OTHERPE, TAG, INFO) IF (INFO.EQ. PVMOUTOFRESSMP) GOTO 10 IF (INFO.LT.0) CALL ABORT() |
Out-of-resource errors often appear when you are increasing the number of processors being used or the size of the problem being solved. Several options are available for dealing with the limits you encounter. The following sections briefly discuss each limit, describe how to raise it, and identify ways to use PVM more efficiently.
A pool of memory is allocated in each PE to receive messages from other PEs. When a message is sent, the sending PE uses part of the pool on the receiving PE for the message. At the beginning of various PVM functions, a receiving PE checks for any messages in this pool and clears them out. If too many PEs try to send messages before a PE can clear out the pool, the pool becomes exhausted, and subsequent sends may fail with the PvmOutOfResSMP error.
By default, sends that detect this condition enter a retry loop, in which they delay briefly and then recheck the pool. This loop is performed PVM_RETRY_COUNT times (default is 500), and the PvmOutOfResSMP error is issued at the end of this count. You can adjust this limit up or down as described in Table 3-1. Many applications will find that increasing this count is sufficient to get by the error.
You can also adjust the number of entries in the pool. The default limit is twice the number of PEs or 10, whichever is larger. You can raise or lower this limit by using the PVM_SM_POOL environment variable, described in Table 3-1.
If you are hitting the pool entry limit, you may want to see if the receiving PE can be changed to call pvm_recv(3) or pvm_nrecv(3) sooner. This problem can occur if all PEs are broadcasting to each other and then trying to receive the results. By interspersing the broadcasts with the receives, you may avoid having to raise the limit.
You may also hit the pool entry limit if many messages are being sent to a PE that is busy doing some computation, waiting for I/O, or doing something else that keeps it from entering PVM. Increasing the limit allows such operations to proceed asynchronously; changing the code to operate more synchronously is another option.
The PvmOutOfResBuf error occurs only if you have set the send buffer increment parameter to 0 (see “Send Buffer Increment”, for information on setting this parameter), indicating that you want a fixed number of send buffers. Getting the error indicates that you underestimated the number of buffers that you needed.
A send buffer cannot be reused until the data in it has been copied to the receiving PE. If the data is smaller than the size of a short message (PVM_DATA_MAX, which has a default of 4096 bytes), this copy occurs on the pvm_send(3) call. For larger amounts of data, this copy does not occur until the receiving PE has unpacked that data.
Make sure you are using buffers efficiently. Sometimes users convert code to use PVM, and the code appears as follows:
for (... several PEs ...) {
pvm_initsend (PvmDataRaw);
pvm_pkbyte (addr, size,...);
pvm_send (...);
} |
Here, the same data is being sent to each PE. However, a single packed buffer can be used by multiple sends:
pvm_initsend (PvmDataRaw);
pvm_pkbyte (addr, size, ...);
for (... several PEs ...) {
pvm_send (...);
} |
Or the single packed buffer can be used by a more efficient broadcast or multicast such as the following example:
pvm_initsend (PvmDataRaw); pvm_pkbyte (addr, size, ...); pvm_mcast (...); |
In both cases, a single send buffer is used. The data it contains is not freed until all of the receiving PEs have responded, which may take a while; however, your use of buffers and memory will be reduced. Also, your program will run faster due to the reduced number of function calls.
PVM allows groups to consist of as many PEs as you specify, up to the total number of PEs in the partition. This is a general feature, but for large numbers of PEs it can waste memory. This is especially true if your groups are small relative to the number of PEs.
You can reduce the limit, and thus save memory, in either of two ways:
Set the environment variable PVM_MAXGTIDS.
Call pvm_setopt(3) with the PvmMaxgtids option (if this is done, the function must be called on each PE before any groups are formed).
Remember that the UNICOS/mk version of PVM defines aglobal group, consisting of all PEs in the partition. If you have code in which each PE is joining a global group with your own name (perhaps code ported from a network version of PVM), you should consider using the predefined global group on the UNICOS/mk system. This will simplify your code, and you will get better performance when using barriers across the group or broadcasts to the group.
The following sections discuss several issues specific to the distributed mode of the UNICOS/mk version. Using this mode requires that you use the PVM daemon. If you are not familiar with the use of the PVM daemon, you may want to read “PVM Program Development” in Chapter 2, before reading this section.
The following discussion assumes that the application you are running is using two partitions in the UNICOS/mk system. This assumption is made only for the sake of simplicity; your application can use other Silicon Graphics systems, or other systems connected to your network. Most of the same issues still apply.
The following sections discuss several key issues related to the distributed mode. The issues are as follows:
PE communication
UNICOS/mk executable files
UNICOS/mk tasks
Cross-system dynamic groups
The PVM daemon runs on the UNICOS/mk system. A PE on the UNICOS/mk system communicates with the daemon and with PVM tasks outside its own partition. In theory, any PE can do so. But UNICOS limits the number of open files per application and the number of open sockets in the system. So, if a UNICOS/mk application running on a large number of PEs were to set up communications for each PE, it may hit either or both of these limits.
Socket communications are very slow, especially compared to the speed of communications between PEs. Because much of socket communication is single-threaded in the PVM daemon, the performance cost goes up as more PEs try to communicate at the same time.
For these reasons, by default, only PE 0 establishes communications with the daemon, and you should consider using PVM in this manner. However, you can specify additional PEs by setting the PVM_PE_LIST environment variable, as follows:
setenv PVM_PE_LIST 0,4,8,12 setenv PVM_PE_LIST all |
This environment variable must be set for both the PVM daemon pvmd3(1) and the application to read, and both must read the same value.
![]() | Note: At present, PE 0 always establishes communications with the daemon, even if PE 0 is not specified in PVM_PE_LIST. It is suggested that PVM_PE_LIST specify PE 0, if it is being used, to ensure future compatibility. It is possible that future releases may introduce other mechanisms for controlling access to the daemon. |
When you build your UNICOS/mk executable file, you can optionally fix the number of PEs at load time. For such executable files, the pvm_spawn(3) count parameter simply specifies the size of the tids array, and must be at least as large as the PE count.
If you do not fix the number of PEs (for example, by using the -Xm option with cld(1)), you have a malleable executable file. For these, the pvm_spawn count parameter specifies the number of PEs that you want for the executable file.
When pvm_spawn returns successfully, it returns a count value that specifies the number of PEs that were started. The tids array is set with either of two values in each entry:
For PEs that can communicate with the daemon, the associated entry contains a pvm_tid value.
For PEs that cannot communicate with the daemon, the associated entry contains the integer value 1, which is not a valid pvm_tid value.
During startup, the UNICOS/mk program checks to determine if the PVM daemon is running. If it is not, the program assumes it is in stand-alone mode.
You cannot form a dynamic group consisting of tasks from the UNICOS/mk system and another system. You cannot form a dynamic group consisting of tasks from more than one partition within a UNICOS/mk system. You must view group handling on each system and partition as being completely independent. If the UNICOS/mk tasks form a group called MYGROUP, and the tasks in the network also join a group called MYGROUP, the two groups are completely independent. A broadcast from a UNICOS/mk task to MYGROUP sends messages only within that partition; no messages will go outside the partition.
![]() | Note: In future releases, this limitation might be removed. Therefore, you should not build your application assuming that the two groups are independent; in a later release, they might form a single, combined group. |
You can use programs and commands a number of different ways to run a distributed application involving the UNICOS/mk system. The following example shows one way.
Example 3-1. Parent task spawning a child task
Assume that the parent task runs on a single PE in the UNICOS/mk system and uses PVM. The key line of interest is the call to pvm_spawn(3). There are several options for making this call. The following is a typical call:
count = pvm_spawn("mpp.a.out", 0, PvmTaskArch, "CRAY", nproc, tids); |
In the example, a variable, nproc, specifies the size of the tids array. If the executable file (mpp.a.out) is built with a fixed PE count, nproc must be larger than or equal to the PE count, and count returns the PE count. If the executable file is built as a malleable executable file (that is, the number of PEs is not fixed), nproc is the number of PEs to request, and count returns the same number.
By specifying that the task should run on a Cray system, the code assumes that any Cray system in the virtual machine is acceptable. If not, PvmTaskHost should be specified instead of PvmTaskArch.
There is little out of the ordinary in the parent task. It must be careful not to use entries in the pvm_tid array that are set to a value of 1. It can communicate with any other PE assigned to the executable file.
The child task on the UNICOS/mk system does not look very different from one written to run in stand-alone mode. You must be careful to use pvm_tid, instead of PE numbers, when referring to the parent task. You must also be careful that only those PEs that can communicate with the daemon try to do so. You can deal with both of these constraints by calling pvm_parent(3). If this function returns a pvm_tid, that identifier can be used for communication. If the function returns the PvmNoParent error, that PE cannot communicate with the outside world. “PVM Program Development” in Chapter 2, describes how to start the PVM daemon and your parent task.
In distributed mode, PVM uses sockets for communication. Read and write system calls actually transmit control and data across the sockets. Further, a given PVM task may have several sockets open at once: one to its local daemon and, optionally, one or more to specific tasks with which it is communicating.
The following facts have important implications regarding performance:
System calls perform the I/O.
There is a maximum size applied to data in a socket when it is transmitted or received; the system divides up requests larger than this maximum.
With multiple open sockets, it is necessary to use yet another call, select(2), to look for incoming data or to determine if data can be output.
By default, in distributed mode, only PE 0 communicates with the PVM daemon, but additional PEs can also be permitted to communicate (for more information, see “PE Communication”). If you are interested in performance, think very carefully before using more than one PE to make PVM calls outside the UNICOS/mk partition. This guideline applies regardless of the other options discussed in this chapter.
Because distributed mode is so dependent upon system calls, you should not use it for sending small, frequent messages.
If you are using PVM to communicate between a UNICOS/mk system and a UNICOS system with Cray floating-point hardware, and you specify PvmDataDefault when calling pvm_initsend(3), PVM converts the data between IEEE and Cray formats for all forms of typed data. This is not done very efficiently on the UNICOS end.
You can perform data conversion efficiently on the UNICOS system, however, by using the data conversion functions available in the UNICOS Fortran libraries (see the Application Programmer's Library Reference Manual). If you are using PVM to transfer the data, pack and unpack it with the byte options (pvm_pkbyte(3), pvm_upkbyte(3), or the Fortran BYTE1 option) and then call CRAY2IEG(3) or IEG2CRAY(3), as appropriate. If you are using file I/O, call CRAY2IEG or IEG2CRAY, as appropriate, on the data you are about to write from the UNICOS system or have just read from the UNICOS/mk system.
If you are using file I/O, an easier option is to use a Fortran I/O feature that automatically converts data as it is read or written. These techniques are described in the Application Programmer's I/O Guide.