Using the Tape Management Facility (TMF) with SGI Cluster Manager requires the value-add SGI product on the SGI Cluster Manager 4.2 for Linux -- Storage Software Plug-ins CD and the supported level of TMF (see “Software Requirements” in Chapter 1). Only Storage Technology Corporation (STK) hardware controlled by the Automated Cartridge System Library Software (ACSLS) software is supported.
For more information about TMF, see the TMF Administrator's Guide.
This chapter discusses the following:
If your application requires tape support via TMF, then your user application script should call the /usr/lib/clumanager/service/helper_tmf TMF failover script, passing the appropriate parameters. See “Using the TMF Failover Script from the User Application Script ”.
The DMF plug-in will automatically call the helper_tmf script if a Library Server Drive Group uses TMF as a mounting service.
The helper_tmf script lets you manage one or more TMF device groups, which are sets of tape devices defined in the /etc/tmf/tmf.config TMF configuration file.
The following example is part of a /etc/tmf/tmf.config that defines a TMF device group named EGLF:
DEVICE_GROUP
name = EGLF
AUTOCONFIG
{
DEVICE
NAME = f9840f1 ,
device_group_name = EGLF ,
FILE = /hw/tape/500104f000417a18/lun0/c4p1 ,
status = down ,
access = EXCLUSIVE ,
vendor_address = (1,0,0,2),
LOADER = l180
} |
The helper_tmf script performs the following functions for the calling user service or userapp script:
Starts TMF if it is not already running.
Configures the associated loader up if it is not already up.
Allows the monitoring of multiple TMF device groups and their associated tape devices.
Monitors the number of tape devices that are available within each TMF device group. If the number of devices currently available is less than the minimum threshold level, a monitoring failure will occur.
Releases previous reservations that are held by another member (if the tape device firmware supports this option).
Lets you assign different TMF device groups to each instance of an SGI Cluster Manager service or userapp script.
For the order in which TMF is started/stopped, see Chapter 6, “Creating a New Highly Available Application”.
The helper_tmf script lets you specify device groups to stop, start, and monitor. Each of these managed device groups must be defined in the following files:
/etc/tmf/sgicm_tmf.config (SGI Cluster Manager configuration file for TMF)
/etc/tmf/tmf.config (standard TMF configuration file)
The resource directive in the /etc/tmf/sgicm_tmf.config file specifies a TMF device group. This directive is required for each TMF device group that you plan to use within SGI Cluster Manager. See “The resource Directive”.
There are other optional configuration specifications associated with a TMF device group. These specifications provide information to the helper_tmf script that lets it communicate with the tape library. They also identify the tape devices within the library on which helper_tmf will force dismounts.
The helper_tmf script can force a dismount of tapes from devices within the library. There may be various reasons why you might want to do this when a failover occurs. In the case of DMF, you would want to ensure that any DMF tapes that were in use on a previous member are available to DMF on the new member after a failover. If these tapes were in tape devices assigned to the previous member, they must be ejected and returned to the library so that they are again accessible to DMF on the new member. You may want the helper_tmf script to dismount only tape devices associated with a particular TMF device group or you may not want the helper_tmf script to dismount any tapes at all.
Some of the functions of the helper_tmf script are performed through TMF; the script issues commands to the TMF daemon to use these functions. However, the helper_tmf script forces a dismount of a tape from a device by issuing a command to the library software controlling the loader/library. The helper_tmf script communicates its request to the ACSLS software that controls the loader. The helper_tmf script uses an expect script that issues commands to login to the loader and issue a dismount request to a tape device.
The /etc/tmf/sgicm_tmf.config file lets you configure other information required by the helper_tmf script. The sgicm_tmf.config file exists on all members in the cluster and should be edited as necessary on each member.
The contents of the sgicm_tmf.config file are dependent on the tape devices assigned to each member in the cluster. If all members in the failover domain are configured through TMF to use exactly the same tape devices, this file would be the same on each member in the failover domain.
| Note: You must maintain the sgicm_tmf.config file on each member; a change on one member is unknown to the other members. |
You can specify the following directives in the sgicm_tmf.config file:
The resource directive defines the TMF device groups that can be managed by the helper_tmf script:
resource device-group devices-minimum devices-loaned email-addresses |
where:
| device-group | The TMF device group that is to be monitored. This is a device group that is defined in /etc/tmf/tmf.config. |
| devices-minimum | The minimum number of devices of the specified device-group that you must have available to you on a member before you fail over. |
| devices-loaned | Currently unused; should be set to 0. |
| email-addresses | List of addresses to send email when the monitor script detects that tape devices in the device-group have become unavailable. Corrective action can then be taken to repair the tape devices before the devices-minimum threshold is crossed. This may be a comma- or white-space-separated list of names. |
The loader directive provides information about a TMF loader, which controls one or more tape devices that are members of TMF device groups being managed by SGI Cluster Manager. There may be multiple loader directives in the sgicm_tmf.config file.
The loader information is used by the helper_tmf script to force a dismount of tapes from tape devices that cannot be made available (that is, that have tmstat states other than assn, free, conn, or idle) so that those tapes can be used via other tape devices in the same device group. The information is also used to force a dismount of tapes from devices that are only connected to the other member, not to this member (as described in “The remote_devices Directive”).
If the file does not contain a loader directive, the helper_tmf script will make no attempt to force a dismount of tapes from any devices.
The directive has the following format:
loader lname ltype lhost luser lpassword |
where:
| lname | Name of the loader as defined in /etc/tmf/tmf.config |
| ltype | Type of the loader as defined in /etc/tmf/tmf.config , which must be STKACS |
| lhost | Server name of the loader as defined in /etc/tmf/tmf.config |
| luser | User name of the loader's administrator account, which must be acssa |
| lpassword | Password for the loader's administrator account |
The tmmls command shows the name of the loader and the server associated with it:
# /usr/sbin/tmmls loader type status m server old m_pnd d_pnd r_qd comp avg operator OPERATOR UP A IRIX 0 0 0 0 0 0(sec) wolfy STKACS DOWN A wolfcree 0 0 0 0 0 0(sec) panther STKACS DOWN A stk9710 0 0 0 0 0 0(sec) l180 STKACS UP A stk9710 0 0 0 0 0 0(sec) |
For example, suppose you want to have the helper_tmf script dismount tape devices that are in the l180 loader/library listed above. That library has the stk9710 server associated with it. The loader directive in the sgicm_tmf.config file would look like the following:
loader l180 STKACS stk9710 acssa acssapassword |
If the initial attempt to configure the device up fails, the helper_tmf script would force a dismount for each tape device that is specified in the tmf.config file to be in the l180 loader/library and in the TMF device group. If you do not want the script to dismount any tape devices associated with a particular TMF device group, you would not place a loader directive in the sgicm_tmf.config file.
The remote_devices directive provides information about one or more tape devices that are part of a TMF device group, but which are not visible on this member.
For example, suppose you have a library with four SCSI tape devices where two tape devices are connected to each of two cluster members. If member A should crash, member B must be able to force a dismount of any tapes in A's tape devices so that they can then be used from member B. Because the tape devices are not visible on member B, the remote_devices directive provides the information needed to force a dismount of unseen tape devices.
The directive has the following format:
remote_devices device-group lname tape-device-ID ... |
where:
| device-group | Name of the TMF device group with which the tape-device-IDs are associated. | ||
| lname | Name of the loader as defined in /etc/tmf/tmf.config . There must be a loader directive for lname elsewhere in this file, or the remote_devices directive will be ignored. | ||
| tape-device-ID | The vendor ID of the drive on which to force a dismount. This is the unique name by which the loader identifies the tape device. For STKACS, this will be a comma-separated four-digit string listing the ACS, LSM, drive panel, and drive (for example, 0,0,1,3).
|
You can specify multiple vendor IDs in the same remote_devices directive as long as they all pertain to the same loader. If all the vendor IDs will not fit on a single line, add additional remote_devices directives for the same loader. For example, to enable the helper_tmf script to force a dismount of the remote tape devices 0,0,1,0 , 0,0,1,1 , 0,0,1,2 , and 0,0,1,3 in the l180 loader/library for TMF device group tmf_eglf, the directive would be:
remote_devices tmf_eglfl 180 0,0,1,0 0,0,1,1 0,0,1,2 0,0,1,3 |
If multiple TMF device groups are defined, only the TMF device group named tmf_eglf will force a dismount of these tape devices.
If tape devices that are managed by the helper_tmf script are configured on more than one member in the cluster, they should be configured consistently. The same tape driver (for example, ts) should be used on each member where the tape device is configured.
When configuring the helper_tmf script, you should be aware of several parameters in the /etc/tmf/tmf.config file. The helper_tmf script will try to start the loader associated with its device-group if it is not up. However, if the configuration file specifies status=UP for the loader, this step may not be necessary and the devices may become available sooner.
A tape device that is managed by the helper_tmf script will be configured in /etc/tmf/tmf.config on one or more members within the cluster. It should be configured with status=down.
If the tape devices being used do not support persistent reserve, then they should each be configured in /etc/tmf/tmf.config with access=shared. If the tape devices do support persistent reserve, it is recommended that you use this feature when using the helper_tmf script. To use persistent reserve, you should set access=exclusive in /etc/tmf/tmf.config for each tape device. The access option should be consistent across all members in the cluster where the tape devices are configured.
The -g option of the tmconfig command reassigns a device to a different device group name. The helper_tmf script does not support reassigning a device into a device group. That is because, in case of failover, the helper_tmf script on the member we have failed over to would not have any knowledge of this reassigned tape device . It would not be able to dismount tapes that are in the tape device. If you use tmconfig -g to move devices out of a device group, that will decrease the number of available tape devices that the monitor function of the helper_tmf script can detect. Also, in the case of failover or stop, the tape device will be configured down.
You must write a user application script in order to use the helper_tmf script. For information on how to write user application script, see Chapter 6, “Creating a New Highly Available Application”.
In order to manage TMF device groups in an SGI Cluster Manager environment, the user application script must pass the appropriate parameters to the TMF failover script. This script called via the following command line:
/usr/lib/clumanager/scripts/helper_tmf action device-groups |
where:
| action | One of start, stop, or status | |
| device-groups | One or more TMF device groups upon which the action should be taken |
Note: It is more efficient to invoke the helper_tmf
script once with several device-group arguments
rather than invoking it several times, each with a single
device-group argument. For example, the following:
is more efficient than the following:
|
For example, to start the 9840 device group:
/usr/lib/clumanager/services/helper_tmf start 9840
if [ $? -ne 0 ]; then
logAndPrint $LOG_ERROR "start of 9840 device group failed"
return 1;
fi |
To stop the 9840 device group:
/usr/lib/clumanager/services/helper_tmf stop 9840
if [ $? -ne 0 ]; then
logAndPrint $LOG_ERROR "unable to stop 9840 device group"
return 1;
fi |
To check the status of the 9840 device group:
/usr/lib/clumanager/services/helper_tmf status 9840
if [ $? -ne 0 ]; then
logAndPrint $LOG_ERROR "device group 9840 not running"
return 1;
fi |