You can perform IRIS FailSafe administration tasks using either the FailSafe Manager graphical user interface (GUI) or the cmgr(1M) command. Although these tools use the same underlying software command line interface (CLI) to configure and monitor a FailSafe system, the GUI provides the additional features that are particularly important in a production system; see “GUI Overview”.
The FailSafe Manager GUI lets you configure, administer, and monitor a FailSafe cluster and nodes.
This section contains the following:
When FailSafe daemons have been started, you must be sure to connect to a node that is running all of the FailSafe cluster daemons in order to obtain the correct cluster status (see “Verify that the Cluster Daemons are Running” in Chapter 5). When FailSafe cluster daemons have not yet been started in a cluster, you can connect to any node in the pool.
| Note: The node from which you run the GUI affects your view of the
cluster. You should wait for a change to appear in the view area before
making another change; the change is not guaranteed to be propagated across
the cluster until it appears in the view area. The entire cluster status
information is sent each time a change is made to the cluster database;
therefore, the larger the configuration, the longer it will take.
You should only make changes from one instance of the GUI running at any given time. (Changes made by a second GUI instance -- a second invocation of fsmgr -- may overwrite changes made by the first instance, because different GUI instances are updated independently at different times. In time, however, independent GUI instances will provide the same information.) However, multiple windows accessed via the File menu are all part of a single GUI instance; you can make changes from any of these windows. |
To ensure that the required privileges are available for performing all of the tasks, you should log in to the GUI as root. However, some or all privileges can be granted to any user by the system administrator using the Privilege Manager, part of the IRIX Interactive Desktop System Administration (sysadmdesktop) product. For more information, see the Personal System Administration Guide.
To start the GUI, use one of these methods:
Enter the following command line:
# /usr/sbin/fsmgr |
| Note: If you invoke fsmgr(1M) before
you have defined a cluster, you will get an error message. If you are
in the process of defining a cluster, you can ignore it.
The fstask and fsdetail command performs the identical function as fsmgr; both commands are kept for historical purposes. |
Choose FailSafe Manager from the Toolchest.
You must restart the toolchest after installing FailSafe to see the FailSafe entry on the toolchest display. Enter the following commands to restart the toolchest:
% killall toolchest % /usr/bin/X11/toolchest & |
In order for this to take effect, sysadm_failsafe2.sw.desktop must be installed on the client system, as described in the IRIS FailSafe Installation and Maintenance Instructions.
In your Web browser, enter http:// server/FailSafeManager/ (where server is the name of node in the pool or cluster that you want to administer) and press Enter. At the resulting Web page, click on the icon.
This method of launching FailSafe Manager works only if you have installed the Java Plug-in, exited all Java processes, restarted your browser, and enabled Java. If there is a long delay before the icon appears, you can click on the “non plug-in” link, but operational glitches may be the result of running in the browser-specific Java.
You can use this method of launching the GUI if you want to run it from a non-IRIX system. If you are running the GUI on an IRIX system, the preferred method is to use toolchest or the /usr/sbin/fsmgr command.
Table 4-1 describes the platforms where the GUI may be started, connected to, and displayed.
GUI Mode | Where You Start the GUI | Where You Connect the GUI | Where the GUI Displays |
|---|---|---|---|
fsmgr(1) or Toolchest | Any IRIX system (such as an SGI 2000 series, SGI O2 workstation, or Silicon Graphics Fuel visual workstation) with sysadm_failsafe2.sw.client and sysadm_failsafe.sw.desktop software installed | The node in the pool that you want to use for cluster administration | The system where the GUI was invoked |
Web | Any system with a web browser and Java 1.1 plug-in installed and enabled with sysadm_failsafe2.sw.web installed on the server | The node in the pool that you want to use for cluster administration | The same system with the web browser |
The FailSafe Manager GUI allows you to administer the entire cluster from a single point. It provides access to the tasks that help you set up and administer your cluster. Guided Configuration tasks consist of a group of tasks collected together to accomplish a larger goal. For example, Set Up a New Cluster steps you through the process for creating a new cluster and allows you to launch the necessary individual tasks by simply clicking their titles.
Online help is provided with the Help button. You can also click any blue text to get more information about that concept or input field.
The File menu lets you do the following:
Invoke multiple windows for this instance of the GUI
Display the /var/adm/SYSLOG system log file and the /var/sysadm/salog system administration log file (which shows the commands accessed by the GUI)
Launch the Performance Co-Pilot (PCP) tools to perform resource monitoring (rmvis(1)) and heartbeat monitoring (hbvis(1))
Close the current window
Exit the GUI completely
The Edit menu lets you expand and collapse the contents of the view area. You can also choose to automatically expand the display to reflect new nodes added to the pool or cluster.
The Tasks menu contains the following:
Find Tasks, which lets you use keywords to search for a specific task
Guided Configuration, which contains the task sets to set up your cluster, define filesystems, modify an existing cluster, and check status
Nodes, which contains tasks to define and manage the nodes
Cluster, which contains tasks to define and manage the cluster
Resource Types, which contains tasks to set up and configure highly available resource types such as volume
Resources, which contains tasks to set up and configure individual resources
Failover Policies, which contains tasks to determine how FailSafe should keep resource groups highly available
Resource Groups, which contains tasks to define resource groups and manage them
FailSafe HA Services, which allows you to start and stop highly available (HA) services, set the FailSafe tie-breaker node, and set the log configuration
Diagnostics, which contains the tasks to test connectivity, resources, and failover policies
By default, the window is divided into two sections: the view area and the details area. You can use the arrows in the middle of the window to shift the display. To find a component in the current view area, enter the name of the component and click the Find button.
To deselect an item in the view area, click anywhere in the view area except on the name of an item.
Choose what you want to view from the View selection: the nodes in the cluster, the nodes in the pool (all defined nodes), or filesystems in the cluster.
To view the details on any component, click on its icon name in the view area. The configuration and status details will appear in the details area to the right. To see details about an item in the view area, select its name (which will appear in blue); details will appear in a new window. Terms with glossary definitions also appear in blue.
You can view details about any of the following components by selecting the icon:
Cluster
Nodes
Resource types
Resources
Resource groups
Failover policies
To perform an individual task, do the following:
Select the task name from the Task menu or click the right mouse button within the view area. For example:
Task -> Guided Configuration -> Set Up a New Cluster
The task window appears.
| Note: You can click any blue text to get more information about that concept or input field. |
Enter information in the appropriate fields and click OK to complete the task. (Some tasks consist of more than one page; in these cases, click Next to go to the next page, complete the information there, and then click OK.) In every task, the cluster configuration will not update until you click OK.
A dialog box appears confirming the successful completion of the task.
Continue launching tasks as needed.
The cmgr(1M) command is more limited in its functions. It enables you to configure and administer a FailSafe system using a command-line interface only on an IRIX system. It provides a minimum of help or formatted output and does not provide dynamic status except when queried. However, an experienced FailSafe administrator may find cmgr to be convenient when performing basic FailSafe configuration tasks, executing isolated single tasks in a production environment, or running scripts to automate some cluster administration tasks.
This section documents how to perform FailSafe administrative tasks by means of the cmgr command. You must be logged in as root.
The cmgr command uses the same underlying FailSafe commands as the GUI.
To use cmgr, enter either of the following:
# /usr/cluster/bin/cmgr # /usr/cluster/bin/cluster_mgr |
For more assistance, you can use the -p option on the command line; see “Using Prompt Mode”.
After you have entered this command, you will see the following:
Welcome to SGI Cluster Manager Command-Line Interface cmgr> |
Once the command prompt displays, you can enter the cluster manager commands.
At any time, you can enter ? or help to bring up the help display.
This section contains the following:
After the command prompt displays, you can enter subcommands. At any time, you can enter ? or help to bring up the cmgr help display.
The cmgr(1M) command provides an option which displays detailed prompts for the required inputs of that define and modify FailSafe components. You can run in prompt mode in either of the following ways:
Specify a -p option when you enter the cmgr command, as in the following example:
# cmgr -p |
Execute a set prompting on command while in normal interactive mode, as in the following example:
cmgr> set prompting on |
This method of entering prompt mode allows you to toggle in and out of prompt mode as you execute individual cmgr commands.
To get out of prompt mode, enter the following command:
cmgr> set prompting off |
For example, if you are not in the prompt mode and you enter the following command to define a node, you will see a single prompt, as indicated:
cmgr> define node cm1a Enter commands, when finished enter either "done" or "cancel" cm1a? |
At the cm1a? prompt, enter the individual node definition commands in the following format (for full information on defining nodes, see “Define a Node with cmgr” in Chapter 5). For example:
cm1a? set hostname to hostname |
A series of commands is required to define a node. If you are running cmgr in prompt mode, however, you are prompted for each required command, as shown in the following example:
cmgr> define node cm1a Enter commands, you may enter "done" or "cancel" at any time to exit |
Node Name [cm1a]? cm1a |
Hostname[optional]? cm1a Is this a FailSafe node <true|false> ? true Is this a CXFS node <true|false> ? false Node ID ? 1 Partition ID[optional] ? (0) Reset type <powerCycle> ? (powerCycle) Do you wish to define system controller info[y/n]:y Sysctrl Type <msc|mmsc|l2>? (msc) msc Sysctrl Password [optional]? ( ) Sysctrl Status <enabled|disabled>? enabled Sysctrl Owner? cm2 Sysctrl Device? /dev/ttyd2 Sysctrl Owner Type <tty> [tty]? Number of Network interfaces [2]? 2 NIC 1 - IP Address? cm1 NIC 1 - Heartbeat HB (use network for heartbeats) <true|false>? true NIC 1 - (use network for control messages) <true|false>? true NIC 1 - Priority <1,2,...>? 1 NIC 2 - IP Address? cm2 NIC 2 Heartbeat HB (use network for heartbeats) <true|false>? true NIC 2 - (use network for control messages) <true|false>? false NIC 2 - Priority <1,2,...>? 2 |
When you are creating or modifying a component of a cluster, you can enter either of the following commands:
cancel, which aborts the current mode and discards any changes you have made
done, which commits the current definitions or modifications and returns to the cmgr> prompt
The cmgr command supports the following command-line editing commands:
The tasks to define the cluster and to stop HA services are long-running tasks that might take a few minutes to complete. The cmgr command will provide intermediate task status for such tasks. For example:
cmgr> stop ha_services in cluster nfs-cluster Making resource groups offline Stopping HA services on node node1 Stopping HA services on node node2 |
You can set the environment variable CMGR_STARTUP_FILE to point to a startup cmgr script. The startup script that this variable specifies is executed when cmgr is started (with or without the -p option). Only the set and show commands of the cmgr are allowed in the cmgr startup file.
The following is an example of a cmgr startup script file called cmgr_rc:
set cluster test-cluster show status of resource_group oracle_rg |
To specify this file as the startup script, execute the following command at the IRIX prompt:
# setenv CMGR_STARTUP_FILE /cmgr_rc |
Whenever cmgr is started, the cmgr_rc script is executed. The default cluster is set to test-cluster and the status of resource group oracle_rg in cluster test-cluster is displayed.
You can enter some cmgr subcommands directly from the command line using the following format:
cmgr -c "subcommand" |
where subcommand can be any of the following with the appropriate operands:
admin, which allows you to perform certain actions such as resetting a node
delete, which deletes a cluster or a node
help, which displays help information
show, which displays information about the cluster or nodes
start, which starts HA services and sets the configuration so that HA services will be automatically restarted upon reboot
stop, which stops HA services and sets the configuration so that HA services are not restarted upon reboot
test, which tests connectivity
For example, to display information about the cluster, enter the following:
# cmgr -c "show clusters"
1 Cluster(s) defined
eagan |
See Chapter 5, “Configuration”, and the cmgr(1M) man page for more information.
You can execute a series of cmgr commands by using the -f option and specifying an input file, as follows:
cmgr -f input_file |
Or you could include the following as the first line of the file and then execute it as a script:
#!/usr/cluster/bin/cmgr -f |
Each line of the file must be a valid cmgr command line, comment line (starting with #), or a blank line. (You must include a done command line to finish a multilevel command and end the file with a quit command line.)
If any line of the input file fails, cmgr will exit. You can choose to ignore the failure and continue the process by using the -i option with the -f option, as follows:
cmgr -if input_file |
Or include it in the first line for a script:
#!/usr/cluster/bin/cmgr -if |
| Note: If you include -i when using a cmgr command line as the first line of the script, you must use this exact syntax (that is, -if). |
For example, suppose the file /tmp/showme contains the following:
fs6# more /tmp/showme show clusters show nodes in cluster fs6-8 quit |
You can execute the following command, which will yield the indicated output:
fs6# /usr/cluster/bin/cmgr -if /tmp/showme
1 Cluster(s) defined
fs6-8
Cluster fs6-8 has following 3 machine(s)
fs6
fs7
fs8 |
Or you could include the cmgr command line as the first line of the script, give it execute permission, and execute showme itself:
fs6# more /tmp/showme
#!/usr/cluster/bin/cmgr -if
#
show clusters
show nodes in cluster fs6-8
quit
fs6# /tmp/showme
1 Cluster(s) defined
fs6-8
Cluster fs6-8 has following 3 machine(s)
fs6
fs7
fs8 |
After you have configured the cluster database, you can use the build_cmgr_script(1M) command to automatically create a cmgr(1M) script based on the contents of the cluster database. The generated script will contain the following:
Node definitions
Cluster definition
Resource definitions
Resource type definitions
Resource group definitions
Failover policy definitions
HA parameters settings
Any changes made using either the cmgr(1M) command or the GUI
CXFS information (only in a coexecution cluster)
When you use the -s option, the command also generates create_resource_type scripts for resource types.
As needed, you can then use the generated script to recreate the cluster database after performing a cdbreinit.
By default, the generated script is placed in the following location:
/tmp/cmgr_create_cluster_clustername_processID |
You can specify an alternative path name by using the -o option:
build_cmgr_script [-o script_pathname] |
For more details, see the build_cmgr_script (1M) man page.
For example:
# /var/cluster/cmgr-scripts/build_cmgr_script -o /tmp/newcdb Building cmgr script for cluster test-cluster ... build_cmgr_script: Generated cmgr script is /tmp/newcdb |
The example script file contents are as follows:
#!/usr/cluster/bin/cmgr -f
# Node node1 definition
define node node1
set hostname to node1.dept.company.com
set is_failsafe to true
set nodeid to 32065
set hierarchy to Reset,Shutdown
set reset_type to powerCycle
set sysctrl_type to msc
set sysctrl_status to enabled
set sysctrl_owner to node2
set sysctrl_device to /dev/ttyd2
set sysctrl_owner_type to tty
add nic 192.0.2.58
set heartbeat to true
set ctrl_msgs to true
set priority to 1
done
add nic 160.0.2.15
set heartbeat to true
set ctrl_msgs to true
set priority to 2
done
done
# Node node2 definition
define node node2
set hostname to node2.dept.company.com
set is_failsafe to true
set nodeid to 24140
set hierarchy to Reset,Shutdown
set reset_type to powerCycle
set sysctrl_type to msc
set sysctrl_status to enabled
set sysctrl_owner to node1
set sysctrl_device to /dev/ttyd2
set sysctrl_owner_type to tty
add nic 192.0.2.59
set heartbeat to true
set ctrl_msgs to true
set priority to 1
done
add nic 160.0.2.16
set heartbeat to true
set ctrl_msgs to true
set priority to 2
done
done
# Define cluster and add nodes to the cluster
define cluster test-cluster
set is_failsafe to true
set ha_mode to normal
done
modify cluster test-cluster
add node node1
add node node2
done
set cluster test-cluster
quit |
Template files of scripts that you can modify to configure the different components of your system are located in the /var/cluster/cmgr-templates directory.
Each template file contains a list of cmgr commands to create a particular object, as well as comments describing each field. The template also provides default values for optional fields.
Table 4-2 shows the template scripts for cmgr that are found in the var/cluster/cmgr-templates directory.
Table 4-2. Template Scripts for cmgr
File name | Description |
|---|---|
cmgr-create-cluster | Creation of a cluster |
cmgr-create-failover_policy | Creation of failover policy |
cmgr-create-node | Creation of node |
cmgr-create-resource_group | Creation of Resource Group |
cmgr-create-resource_type | Creation of resource type |
cmgr-create-resource- resource type | Creation of the specified resource of type |
To create a FailSafe configuration, you can concatenate multiple templates into one file and execute the resulting script. If you concatenate information from multiple template scripts to prepare your cluster configuration, you must remove the quit at the end of each template script, except for the final quit. A cmgr script must have only one quit line.
For example, for a three-node configuration with an NFS resource group containing one volume, one filesystem , one IP_address, and one NFS resource, you would concatenate the following files, removing the quit at the end of each template script except the last one:
Three copies of the cmgr-create-node file
One copy of each of the following files:
cmgr-create-cluster cmgr-create-failover_policy cmgr-create-resource_group cmgr-create-resource-volume cmgr-create-resource-filesystem cmgr-create-resource-IP_address cmgr-create-resource-NFS |