Appendix A. Software Overview

This section contains the following:

Software Layers

A FailSafe system has the following software layers:

  • Plug-ins, which create highly available services. The following table shows the provided and optional FailSafe plug-ins and their associated resource types.

    Table A-1. Provided and Optional Plug-Ins

    Provided Plug-In

    Resource Type

    Optional Plug-In

    Resource Type

    CXFS file system

    CXFS

    FailSafe/DMF

    DMF

    IP addresses

    IP_address

    FailSafe/NFS

    NFS and statd_unlimited

    MAC addresses

    MAC_address

    FailSafe/Informix

    INFORMIX_DB

    XFS file systems

    filesystem

    FailSafe/Oracle

    Oracle_DB

    XLV logical volumes

    volume

    FailSafe/Samba

    Samba

     

     

    FailSafe/TMF

    TMF

     

     

    FailSafe/Web (Netscape)

    Netscape_web

    See the release notes for information about the specific releases of these products that are supported.


    Note: The Samba interfaces parameter allows Samba to support multiple IP interfaces. It takes the following format, where IP must be a dotted decimal IP address and netmask must be a dotted decimal netmask such as 255.255.255.0:
    interfaces = IP1/netmask1 IP2/netmask2


    If the application you want is not available, you can hire the SGI Professional Services group to develop the required software, or you can use the IRIS FailSafe Version 2 Programmer's Guide to write the software yourself.

  • FailSafe base, which includes the ability to define resource groups and failover policies.

  • Cluster services, which lets you define clusters, resources, and resource types (this consists of the cluster_services installation package)

  • Cluster software infrastructure, which lets you do the following:

    • Perform node logging

    • Administer the cluster

    • Define nodes

    The cluster software infrastructure consists of the cluster_admin and cluster_control subsystems.

Figure A-1 shows a graphic representation of these layers. The cluster services and cluster software infrastructure layers are shared with CXFS. Table A-2, describes the contents of the /usr/cluster/bin directory. For more information about CXFS, see the CXFS Version 2 Software Installation and Administration Guide.

Figure A-1. Software Layers


Table A-2. Contents of /usr/cluster/bin 

Layer

Subsystem

Process

Description

Plug-ins

failsafe_informix

failsafe2_oracle

ha_ifmx2

IRIS FailSafe database agents. Each database agent monitors all instances of one type of database.

IRIS FailSafe Base

failsafe2

ha_fsd

IRIS FailSafe daemon. Provides basic component of the IRIS FailSafe software.

Cluster services (high-availability processes)

cluster_services

ha_cmsd

The FailSafe membership daemon. Provides the list of nodes, called FailSafe membership, available to the cluster.

 

 

ha_gcd

Group membership daemon. Provides group membership and reliable communication services in the presence of failures to IRIS FailSafe processes.

 

 

ha_srmd

System resource manager daemon. Manages resources, resource groups, and resource types. Executes action scripts for resources.

 

 

ha_ifd

Interface agent daemon. Monitors the local node's network interfaces. This daemon is described in detail in “Interface Agent Daemon (IFD)”.

Cluster software infrastructure (cluster administrative processes)

cluster_admin

cad

Cluster administration daemon. Provides administration services.

 

cluster_control

crsd

Node control daemon. Monitors the serial connection to other nodes. Has the ability to reset other nodes.

 

 

cmond

Daemon that manages all other daemons. This process starts other processes in all nodes in the cluster and restarts them on failures.

 

 

fs2d

Manages the cluster database and keeps each copy in sync on all nodes in the pool.


Interface Agent Daemon (IFD)

The IFD is an agent that monitors network interfaces and IP addresses. The IFD monitors all network interfaces and IP addresses configured in the node even when there are no highly available IP addresses in the node.

The IFD checks the number of input packets for each interace. If the number of input packets does not increase for a 10-second period, the IFD contacts the broadcast address of the interface by using the ping(1M) command. If the input packet count does not increase in the next 10-second period, the network interface and all IP addresses on the interface are marked as bad.

The IFD reads the configuration of IP addresses from the cluster database.

IP_address resource type action scripts use the ha_ifdadmin command to communicate with the IFD. Action scripts obtain status and configuration IP address from the IFD.

IFD logging can be controlled with the GUI and the cmgr command.

Communication Paths

The following figures show communication paths in FailSafe.


Note: The following figures do not represent the cmond cluster manager daemon. The purpose of this daemon is to keep the other daemons running.


Figure A-2. Administrative Communication within One Node


Figure A-3. Daemon Communication within One Node


Figure A-4. Communication between Nodes in the Pool


Figure A-5. Communication for a Node Not in the Cluster


Communication Paths in a Coexecution Cluster

The following figures show the communication paths within one node in a coexecution cluster.

Figure A-6. Administrative Communication within One Node under Coexecution


Figure A-7. Daemon Communication within One Node under Coexecution


Execution of FailSafe Action and Failover Scripts

The order of execution is as follows:

  1. FailSafe starts up by using the start ha_services command in cmgr or as part of the node bootup procedure. It then reads the resource group information from the cluster database.

  2. FailSafe tells the system resource manager (SRM) to run exclusive scripts for all resource groups that are in the Online ready state.

  3. SRM returns one of the following states for each resource group:

    • running

    • partially running

    • not running

  4. If a resource group has a state of not running in a node where HA services have been started, the following occurs:

    1. FailSafe runs the failover policy script associated with the resource group. The failover policy scripts takes the list of nodes that are capable of running the resource group (the failover domain) as a parameter.

    2. The failover policy script returns an ordered list of nodes in descending order of priority (the run-time failover domain) where the resource group can be placed.

    3. FailSafe sends a request to SRM to move the resource group to the first node in the run-time failover domain.

    4. SRM executes the start action script for all resources in the resource group:

      • If the start script fails, the resource group is marked online on that node with following error:

        srmd executable error

      • If the start script is successful, SRM automatically starts monitoring those resources. After the specified start monitoring time passes, SRM executes the monitor action script for the resource in the resource group.

  5. If the state of the resource group has a status of running or partially running on only one node in the cluster, FailSafe runs the associated failover policy script:

    • If the highest priority node is the same node where the resource group is partially running or running , the resource group is made online on the same node. In the partially running case, FailSafe tells SRM to execute start scripts for all resources in the resource group.

    • If the highest priority node is another node in the cluster, FailSafe tells SRM to execute stop action scripts for resources in the resource group on other nodes. FailSafe then makes the resource group online in the highest priority node in the cluster.

  6. If the state of the resource group is running or partially running in multiple nodes in the cluster, the resource group is marked with an error exclusivity error. These resource groups will require operator intervention to become online in the cluster.

Figure A-8 shows the message paths for action scripts and failover policy scripts.

Figure A-8. Message Paths for Action Scripts and Failover Policy Scripts


When a start Script Fails

When the start action script fails, the order of execution is as follows:

  1. SRM notifies FailSafe of the start action script failure as a resource group failure.

  2. FailSafe runs the failover policy script to determine the next node for the resource group.

  3. FailSafe sends a request to SRM to release the resource group and allocate the resource group in the next node in the cluster.

When a stop Script Fails

When the stop action script fails, the order of execution is as follows:

  1. SRM notifies FailSafe of the stop action script failure as a resource group failure.

  2. FailSafe marks the resource group with the following error:

    srmd executable error

  3. The system administrator must use the offline force command to clear the error state after stopping the resource group in the node.

Components

The cluster database is a key component of FailSafe software. It contains all information about the following:

  • Resources

  • Resource types

  • Resource groups

  • Failover policies

  • Nodes

  • Clusters

The cluster database daemon (fs2d) maintains identical databases on each node in the cluster.

The following table shows the contents of the /var/cluster/ha directory.

Table A-3. Contents of the /var/cluster/ha directory

Directory or File

Purpose

comm/

Directory that contains files that communicate between various daemons. FailSafe processes create temporary files in this directory. FailSafe interprocess communication will fail if there is not sufficient disk space for this directory (approximately 2-3 MB) in the root filesystem on every node in a FailSafe cluster.

common_scripts/

Directory that contains the script library (the common functions that may be used in action scripts).

log/

Directory that contains the logs of all scripts and daemons executed by IRIS FailSafe. The outputs and errors from the commands within the scripts are logged in the script_Nodename file.

policies/

Directory that contains the failover scripts used for resource groups.

resource_types/template

Directory that contains the template action scripts.

resource_types/ RTname

Directory that contains the action scripts for the RTname resource type. For example, /var/cluster/ha/resource_types/filesystem.

resource_types/ RTname/exclusive

Script that verifies that a resource of this resource type is not already running.

resource_types/ RTname/monitor

Script that monitors a resource of this resource type.

resource_types/ RTname/restart

Script that restarts a resource of this resource type on the same node after a monitoring failure.

resource_types/ RTname/start

Script that starts a resource of this resource type.

resource_types/ RTname/stop

Script that stops a resource of this resource type.