Appendix A. FailSafe and SGI Cluster Manager

Table A-1 summarizes the differences between IRIX FailSafe and SGI Cluster Manager for Linux for those readers who may be familiar with FailSafe.


Note: SGI Cluster Manager for Linux members and FailSafe nodes do not work together and cannot form a high-availability cluster.


Table A-1. Differences Between FailSafe and SGI Cluster Manager

Topic

FailSafe

SGI Cluster Manager

Operating system

IRIX

SLES9 with SGI ProPack 4 for Linux

Terminology

node

resource

member

application

Size of cluster

8 nodes

2 members

Node/member name

Hostname or private network address

Hostname only

NFS lock failover

Supported

Not supported

Network tiebreaker

A node that is participating the cluster membership. FailSafe tries to include the tiebreaker node in the membership in case of split-brain scenarios.

The IP address of machine or a router that does not participate in the cluster membership. Usually it is the IP address of a network router that connects the SGI Cluster Manager members to the external world (clients). In a split-brain scenario, only those members that can contact the tiebreaker IP address can form a cluster. There is also a disk tiebreaker.

Rolling upgrade

Supported

Not supported

Configuration information storage

Information is stored in the cluster database. The cluster database is replicated on all nodes automatically and kept in synchronization.

Information is stored in the /etc/cluster.xml configuration file and in the shared partitions. For initial configuration, you must copy this file to all members, such as by using scp . After making configuration changes, you must verify that configuration files are in synchronization. See “Step 15: Verify that Configuration Changes are Synchronized” in Chapter 4.

Making changes while the service is enabled

Device parameter, IP address parameters, and check interval can be changed.

Device parameter, IP address parameters, and check interval cannot be changed.

Script location for resources and resource types

/var/cluster/ha/resource_types

/usr/lib/clumanager/services/service

Heartbeat interval and timeout

You can specify cluster membership heartbeat interval and timeout (in milliseconds).

In the command line, you can specify the heartbeat interval (in microseconds) and the number of heartbeats that can be consecutively missed (tko_count). You can also specify the aggregate failover speed in the GUI.

Heartbeat networks

Allows multiple networks to be designated as heartbeat networks. You can choose a list of networks.

Allows heartbeat on all networks or as a multicast on the hostname network. However, you cannot choose a list of networks.

Action scripts

Separate scripts named start, stop, monitor, restart, exclusive.

A bash script that contains start, stop, and status parameters (see Chapter 6, “Creating a New Highly Available Application”). The equivalent for restart in SGI Cluster Manager is to perform a stop and then a start; there is no equivalent in SGI Cluster Manager for exclusive.

Resource timeouts

Timeouts can be specified for each action (start, stop, monitor , restart, exclusive) and for each resource type independently.

Timeout can be specified for each service irrespective of the action or the number of resources it contains.

Resource dependencies

Resource and resource type dependencies are supported and can be modified by the user.

Applications have fixed dependencies. The start and stop order of applications cannot be modified.

Failover policies

The ordered and round-robin failover policies are predefined. User-defined failover policies are supported.

Only the predefined ordered policy is supported. No user-defined failover policies are supported.