Appendix A. Names Used in Template Configuration Files

This appendix describes each of the names—block names, section names, and parameter names—used in the sample configuration files included with the IRIS FailSafe product and the IRIS FailSafe options.

When you are developing scripts for failing over a new highly available service, you can use these names (for ease of maintenance you should use them for the purpose described here) or you can define new names. Defining new names is described in Chapter 2, “Modifying the Configuration File for a New Highly Available Service.”

The sections in this appendix are as follows:

Block Names

Table A-1 lists the blocks in configuration file templates included in IRIS FailSafe products and summarizes their contents.

Table A-1. Major Blocks in the Configuration File

Name

Description

action appclass

Describes the scripts that are to be executed for the highly available service appclass. An action block exists for each highly available service. For all highly available services the action block specifies the local and monitor scripts if used. For the “main” highly available service, it also specifies the giveaway, giveback, takeback, takeover, and kill scripts.

action-timer appclass

Describes the various timers that are used by the application monitor to decide when to execute and time out a monitoring script. The values can be adjusted based upon the expected response times of instances of the highly available service appclass.

application-class appclass

Describes one highly available service that is failed over by this IRIS FailSafe cluster. These blocks identify the nodes that provide the highly available service appclass in normal state.

filesystem label

Each filesystem block describes a single filesystem on a shared disk. For each filesystem, it specifies the primary and backup nodes and mount information. There should be one filesystem block for each filesystem on a shared disk in the cluster.

informix label

An informix section is present for each INFORMIX database failed over in this cluster.

interface-pair label

Contains IP addresses to be failed over and the primary interface and backup interface for the IP addresses.

internal

Describes the various timeout values that are used by IRIS FailSafe daemons. The values in this block, except for long-timeout, must not be changed.

nfs label

nfs blocks are present if NFS is failed over in this cluster. Each nfs block describes the NFS export information associated with a single exported filesystem.

node label

Each node block describes the network interface configuration and heartbeat and serial information for a node in the cluster.

oracle label

An oracle section is present for each Oracle database failed over in this cluster.

sybase label

A sybase section is present for each Sybase Server failed over in this cluster.

system-configuration

Describes global variables for the IRIS FailSafe cluster as a whole.

volume label

Each volume block describes the ownership and location of one XLV volume.

webserver label

webserver blocks are present if any of the nodes in this cluster are Netscape servers. Each webserver block describes the Netscape configurations of one node.


Section Names

Table A-2 lists the section names in the configuration file templates.

Table A-2. Section Names in Template Configuration Files

Section Name

Description

heartbeat

This section specifies heartbeat monitoring parameters.

interface label

A node network interface that is to be monitored. The interface label is created from the ip-address and name parameters and must be unique in the configuration file. There is one interface section for each public interface in the node that is part of IRIS FailSafe. Not all public interfaces need to be part of IRIS FailSafe.

mount-info

Contains filesystem mounting information.

web-confign

Describes one Web server instance on a node.


Parameter Names

Table A-3 lists the parameters used in the configuration file templates provided with IRIS FailSafe software options. Additional parameters can be defined as needed when failover of other applications is added.

Table A-3. Parameters in Template Configuration Files

Parameter Name

Possible Values

Description

agent

pathname

The pathname of the agent for the highly available service.

backup-node

label

nodename is a name returned by hostname.

For volume blocks: The backup node for this XLV logical volume. The value assigned to backup-node must match the label for a node block.

For webserver blocks: The backup node for the Netscape server. The value must match the label for a node block.

backup-server

string

The name of the Backup Server for this SQL Server.

broadcast-addr

X.X.X.X

For node blocks: The broadcast address for the subnet.

For interface-pair blocks: The subnet broadcast IP address for the IP aliases in X.X.X.X notation.

config-file

string

The INFORMIX configuration file for this node. Its value is the value of the ONCONFIG environment variable.

controlled-failback

true
false
(not set)

Controls whether this node automatically moves to normal state after a failure. If controlled-failback is set to true, the node doesn't move to normal state after failure; it moves to controlled failback state. If set to false or not set, the node moves to normal state.

db-avail

high
low

If the value is high (the default value), a database server (INFORMIX, Oracle, or Sybase) failure forces a failover. If the value is low, a failure of the database server doesn't force a failover, but the failover is reported.

db-probe-time

integer

Specifies the length of time (in seconds) between the completion of one probe of the database by the database agent and the beginning of the next probe.

db-shutdown-
timeout

integer

The timeout for the Oracle shutdown script specified by shutdown-script. If the shutdown script doesn't complete in this many seconds, IRIS FailSafe performs an abort shutdown of the Oracle instance.

db-retry-count

integer

The number of monitoring retries by the database agent before a failure is declared.

db-timeout

integer

Defines (in seconds) the time the database agent waits for a response to its probe from the database instance.

devname

pathname

The device filename for the XLV logical volume.

devname-group

string

The group of the device name for the XLV logical volume (reported by ls -l). The default value is sys.

devname-mode

fs_mode

The access mode of the device name for the XLV logical volume (reported by ls -l). The default value is 0600.

devname-owner

string

The user ID of owner of the device name for the XLV logical volume (reported by ls -l). The default value is root.

export-info

export_mode

Filesystem export options (see the exports(4) reference page and the section “Wsync Filesystem Options” in Chapter 4 of the IRIS FailSafe Administrator's Guide).

export-point

pathname

The pathname of an exported filesystem.

filesystem

label

A filesystem to be exported. The value of this parameter must match the label of a filesystem block and the label of the block.

fs-type

xfs

The filesystem type. Only xfs filesystems are supported.

giveaway

pathname

The pathname of the giveaway script in /var/ha/actions.

giveback

pathname

The pathname of the giveback script in /var/ha/actions.

hb-lost-count

integer

Specifies how many heartbeat probe failures must occur to declare a heartbeat failure. The recommended value is 3.

hb-private-ipname

string
X.X.X.X

This node's IP address for the private network used by heartbeat and control messages.

hb-probe-time

integer

Heartbeat messages begin this many seconds after the node controller tells the application monitor to start monitoring. Also, this value specifies how long to wait (in seconds) after completion of the last heartbeat message to begin the next heartbeat message. The recommended value is 5.

hb-public-ipname

hostname
X.X.X.X

This node's IP address for the public network that is used for heartbeat messages if the private network fails. This IP address is a fixed IP address.

hb-timeout

integer

Specifies how long (in seconds) to wait for a heartbeat response before declaring a failure. The recommended value is 5.

httpd-dir

pathname-port

The Netscape® server root location.

httpd-options-file

string

The Netscape configuration file that starts multiple Netscape servers. The value is not a full pathname; it is a file in the directory /etc/config. The default value is ns_httpd.options, which is the configuration file for the Netscape Communications server.

httpd-restart

false
true

If the two nodes have identical Netscape server configurations (a dual-active configuration with the same configuration information, log locations, and document root), then the Netscape server doesn't need to be restarted after failover (because an identical server is already running) and httpd-restart should be set to true. Otherwise, the Netscape server needs to be started on the backup node after failover and httpd-restart should be set to false.

httpd-script

pathname

The full pathname of the script used to start and stop the Netscape server. The default value is /etc/init.d/ns_httpd, which is the configuration file for the Netscape Communications server.

instance-id

string

The value of $ORACLE_SID for the Oracle instance that IRIS FailSafe is monitoring.

interface-probe-
interval

integer

The length of time (in seconds) between the completion of one probe of the local interfaces and the beginning of the next probe. The value is rounded to the nearest five-second increment.

interface-probe-
timeout

integer

The length of time (in seconds) that the interface agent waits after probing the local interfaces without a response before declaring a failure.

ip-address

string
X.X.X.X

For node blocks: The fixed IP address of this network interface. It can be a name (string) or an address (X.X.X.X).

For nfs blocks: One (any one) of the IP addresses or IP aliases on the node that is primary for this filesystem, preferably in the form X.X.X.X. A good choice is the IP address or IP alias used by clients to mount the filesystem. If an IP name is used, the length of time required to resolve the name to an address could require that the lmon-timeout value for NFS be increased.

For webserver blocks: The high availability IP address used by clients to access the Netscape server. An IP address of the form X.X.X.X is recommended.

ip-aliases

( string )
( str1 str2 str3 )

For interface-pair blocks: Specifies the list of IP addresses to be failed over using IP aliasing.

For application-class nfs blocks: Specifies a list of IP aliases. This parameter, with at least one IP alias, is required if NFS file locking is used (listing at least one IP aliases is recommended). Each IP alias that is used for NFS should be listed.

kill

/usr/etc/ha_kill

The pathname of the ha_kill command.

lmon-probe-time

integer

The probe interval (in seconds) for local monitoring of this highly available service.

lmon-timeout

integer

Local monitoring of this highly available service times out in integer seconds if no response is received.

local-monitor

pathname

The pathname of the local monitoring script for this highly available service.

long-timeout

integer

The maximum time taken by the takeover, takeback, giveaway, and giveback scripts. If these scripts cannot be executed in this length of time, the value should be increased. This value is also used as the maximum time taken by several types of internal IRIS FailSafe communications.

MAC-address

X:X:X:X:X:X

MAC addresses are required only if the network interfaces have to use MAC address impersonation. See the section “Network Interfaces and IP Addresses” in Chapter 1 of the IRIS FailSafe Administrator's Guide.

mail-dest-addr

user@host
(not set)

Mail is sent to this address when private network failure has been detected, the local node controller process appears to be hung or dead, the cluster is transitioning to degraded state, the cluster is transitioning to standby state, killing of a node fails, the ha_killd daemon has died, the ha_killd daemon could not be started, or the reset device monitor has failed. Do not set if mail is not configured on this node.

master-db-fs

label

label is the filesystem label for the filesystem of the master database.

master-db-vol

label

label is the volume label for the volume of the master database.

mode

fs_mode

Filesystem mount options (see the fstab(4) reference page) and the section “Wsync Filesystem Options” in Chapter 4 of the IRIS FailSafe Administrator's Guide.

monitoring-level

1
2

For informix blocks: Defines which test is done to determine if INFORMIX is up. If the value is 1, the database agent executes the onstat command and searches the output for the pattern specified by running-indicator-strings. If there is a match, the database is up. If the value is 2, the database agent uses a database call to determine if INFORMIX is up.

For webserver blocks: Defines which test is done to determine if Netscape is up. If the value is 1, the ps command is used to check if the httpd process is running on the node. If the value is 2, an http request is sent to the httpd process to determine if the Netscape server is running. The value 2 is a stricter check. If the value 1 is specified, the parameter search-string in the webserver block is used.

mount-point

pathname

The pathname of a filesystem mount point. Both nodes use the same mount point.

name

interface

interface is a network interface. Each node has several network interfaces. For example, a CHALLENGE S node has the network interfaces ec0, ec2, and ec3.

netmask

X.X.X.X

For node blocks: The netmask used to identify this node on the subnet.

For interface blocks: The netmask for the IP aliases.

port-num

integer

The Netscape server port number.

primary-interface

label

A name for an interface on which the IP aliases are configured in normal state, typically created by combining the hostname and interface name. The value must match an interface section label in a node block.

pwrfail

true
false
(not set)

This parameter does not apply to CHALLENGE S nodes. When set to true (the default), it allows the surviving node to attempt to go to degraded state after it detects a power failure on the other node (or the private network, public network, and serial connections are broken). If pwrfail is set to false or not set, the node goes to standby state after it detects a power failure on the other node.

re-mac

true
false
(not set)

If the IP address and the physical address of the primary interface are to be transferred to the backup interface when a failover occurs, set re-mac to true. Otherwise set it to false or leave it undefined.

release-dir

pathname

For informix blocks: The INFORMIX release directory specified in the INFORMIX configuration. This value is the value of the environment variable INFORMIXDIR.

For Sybase blocks: The Sybase release directory specified in the Sybase configuration, which is also the value of the Sybase environment variable.

remote-send-probe-
interval

integer

The frequency (in seconds) of messages to the interface agent on the other node. This parameter must be less than the value of interface-probe-interval, but should be only slightly less. The value is rounded to the nearest five seconds.

remote-send-timeout

integer

The length of time (in seconds) between the completion of one probe of the other node's highly available interfaces and the beginning of the next probe. This parameter must be less than the value of interface-probe-interval, but should be only slightly less. The value is rounded to the nearest five seconds.

reset-host

hostname

Applies only to nodes running the Silicon Graphics Oracle Parallel Server product. If an IRISconsole is used to provide reset functionality, hostname is the hostname of the Indy running IRISconsole software. reset-host is ignored if reset-tty is set. Do not set reset-host if you are not running OPS in the IRIS FailSafe cluster.

reset-tty

serial_devicename

The device filename of the serial port on this node that is used by the serial cable connected to the system controller port on the other node or to the remote power control unit.

retry-count

integer

Determines the number of retries done by the local monitoring script. The application monitor declares a local monitor failure after lmon-timeout seconds independent of the retry-count value.

running-indicator-
strings

string

A string that is used as a search pattern in determining if On-Line is up. The output of the onstat command is searched. The string must begin and end with double-quotes. An OR symbol (|) can be used in the string to separate multiple search patterns. The string cannot contain any blanks and the search is case sensitive.

sa-passwd

string

The unencrypted password of the database system administrator (sa-user). This parameter should be omitted if there is no password.

sa-user

string

For informix blocks: The INFORMIX login name of the INFORMIX database system administrator.

For sybase blocks: The Sybase login name of the Sybase database system administrator.

search-string

string

Specifies the string to be searched for in the output of the ps command to verify that the httpd process is running, for example ns-httpd. This parameter is required if the monitoring-level parameter in a webserver block has the value 1.

secondary-interface

label

A name for the backup interface, an interface on the other node that replaces the primary interface on failover. It is typically created by concatenating the hostname and interface name. The value must match an interface section label in a node block.

server-node

label

For application-class blocks: The primary node for instances of the highly available service. Server-node is listed twice if each of the nodes in a cluster serves as the primary node for some instances. The values assigned to server-node must match the labels for node blocks.

For volume blocks: The primary node for this XLV logical volume. The value assigned to server-node must match the label for a node block.

For webserver blocks: The primary node for this Netscape server. The value must match the label for a node block.

short-timeout

integer

The maximum length of time (in seconds) for certain IRIS FailSafe internal communications tasks to complete. Do not change this value.

shutdown-script

pathname

The pathname of a custom Oracle shutdown script that replaces the standard Oracle shutdown script.

shutdown-options

integer

Defines how the database is shut down on a failover. The possible values are: 0—normal shutdown, 1— shutdown with nowait and no checkpointing, and 2— shutdown with nowait with checkpointing.

start-monitor-time

integer

For application-class blocks: This parameter specifies how long (in seconds) after a node releases this highly available service (giveaway) that the application monitor waits before starting monitoring of the highly available service. It is required.

For an action-timer block for a highly available service that has an agent (for example interfaces, sybase, informix, or oracle): The length of time (in seconds) that the interface agent waits after it is told by the application monitor to start monitoring the local interfaces before beginning to monitor. This wait ensures that the database instance has had time to start up and must be greater than or equal to the value of long-timeout.

For an action-timer block for a highly available service that doesn't have an agent: The length of time (in seconds) that the monitoring script waits after it is told by the application monitor to start monitoring before it begins monitoring. The value must be greater than or equal to the value of long-timeout and is required.

startup-script

pathname

The pathname of a custom Oracle startup script that replaces the standard Oracle startup script.

statmon-dir

pathname

Specifies the pathname of a directory on a shared filesystem belonging to its server-node where NFS locking information is stored. The basename of this directory must be statmon.

takeback

pathname

The pathname of the takeback script in /var/ha/actions.

takeover

pathname

The pathname of the takeover script in /var/ha/actions.

unix-user

login_id

The IRIX login name of the owner of the release directory for the database software.

version-major

1

A version number that, along with version-minor, specifies the configuration file format used in this file. It is 1 for IRIS FailSafe 1.0 and IRIS FailSafe 1.1.

version-minor

0
1

A version number that, along with version-major, specifies the configuration file format used in this file. It is 0 for IRIS FailSafe 1.0 configuration files (which can run on IRIS FailSafe 1.1 as well) and 1 for IRIS FailSafe 1.1 configuration files.

volume-name

label

volname must match a volume block label.

webserver-num

integer

The number of Netscape servers configured on this node (server-node). This is the number of web-config sections for this Netscape server.