This appendix describes each of the names—block names, section names, and parameter names—used in the sample configuration files included with the IRIS FailSafe product and the IRIS FailSafe options.
When you are developing scripts for failing over a new highly available service, you can use these names (for ease of maintenance you should use them for the purpose described here) or you can define new names. Defining new names is described in Chapter 2, “Modifying the Configuration File for a New Highly Available Service.”
The sections in this appendix are as follows:
Table A-1 lists the blocks in configuration file templates included in IRIS FailSafe products and summarizes their contents.
Table A-1. Major Blocks in the Configuration File
Name | Description |
|---|---|
action appclass | Describes the scripts that are to be executed for the highly available service appclass. An action block exists for each highly available service. For all highly available services the action block specifies the local and monitor scripts if used. For the “main” highly available service, it also specifies the giveaway, giveback, takeback, takeover, and kill scripts. |
action-timer appclass | Describes the various timers that are used by the application monitor to decide when to execute and time out a monitoring script. The values can be adjusted based upon the expected response times of instances of the highly available service appclass. |
application-class appclass | Describes one highly available service that is failed over by this IRIS FailSafe cluster. These blocks identify the nodes that provide the highly available service appclass in normal state. |
filesystem label | Each filesystem block describes a single filesystem on a shared disk. For each filesystem, it specifies the primary and backup nodes and mount information. There should be one filesystem block for each filesystem on a shared disk in the cluster. |
informix label | An informix section is present for each INFORMIX database failed over in this cluster. |
interface-pair label | Contains IP addresses to be failed over and the primary interface and backup interface for the IP addresses. |
internal | Describes the various timeout values that are used by IRIS FailSafe daemons. The values in this block, except for long-timeout, must not be changed. |
nfs label | nfs blocks are present if NFS is failed over in this cluster. Each nfs block describes the NFS export information associated with a single exported filesystem. |
node label | Each node block describes the network interface configuration and heartbeat and serial information for a node in the cluster. |
oracle label | An oracle section is present for each Oracle database failed over in this cluster. |
sybase label | A sybase section is present for each Sybase Server failed over in this cluster. |
system-configuration | Describes global variables for the IRIS FailSafe cluster as a whole. |
volume label | Each volume block describes the ownership and location of one XLV volume. |
webserver label | webserver blocks are present if any of the nodes in this cluster are Netscape servers. Each webserver block describes the Netscape configurations of one node. |
Table A-2 lists the section names in the configuration file templates.
Table A-2. Section Names in Template Configuration Files
Section Name | Description |
|---|---|
heartbeat | This section specifies heartbeat monitoring parameters. |
interface label | A node network interface that is to be monitored. The interface label is created from the ip-address and name parameters and must be unique in the configuration file. There is one interface section for each public interface in the node that is part of IRIS FailSafe. Not all public interfaces need to be part of IRIS FailSafe. |
mount-info | Contains filesystem mounting information. |
web-confign | Describes one Web server instance on a node. |
Table A-3 lists the parameters used in the configuration file templates provided with IRIS FailSafe software options. Additional parameters can be defined as needed when failover of other applications is added.
Table A-3. Parameters in Template Configuration Files
Parameter Name | Possible Values | Description |
|---|---|---|
agent | pathname | The pathname of the agent for the highly available service. |
backup-node | label | nodename is a name returned by hostname. |
backup-server | string | The name of the Backup Server for this SQL Server. |
broadcast-addr | X.X.X.X | For node blocks: The broadcast address for the
subnet. |
config-file | string | The INFORMIX configuration file for this node. Its value is the value of the ONCONFIG environment variable. |
controlled-failback | true | Controls whether this node automatically moves to normal state after a failure. If controlled-failback is set to true, the node doesn't move to normal state after failure; it moves to controlled failback state. If set to false or not set, the node moves to normal state. |
db-avail | high | If the value is high (the default value), a database server (INFORMIX, Oracle, or Sybase) failure forces a failover. If the value is low, a failure of the database server doesn't force a failover, but the failover is reported. |
db-probe-time | integer | Specifies the length of time (in seconds) between the completion of one probe of the database by the database agent and the beginning of the next probe. |
db-shutdown- | integer | The timeout for the Oracle shutdown script specified by shutdown-script. If the shutdown script doesn't complete in this many seconds, IRIS FailSafe performs an abort shutdown of the Oracle instance. |
db-retry-count | integer | The number of monitoring retries by the database agent before a failure is declared. |
db-timeout | integer | Defines (in seconds) the time the database agent waits for a response to its probe from the database instance. |
devname | pathname | The device filename for the XLV logical volume. |
devname-group | string | The group of the device name for the XLV logical volume (reported by ls -l). The default value is sys. |
devname-mode | fs_mode | The access mode of the device name for the XLV logical volume (reported by ls -l). The default value is 0600. |
devname-owner | string | The user ID of owner of the device name for the XLV logical volume (reported by ls -l). The default value is root. |
export-info | export_mode | Filesystem export options (see the exports(4) reference page and the section “Wsync Filesystem Options” in Chapter 4 of the IRIS FailSafe Administrator's Guide). |
export-point | pathname | The pathname of an exported filesystem. |
filesystem | label | A filesystem to be exported. The value of this parameter must match the label of a filesystem block and the label of the block. |
fs-type | xfs | The filesystem type. Only xfs filesystems are supported. |
giveaway | pathname | The pathname of the giveaway script in /var/ha/actions. |
giveback | pathname | The pathname of the giveback script in /var/ha/actions. |
hb-lost-count | integer | Specifies how many heartbeat probe failures must occur to declare a heartbeat failure. The recommended value is 3. |
hb-private-ipname | string | This node's IP address for the private network used by heartbeat and control messages. |
hb-probe-time | integer | Heartbeat messages begin this many seconds after the node controller tells the application monitor to start monitoring. Also, this value specifies how long to wait (in seconds) after completion of the last heartbeat message to begin the next heartbeat message. The recommended value is 5. |
hb-public-ipname | hostname | This node's IP address for the public network that is used for heartbeat messages if the private network fails. This IP address is a fixed IP address. |
hb-timeout | integer | Specifies how long (in seconds) to wait for a heartbeat response before declaring a failure. The recommended value is 5. |
httpd-dir | pathname-port | The Netscape® server root location. |
httpd-options-file | string | The Netscape configuration file that starts multiple Netscape servers. The value is not a full pathname; it is a file in the directory /etc/config. The default value is ns_httpd.options, which is the configuration file for the Netscape Communications server. |
httpd-restart | false | If the two nodes have identical Netscape server configurations (a dual-active configuration with the same configuration information, log locations, and document root), then the Netscape server doesn't need to be restarted after failover (because an identical server is already running) and httpd-restart should be set to true. Otherwise, the Netscape server needs to be started on the backup node after failover and httpd-restart should be set to false. |
httpd-script | pathname | The full pathname of the script used to start and stop the Netscape server. The default value is /etc/init.d/ns_httpd, which is the configuration file for the Netscape Communications server. |
instance-id | string | The value of $ORACLE_SID for the Oracle instance that IRIS FailSafe is monitoring. |
interface-probe- | integer | The length of time (in seconds) between the completion of one probe of the local interfaces and the beginning of the next probe. The value is rounded to the nearest five-second increment. |
interface-probe- | integer | The length of time (in seconds) that the interface agent waits after probing the local interfaces without a response before declaring a failure. |
ip-address | string | For node blocks: The fixed IP address of this network
interface. It can be a name (string) or an address
(X.X.X.X). |
ip-aliases | ( string ) | For interface-pair blocks: Specifies the list of IP
addresses to be failed over using IP aliasing. |
kill | /usr/etc/ha_kill | The pathname of the ha_kill command. |
lmon-probe-time | integer | The probe interval (in seconds) for local monitoring of this highly available service. |
lmon-timeout | integer | Local monitoring of this highly available service times out in integer seconds if no response is received. |
local-monitor | pathname | The pathname of the local monitoring script for this highly available service. |
long-timeout | integer | The maximum time taken by the takeover, takeback, giveaway, and giveback scripts. If these scripts cannot be executed in this length of time, the value should be increased. This value is also used as the maximum time taken by several types of internal IRIS FailSafe communications. |
MAC-address | X:X:X:X:X:X | MAC addresses are required only if the network interfaces have to use MAC address impersonation. See the section “Network Interfaces and IP Addresses” in Chapter 1 of the IRIS FailSafe Administrator's Guide. |
mail-dest-addr | user@host | Mail is sent to this address when private network failure has been detected, the local node controller process appears to be hung or dead, the cluster is transitioning to degraded state, the cluster is transitioning to standby state, killing of a node fails, the ha_killd daemon has died, the ha_killd daemon could not be started, or the reset device monitor has failed. Do not set if mail is not configured on this node. |
master-db-fs | label | label is the filesystem label for the filesystem of the master database. |
master-db-vol | label | label is the volume label for the volume of the master database. |
mode | fs_mode | Filesystem mount options (see the fstab(4) reference page) and the section “Wsync Filesystem Options” in Chapter 4 of the IRIS FailSafe Administrator's Guide. |
monitoring-level | 1 | For informix blocks: Defines which test is done to
determine if INFORMIX is up. If the value is 1, the
database agent executes the onstat command and
searches the output for the pattern specified by
running-indicator-strings. If there is a match, the
database is up. If the value is 2, the database agent
uses a database call to determine if INFORMIX is up. |
mount-point | pathname | The pathname of a filesystem mount point. Both nodes use the same mount point. |
name | interface | interface is a network interface. Each node has several network interfaces. For example, a CHALLENGE S node has the network interfaces ec0, ec2, and ec3. |
netmask | X.X.X.X | For node blocks: The netmask used to identify this
node on the subnet. |
port-num | integer | The Netscape server port number. |
primary-interface | label | A name for an interface on which the IP aliases are configured in normal state, typically created by combining the hostname and interface name. The value must match an interface section label in a node block. |
pwrfail | true | This parameter does not apply to CHALLENGE S nodes. When set to true (the default), it allows the surviving node to attempt to go to degraded state after it detects a power failure on the other node (or the private network, public network, and serial connections are broken). If pwrfail is set to false or not set, the node goes to standby state after it detects a power failure on the other node. |
re-mac | true | If the IP address and the physical address of the primary interface are to be transferred to the backup interface when a failover occurs, set re-mac to true. Otherwise set it to false or leave it undefined. |
release-dir | pathname | For informix blocks: The INFORMIX release directory
specified in the INFORMIX configuration. This value
is the value of the environment variable
INFORMIXDIR. |
remote-send-probe- | integer | The frequency (in seconds) of messages to the interface agent on the other node. This parameter must be less than the value of interface-probe-interval, but should be only slightly less. The value is rounded to the nearest five seconds. |
remote-send-timeout | integer | The length of time (in seconds) between the completion of one probe of the other node's highly available interfaces and the beginning of the next probe. This parameter must be less than the value of interface-probe-interval, but should be only slightly less. The value is rounded to the nearest five seconds. |
reset-host | hostname | Applies only to nodes running the Silicon Graphics Oracle Parallel Server product. If an IRISconsole is used to provide reset functionality, hostname is the hostname of the Indy running IRISconsole software. reset-host is ignored if reset-tty is set. Do not set reset-host if you are not running OPS in the IRIS FailSafe cluster. |
reset-tty | serial_devicename | The device filename of the serial port on this node that is used by the serial cable connected to the system controller port on the other node or to the remote power control unit. |
retry-count | integer | Determines the number of retries done by the local monitoring script. The application monitor declares a local monitor failure after lmon-timeout seconds independent of the retry-count value. |
running-indicator- | string | A string that is used as a search pattern in determining if On-Line is up. The output of the onstat command is searched. The string must begin and end with double-quotes. An OR symbol (|) can be used in the string to separate multiple search patterns. The string cannot contain any blanks and the search is case sensitive. |
sa-passwd | string | The unencrypted password of the database system administrator (sa-user). This parameter should be omitted if there is no password. |
sa-user | string | For informix blocks: The INFORMIX login name of
the INFORMIX database system administrator. |
search-string | string | Specifies the string to be searched for in the output of the ps command to verify that the httpd process is running, for example ns-httpd. This parameter is required if the monitoring-level parameter in a webserver block has the value 1. |
secondary-interface | label | A name for the backup interface, an interface on the other node that replaces the primary interface on failover. It is typically created by concatenating the hostname and interface name. The value must match an interface section label in a node block. |
server-node | label | For application-class blocks: The primary node for
instances of the highly available service. Server-node
is listed twice if each of the nodes in a cluster serves as
the primary node for some instances. The values
assigned to server-node must match the labels for
node blocks. |
short-timeout | integer | The maximum length of time (in seconds) for certain IRIS FailSafe internal communications tasks to complete. Do not change this value. |
shutdown-script | pathname | The pathname of a custom Oracle shutdown script that replaces the standard Oracle shutdown script. |
shutdown-options | integer | Defines how the database is shut down on a failover. The possible values are: 0—normal shutdown, 1— shutdown with nowait and no checkpointing, and 2— shutdown with nowait with checkpointing. |
start-monitor-time | integer | For application-class blocks: This parameter specifies
how long (in seconds) after a node releases this highly
available service (giveaway) that the application
monitor waits before starting monitoring of the
highly available service. It is required. |
startup-script | pathname | The pathname of a custom Oracle startup script that replaces the standard Oracle startup script. |
statmon-dir | pathname | Specifies the pathname of a directory on a shared filesystem belonging to its server-node where NFS locking information is stored. The basename of this directory must be statmon. |
takeback | pathname | The pathname of the takeback script in /var/ha/actions. |
takeover | pathname | The pathname of the takeover script in /var/ha/actions. |
unix-user | login_id | The IRIX login name of the owner of the release directory for the database software. |
version-major | 1 | A version number that, along with version-minor, specifies the configuration file format used in this file. It is 1 for IRIS FailSafe 1.0 and IRIS FailSafe 1.1. |
version-minor | 0 | A version number that, along with version-major, specifies the configuration file format used in this file. It is 0 for IRIS FailSafe 1.0 configuration files (which can run on IRIS FailSafe 1.1 as well) and 1 for IRIS FailSafe 1.1 configuration files. |
volume-name | label | volname must match a volume block label. |
webserver-num | integer | The number of Netscape servers configured on this node (server-node). This is the number of web-config sections for this Netscape server. |