Chapter 2. Configuration Planning

This chapter explains how to plan the configuration of highly available (HA) services on your IRIS FailSafe cluster. The major sections of this chapter are as follows:

Overview

You must first decide how you want to use the cluster. You can then configure the disks and interfaces to meet the needs of the highly available (HA) services you want the cluster to provide.

Questions to be Answered

Questions you must answer during the planning process are as follows:

  • How do you plan to use the nodes?

    Your answers might include uses such as offering home directories for users, running particular applications, supporting an Oracle database, providing Netscape Web service, and providing file service.

  • Which of these uses will be provided as an HA service?

    SGI has developed FailSafe software options for some HA applications; see “Software Layers” in Appendix A. To offer other applications as HA services, you must develop a set of application monitoring shell scripts as described in the IRIS FailSafe Version 2 Programmer's Guide. If you need assistance, contact SGI Professional Services, which offers custom FailSafe agent development and integration services.

  • Which node will be the primary node for each HA service?

    The primary node is the node that provides the service (exports the filesystem, is a Netscape server, provides the database, and so on).

  • For each HA service, how will the software and data be distributed on shared and non-shared disks?

    Each application has requirements and choices for placing its software on disks that are failed over (shared) or not failed over (non-shared).

  • Are the shared disks going to be part of a RAID storage system or are they going to be disks in SCSI/Fibre channel disk storage that have plexed XLV logical volumes on them?

    Shared disks must be part of a RAID storage system or in SCSI/Fibre channel disk storage with plexed XLV logical volumes on them.

  • How will shared disks be configured?

    • As raw XLV logical volumes?

    • As XLV logical volumes with XFS filesystems on them?

    • As CXFS filesystems, which use XVM logical volumes? For information on using FailSafe and CXFS, see “Coexecution of CXFS and IRIS FailSafe”.

    The choice of volumes or filesystems depends on the application that is going to use the disk space.

  • Which IP addresses will be used by clients of HA services?

    Multiple interfaces may be required on each node because a node could be connected to more than one network or because there could be more than one interface to a single network.

  • Which resources will be part of a resource group?

    All resources that are dependent on each other must be in the resource group.

  • What will be the failover domain of the resource group? (For more information about failover domains, see “Failover Domain” in Chapter 1.)

    The failover domain determines the list of nodes in the cluster where the resource group can reside. For example, a volume resource that is part of a resource group can reside only in nodes from which the disks composing the volume can be accessed.

  • How many highly available (HA) IP addresses on each network interface will be available to clients of the HA services?

    At least one HA IP address must be available for each interface on each node that is used by clients of HA services.

  • Which HA IP addresses on nodes in the failover domain are going to be available to clients of the HA services?

  • For each HA IP address that is available on a node in the failover domain to clients of HA services, which interface on the other nodes will be assigned that IP address after a failover?

    Every HA IP address used by an HA service must be mapped to at least one interface in each node that can take over the resource group service. The HA IP addresses are failed over from the interface in the primary node of the resource group to the interface in the replacement node.

Example

As an example of the configuration planning process, suppose that you have a two-node FailSafe cluster that is a departmental server. You want to make four XFS filesystems available for NFS mounting and have two Netscape FastTrack servers, each serving a different set of documents. These applications will be HA services.

You decide to distribute the services across two nodes, so each node will be the primary node for two filesystems and one Netscape server. The filesystems and the document roots for the Netscape servers (on XFS filesystems) are each on their own plexed XLV logical volume. The logical volumes are created from disks in a Fibre Channel storage system connected to both nodes.

There are four resource groups:

  • NFSgroup1

  • NFSgroup2

  • Webgroup1

  • Webgroup2

NFSgroup1 and NFSgroup2 are the NFS resource groups; Webgroup1 and Webgroup2 are the Web resource groups. NFSgroup1 and Webgroup1 will have one node as the primary node. NFSgroup2 and Webgroup2 will have the other node as the primary node.

Two networks are available on each node, ef0 and ef1. The ef1 interfaces in each node are connected to each other to form a private network.

Figure 2-1 depicts this configuration.

Figure 2-1. Example Configuration with Four Resource Groups


Disk Configuration

This section contains the following:

Planning Disk Configuration

For each disk in a FailSafe cluster, you must choose whether to make it a shared disk, which enables it to be failed over, or a non-shared disk. Non-shared disks are not failed over.

The nodes in a FailSafe cluster must follow these requirements:

  • The system disk must be a non-shared disk.

  • The FailSafe software must be on a non-shared disk.

  • All system directories (/tmp, /var, /usr, /bin, /dev, ... ) should be in non-shared disk.

Only HA application data and configuration data can be placed on a shared disk. Choosing to make a disk shared or non-shared depends on the needs of the HA services that use the disk. Each HA service has requirements about the location of data associated with the service:

  • Some data must be placed on non-shared disks

  • Some data must not be placed on shared disks

  • Some data can be on shared or non-shared disks

The figures in the remainder of this section show the basic disk configurations on FailSafe clusters before and after failover. A cluster can contain a combination of the following basic disk configurations:

  • A non-shared disk on each node

  • Multiple shared disks containing Web server and NFS file server documents


Note: In each of the before and after failover diagrams, just one or two disks are shown. In fact, many disks could be connected in the same way as each disk shown. Thus each disk shown can represent a set of disks.

Figure 2-2 shows two nodes in a cluster, each of which has a non-shared disk with two resource groups. When non-shared disks are used by HA applications, the data required by those applications must be duplicated on non-shared disks on both nodes. The clients should access the data in the shared disk using HA IP alias. When a failover occurs, HA IP aliases fail over.


Note: The hostname is bound to a different IP address that never moves.

The data that was originally available on the failed node is still available from the replacement node by using the HA IP alias to access it.

The configuration in Figure 2-2 contains two resource groups:

Resource Group

Resource Type

Resource

Group1

IP_address

192.26.50.1

Group2

IP_address

192.26.50.2


Figure 2-2. Non-Shared Disk Configuration and Failover

Figure 2-3 shows a two-node configuration with one resource group, Group1:

Resource Group

Resource Type

Resource

Failover Domain

Group1

IP_address

192.26.50.1

(xfs-ha1, xfs-ha2)

 

filesystem

/shared

 

 

volume

shared_vol

 

In this configuration, the resource group Group1 has a primary node, which is the node that accesses the disk prior to a failover. It is shown by a solid line connection. The backup node, which accesses the disk after a failover, is shown by a dotted line. Thus, the disk is shared between the nodes. In an active/backup configuration, all resource groups have the same primary node. The backup node does not run any HA resource groups until a failover occurs.

Figure 2-3. Shared Disk Configuration for Active/Backup Use

Figure 2-4 shows two shared disks in a two-node cluster with two resource groups, Group1 and Group2:

Resource Group

Resource Type

Resource

Failover Domain

Group1

IP_address

192.26.50.1

(xfs-ha1, xfs-ha2)

 

filesystem

/shared1

 

 

volume

shared1_vol

 

Group2

IP_address

192.26.50.2

(xfs-ha2, xfs-ha1)

 

filesystem

/shared2

 

 

volume

shared2_vol

 

In this configuration, each node serves as a primary node for one resource group. The solid line connections show the connection to the primary node prior to fail over. The dotted lines show the connections to the backup nodes. After a failover, the surviving node has all the resource groups.

Figure 2-4. Shared Disk Configuration For Dual-Active Use

Other sections in this chapter and similar sections in the IRIS FailSafe 2.0 Oracle Administrator's Guide and IRIS FailSafe 2.0 INFORMIX Administrator's Guide provide more specific information about choosing between shared and non-shared disks for various types of data associated with each HA service.

Configuration Parameters for Disks

There are no configuration parameters associated with non-shared disks. They are not specified when you configure a FailSafe system. Only the XLV logical volumes on shared disks are specified at configuration. For more information, see “Configuration Parameters for Logical Volumes”.

For information on using CXFS filesystems (which use XVM logical volumes) in a FailSafe configuration, see “Coexecution of CXFS and IRIS FailSafe”.

Logical Volume Configuration


Note: This section describes logical volume configuration using XLV logical volumes. For information on coexecution of FailSafe and CXFS filesystems (which use XVM logical volumes), see “Coexecution of CXFS and IRIS FailSafe”.

This section contains the following:

Planning XLV Logical Volumes

All shared disks must contain XLV logical volumes. You can work with XLV logical volumes on shared disks as you would work with other disks. However, you must follow these rules:

  • All data that is used by HA applications on shared disks must be stored in XLV logical volumes.

  • If you create more than one XLV volume on a single physical disk, all of those volumes must be owned by the same node. For example, if a disk has two partitions that are part of two XLV volumes, both XLV volumes must be part of the same resource group. (See “Create XLV Logical Volumes and XFS Filesystems ” in Chapter 3, for more information about XLV volume ownership.)

  • Each disk in a Fibre Channel or SCSI Vault or RAID logical unit number (LUN) must be part of one resource group. Therefore, you must divide the Fibre Channel or SCSI Vault disks and RAID LUNs into one set for each resource group. If you create multiple volumes on a Fibre Channel or SCSI Vault disk or RAID LUN, all those volumes must be part of one resource group.

  • Do not simultaneously access a shared XLV volume from more than one node. Doing so causes data corruption.

The FailSafe software relies on the XLV naming scheme to operate correctly. A fully qualified XLV volume name uses one of the following formats:

pathname/volname
pathname/nodename.volname

The components are these:

  • pathname is /dev/xlv

  • nodename by default is the same as the hostname of the node on which the volume was created

  • volname is a name specified when the volume was created; this component is commonly used when a volume is to be operated on by any of the XLV tools

For example, if volume vol1 is created on node ha1 using disk partitions located on a shared disk, the raw character device name for the assembled volume is /dev/rxlv/vol1. On the peer ha2, however, the same raw character volume appears as /dev/rxlv/ha1.vol1, where ha1 is the nodename component and vol1 is the volname component. As can be seen from this example, when the nodename component is the same as the local hostname, it does not appear as part of the device node name.

One nodename is stored in each disk or LUN volume header. This is why all volumes with volume elements on any single disk must have the same nodename component.


Caution: If this rule is not followed, FailSafe does not operate correctly.

FailSafe modifies the nodename component of the volume header as volumes are transferred between nodes during failover and recovery operations. This is important because xlv_assemble assembles only those volumes whose nodename matches the local hostname. Some of the other XLV utilities allow you to see (and modify) all volumes, regardless of which node owns them.

The resource name for a resource of resource type volume is the XLV volume name.

If you use XLV logical volumes as raw volumes (that is, with no filesystem) for storing database data, the database system may require that the device names in /dev/xlv have specific owners, groups, and modes. See the documentation provided by the database vendor to determine if the XLV logical volume device names must have owners, groups, and modes that are different from the default values (the defaults are root, sys, and 0600, respectively).

Example Logical Volume Configuration

As an example of logical volume configuration, say that you have the following logical volumes on disks that we will call Disk1 through Disk5:

  • /dev/xlv/VolA (volume A) contains Disk1 and a portion of Disk2

  • /dev/xlv/VolB (volume B) contains the remainder of Disk2 and Disk3

  • /dev/xlv/VolC (volume C) that contains Disk4 and Disk5

VolA and VolB must be part of the same resource group because they share a disk. VolC could be part of any resource group. Figure 2-5 describes this.

Figure 2-5. Example Logical Volume Configuration


Configuration Parameters for Logical Volumes

Configuration parameters for XLV logical volumes list the following:

  • Owner of the device file (default value: root)

  • Group of the device (default value: sys)

  • Mode of the device (default value: 0600)

Table 2-1 lists a label and parameters for individual logical volumes.

Table 2-1. XLV Logical Volume Configuration Parameters

Resource Attribute

VolA

VolB

VolC

Comments

devname-owner

root

root

root

Owner of the device name

devname-group

sys

sys

root

Group of the device name

devname-mode

0600

0600

0600

Mode of the device name

See the section “Create XLV Logical Volumes and XFS Filesystems ” in Chapter 3, for information about creating XLV logical volumes.

Filesystem Configuration

This section describes filesystem configuration for FailSafe using XFS filesystems. For information on coexecution of FailSafe and CXFS filesystems, see “Coexecution of CXFS and IRIS FailSafe”.

Planning Filesystems

FailSafe supports the failover of XFS filesystems on shared disks. Shared disks must be either Fibre Channel or SCSI JBOD or in RAID storage systems that are shared between nodes in the FailSafe cluster. Fibre Channel and SCSI JBOD storage systems must use XLV mirroring.

The following are special issues that you must be aware of when you are working with filesystems on shared disks in a cluster:

  • All filesystems to be failed over must be XFS filesystems.

  • All filesystems to be failed over must be created on XLV logical volumes on shared disks.

  • For availability, filesystems to be failed over in a cluster must be created on either mirrored disks (using the XLV plexing software) or on the Fibre Channel RAID storage system.

  • Create the mount points for the filesystems on all nodes in the failover domain.

  • When you set up the various IRIS FailSafe filesystems on each node, make sure that each filesystem uses a different mount point.

  • Do not simultaneously mount filesystems on shared disks on more than one node. Doing so causes data corruption. Normally, FailSafe performs all mounts of filesystems on shared disks. If you manually mount a filesystem on a shared disk, make sure that it is not being used by another node.

  • Do not place filesystems on shared disks in the /etc/fstab file. FailSafe mounts these filesystems only after making sure that another node does not have these filesystems mounted.

The name of a resource of the filesystem resource type is the mount point of the filesystem.

When clients are actively writing to a FailSafe NFS filesystem during failover of filesystems, data corruption can occur unless filesystems are exported with the mode wsync. This mode requires that local mounts of the XFS filesystems use the wsync mount mode as well. Using wsync affects performance considerably.


Caution: Do not cross-mount filesystems using NFS in a FailSafe cluster (that is, do not mount a locally mounted filesystem on a different node using NFS). This configuration is not reliable and will not work with FailSafe. Instead, you should use the CXFS (clustered XFS) plug-in, which provides this functionality. For more information, see IRIS FailSafe Version 2 NFS Administrator's Guide.

Use of NFS over TCP is not recommended: if the client loses the TCP connection and does not reconnect, it can cause the client to hang on a failover. You should use UDP rather than TCP. Note that TCP may be the default for your NFS clients, requiring you to reconfigure them to use UDP. One method to accomplish this is to create the /etc/config/nfsd.options file with the content -p UDP, which will prevent the server from accepting TCP mount requests.


Example Filesystem Configuration

Continuing with the example configuration from the section “Example Logical Volume Configuration”, suppose you have the following XFS filesystems:

  • xfsA on VolA is mounted at /sharedA with modes rw and noauto

  • xfsB on VolB is mounted at /sharedB with modes rw, noauto, and wsync

  • xfsC on VolC is mounted at /sharedC with modes rw and noauto

Table 2-2 lists a label and configuration parameters for each filesystem.

Table 2-2. Filesystem Configuration Parameters

Attribute

/sharedA

/sharedB

/sharedC

Comments

monitor-level

2

2

2

There are two levels of monitoring:

1 - checks /etc/mtab file

2 - checks if the filesystem is mounted using the stat(1M) command

volume-name

VolA

VolB

VolC

The label of the logical volume on which the filesystem was created

mode

rw, noauto

rw, noauto, wsync

rw, noauto

The modes of the filesystem (identical to the modes specified in /etc/fstab)

Figure 2-6 shows the following:

  • Resource group 1 has two XFS filesystems (xfsA and xfsB) and two XLV volumes (VolA and VolB)

  • Resource group 2 has one XFS filesystem (xfsC) and one XLV volume (VolC)

Figure 2-6. Filesystems and Logical Volumes

See “Create XLV Logical Volumes and XFS Filesystems ” in Chapter 3, for information about creating XFS filesystems.

HA IP Address Configuration

This section contains the following:

Planning Network Interface and HA IP Address Configuration

Use the following guidelines when planning interface configuration for the private control network between nodes:

  • Each interface has one IP address.

  • The HA IP addresses used on each node for the interfaces to the private network are on a different subnet from the IP addresses used for public networks.

  • An IP name can be specified for each HA IP address in /etc/hosts.

  • A naming convention that identifies these HA IP addresses with the private network can be helpful. For example, precede the hostname with priv- (for private), as in priv-xfs-ha1 and priv-xfs-ha2.

Use the following guidelines when planning the interface configuration for one or more public networks:

  • If re-MACing is required, each interface to be failed over requires a dedicated backup interface on the other node (an interface that does not have an HA IP address). Thus, for each HA IP address on an interface that requires re-MACing, there should be one interface in each node in the failover domain dedicated for the interface.

  • Each interface has a primary IP address also known as the fixed address. The primary IP address does not fail over.

  • The hostname of a node cannot be an HA IP address.

  • All HA IP addresses used by clients to access HA services must be part of the resource group to which the HA service belongs.

  • If re-MACing is required, all of the HA IP addresses must have the same backup interface.

  • Making good choices for HA IP addresses is important; these are the “hostnames” that will be used by users of the HA services, not the true hostnames of the nodes.

  • Make a plan for publicizing the HA IP addresses to the user community, because users of HA services must use HA IP addresses instead of the output of the hostname command.

  • HA IP addresses should not be configured in the /etc/config/netif.options file. HA IP addresses also should not be defined in the /etc/config/ipaliases.options file.

Use the following procedure to determine whether re-MACing is required (see the section “Network Interfaces and IP Addresses” in Chapter 1 for information about re-MACing). It requires the use of three nodes: node1, node2, and node3. node1 and node2 can be nodes of a FailSafe cluster, but they need not be. They must be on the same subnet. node3 is a third node. If you must verify that a router accepts gratuitous ARP packets (which means that re-MACing is not required), node3 must be on the other side of the router from node1 and node2 .

  1. Configure an HA IP address on one of the interfaces of node1:

    # /usr/etc/ifconfig interface inet ip_address netmask netmask up

    interface is the interface to be used access the node. ip_address is an IP address for node1; this IP address is used throughout this procedure. netmask is the netmask of the IP address.

  2. From node3, contact the HA IP address used in step 1 using the ping(1M) command :

    # ping -c 2 ip_address
    PING 190.0.2.1 (190.0.2.1): 56 data bytes
    64 bytes from 190.0.2.1: icmp_seq=0 ttl=255 time=29 ms
    64 bytes from 190.0.2.1: icmp_seq=1 ttl=255 time=1 ms
    
    ----190.0.2.1 PING Statistics----
    2 packets transmitted, 2 packets received, 0% packet loss
    round-trip min/avg/max = 1/1/1 ms

  3. Enter the following command on node1 to shut down the interface you configured in step 1:

    # /usr/etc/ifconfig interface down

  4. On node2, enter the following command to move the HA IP address to node2:

    # /usr/etc/ifconfig interface inet ip_address netmask netmask up

  5. On node3, contact the HA IP address:

    # ping -c 2 ip_address

    If the ping(1) command fails, gratuitous ARP packets are not being accepted and re-MACing is needed to fail over the HA IP address.

Example HA IP Address Configuration

Table 2-3 shows the FailSafe configuration parameters you could specify for these HA IP addresses.

Table 2-3. HA IP Address Configuration Parameters

Resource Attribute

Resource Name:
192.26.50.1

Resource Name:
192.26.50.2

Network mask

0xffffff00

0xffffff00

Broadcast address

192.26.50.255

192.26.50.255

Interface

ef0

ef0


Local Failover of HA IP Addresses

You can configure your system so that an HA IP address will fail over to a second interface within the same node, for example from ef0 to ef1. A configuration example that shows the steps you must follow for this configuration is provided in “Example: Local Failover of HA IP Address” in Chapter 6.

Coexecution of CXFS and IRIS FailSafe

CXFS, the clustered XFS filesystem, allows groups of computers to coherently share large amounts of data while maintaining high performance. Users can use FailSafe in a CXFS cluster to provide highly available services (such as NFS or Web) running on a CXFS filesystem. This combination provides high-performance shared data access for highly available applications.

CXFS 6.5.10 or later and IRIS FailSafe 2.1 or later (plus relevant patches) may be installed and run on the same system, which is known as coexecution. This allows you to have application-level high availability with a clustered filesystem.

A subset of nodes in a coexecution cluster can be configured to be used as FailSafe nodes; a coexecution cluster can have up to eight nodes that run FailSafe.

This section contains the following:

See also “Communication Paths in a Coexecution Cluster” in Appendix A.

Size of the Coexecution Cluster

All nodes in a CXFS cluster will run CXFS, and up to eight of those nodes can also run FailSafe. Even when you are running CXFS and FailSafe, there is still only one pool, one cluster, and one cluster configuration.

It is recommended that a production cluster can be configured with a minimum of three weighted nodes (CXFS weight) and a maximum of 16 nodes. (A cluster with reset cables and only two weighted nodes is supported, but there are inherent issues with this configuration; see the CXFS Version 2 Software Installation and Administration Guide.)

Cluster Type

The cluster can be one of three types:

  • FailSafe. In this case, all nodes will also be of type FailSafe.

  • CXFS. In this case, all nodes will be of type CXFS.

  • CXFS and FailSafe (coexecution). In this case, all nodes will be a mix of type CXFS and type CXFS and FailSafe, using FailSafe for application-level high availability and CXFS.


    Note: Although it is possible to configure a coexecution cluster with type FailSafe only nodes, SGI does not support this configuration.


Node Types for CXFS Metadata Servers

All potential metadata server nodes must be of one of the following types:

  • CXFS

  • CXFS and FailSafe

CXFS Metadata Servers and Failover Domain

The metadata server list must exactly match the failover domain list (the names and the order of names).

CXFS Resource Type for FailSafe

FailSafe provides the CXFS resource type, which can be used to fail over applications that use CXFS filesystems. CXFS resources must be added to the resource group that contain the resources that depend on a CXFS filesystem. The name of the CXFS resource is the CXFS filesystem mount point.

The CXFS resource type has the following characteristics:

  • It does not start all resources that depend on the CXFS filesystem until the CXFS filesystem is mounted on the local node.

  • The start and stop action scripts for the CXFS resource type do not mount and unmount CXFS filesystems, respectively. (The start script waits for the CXFS filesystem to become available; the stop script does nothing but its existence is required by FailSafe.) Users should use the CXFS GUI or cmgr(1M) command to mount and unmount CXFS filesystems.

  • It monitors CXFS filesystem for failures.

  • Optionally, for applications that must run on a CXFS metadata server, the CXFS resource type relocates the CXFS metadata server when there is an application failover. In this case, the application failover domain (AFD) for the resource group should consist of the CXFS metadata server and the metadata server backup nodes.

The CXFS filesystems that an NFS server exports should be mounted on all nodes in the failover domain using the CXFS GUI or the cmgr(1M) command.

For example, following are the commands used to create resources NFS, CXFS and statd_unlimited based on a CXFS filesystem mounted on /FC/lun0_s6. (This example assumes that you have defined a cluster named test-cluster and that you have already created a failover policy named cxfs-fp and a resource group named cxfs-group based on this policy.)

cmgr> define resource /FC/lun0_s6 of resource_type CXFS in cluster test-cluster
Enter commands, when finished enter either "done" or "cancel"

Type specific attributes to create with set command:

Type Specific Attributes - 1: relocate-mds


No resource type dependencies to add

resource /FC/lun0_s6 ? set relocate-mds to false
resource /FC/lun0_s6 ? done


============================================

cmgr> define resource /FC/lun0_s6 of resource_type NFS in cluster test-cluster
Enter commands, when finished enter either "done" or "cancel"

Type specific attributes to create with set command:

Type Specific Attributes - 1: export-info
Type Specific Attributes - 2: filesystem


No resource type dependencies to add

resource /FC/lun0_s6 ? set export-info to rw
resource /FC/lun0_s6 ? set filesystem to /FC/lun0_s6
resource /FC/lun0_s6 ? done


============================================
cmgr> define resource /FC/lun0_s6/statmon of resource_type statd_unlimited in cluster test-cluster
Enter commands, when finished enter either "done" or "cancel"

Type specific attributes to create with set command:

Type Specific Attributes - 1: ExportPoint


Resource type dependencies to add:

Resource Dependency Type - 1: NFS


resource /FC/lun0_s6/statmon ? set ExportPoint to /FC/lun0_s6
resource /FC/lun0_s6/statmon ? add dependency /FC/lun0_s6 of type NFS
resource /FC/lun0_s6/statmon ? done

==============================================
cmgr> define resource_group cxfs-group in cluster test-cluster
Enter commands, when finished enter either "done" or "cancel"

resource_group cxfs-group ? set failover_policy to cxfs-fp
resource_group cxfs-group ? add resource /FC/lun0_s6 of resource_type NFS
resource_group cxfs-group ? add resource /FC/lun0_s6 of resource_type CXFS
resource_group cxfs-group ? add resource /FC/lun0_s6/statmon of resource_type statd_unlimited
resource_group cxfs-group ? done

Separate CXFS and FailSafe GUIs

There is one cmgr(1M) command but separate graphical user interfaces (GUIs) for CXFS and for FailSafe. You must manage CXFS configuration with the CXFS GUI and FailSafe configuration with the FailSafe GUI; you can manage both with cmgr.

Conversion Between CXFS and FailSafe

Using the CXFS GUI or cmgr(1M), you can convert an existing FailSafe cluster and nodes to type CXFS or to type CXFS and FailSafe. You can perform a parallel action using the FailSafe GUI. A converted node can be used by FailSafe to provide application-level high-availability and by CXFS to provide clustered filesystems.

However:

  • You cannot change the type of a node if the respective high availability (HA) or CXFS services are active. You must first stop the services for the node.

  • The cluster must support all of the functionalities (FailSafe and/or CXFS) that are turned on for its nodes; that is, if your cluster is of type CXFS, then you cannot modify a node that is already part of the cluster so that it is of type FailSafe. However, the nodes do not have to support all the functionalities of the cluster; that is, you can have a CXFS node in a CXFS and FailSafe cluster.

Network Interfaces

For FailSafe, you must have at least two network interfaces. However, CXFS uses only one interface for both heartbeat and control messages.

When using FailSafe and CXFS on the same node, only the priority 1 network will be used for CXFS and it must be set to allow both heartbeat and control messages.


Note: CXFS will not fail over to the second network. If the priority 1 network fails, CXFS will fail but FailSafe services may move to the second network if the node is CXFS and FailSafe.

If CXFS resets the node due to the loss of the priority 1 network, it will cause FailSafe to remove the node from the FailSafe membership; this in turn will cause resource groups to fail over to other FailSafe nodes in the cluster.