This guide describes the configuration and administration of a FailSafe highly available system.
This guide was prepared in conjunction with IRIS FailSafe 2.1.5, which supports IRIX 6.5.20 and later.
This guide is written for the person who administers the FailSafe system. The FailSafe administrator must be familiar with the operation of Origin servers, as well as optional Origin Vault, Fibre Channel RAID, JBOD, SGI TP9100, or SGI TP9400 storage systems, whichever is used in the FailSafe configuration. Good knowledge of XLV and XFS is also required.
To use Performance Co-Pilot (PCP) for FailSafe, you must have the following licenses:
Two or more PCP Collector licenses (PCPCOL), one for each node in the FailSafe cluster from which you want to collect performance metrics.
One PCP Monitor license (PCPMON) for the workstation that is to run the visualization tools.
FailSafe configuration and administration information is presented in the following chapters and appendices:
Chapter 1, “Overview”, introduces the components of the FailSafe system and explains its hardware and software architecture.
Chapter 2, “Configuration Planning”, describes how to plan the configuration of a FailSafe cluster.
Chapter 3, “Installation and System Preparation” describes several procedures that must be performed on nodes in a cluster to prepare them for FailSafe.
Chapter 4, “Administration Tools”, provides an overview of the FailSafe Manager GUI and the cmgr command.
Chapter 5, “Configuration”, explains how to configure a FailSafe system.
Chapter 6, “Configuration Examples”, shows an example of a FailSafe three-node configuration and some variations on that configuration.
Chapter 7, “FailSafe System Operation”, explains how to operate and monitor a FailSafe system.
Chapter 8, “Testing the Configuration”, describes how to test the configured FailSafe system.
Chapter 9, “System Recovery and Troubleshooting”, describes the log files used by FailSafe and recovery procedures.
Chapter 10, “Upgrading and Maintaining Active Clusters”, describes some procedures you may need to perform without shutting down a FailSafe cluster.
Chapter 11, “Performance Co-Pilot for FailSafe”, tells you how to use PCP to monitor the availability of a FailSafe cluster.
Appendix C, “Updating from IRIS FailSafe 1.2 to IRIS FailSafe 2.1.x”, describes the upgrade procedure.
Appendix D, “FailSafe Software”, summarizes the systems to install on each component of a cluster.
Appendix E, “Metrics Exported by PCP for FailSafe”, lists the metrics implemented by pmdafsafe.
The following documentation will be useful in a FailSafe environment:
IRIS FailSafe Version 2 Programmer's Guide
Performance Co-Pilot for IRIX Advanced User's and Administrator's Guide
CXFS Version 2 Software Installation and Administration Guide
IRIS FailSafe 2.0 DMF Administrator's Guide
IRIS FailSafe 2.0 INFORMIX Administrator's Guide
IRIS FailSafe 2.0 Netscape Server Administrator's Guide
IRIX FailSafe NFS Administrator's Guide
IRIS FailSafe 2.0 Oracle Administrator's Guide
IRIS FailSafe Version 2 Samba Administrator's Guide
IRIS FailSafe Version 2 TMF Administrator's Guide
Embedded Support Partner User Guide
The FailSafe man pages are as follows:
cdbBackup(1M)
cdbRestore(1M)
cmgr(1M)
crsd(1M)
failsafe(7M)
fs2d(1M)
ha_cilog(1M)
ha_cmsd(1M)
ha_exec2(1M)
ha_fsd(1M)
ha_gcd(1M)
ha_ifd(1M)
ha_ifdadmin(1M)
ha_macconfig2(1M)
ha_srmd(1M)
ha_statd2(1M)
haStatus(1M)
Release notes are included with each FailSafe product. The names of the release notes are as follows:
Release Note | Product |
|---|---|
cluster_admin | Cluster administration services |
cluster_control | Node control services |
cluster_services | Cluster services |
failsafe2 | IRIS FailSafe 2.1.x |
failsafe2_dmf | IRIS FailSafe for DMF |
failsafe2_informix | IRIS FailSafe for INFORMIX |
failsafe2_nfs | IRIS FailSafe for NFS |
failsafe2_oracle | IRIS FailSafe for Oracle |
failsafe2_samba | IRIS FailSafe for Samba |
failsafe2_tmf | IRIS FailSafe for TMF |
You can obtain SGI documentation in the following ways:
See the SGI Technical Publications Library at http://docs.sgi.com. Various formats are available. This library contains the most recent and most comprehensive set of online books, release notes, man pages, and other information.
If it is installed on your SGI system, you can use InfoSearch, an online tool that provides a more limited set of online books, release notes, and man pages. With an IRIX system, select Help from the Toolchest, and then select InfoSearch. Or you can type infosearch on a command line.
You can also view release notes by typing either grelnotes or relnotes on a command line.
You can also view man pages by typing man title on a command line.
The following conventions are used throughout this document:
| Convention | Meaning | |
| command | This fixed-space font denotes literal items such as commands, files, routines, path names, signals, messages, and programming language structures. | |
| manpage(x) | Man page section identifiers appear in parentheses after man page names. (1) indicates a user command, (1M) and (8) indicate an administrator command | |
| variable | Italic typeface denotes variable entries and words or concepts being defined. | |
| GUI | This font denotes the names of graphical user interface (GUI) elements such as windows, screens, dialog boxes, menus, toolbars, icons, buttons, boxes, fields, and lists. | |
| user input | This bold, fixed-space font denotes literal items that the user enters in interactive sessions. (Output is shown in nonbold, fixed-space font.) | |
| [ ] | Brackets enclose optional portions of a command or directive line. | |
| ... | Ellipses indicate that a preceding element can be repeated. |
If you have comments about the technical accuracy, content, or organization of this publication, contact SGI. Be sure to include the title and document number of the publication with your comments. (Online, the document number is located in the front matter of the publication. In printed publications, the document number is located at the bottom of each page.)
You can contact SGI in any of the following ways:
Send e-mail to the following address:
techpubs@sgi.com
Use the Feedback option on the Technical Publications Library Web page:
Contact your customer service representative and ask that an incident be filed in the SGI incident tracking system.
Send mail to the following address:
| Technical Publications |
| SGI |
| 1600 Amphitheatre Parkway, M/S 535 |
| Mountain View, California 94043-1351 |
Send a fax to the attention of “Technical Publications” at +1 650 932 0801.
SGI values your comments and will respond to them promptly.