Chapter 5. System Maintenance, Monitoring, and Debugging

This chapter describes system monitoring and covers the following topics:

Maintenance Procedures

This section describes some common maintenance procedures, as follows:

Temporarily Take a Node Offline for Maintenance

This section describes how to temporarily take a node offline for maintenance.

Procedure 5-1. Temporarily Take a Node Offline for Maintenance

    To temporarily Take a node offline for maintenance, perform the following steps:

    1. Disable the node in the batch scheduler (depends on your batch scheduler).

    2. Power off the node, as follows:

      # cpower --down r1i0n0

    3. Mark the node offline, as follows:

      # cadmin --set-admin-status --node r1i0n0 offline

    4. Perform any maintenance to the blade that needs to be done.

    5. Mark the node online, as follows:

      # cadmin --set-admin-status --node r1i0n0 online

    6. Power up the node, as follows:

      # cpower --boot r1i0n0

    7. Enable the node in the batch scheduler (depends on your batch scheduler).

    Permanently Replace a Failed Blade


    Note: See your SGI field support person for the physical removal and replacement of SGI Altix ICE compute nodes (blades).


    This section describes how to permanently replace a failed blade.

    Procedure 5-2. Permanently Replace a Failed Blade

      To permanently replace a failed blade (compute node), perform the following steps:

      1. Disable the node in the batch scheduler (depends on your batch scheduler).

      2. Power off the node, as follows:

        # cpower --down r1i0n0

      3. Mark the node offline, as follows:

        # cadmin --set-admin-status --node r1i0n0 offline

      4. Physically remove and replace the failed blade.

      5. In the Tempo 1.3 release, it is not necessary to run discover-rack when a blade is replaced. This is handled by blademond daemon. See “Discovering Compute Nodes” in Chapter 2, for more information.

      6. Set the node to boot your desired compute image (see cimage --list-images and “cimage Command” in Chapter 3 for your options), as follows:

        # cimage --set mycomputeimage mykernel r1i0n0

      7. Power up the node, as follows:

        # cpower --boot r1i0n0

      8. Enable the node in the batch scheduler (depends on your batch scheduler).

      Permanently Remove a Blade

      This section describes how to permanently remove a blade from your Altix ICE system.

      Procedure 5-3. Permanently Remove a Blade

        To permanently remove a blade from your system, perform the following steps:

        1. Disable the node in the batch scheduler (depends on your batch scheduler).

        2. Power off the node, as follows:

          # cpower --down r1i0n0

        3. Mark the node offline, as follows:

          # cadmin --set-admin-status --node r1i0n0 offline

        4. Physically remove the failed blade.

        5. In the Tempo 1.3 release, it is not necessary to run discover-rack when a blade is replaced. This is handled by blademond daemon. See “Discovering Compute Nodes” in Chapter 2, for more information.

        Add a New Blade

        This section describes how to add a new blade to an Altix ICE system.

        Procedure 5-4. Add a New Blade

          To add a new blade to your system, perform the following steps:

          1. Physically insert the new blade

          2. In the Tempo 1.3 release, it is not necessary to run discover-rack when a blade is replaced. This is handled by blademond daemon. See “Discovering Compute Nodes” in Chapter 2, for more information.

          3. Set the node to boot your desired compute image (see cimage --list-images and “cimage Command” in Chapter 3 for your options), as follows:

            # cimage --set mycomputeimage mykernel r1i0n0

          4. Power up the node, as follows:

            # cpower --boot r1i0n0

          5. Enable the node in the batch scheduler (depends on your batch scheduler).

          Node Replacement Procedure for Admin, Leader, and Service Nodes

          This section describe how to install and configure a spare admin, leader, or managed service node. It covers the following topics:


          Note: When ordering shelf spare systems from SGI, it is important to order spare nodes appropriate to or in conjunction with your SGI Altix ICE system. This is because the Altix ICE serial number is programmed into the admin node itself. If you try to migrate to a shelf spare system that does not have the correct Altix ICE system serial number programmed into it, parts of Tempo software may not work correctly. In particular, the Embedded Support Partner (ESP) software will fail to start if the system serial number does not match the number that was previously in use.


          Shelf Spare Admin or Leader Node Availability

          A shelf spare node is like an existing admin or leader node, but it sits on a shelf to be used in an emergency.

          If the admin or leader node should fail, the shelf spare can be swapped in to position to take over the duities of the failed node.

          If you wish to make use of shelf spare nodes, SGI suggests that you have both an admin node and a leader node on the shelf as available spares. Some of the reasons to have two separate nodes instead of one are (not an exhaustive list), as follows:

          • The BIOS settings of an admin and leader node are different. For example, an admin node does not PXE boot by default. However, a leader node must PXE boot each boot. This means the boot order is different for each type.

          • The BMC of a leader node is set up to use DHCP by default. An admin node may not be set up this way.

          • Given the examples cited about, if you try to use a shelf-spare admin node as a leader, the leader will not be properly discovered.

          Shelf Spare Hardware Limitations

          Currently, the hardware replacement procedure described in this section only supports Altix ice-csn nodes, that is, admin controller and rack leader controller nodes and managed service nodes.

          Tools Required

          You will need a Video Graphics Array (VGA) screen and a keyboard to perform this procedure. This is because you need to interact with the LSI BIOS tool to import the root volumes. You cannot do this from an Intelligent Platform Management Interface (IPMI) serial console session because of the following:

          • For leader nodes, the cluster does not know the MAC addresses of the replacement BMC so there is no way for the cluster to connect to it until the migration script is run.

          • The LSI BIOS tool requires the use of Alt characters which often do not transfer through the serial console properly.

          Migrating to a Shelf Spare: Installing the Hardware

          If you find that an admin node or leader node has failed and you need to replace it with a shelf spare system, this section describes what to do in terms of the physical hardware.

          Admin nodes are the only node type that store the system-wide serial number. Therefore, if you use a shelf spare leader node as an admin node, ESP will fail to start properly due to the system serial number mismatch and much of the logging and monitoring infrastructure will fail to function. The admin node shelf spares must be ordered from the factory as an admin node shelf spare so that the proper serial number can be stored within.

          Procedure 5-5. Migrating to a Shelf Spare: Installing the Hardware

            To replace an admin node or leader node that has failed, perform the following steps:

            1. Power down the failed node (if possible).

            2. Disconnect the failed node from AC power.

            3. Remove the two system disks from the failed node and set them aside for later reinstallation.

            4. Remove the Ethernet cables. Label the cables to avoid confusing them. It is important that they stay in the same jacks in the new node.

            5. Remove the system from the rack.

            6. Install the shelf spare system into the rack.

            7. Install the system disks you set aside in step 3 (from the system you are replacing).

            8. Connect the Ethernet cables in the same way they were connected to the replaced node.

            9. Connect AC power.

            10. Connect a keyboard and VGA monitor (and mouse if you like).

            11. Do NOT power up the system just yet. Proceed to “Migrating to a Shelf Spare: Importing the Disk Volumes”.

            Migrating to a Shelf Spare: Importing the Disk Volumes

            This section describes how to import the disk volumes into the new node installed in “Migrating to a Shelf Spare: Installing the Hardware”.

            Procedure 5-6. Migrating to a Shelf Spare: Importing the Disk Volumes

              To import the disk volumes into the new node, perform the following steps:

              1. At this time, you can power up the system using the power button.

              2. Watch the VGA screen output.

              3. When you see the LSI BIOS tool come up up, enter Ctrl-C. This will instruct the LSI BIOS tool to enter the configuration utility.

              4. A screen appears listing the LSI controllers in the system. Normally, there is just one. Hit the Enter key to proceed.

              5. Choose RAID Properties.

              6. It is important to note that the controller supports only two RAIDs at a time. Therefore, if the system had two volumes at a time in the past, one or more volumes may appear empty now. It is important to use the utility to delete these empty volumes representing disks that are no longer installed before proceeding. Otherwise, if the tool sees more than one volume, activating volumes will not work.

              7. Enter Alt-N to browse the list of volumes. Delete the empty ones as described in the step, above. Eventually, you will encounter an inactive volume. This inactive volume represents the disks you migrated from the failed node to this node.

              8. With the inactive volume selected, choose Manage Array.

              9. Choose Activate and answer y to the activate and exit this menu choice.

              10. At this point, especially if the node has more than one volume, it is important to select the migrated system disk volume as the boot volume. To select the boot volume, choose SAS Topology .

              11. In SAS Topology, you can expand the volumes to see the disks within them if you choose by hitting Enter on volumes.

              12. Choose the volume that represents your newly imported volume. Highlight it, then enter Alt-B.

              13. You should see that the volume now has a Boot flag associated with it.


                Note: If, after you exit the tool, the system does not appear to boot from the disk. You may have selected the wrong volume from which to boot. In that case, reset, re-enter the LSI BIOS Tool, and choose a different volume to be the boot volume.


              14. Escape out of the LSI tool and exit.

              15. Keep watching the VGA screen! You will have to hit a key at the correct moment in the next section. Go to “Migrating to a Shelf Spare: Booting for the First Time on the Migrated Node”.

              Migrating to a Shelf Spare: Booting for the First Time on the Migrated Node

              This section provides details on booting the system for the first time on the replacement node. These instructions include some special things you need to do with GRUB boot loader to ensure, for this boot only, that the console output goes to the VGA screen. This important because on leader nodes, there is no way to connect to the BMC with IPMI at this moment to use the IPMI serial console. The console command will not work for the leader node until the system is configured as described in this section. The network will not be properly configured until the end of this procedure either.

              Procedure 5-7. Migrating to a Shelf Spare: Booting for the First Time on the Migrated Node

                To boot for the first time on a migrated node, perform the following steps:

                1. At this moment, the node is in the process of resetting because you exited the LSI BIOS tool at the end of the procedure, above (see “Migrating to a Shelf Spare: Importing the Disk Volumes”).

                  On leader nodes, the node will attempt to PXE boot as it comes up. This is normal. The PXE boot will fail and this is normal. On the admin node, no PXE will be attempted. In either case, the system will eventually try to boot from disk.


                  Note: If it is not booting from disk, the wrong volume may be selected as the boot disk in the LSI BIOS tool. See “Migrating to a Shelf Spare: Importing the Disk Volumes”.


                  When you see the GRUB boot menu come up, the first boot option will be highlighted by default. This should NOT be the choice starting with Failsafe. As an example, in SGI Tempo 1.4, the highlighted choice should be : SUSE Linux Enterprise Server 10 SP2. Enter e to edit the boot parameters for this boot only.

                2. Arrow down once so that the line starting kernel is highlighted.

                3. Enter e to edit the kernel parameters.

                4. Now you need to add console=tty0 as the final parameter in the list. This ensures that console output goes to the VGA screen for this boot. Enter the space character followed by console=tty0. The line should look similar to the following after adding the console parameter (characters wrapped in the front):

                  <hkernel=128M@16M rootflags=prjquota,logbsize=256k console=tty0 

                5. Press the Enter key.

                6. Enter b to boot the system.

                  The system will now boot with console output going to the VGA screen.

                  Networking will fail to start and some error messages will appear.

                  It is normal to see that the Ethernet devices were renumbered. This will be fixed below.

                  Eventually the login prompt will appear.

                7. Log in as root.

                8. The following script fixes the network settings and update the SGI Tempo database for the new network interfaces, as follows:

                  # migrate-to-shelf-spare-node


                  Note: If you have Ethernet cards installed, in addition to the ones that come with the system itself, the script could possibly guess the integrated Ethernet devices incorrectly. This may mean you have to manually configure networking including the ifcfg-eth-id-* files in /etc/sysconfig/network and the /etc/udev/rules.d/30-net_persistent_names.rules file (to number them how you want and ensure integrated Ethernet is eth0 and eth1).


                  At this time, networking should be operational.

                9. Reboot the node and let it boot normally.

                Inventory Verification Tool

                You can use the SGI Tempo inventory verification tool to query, take snapshots, analyze and compare the node and network inventory of a cluster. Various hardware, network and operating system configuration properties are available and are presented in user-specified formats.


                Note: If you are reinstalling the system admin controller (admin node), you may want to make a backup of the cluster configuration snapshot that comes with your system so that you can recover it later. You can find it in the /opt/sgi/var/ivt directory on the admin node; it is the earliest snapshot taken. You can use this information with the interconnect verification tool (IVT) to verify that the current system shows the same hardware configuration as when it was shipped. For more information, see “Installing Software on the System Admin Controller ” in Chapter 2.


                To make an inventory snapshot of an Altix ICE system, use the following command from the system admin controller (admin node).

                system-admin:~ # ivt -M
                Making a cluster inventory snapshot.  Takes a couple of minutes...  

                Each snapshot is assigned a unique number and marked with the date and time it was taken. Use the ivt -L command to list active snapshot information, as follows:

                system-admin:~ # ivt -L
                    1   2007-07-13.11:42:47

                You can query (-Q option), compare ( -C option) and analyze (-S option) existing snapshots. A variety of system hardware and configuration properties can be displayed. You can compare two snapshots to see what has changed or analyze a system snapshot for failed nodes and or see network fabric links.

                You use the ivt command to show general information about your system (note that only a portion of the output of this command is shown below), as follows:

                system-admin:~ # ivt -S
                
                Your system has 6 compute blades.
                
                All 6 blades have the following characteristics:
                    bios_date: 05/29/2007
                    cpu_core_count: 8
                    cpu_model: Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
                    kernel: 2.6.16.46-0.12-smp
                    memsize: 2059264
                    os_product: SLES
                    os_vendor: SUSE
                    os_version: 10.1
                
                The following characteristics have different values for some blades.
                
                  ib0_phys_state (State of InfiniBand ib0 physical link):
                          4 blades have ib0_phys_state == LinkUp (r1i0n0, r1i1n0, r1i0n8, ...)
                          2 blades have ib0_phys_state == unknown (r1i0n1, r1i1n1)
                      Query the  value for all blades with:
                        ivt -Q -w blades -f 'blade $blade has ib0_phys_state $ib0_phys_state'
                
                  ib0_rate (Rate of InfiniBand ib0 link - Gb/sec):
                          2 blades have ib0_rate == unknown (r1i0n1, r1i1n1)
                          4 blades have ib0_rate == 20 (r1i0n0, r1i1n0, r1i0n8, ...)
                      Query the  value for all blades with:
                        ivt -Q -w blades -f 'blade $blade has ib0_rate $ib0_rate'
                ...
                
                  ib_bios_rev (Revision of InfiniBand BIOS on blade):
                          2 blades have ib_bios_rev == unknown (r1i0n1, r1i1n1)
                          4 blades have ib_bios_rev == 1.2.0 (r1i0n0, r1i1n0, r1i0n8, ...)
                      Query the  value for all blades with:
                        ivt -Q -w blades -f 'blade $blade has ib_bios_rev $ib_bios_rev'
                
                  image (image provisioned on blade):
                          5 blades have image == compute-sles10sp1 (r1i0n1, r1i1n1, r1i1n0, ...)
                          1 blades have image == erikj-blade-mksiimage (r1i0n0)
                      Query the  value for all blades with:
                        ivt -Q -w blades -f 'blade $blade has image $image'
                
                  rack_blade_count (number of booted blades in this blades rack):
                          2 blades have rack_blade_count == 5 (r1i0n1, r1i1n1)
                          4 blades have rack_blade_count == 4 (r1i0n0, r1i1n0, r1i0n8, ...)
                      Query the  value for all blades with:
                        ivt -Q -w blades -f 'blade $blade has rack_blade_count $rack_blade_count'
                
                InfiniBand GUID check:
                  Do fabric (ibnetdiscover) and blades (ib stat) have same GUIDs?
                    ib0 plane: unmatched GUIDs
                    GUIDs seen on blade ports, missing on fabric: unknown 0030487aa7940000
                    GUIDs see on fabric, missing on blade ports: 0030487aa7840000 0030487aa7980000
                    ib1 plane: unmatched GUIDs
                    GUIDs seen on blade ports, missing on fabric: unknown 0030487aa7950000
                    GUIDs see on fabric, missing on blade ports: 0030487aa7850000 0030487aa7990000
                
                InfiniBand Link state check:
                  Are any IB ports not ACTIVE, not 20 Gb/sec rate or not Up?
                ...
                

                You can use the ivt -c cpu command to show an inventory of the system compute blades and the number of CPUs each blade contains, as follows:

                system-admin:~ # ivt -c cpu
                r1i0n0 has 8 CPUs
                r1i0n1 has 8 CPUs
                r1i0n8 has 8 CPUs
                r1i1n0 has 8 CPUs
                r1i1n1 has 8 CPUs
                r1i1n8 has 8 CPUs

                You can use the ivt tool to determine which compute nodes (blades) are up or down, as follows:

                system-admin:~ #  ivt -Q -w blades -f '$blade $sshstate'
                r1i0n0 up
                r1i0n1 down
                r1i0n8 up
                r1i1n0 up
                r1i1n1 down
                r1i1n8 up

                You can use the ivt tool to determine the GigE Ethernet address for each compute node (blade) , as follows:

                system-admin:~ # ivt -Q -w blades -f '$blade $gige_ip_addr'
                r1i0n0 192.168.159.10
                r1i0n1 192.168.159.11
                r1i0n8 192.168.159.18
                r1i1n0 192.168.159.26
                r1i1n1 192.168.159.27
                r1i1n8 192.168.159.34

                For detailed information on how to use the ivt tool, see the ivt(8) man page or ivt -h, --help usage statement.

                System Monitoring Overview

                Ganglia is a scalable, distributed monitoring system for monitoring system for high-performance computing systems, such as the SGI Altix ICE 8200 system. It displays web browser-based, real-time (on demand) histograms of system metrics, as shown in Figure 5-1.

                Figure 5-1. Ganglia System Monitor

                Ganglia System Monitor

                Detailed information about the Ganglia monitoring system is available at: http://ganglia.info/.

                SGI Tempo has devised a Ganglia model for the Altix ICE system that makes maximum use of Ganglia's highly scalable architecture: each compute node (blade) presents a single monitoring source sending its statistics to the rack leader controller. Therefore, the rack leader controller receives, at most, data from 64 blades. After collecting the data, the rack leader controller forwards aggregated rack statistics to the system admin controller (admin node). The rack leader controller also sends its own statistics to the system admin controller. The system admin controller presents the meta-aggregator for the entire Altix ICE system. It collects data from all rack leaders and presents the cluster-wide metrics. This model enables SGI to scale-out Ganglia to very large cluster deployments.

                The Node View as shown in Figure 5-2 can aid in system troubleshooting. For every blade in the system, the Location field of the Node View shows the exact physical location of the blade. This is an extremely useful when trying to locate a blade that is down.

                Figure 5-2. Ganglia System Monitoring Node View

                Ganglia System Monitoring Node View

                System Monitoring Operation

                This section describes the operation of the Ganglia system monitor and covers the following topics:

                Accessing the Ganglia System Monitor

                To access the Ganglia system monitor, point your browser to the following location: http://admin_pub_name /ganglia

                Monitoring System Metrics

                By default, Ganglia monitors standard operating system metrics like CPU load, memory usage. The Grid Report view shows an overview of your system, such as the number of CPUs, the number of hosts (compute nodes) that are up or down, service node information, memory usage information, and so on.

                The Last pull down menu allows you to view performance data on an hourly, daily, weekly, or yearly basis. The Sorted pull down menu allows provides an ascending, descending, or by host view of performance data. The Grid pull-down menu allows you to see performance data for a particular rack or service node. The Get Fresh Data button allows you to see current data performance.

                SEL/Hardware Event Monitoring

                The system admin controller, rack leader controllers, the service nodes, the chassis management controllers (CMCs) and all the compute nodes (blades) are equipped with a specialized controller, called the Board Management Controller(BMC). This unit provides a broad set of functions as described in the IPMI 2.0 standard. SGI TEMPO software uses the BMCs predominantly for remote power management, remote system configuration, and for gathering critical hardware events.

                Currently, critical hardware events are gathered for the following nodes: rack leader controllers (leader nodes), CMCs and compute nodes (blades). These events are logged in the following locations:

                • /var/log/messages via syslog

                • var/log/sel/sel.log

                • Embedded Support Partner (ESP)

                Whenever critical hardware event occurs, information is forwarded about the event to all three locations. You can observe a critical hardware event via syslog, via sel.log or using ESP. Furthermore, administrator-defined actions can be triggered via ESP, for instance sending an e-mail notification to the system administrator. For more information on ESP, see esp(5) man page and the SGI Embedded Support Partner User Guide.

                All critical hardware events are summarized under the BMC_CMC event type. One particular event holds the following useful information:

                MSG ::=  <syslog-prefix> TEMPO:<node> EVENT:<event> APP:<app> Date:<date> VERSION:<version> TEXT <text> 

                The following fields are all of the type string:

                <node> 

                node name, for example, r1i0n5

                <event> 

                BMC_CMC

                <app> 

                SEL-LOGGER

                <date> 

                date / time of the event

                <version> 

                1.0

                <text> 

                Exact copy of the hardware event description from the BMC

                After reading the events from the BMCs, the BMC event logs are cleared on the controller to avoid duplicate events.

                Node Availability Monitoring

                The availability of each node in the SGI Altix ICE system is monitored via Ganglia. A node is declared as down if it does not send a hearbeat for approximately 80 seconds. In this event, a NODE_DOWN Embedded Support Partner (ESP) event is generated. You can observe this event via syslog or using ESP. Furthermore, administrator-defined actions can be triggered, for instance sending an e-mail notification to the system administrator. For more information on ESP, see esp(5) man page and the SGI Embedded Support Partner User Guide.

                The NODE_DOWN event contains the following useful information:

                MSG ::=  <syslog-prefix> TEMPO:<node> EVENT:<event> APP:<app> Date:<date> VERSION:<version> TEXT <text> 

                The NODE_DOWN event is created only once for a failed node.

                The following fields are all of the type string:

                <node> 

                node name, for example, r1i0n5

                <event> 

                NODE_DOWN

                <app> 

                MIA

                <date> 

                date / time of the event

                <version> 

                1.0

                <text> 

                Ganglia Web link to failed node

                Monitoring System Metrics with Performance Co-Pilot

                A wealth of system metrics are also available through the Performance Co-Pilot (see Performance Co-Pilot Linux User's and Administrator's Guide). The Performance Co-Pilot collection daemon (PMCD) runs on the admin node, managed service nodes, and rack leader nodes. As of the SGI ProPack 5 Service Pack 5 release, a performance metrics domain agent (PMDA) is running on the rack leader nodes, which collects metrics from the compute nodes.

                The new cluster metrics domain contains metrics that were previously available in other PMDAs. The method in which they are collected is different in a Tempo system, in order to minimize load on the compute nodes. The following metrics are available for each compute node in a system by querying the PMCD on their rack leader node:

                sys-admin:~ # pminfo -h r1lead cluster
                cluster.control.suspend_monitoring
                cluster.kernel.percpu.cpu.user
                cluster.kernel.percpu.cpu.sys
                cluster.kernel.percpu.cpu.idle
                cluster.kernel.percpu.cpu.intr
                cluster.kernel.percpu.cpu.wait.total
                cluster.mem.util.free
                cluster.mem.util.bufmem
                cluster.mem.util.dirty
                cluster.mem.util.writeback
                cluster.mem.util.mapped
                cluster.mem.util.slab
                cluster.mem.util.cache_clean
                cluster.mem.util.anonpages
                cluster.network.interface.in.bytes
                cluster.network.interface.in.errors
                cluster.network.interface.in.drops
                cluster.network.interface.out.bytes
                cluster.network.interface.out.errors
                cluster.network.interface.out.drops
                cluster.network.ib.in.bytes
                cluster.network.ib.in.errors.drop
                cluster.network.ib.in.errors.filter
                cluster.network.ib.in.errors.local
                cluster.network.ib.in.errors.remote
                cluster.network.ib.out.bytes
                cluster.network.ib.out.errors.drop
                cluster.network.ib.out.errors.filter
                cluster.network.ib.total.errors.link
                cluster.network.ib.total.errors.recover
                cluster.network.ib.total.errors.integrity
                cluster.network.ib.total.errors.vl15
                cluster.network.ib.total.errors.overrun
                cluster.network.ib.total.errors.symbol

                Monitoring SDR Metrics

                In SGI ProPack 5 SP5, the sensor data repository (SDR) metrics are available through Performance Co-Pilot (see Performance Co-Pilot Linux User's and Administrator's Guide). The SDR provides temperature, voltage, and fan speed information for all service nodes, leader nodes, compute nodes, and CMCs. This information is collected from service and compute nodes through their BMC interface, so it is out-of-band and does not impact the performance of the node.

                The following metrics are available through the PMCD:

                sys-admin:~ # pminfo -h r1lead sensor
                sensor.value.fan
                sensor.value.voltage
                sensor.value.temperature

                Each sensor will have a separate instance within the domain, with the instance of the form:
                <nodeName>:<nodeType>:<metricNam>
                
                nodeName ::= Tempo node names (rXlead, rXiYc, rXiYnZ)
                nodeType ::= "service", "cmc", "blade", "leader"

                For example, to view voltages for the rack leader node, perform the following

                sys-admin:~ # pminfo -h r1lead -f sensor.value.voltage | grep -E '(^$|^sensor|r1lead)'
                
                sensor.value.voltage
                    inst [0 or "r1lead:leader:CPU1_Vcore"] value 1.3
                    inst [1 or "r1lead:leader:CPU2_Vcore"] value 1.3
                    inst [2 or "r1lead:leader:3.3V"] value 3.26
                    inst [3 or "r1lead:leader:5V"] value 4.9
                    inst [4 or "r1lead:leader:12V"] value 11.71
                    inst [5 or "r1lead:leader:-12V"] value -12.3
                    inst [6 or "r1lead:leader:1.5V"] value 1.47
                    inst [7 or "r1lead:leader:5VSB"] value 4.9
                    inst [8 or "r1lead:leader:VBAT"] value 3.31

                For additional examples on how to retrieve values using pmval(1) and for using this data in trend analysis using pmie(1), see the appropriate man page and the Performance Co-Pilot Linux User's and Administrator's Guide.

                Setting up the Embedded Support Partner

                The Embedded Support Partner (ESP) is a software suite to monitor events, set up proactive notification, and generate reports on SGI Altix systems. This section describes how to set it up on an SGI Altix ICE system. For detailed information about ESP, see Embedded Support Partner User Guide.

                Procedure 5-8. Setting up the Embedded Support Partner

                  To set up ESP on an SGI Altix ICE system, perform the following steps:

                  1. From the admin node, use the chkconfig command to make sure that the state of ESP is on, as follows:

                    sys-admin:~ # chkconfig --list | grep esp
                    esp                       0:on  1:on  2:on   3:on   4:on   5:on   6:on
                            sgi-esphttp:        on
                            sgi_espd:           on

                    ESP should already be running if its chkconfig flag is on. You can interact with ESP using a web interface or the command line (see Chapter 4, “Setting Up the ESP Environment” in the Embedded Support Partner User Guide.

                  2. From the admin node, create the default ESP user account, as follows:

                    system-admin:~ # espconfig -createadmin 

                  3. Enable the hosts that will be allowed to access ESP with the following commands:

                    system-admin:~ # espconfig -enable ipaddr 127.0.0.0 
                    system-admin:~ # espconfig -enable ipaddr 127.0.0.1 
                    system-admin:~ # espconfig -enable ipaddr IP_address_of_client 

                  4. From your laptop or PC system, point your browser to http://mymachine__-admin:5554 and log into ESP.

                  5. When the ESP login screen appears, login as administrator, use the password partner. After you login, the System Information screen appears (see Chapter 2, “Accessing ESP” Embedded Support Partner User Guide.

                  6. Now enter the Customer Profile information, as follows:

                    1. Select ESP Administration from the menu.

                    2. Click on Customer Profile (if not selected by default).

                    3. Fill in the form and then click Add.

                    4. Click Commit; or Update if already filled out.

                  7. Use ESP to Examine Inventory, as follows:

                    1. Select Reports Hardware Generate Report.

                    2. Select Reports Software Generate Report.

                    3. You can search for individual packages by entering the name in the search box (below the system host name) and then selecting GO on the right hand side of the screen. You can also use the down arrow to select a package in this search box.

                  8. Use ESP to enable or disable Performance Monitoring, as follows:

                    1. Select Configuration (from the top level menu) and then select Performance Monitoring.

                    2. Enable PMIE.

                    3. Disable the PMIE rule cpu.util.

                    4. Select Commit.

                    5. Select Configuration System Monitoring and enable the service pmcd.

                    6. Select Update and Commit (this may take a few minutes).

                  9. Use ESP to examine errors logs, as follows:

                    1. From the top level menus, select Report Events .

                    2. Then select Last 30 days and All Classes before clicking on Generate Report.

                  10. Use ESP to enable or disable Notification

                    Notification of events is handled by espnotify. The notication can be of types e-mail, system console, or graphics console. The notifications are enabled or disabled by specific actions. So after configuring the notification action you can enable or disable the notification, as follows:

                    1. Select Configuration Actions and click Continue.

                    2. Decide on the notification format and then check and select Continue and Commit .

                    3. Select Enable/ Disable from the third level menu, and click to enable the notification you set up.

                    4. Click Commit.

                  Troubleshooting

                  This section describes some troubleshooting tools and covers these topics:

                  dbdump Command

                  You can run the dbdump script to see an inventory of the Altix ICE database.

                  The dbdump command is, as follows:

                  /opt/sgi/sbin/dbdump --admin
                  /opt/sgi/sbin/dbdump --leader
                  /opt/sgi/sbin/dbdump --rack  [--rack ]
                  /opt/sgi/sbin/dbdump

                  • Use the --admin argument to dump the system admin controller (admin node)

                  • Use the --leader argument to dump all rack leader controllers (leader nodes)

                  • Use the --rack argument to dump a specific rack

                  • Use the dbdump command without any argument to dump the entire Altix ICE system.

                  EXAMPLES

                  Example 5-1. dbdump Command Examples

                  To dump the entire database, perform the following:

                  system-admin:~ # dbdump
                  0 is { cluster=oscar ifname=service0-bmc dev=bmc0 ip=172.24.0.3 net=head-bmc node=service0
                    nodetype=oscar_service mac=00:30:48:8e:
                  1 is { cluster=oscar ifname=service0 dev=eth0 ip=172.23.0.3 net=head node=service0
                    nodetype=oscar_service mac=00:30:48:33:53:2e }
                  2 is { cluster=oscar ifname=service0-ib0 dev=ib0 ip=10.148.0.2 net=ib-0 node=service0
                    nodetype=oscar_service }
                  3 is { cluster=oscar ifname=service0-ib1 dev=ib1 ip=10.149.0.2 net=ib-1 node=service0
                    nodetype=oscar_service }
                  4 is { cluster=oscar dev=eth0 ip=128.162.244.86 net=public node=oscar_server
                    nodetype=oscar_server mac=00:30:48:34:2B:E0 }
                  ...


                  Note: Some of the sample output in this section has been modified to fit the format of this manual.


                  To dump just the rack leader controller, perform the following:

                  sys-admin:~ # /opt/sgi/sbin/dbdump --leader
                  0 is { cluster=rack1 ifname=r1lead-bmc dev=bmc0 ip=172.24.0.2 net=head-bmc node=r1lead
                    nodetype=oscar_leader mac=00:30:48:8a:a4:c2 }
                  1 is { cluster=rack1 ifname=lead-bmc dev=eth0 ip=192.168.160.1 net=bmc node=r1lead
                    nodetype=oscar_leader mac=00:30:48:33:54:9e }
                  2 is { cluster=rack1 ifname=lead-eth dev=eth0 ip=192.168.159.1 net=gbe node=r1lead
                    nodetype=oscar_leader mac=00:30:48:33:54:9e }
                  3 is { cluster=rack1 ifname=r1lead dev=eth0 ip=172.23.0.2 net=head node=r1lead
                    nodetype=oscar_leader mac=00:30:48:33:54:9e }
                  4 is { cluster=rack1 ifname=r1lead-ib0 dev=ib0 ip=10.148.0.1 net=ib-0 node=r1lead
                    nodetype=oscar_leader }
                  5 is { cluster=rack1 ifname=r1lead-ib1 dev=ib1 ip=10.149.0.1 net=ib-1 node=r1lead
                    nodetype=oscar_leader }

                  To dump just one rack, perform the following:
                  sys-admin:~ # /opt/sgi/sbin/dbdump --rack 1
                  0 is { cluster=rack1 ifname=i0n0-bmc dev=bmc0 ip=192.168.160.10 net=bmc node=r1i0n0
                    nodetype=oscar_clients mac=00:30:48:7a:a7:96 }
                  1 is { cluster=rack1 ifname=i0n0-eth dev=eth0 ip=192.168.159.10 net=gbe node=r1i0n0
                    nodetype=oscar_clients mac=00:30:48:7a:a7:94 }
                  2 is { cluster=rack1 ifname=r1i0n0-ib0 dev=ib0 ip=10.148.0.3 net=ib-0 node=r1i0n0
                    nodetype=oscar_clients }
                  3 is { cluster=rack1 ifname=r1i0n0-ib1 dev=ib1 ip=10.149.0.3 net=ib-1 node=r1i0n0
                    nodetype=oscar_clients }
                  4 is { cluster=rack1 ifname=i0n1-bmc dev=bmc0 ip=192.168.160.11 net=bmc node=r1i0n1
                    nodetype=oscar_clients mac=00:30:48:7a:a7:86 slot=1 }
                  5 is { cluster=rack1 ifname=i0n1-eth dev=eth0 ip=192.168.159.11 net=gbe node=r1i0n1
                    nodetype=oscar_clients mac=00:30:48:7a:a7:84 slot=1 }
                  6 is { cluster=rack1 ifname=r1i0n1-ib0 dev=ib0 ip=10.148.0.4 net=ib-0 node=r1i0n1
                    nodetype=oscar_clients slot=1 }
                  7 is { cluster=rack1 ifname=r1i0n1-ib1 dev=ib1 ip=10.149.0.4 net=ib-1 node=r1i0n1
                    nodetype=oscar_clients slot=1 }
                  8 is { cluster=rack1 ifname=i0n10-bmc dev=bmc0 ip=192.168.160.20 net=bmc node=r1i0n10
                    nodetype=oscar_clients slot=10 }
                  9 is { cluster=rack1 ifname=i0n10-eth dev=eth0 ip=192.168.159.20 net=gbe node=r1i0n10
                    nodetype=oscar_clients slot=10 }
                  10 is { cluster=rack1 ifname=r1i0n10-ib0 dev=ib0 ip=10.148.0.13 net=ib-0 node=r1i0n10
                    nodetype=oscar_clients slot=10 }
                  ...


                  tempo-info-gather Command

                  The tempo-info-gather command enables to collect vital system data especially when troubleshooting problems. The tempo-info-gather command collects the information about the following:

                  • Digital media dminfo files, syslogs, Dynamic Host Configuration Protocol (DHCP), network file system (NFS)

                  • MySQL cluster database dump

                  • Network service configuration files, for example, C3, Ganglia, DHCP, domain name service (DNS) configuration files

                  • A list of installed system images

                  • Log files in /var/log/messages

                  • Chassis management control (CMC) slot table for each rack

                  • basic input-output system (BIOS), Baseboard Management Controller (BMC), CMC and Infiniband fabric software versions from all Altix ICE nodes

                  To see a usage statement for the tempo-info-gather command, perform the following:

                  sys-admin:/opt/sgi/sbin # tempo-info-gather  -h
                   usage: tempo-info-gather [-h] [-P path] [-o file]
                          tempo-info-gather -h            # Print this usage page
                          tempo-info-gather -o file       # Tar and gzip the directories 
                  into file (imply -n)
                          tempo-info-gather -p path       # Directory to write the data 
                  (default /var/tmp/tempo)
                  

                  cminfo Command

                  The cminfo command is used internally by many of the SGI Tempo scripts that are used to discover, configure, and manage an SGI Altix ICE system.

                  In a troubleshooting situation, you can use it to gather information about your system. To see a usage statement from a rack leader controller, perform the following:

                  r1lead:~ # cminfo --help
                  Usage: cminfo [--bmc_base_ip|--bmc_ifname|--bmc_iftype|--bmc_ip|--bmc_mac|--bmc_netmask|--bmc_nic|
                  --dns_domain|--gbe_base_i
                  p|--gbe_ifname|--gbe_iftype|--gbe_ip|--gbe_mac|--gbe_netmask|--gbe_nic|--head_base_ip|
                  --head_bmc_base_ip|--head_bmc_ifname|
                  --head_bmc_iftype|--head_bmc_ip|--head_bmc_mac|--head_bmc_netmask|--head_bmc_nic|--head_ifname|
                  --head_iftype|--head_ip|--he
                  ad_mac|--head_netmask|--head_nic|--ib_0_base_ip|--ib_0_ifname|--ib_0_iftype|--ib_0_ip|--ib_0_mac|
                  --ib_0_netmask|--ib_0_nic|
                  --ib_1_base_ip|--ib_1_ifname|--ib_1_iftype|--ib_1_ip|--ib_1_mac|--ib_1_netmask|
                  --ib_1_nic|--name|--rack]
                  r1lead:~ # cminfo --bmc_base_ip

                  EXAMPLES

                  Example 5-2. cminfo Command Examples

                  To see the rack leader node BMC IP address, perform the following:

                  r1lead:~ # cminfo --bmc_base_ip
                  192.168.160.0

                  To see the rack leader DNS domain, perform the following:

                  r1lead:~ # cminfo --dns_domain
                  ice.domain_name.mycompany.com

                  To see the BMC nic, perform the following:

                  r1lead:~ #  cminfo --bmc_nic
                  eth0

                  To see the IP address of the ib1 InfiniBand fabric, perform the following:

                  r1lead:~ # cminfo --ib_1_base_ip
                  10.149.0.0


                  kdump Utility

                  The kdump utility is a kexec-based crash dumping mechanism for the Linux operating system. You can downlonad debuginfo kernel RPMs for use with crash and any kernel dumps at the following location: http://support.novell.com/linux/psdb/byproduct.html.

                  To get a traceback or system dump, perform the following from the system console:

                  console r1i0n0
                  ^e c l 1 8
                  ^e c l 1 t       #traceback
                  ^e c l 1 c       #dump


                  Note: This example shows the letter “c”, a lowercase L “l”, and the number one “ 1” in all three lines.


                  On the admin node, go to /net/r1lead/var/log/consoles for the traceback and /net/r1lead/var/log/dumps/r1i0n0 for the system dump.

                  You can dump a compute node, the rack leader, such as, r1lead, or a service node, such as, service0.

                  System Firmware


                  Note: Your SGI Altix ICE system comes preinstalled with the appropriate firmware. See your SGI field support person for any BMC, BIOS, and CMC firmware updates.


                  The SGI Altix ICE system firmware software consists of the following components:
                  sgi-ice-blade-bmc-1.43.5-1.x86_64.rpm
                   

                  Blade BMC firmware and update tool

                  sgi-ice-blade-bios-2007.08.10-1.x86_64.rpm
                   

                  Blade BIOS image and update tool

                  sgi-ice-cmc-0.0.11-2.x86_64.rpm
                   

                  CMC firmware and update tool

                  BIOS Version Interrogation

                  To identify the BIOS you need both the version and the release date. You can get these using the dmidecode command. Log onto the node on which you want to interrogate BIOS level and perform the following:

                  # dmidecode -s bios-version; dmidecode -s bios-release-date

                  BMC Revision Interrogation

                  The BMC firmware revision can be retrieved using the ipmitool. For example, if you are logged onto the r1lead rack leader controller, the following command gets the BMC firmware revision:

                  # ipmitool -U ADMIN -P ADMIN -I lanplus -H r1i0n0-bmc bmc info | grep 'Firmware Revision' 

                  CMC Version Interrogation

                  The CMC firmware version can can be retrieved using the version command to the CMC. For example, if you are logged onto the r1lead rack leader controller, the following command gets the CMC firmware version:

                  # ssh root@r1i0-cmc version 

                  Infiniband Version Interrogation

                  The ibstat command retrieves information for the InfiniBand links including the firmware version. The following command gets the InfiniBand firmware version:

                  # ibstat | grep Firmware 

                  Getting Firmware Information for All System Nodes

                  The firmware_revs script on the system admin controller (admin node) collects the firmware information for all nodes in the SGI Altix ICE system, as follows:

                  system-admin:~ # firmware_revs 
                  BIOS versions:
                  --------------
                  admin: 6.00
                  r1lead: 6.00
                  service0: 6.00
                  r1i0n0: 6.00
                  r1i0n1: 6.00
                  r1i0n8: 6.00
                  r1i1n0: 6.00
                  r1i1n1: 6.00
                  r1i1n8: 6.00
                  
                  
                  BIOS release dates:
                  -------------------
                  admin: 05/10/2007
                  r1lead: 05/10/2007
                  service0: 05/10/2007
                  r1i0n0: 05/29/2007
                  r1i0n1: 05/29/2007
                  r1i0n8: 05/29/2007
                  r1i1n0: 05/29/2007
                  r1i1n1: 05/29/2007
                  r1i1n8: 05/29/2007
                  
                  
                  BMC versions:
                  -------------
                  admin: 1.31
                  r1lead: 1.31
                  service0: 1.31
                  r1i0n0: 1.29
                  r1i0n1: 1.29
                  r1i0n8: 1.29
                  r1i1n0: 1.29
                  r1i1n1: 1.29
                  r1i1n8: 1.29
                  
                  
                  CMC versions:
                  -------------
                  r1i0c: 0.0.9pre10
                  r1i1c: 0.0.9pre10
                  
                  
                  Infiniband versions:
                  --------------------
                  r1lead: 4.7.600
                  service0: 4.7.600
                  r1i0n0: 1.2.0
                  r1i0n0: 1.2.0
                  r1i0n1: 1.2.0
                  r1i0n1: 1.2.0
                  r1i0n8: 1.2.0
                  r1i0n8: 1.2.0
                  r1i1n0: 1.2.0
                  r1i1n0: 1.2.0
                  r1i1n1: 1.2.0
                  r1i1n1: 1.2.0
                  r1i1n8: 1.2.0
                  r1i1n8: 1.2.0