Chapter 5. System Monitoring and Debugging

This chapter describes system monitoring and covers the following topics:

Inventory Verification Tool

You can use the SGI Tempo inventory verification tool to query, take snapshots, analyze and compare the node and network inventory of a cluster. Various hardware, network and operating system configuration properties are available and are presented in user-specified formats.

To make an inventory snapshot of an Altix ICE system, use the following command from the system admin controller (admin node).

system-admin:~ # ivt -M
Making a cluster inventory snapshot.  Takes a couple of minutes...  

Each snapshot is assigned a unique number and marked with the date and time it was taken. Use the ivt -L command to list active snapshot information, as follows:

system-admin:~ # ivt -L
    1   2007-07-13.11:42:47

You can query (-Q option), compare ( -C option) and analyze (-S option) existing snapshots. A variety of system hardware and configuration properties can be displayed. You can compare two snapshots to see what has changed or analyze a system snapshot for failed nodes and or see network fabric links.

You use the ivt command to show general information about your system (note that only a portion of the output of this command is shown below), as follows:

system-admin:~ # ivt -S

Your system has 6 compute blades.

All 6 blades have the following characteristics:
    bios_date: 05/29/2007
    cpu_core_count: 8
    cpu_model: Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
    kernel: 2.6.16.46-0.12-smp
    memsize: 2059264
    os_product: SLES
    os_vendor: SUSE
    os_version: 10.1

The following characteristics have different values for some blades.

  ib0_phys_state (State of InfiniBand ib0 physical link):
          4 blades have ib0_phys_state == LinkUp (r1i0n0, r1i1n0, r1i0n8, ...)
          2 blades have ib0_phys_state == unknown (r1i0n1, r1i1n1)
      Query the  value for all blades with:
        ivt -Q -w blades -f 'blade $blade has ib0_phys_state $ib0_phys_state'

  ib0_rate (Rate of InfiniBand ib0 link - Gb/sec):
          2 blades have ib0_rate == unknown (r1i0n1, r1i1n1)
          4 blades have ib0_rate == 20 (r1i0n0, r1i1n0, r1i0n8, ...)
      Query the  value for all blades with:
        ivt -Q -w blades -f 'blade $blade has ib0_rate $ib0_rate'
...

  ib_bios_rev (Revision of InfiniBand BIOS on blade):
          2 blades have ib_bios_rev == unknown (r1i0n1, r1i1n1)
          4 blades have ib_bios_rev == 1.2.0 (r1i0n0, r1i1n0, r1i0n8, ...)
      Query the  value for all blades with:
        ivt -Q -w blades -f 'blade $blade has ib_bios_rev $ib_bios_rev'

  image (image provisioned on blade):
          5 blades have image == compute-sles10sp1 (r1i0n1, r1i1n1, r1i1n0, ...)
          1 blades have image == erikj-blade-mksiimage (r1i0n0)
      Query the  value for all blades with:
        ivt -Q -w blades -f 'blade $blade has image $image'

  rack_blade_count (number of booted blades in this blades rack):
          2 blades have rack_blade_count == 5 (r1i0n1, r1i1n1)
          4 blades have rack_blade_count == 4 (r1i0n0, r1i1n0, r1i0n8, ...)
      Query the  value for all blades with:
        ivt -Q -w blades -f 'blade $blade has rack_blade_count $rack_blade_count'

InfiniBand GUID check:
  Do fabric (ibnetdiscover) and blades (ib stat) have same GUIDs?
    ib0 plane: unmatched GUIDs
    GUIDs seen on blade ports, missing on fabric: unknown 0030487aa7940000
    GUIDs see on fabric, missing on blade ports: 0030487aa7840000 0030487aa7980000
    ib1 plane: unmatched GUIDs
    GUIDs seen on blade ports, missing on fabric: unknown 0030487aa7950000
    GUIDs see on fabric, missing on blade ports: 0030487aa7850000 0030487aa7990000

InfiniBand Link state check:
  Are any IB ports not ACTIVE, not 20 Gb/sec rate or not Up?
...

You can use the ivt -c cpu command to show an inventory of the system compute blades and the number of CPUs each blade contains, as follows:

system-admin:~ # ivt -c cpu
r1i0n0 has 8 CPUs
r1i0n1 has 8 CPUs
r1i0n8 has 8 CPUs
r1i1n0 has 8 CPUs
r1i1n1 has 8 CPUs
r1i1n8 has 8 CPUs

You can use the ivt tool to determine which compute nodes (blades) are up or down, as follows:

system-admin:~ #  ivt -Q -w blades -f '$blade $sshstate'
r1i0n0 up
r1i0n1 down
r1i0n8 up
r1i1n0 up
r1i1n1 down
r1i1n8 up

You can use the ivt tool to determine the GigE Ethernet address for each compute node (blade) , as follows:

system-admin:~ # ivt -Q -w blades -f '$blade $gige_ip_addr'
r1i0n0 192.168.159.10
r1i0n1 192.168.159.11
r1i0n8 192.168.159.18
r1i1n0 192.168.159.26
r1i1n1 192.168.159.27
r1i1n8 192.168.159.34

For detailed information on how to use the ivt tool, see the ivt(8) man page or ivt -h, --help usage statement.

System Monitoring Overview

Ganglia is a scalable, distributed monitoring system for monitoring system for high-performance computing systems, such as the SGI Altix ISE 8000 system. It displays web browser-based, real-time (on demand) histograms of system metrics, as shown in Figure 5-1.

Figure 5-1. Ganglia System Monitor

Ganglia System Monitor

Detailed information about the Ganglia monitoring system is available at: http://ganglia.info/.

SGI Tempo has devised a Ganglia model for the Altix ICE system that makes maximum use of Ganglia's highly scalable architecture: each compute node (blade) presents a single monitoring source sending its statistics to the rack leader controller. Therefore, the rack leader controller receives, at most, data from 64 blades. After collecting the data, the rack leader controller forwards aggegrated rack statistics to the system admin controller (admin node). The rack leader controller also sends its own statistics to the system admin controller. The system admin controller presents the meta-aggregator for the entire Altix ICE system. It collects data from all rack leaders and presents the cluster-wide metrics. This model enables SGI to scale-out Ganglia to very large cluster deployments.

The Node View as shown in can aid in system troubleshooting. For every blade in the system, the Location field of the Node View shows the exact physical location of the blade. This is an extremely useful when trying to locate a blade that is down.

Figure 5-2. Ganglia System Monitoring Node View

Ganglia System Monitoring Node View

System Monitoring Operation

To access the Ganglia system monitor, point your browser to the following location: http://admin_pub_name /ganglia

By default, Ganglia monitors standard operating system metrics like CPU load, memory usage. The Grid Report view shows an overview of your system, such as the number of CPUs, the number of hosts (compute nodes) that are up or down, service node information, memory usage information, and so on.

The Last pull down menu allows you to view performance data on an hourly, daily, weekly, or yearly basis. The Sorted pull down menu allows provides an ascending, descending, or by host view of performance data. The Grid pull-down menu allows you to see performance data for a particular rack or service node. The Get Fresh Data button allows you to see current data performance.

Troubleshooting

This section describes some troubleshooting tools and covers these topics:

dbdump Command

You can run the dbdump script to see an inventory of the Altix ICE database.

The dbdump command is, as follows:

/opt/sgi/sbin/dbdump --admin
/opt/sgi/sbin/dbdump --leader
/opt/sgi/sbin/dbdump --rack  [--rack ]
/opt/sgi/sbin/dbdump

  • Use the --admin argument to dump the system admin controller (admin node)

  • Use the --leader argument to dump all rack leader controllers (leader nodes)

  • Use the --rack argument to dump a specific rack

  • Use the dbdump command without any argument to dump the entire Altix ICE system.

EXAMPLES

Example 5-1. dbdump Command Examples

To dump the entire database, perform the following:

system-admin:~ # dbdump
0 is { cluster=oscar ifname=service0-bmc dev=bmc0 ip=172.24.0.3 net=head-bmc node=service0
  nodetype=oscar_service mac=00:30:48:8e:
1 is { cluster=oscar ifname=service0 dev=eth0 ip=172.23.0.3 net=head node=service0
  nodetype=oscar_service mac=00:30:48:33:53:2e }
2 is { cluster=oscar ifname=service0-ib0 dev=ib0 ip=10.148.0.2 net=ib-0 node=service0
  nodetype=oscar_service }
3 is { cluster=oscar ifname=service0-ib1 dev=ib1 ip=10.149.0.2 net=ib-1 node=service0
  nodetype=oscar_service }
4 is { cluster=oscar dev=eth0 ip=128.162.244.86 net=public node=oscar_server
  nodetype=oscar_server mac=00:30:48:34:2B:E0 }
...


Note: Some of the sample output in this section has been modified to fit the format of this manual.


To dump just the rack leader controller, perform the following:

system-admin:~ # /opt/sgi/sbin/dbdump --leader
0 is { cluster=rack1 ifname=r1lead-bmc dev=bmc0 ip=172.24.0.2 net=head-bmc node=r1lead
  nodetype=oscar_leader mac=00:30:48:8a:a4:c2 }
1 is { cluster=rack1 ifname=lead-bmc dev=eth0 ip=192.168.160.1 net=bmc node=r1lead
  nodetype=oscar_leader mac=00:30:48:33:54:9e }
2 is { cluster=rack1 ifname=lead-eth dev=eth0 ip=192.168.159.1 net=gbe node=r1lead
  nodetype=oscar_leader mac=00:30:48:33:54:9e }
3 is { cluster=rack1 ifname=r1lead dev=eth0 ip=172.23.0.2 net=head node=r1lead
  nodetype=oscar_leader mac=00:30:48:33:54:9e }
4 is { cluster=rack1 ifname=r1lead-ib0 dev=ib0 ip=10.148.0.1 net=ib-0 node=r1lead
  nodetype=oscar_leader }
5 is { cluster=rack1 ifname=r1lead-ib1 dev=ib1 ip=10.149.0.1 net=ib-1 node=r1lead
  nodetype=oscar_leader }

To dump just one rack, perform the following:
system-admin:~ # /opt/sgi/sbin/dbdump --rack 1
0 is { cluster=rack1 ifname=i0n0-bmc dev=bmc0 ip=192.168.160.10 net=bmc node=r1i0n0
  nodetype=oscar_clients mac=00:30:48:7a:a7:96 }
1 is { cluster=rack1 ifname=i0n0-eth dev=eth0 ip=192.168.159.10 net=gbe node=r1i0n0
  nodetype=oscar_clients mac=00:30:48:7a:a7:94 }
2 is { cluster=rack1 ifname=r1i0n0-ib0 dev=ib0 ip=10.148.0.3 net=ib-0 node=r1i0n0
  nodetype=oscar_clients }
3 is { cluster=rack1 ifname=r1i0n0-ib1 dev=ib1 ip=10.149.0.3 net=ib-1 node=r1i0n0
  nodetype=oscar_clients }
4 is { cluster=rack1 ifname=i0n1-bmc dev=bmc0 ip=192.168.160.11 net=bmc node=r1i0n1
  nodetype=oscar_clients mac=00:30:48:7a:a7:86 slot=1 }
5 is { cluster=rack1 ifname=i0n1-eth dev=eth0 ip=192.168.159.11 net=gbe node=r1i0n1
  nodetype=oscar_clients mac=00:30:48:7a:a7:84 slot=1 }
6 is { cluster=rack1 ifname=r1i0n1-ib0 dev=ib0 ip=10.148.0.4 net=ib-0 node=r1i0n1
  nodetype=oscar_clients slot=1 }
7 is { cluster=rack1 ifname=r1i0n1-ib1 dev=ib1 ip=10.149.0.4 net=ib-1 node=r1i0n1
  nodetype=oscar_clients slot=1 }
8 is { cluster=rack1 ifname=i0n10-bmc dev=bmc0 ip=192.168.160.20 net=bmc node=r1i0n10
  nodetype=oscar_clients slot=10 }
9 is { cluster=rack1 ifname=i0n10-eth dev=eth0 ip=192.168.159.20 net=gbe node=r1i0n10
  nodetype=oscar_clients slot=10 }
10 is { cluster=rack1 ifname=r1i0n10-ib0 dev=ib0 ip=10.148.0.13 net=ib-0 node=r1i0n10
  nodetype=oscar_clients slot=10 }
...


tempo-info-gather Command

The tempo-info-gather command enables to collect vital system data especially when troubleshooting problems. The tempo-info-gather command collects the information about the following:

  • Digital media dminfo files, syslogs, Dynamic Host Configuration Protocol (DHCP), network file system (NFS)

  • MySQL cluster database dump

  • Network service configuration files, for example, C3, Ganglia, DHCP, domain name service (DNS) configuration files

  • A list of installed system images

  • Log files in /var/log/messages

  • Chassis management control (CMC) slot table for each rack

  • basic input-output system (BIOS), Baseboard Management Controller (BMC), CMC and Infiniband fabric software versions from all Altix ICE nodes

To see a usage statement for the tempo-info-gather command, perform the following:

system-admin:/opt/sgi/sbin # tempo-info-gather  -h
 usage: tempo-info-gather [-h] [-P path] [-o file]
        tempo-info-gather -h            # Print this usage page
        tempo-info-gather -o file       # Tar and gzip the directories 
into file (imply -n)
        tempo-info-gather -p path       # Directory to write the data 
(default /var/tmp/tempo)

cminfo Command

The cminfo command is used internally by many of the SGI Tempo scripts that are used to discover, configure, and manage an SGI Altix ICE system.

In a troubleshooting situation, you can use it to gather information about your system. To see a usage statement from a rack leader controller, perform the following:

r1lead:~ # cminfo --help
Usage: cminfo [--bmc_base_ip|--bmc_ifname|--bmc_iftype|--bmc_ip|--bmc_mac|--bmc_netmask|--bmc_nic|
--dns_domain|--gbe_base_i
p|--gbe_ifname|--gbe_iftype|--gbe_ip|--gbe_mac|--gbe_netmask|--gbe_nic|--head_base_ip|
--head_bmc_base_ip|--head_bmc_ifname|
--head_bmc_iftype|--head_bmc_ip|--head_bmc_mac|--head_bmc_netmask|--head_bmc_nic|--head_ifname|
--head_iftype|--head_ip|--he
ad_mac|--head_netmask|--head_nic|--ib_0_base_ip|--ib_0_ifname|--ib_0_iftype|--ib_0_ip|--ib_0_mac|
--ib_0_netmask|--ib_0_nic|
--ib_1_base_ip|--ib_1_ifname|--ib_1_iftype|--ib_1_ip|--ib_1_mac|--ib_1_netmask|
--ib_1_nic|--name|--rack]
r1lead:~ # cminfo --bmc_base_ip

EXAMPLES

Example 5-2. cminfo Command Examples

To see the rack leader node BMC IP address, perform the following:

r1lead:~ # cminfo --bmc_base_ip
192.168.160.0

To see the rack leader DNS domain, perform the following:

r1lead:~ # cminfo --dns_domain
ice.domain_name.mycompany.com

To see the BMC nic, perform the following:

r1lead:~ #  cminfo --bmc_nic
eth0

To see the IP address of the ib1 InfiniBand fabric, perform the following:

r1lead:~ # cminfo --ib_1_base_ip
10.149.0.0