Chapter 3. System Operation

This chapter describes how to use the SGI Tempo systems management software to operate your Altix ICE system and covers the following topics:

SGI Altix ICE Software

This section describes SLES10 services turned off on compute nodes by default, how you can customize the software running on compute nodes or service nodes, create a simple clone image of compute node or service node software, how to use the cimage command, how to use yum to install packages into software images, and how to use the mksiimage command to create compute and service node images. It covers these topics:

Compute Node Services Turned Off by Default

Currently, the compute nodes run the SUSE Linux Enterprise Server 10 (SLES10) Service Pack 1 (SP1) Linux distribution. To improve the performance of applications running MPI jobs on compute nodes, the following SLES10 services are turned off:

  • acpid

  • auditd

  • boot.crypto

  • boot.device-mapper

  • boot.lvm

  • boot.md

  • cron

  • earlykbd

  • earlysyslog

  • fbset

  • irq_balancer

  • kbd

  • novell-zmd

  • nscd

  • postfix

  • powersaved

  • resmgr

  • slpd

  • splash

  • splash_early

  • suseRegister

  • xdm

Customizing Software On Your SGI Altix ICE System

This section discusses how to manage various nodes on your SGI Altix ICE system. It describes how to configure the various nodes, including the compute and service nodes. It describe how to augment software packages. Many tasks having to do with package management have multiple valid methods to use.

For information on installing patches and updates, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” in Chapter 2.

Compute Node Per-Host Customizations

You can add per-host compute node customization to the compute node images. You do this by adding scripts either to the /opt/sgi/share/per-host-customization/global/ directory or the /opt/sgi/share/per-host-customization/mynewimage/ directory on the system admin controller.


Note: When creating custom images for compute nodes, make sure you clone the original SGI images. This provides the original images intact that yocan fall back to if necessary.


Scripts in the global directory apply to all compute nodes images. Scripts under the image name apply only to the image in question. The scripts are cycled through once per host when being installed on the rack leader controllers. They receive one input argument, which is the full path (on the rack leader controller) to the per-host base directory, for example, /var/lib/sgi/mynewimage/i2n11. There is a README file at /opt/sgi/share/per-host-customization/README on the system admin controller, as follows:

This directory contains compute node image customization scripts which are
  executed as part of the install-image operations on the leader nodes when
  pulling over a new compute node image.

  After the image has been pulled over, and the per-host-customization dir has
  been rsynced, the per-host /etc and /var directories are populated, then the
  scripts in this directory are cycled through once per-host.  This allows the
  scripts to source the node specific network and cluster management settings,
  and set node specific settings.

  Scripts in the global directory are iterated through first, then if a
  directory exists that matches the image name, those scripts are iterated
  through next.

  You can use the scripts in the global directory as examples.

An example global script, /opt/sgi/share/per-host-customization/global/sgi-hostname is, as follows:

#!/bin/sh
  #
  # Copyright (c) 2007 Silicon Graphics, Inc.
  # All rights reserved.
  #
  # Set the compute node's hostname to the cluster unique name
  #
  # This script is excecuted once per-host as part of the install-image operation
  # run on the leader nodes, which is called from cimage on the admin node.
  # The full path to the per-host iru+slot directory is passed in as $1,
  # e.g. /var/lib/sgi/per-host//i2n11.
  #

  # sanity checks
  . /opt/sgi/share/per-host-customization/global/sanity.sh

  iruslot=$1

  # source cluster management information
  . ${iruslot}/etc/opt/sgi/cminfo

  # set hostname of blade to cluster unique name
  echo ${NAME} > ${iruslot}/etc/HOSTNAME

SGI Altix ICE System Configuration Framework

All node types that are part of an SGI Altix ICE system can have configuration settings adjusted by the configuration framework. There is some overlap between the per-host customization instructions and the configuration framework instructions. Each approach plays a role in configuring your system. The major differences between the two methods are, as follows:

  • Per-host customization runs at the time an image is pushed to the rack leader controllers.

  • Per-host customization only applies to compute node images.

  • The Altix ICE system configuration framework can be used with all node types.

  • The system configuration framework is run when a new root is created, when SuSEconfig command is run for some other reason, or as part of a yum operation.

This framework exists to make it easy to adjust configuration items. There are SGI-supplied scripts already present. You can add more scripts as you wish. You can also exclude scripts from running without purging the script if you decide a certain script should not be run. The following set of questions in bold and bulleted answers describes how to use the system configuration framework.

How does the system configuration framwork operate?

These files could be added, for example, to a running service node, or to an already created service or compute image. Remember that images destined for compute nodes need to be pushed with the cimage command after being altered. For more information, see “cimage Command”.

  • A /opt/sgi/lib/cluster-configuration script is called, from where it is called is described below.

  • That script iterates through scripts residing in /etc/opt/sgi/conf.d.

  • Any scripts listed in /etc/opt/sgi/conf.d/exclude are skipped, as are scripts, that are not executable.

  • Scripts in system configruation framework must be tolerant of files that do not exist yet, as described below. For example, check that a syslog configuration file exists before trying to adjust it.

Where is the framework called from?

  • The callout for /opt/sgi/lib/cluster-configuration is implemented as a yum plugin that executes after packages have been installed and cleaned.

  • There is also a SUSE configuration script in /sbin/conf.d, called SuSEconfig.00cluster-configuration , that calls the framework. This is in case of you are using YaST to install or upgrade packages.

  • One of the scripts called by the framework calls SuSEconfig. A check is made to avoid a callout loop.

How do I adjust my system configuration?

  • Create a small script in /etc/opt/sgi/conf.d to do the adjustment.

    Be sure that you test for existence of files and do not assume they are there (see "Why do scripts need to tolerate files that do not exist but should?" below).

Why do scripts need to tolerate files that do not exist but should?

  • This is because the mksiimage command runs yume and yum in two steps. The first step only installs 40 or so RPMs but our framework is called then too. The second pass installs the other "hundreds" of RPMs. So the framework is called once before many packages are installed, and again after everything is in place. So not all files you expect might be available when your small script is called.

How does the yum plugin work?

  • In order for the yum plugin to work, the /etc/yum.conf file has to have plugins=1 set in its configuration file. SGI Tempo software ensures that is in place by way of a trigger in the sgi-cluster package. Any time yum is installed or updated, it verify plugins=1 is set.

How does yume work?

  • yume, an oscar wrapper for yum, works by creating a temporary yum configuration file in /tmp and then points yum at it. This temporary configuration file needs to have plugins enabled. A tiny patch to yume makes this happen. This fixes it for yume and also mksiimage, which calls yume as part of its operation.

Customizing Software Images


Note: Procedures in this section describe how to work with service node and compute node images. Always use a cloned image. If you are adjusting an RPM list, use your own copy of the RPM list.


The service and compute node images are created during the configure-cluster operation (or during your upgrade to SGI ProPack 5 SP3 if you were running SGI ProPack 5 SP2 previously). This process uses an RPM list to generate a root on the fly.

You can clone a compute node image, or create a new one based on an RPM list. For service nodes, SGI does not support a clone operation. For compute images, you can either clone the image and work on a copy or you can always make a new compute node image from the SGI supplied default RPM list.

Procedure 3-1. Clone a Compute Node Image

    To clone a compute node image, perform the following steps:

    1. From the system admin controller, create a clone of the compute node image, as follows:

      # cimage --clone-image compute-sles10sp2 new

      After that command is complete, you will have a new image located in /var/lib/systemimager/images/new on the system admin controller.

    2. To see that the image is now available, perform the following command:

       # cimage --list-images
      image: compute-sles10sp1
             kernel: 2.6.16.46-0.12-carlsbad
             kernel: 2.6.16.46-0.12-smp
      
      image: new
             kernel: 2.6.16.46-0.12-carlsbad
             kernel: 2.6.16.46-0.12-smp
      

    For RPM lists, the default RPM lists are located in /etc/opt/sgi/rpmlists on the system admin controller. SGI suggests you never change these files, but rather, create your own versions using the ones supplied by SGI as a base.

    Please note, it is important that certain packages be in the rpmlist for a given node. For example, an rpmlist used for compute nodes should have packages sgi-compute-node and sgi-cluster. Service nodes must have sgi-service-node and sgi-cluster.

    Procedure 3-2. Manually adding a Package to a Compute Node Image

      To manually add a package to a compute node image, perform the steps:

      1. Make a clone of the compute node image, as described in “Customizing Software Images”.

      2. Determine what images and kernels you have available now, as follows:

        # cimage --list-images
          image: compute-sles10sp1
                 kernel: 2.6.16.46-0.12-carlsbad
                 kernel: 2.6.16.46-0.12-smp
        
          image: compute-sles10sp1-clone
                 kernel: 2.6.16.46-0.12-carlsbad
                 kernel: 2.6.16.46-0.12-smp
        

      3. From the system admin controller, change directory to the images directory, as follows:

        # cd /var/lib/systemimager/images/

      4. From the system admin controller, copy the RPMs your wish to add, as follows:

        # cp /newrpm.rpm new/tmp

      5. The new RPMs now reside in /tmp direcory in the image named new. To install them into your new compute node image, perform the following commands:

        # chroot new bash

        And then perform the following:

        # rpm -Uvh /tmp/newrpm.rpm

      6. The image on the system admin controller is updated. However, you still need to push the changes out. Ensure there are no nodes currently using the image and then run this command:

        # cimage --push-rack new r\*

        This will push the updates to the rack lead controllers and the changes will be seen by the compute nodes the next time they start up. For information on how to ensure the image is associated with a given node, see the cimage --set command and the example in Procedure 3-3.

      Procedure 3-3. Creating a Simple Compute Node Image Clone


        Note: Always work from a clone image, see “Customizing Software Images”.


        To create a simple compute node image clone from the system admin controller, perform the following steps:

        1. To clone the compute node image, perform the following:

          # cimage --clone-image compute-sles10sp1 compute-sles10sp1-clone

        2. To see the images and kernels in the list, perform the following:

          # cimage --list-images
          image: compute-sles10sp1
                 kernel: 2.6.16.46-0.12-carlsbad
                 kernel: 2.6.16.46-0.12-smp
          
          image: compute-sles10sp1-clone
                 kernel: 2.6.16.46-0.12-carlsbad
                 kernel: 2.6.16.46-0.12-smp

        3. To change the compute nodes to use the cloned image/kernel pair, perform the following:

          # cimage --set compute-sles10sp1-clone 2.6.16.46-0.12-smp "r*i*n*"

        Procedure 3-4. Manually Adding a Package to the Service Node Image

          To manually add a package to the service node image, perform the following steps:

          1. Use the mksiimage command to create your own version of the service node image. See “Creating Compute and Service Images Using the mksiimage Command”.

          2. Change directory to the images directory, as follows:

            # cd /var/lib/systemimager/images/

          3. From the system admin controller, copy the RPMs your wish to add, as follows, where my-service-image is your own service node image:

            # cp /newrpm.rpm my-service-image/tmp

          4. The new RPMs now reside in /tmp direcory in the image named my-service-image. To install them into your new compute node image, perform the following commands:

            # chroot new bash

            And then perform the following:
            # rpm -Uvh /tmp/newrpm.rpm

            At this point, the image has been updated with the rpm . Please note, that unlike compute node images, changes made to a service node image will not be seen by service nodes until they are re-installed with the image. If you wish to install the package on running systems, you can copy the rpm to the running system and use rpm from there.

          cimage Command

          The cimage command allows you to list, modify, and set software images on the compute nodes in your system.

          The cimage command accepts the following options:

          Option 

          Description

          --help 

          Usage and help text

          --list-images 

          Lists images present in the database

          --list-nodes RACK ... 

          Lists what compute nodes are set to

          --set IMAGE KERNEL NODE ... 

          Sets the compute nodes to a certain boot image and kernel combination

          --add-db IMAGE 

          Adds an image to the database

          --del-db IMAGE 

          Deletes an image from the database

          --push-rack IMAGE RACK ... 

          Pushes an image to specified rack(s)

          --del-rack IMAGE RACK 

          Deletes an image from specified rack(s)

          --clone-image OIMAGE NIMAGE 

          Clones an existing image to a new image

          --del-image IMAGE  

          Deletes an existing image entirely

          RACK arguments take the format rX.

          NODE arguments take the format rXiYnZ .

          X, Y, Z can be single digits, a [start-end] range, or * for all matches.

          ... indicates more than one RACK or NODE argument can be passed in.

          EXAMPLES

          Example 3-1. cimage Command Examples

          The following examples walk you through some typical cimage command operations.

          To list the available images and their associated kernels, perform the following:

          # cimage --list-images
          
          image: compute-sles10sp1
                  kernel: 2.6.16.46-0.12-carlsbad
                  kernel: 2.6.16.46-0.12-smp

          To list the compute nodes in rack 1 and the image and kernel they are set to boot, perform the following:

          # cimage --list-nodes r1
          r1i0n0: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n3: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n4: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n5: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n6: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n7: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n8: compute-sles10sp1 2.6.16.46-0.7-smp
          [...snip...]

          To set the r1i0n0 compute node to boot the 2.6.16.46-0.12-carlsbad kernel from the compute-sles10sp1 image, perform the following: :

          # cimage --set compute-sles10sp1 2.6.16.46-0.12-carlsbad r1i0n0

          To list the nodes in rack 1 to see the changes set in the example above, perform the following:

          # cimage --list-nodes r1
          r1i0n0: compute-sles10sp1 2.6.16.46-0.7-carlsbad
          r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp
          r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp
          [...snip...]

          To set all nodes in all racks to boot the 2.6.16.46-0.7-carlsbad kernel from the compute-sles10sp1 image, perform the following:

          # cimage --set compute-sles10sp1 2.6.16.46-0.7-carlsbad r*i*n*
          
          
          

          To set two ranges of nodes to boot the 2.6.16.46-0.7-smp kernel, perform the following:

          # cimage --set compute-sles10sp1 2.6.16.46-0.7-smp r1i[0-2]n[5-6] r1i[2-3]n[0-4] 

          To clone the compute-sles10sp1 image to a new image (so that you can modify it) , perform the following:

          # cimage --clone-image compute-sles10sp1 mynewimage
          Cloning compute-sles10sp1 to mynewimage ... done

          The clone process adds the image and its kernels to the database

          To change to the cloned image created in the example, above, copy the needed rpms into the /var/lib/systemimager/images/tmp directory, use the chroot command to enter the directory and then install the rpms, perform the following:

          # cp .rpm /var/lib/systemimager/images//tmp
          # chroot /var/lib/systemimager/images// bash
          # rpm -Uvh /tmp/.rpm

          If you make changes to the kernels in the image, you need to refresh the kernel database entries for your image, To do this, perform the following:

          # cimage --del-db mynewimage
          # cimage --add-db mynewimage

          If you did not make changes to the kernels in the cloned image created in the example, above, you can omit this step.

          To push new software images out to the compute blades in a rack or set of racks, perform the following:

          # cimage --push-rack mynewimage r*
          r1lead: install-image: mynewimage
          r1lead: install-image: mynewimage done.

          To list images in the database the kernels they contain, perform the following:

          # cimage --list-images
          
          image: compute-sles10sp1
                  kernel: 2.6.16.46-0.7-carlsbad
                  kernel: 2.6.16.46-0.7-smp
          
          image: mynewimage
                  kernel: 2.6.16.46-0.7-carlsbad
                  kernel: 2.6.16.46-0.7-smp

          To set some compute nodes to boot an image, perform the following:

          # cimage --set mynewimage 2.6.16.46-0.7-smp r1i3n*

          You need to reboot the compute nodes to run the new images.

          Completely remove an image you no longer use, both from system admin controller and all compute nodes in all racks, perform the following:

          # cimage --del-image mynewimage
          r1lead: delete-image: mynewimage
          r1lead: delete-image: mynewimage done.


          Using yum to Install Packages into Software Images

          The packages that make up SGI Tempo and SLES are available in repositories at which yum is configured to look.


          Note: Always work with copies of software images.


          SGI provides a wrapper around yum that makes it simple to install a package in to an image.

          However, yum only looks at packages that are part of a yum repository. So if you are installing your own rpm, you will need to configure yum to look at your own repositories in addition to the others. See the appropriate yum docomentation.


          Note: The yum software maintains a cache of the repository metdata and it will not update its cache of the metadata until a certain number of minutes have passed. This time limit is defined by the metadata_expire option of the yum.conf file. See the yum.conf(5) man page. If you happen to synchronize your repository to SUSE or Novell shortly after doing a yum operation, and you notice that yum now says there are no updates when there should be, you can run this command to clear the caches and force yum to try again: yum clean all.


          This example shows how to install the zlib-devel package in to the service node image so that the next time you image or install a service node it has this new package.

          You can run the following command:

          # yum-image-wrapper /var/lib/systemimager/images/my-service-sles10sp1 install zlib-devel

          Perform a similar command for compute nodes. Note the following:

          • If you update a compute node image on the system admin controller, you have to use the cimage command to push the changes.

          • If you update a service node image on the system admin controller, that service node needs to be re-installed/re-imaged to get the change. The discover command can be given an alternate image.

          For more information on using yum, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” in Chapter 2.

          Using yum to Install Packages on Running Service Nodes

          These instructions only apply to service nodes.

          You can use the yum command to install a package on a service node. From the system admin controller, you can issue a command like the following. Please note that SGI suggests using the -y option. This prevents yum from asking for input.

          # ssh service0 yum install zlib-devel

          For more information using yum, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” in Chapter 2.

          Creating Compute and Service Images Using the mksiimage Command

          You can create service node and compute node images using the mksiimage(1) command. This will generate a root on the fly.

          Fresh installations of SGI ProPack 5 SP3 create these images during the configure-cluster installation step.

          The rpm lists that drive which packages get installed in the images are listed in files located in /etc/opt/sgi/rpmlists . For example, /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist . You should not edit the default SGI RPM lists but instead make copies and work on the copy.

          Procedure 3-5. Use mksiimage to Create a Service Node Image

            To use the mksiimage command to create a service node image, perform the following:

            1. Make a copy of the example service node image RPM list and work on the copy, as follows:

              # cp /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-service-node.rpmlist

            2. Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.

            3. Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-service-node-image as the home for this image.

              This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.

            4. Execute the following command:

              # mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64,
              /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64 
              --filename /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist

            5. post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:

              # post-process-sgi-image /var/lib/systemimager/images/my-service-node-image/ eth1

            6. Now the image is ready to be used with service nodes. Please see the discover command for how to associate the new image with the service node for the discover command. See “Installing a Service Node with a Non-default Image”.

            Procedure 3-6. Use mksiimage to Create a Compute Node Image

              To use the mksiimage(1) command to create a compute node image, perform the following:

              1. Make a copy of the example compute node image RPM list and work on the copy, as follows:

                # cp /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-compute-node.rpmlist

              2. Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.

              3. Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-compute-node-image as the home for this image. This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.

                # mksiimage -A --name my-compute-node-image --location /tftpboot/distro/sles-10-x86_64,
                /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64
                 --filename /etc/opt/sgi/rpmlists/my-compute-node.rpmlist

              4. Add the new image to the list of images cimage knows is available, as follows:

                # cimage --add-db my-compute-node-image


                Note: The order that you execute commands is important. Make sure to you run the cimage --add-db command before the post-process-sgi-image command.


              5. Post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:

                # post-process-sgi-image /var/lib/systemimager/images/my-compute-node-image/ eth1

              6. For information on how to use the cimage command to push this new image to the rack leader controllers, see “cimage Command”.

              Installing a Service Node with a Non-default Image

              After you have updated or created a service node image, you can install that image on to a managed service node, such as a login node.


              Note: Re-installing the service node using the discover process will destroy everything previously on the root drive.


              By default, discover uses the SGI default service-sles10sp1 image. For example:

              # discover --service 2,image=my-service-node-image

              The image above directs the installation of the described operating system image.

              For more information on the discover command, see “discover Command” in Chapter 2.

              Using a Custom Repository for Site Packages

              This section describes how to maintain packages specific to your site and have them available to yum and mksiimage .

              SGI suggests putting site-specific packages in a separate location. They should not reside in the same location as SGI or Novell supplied packages.

              Procedure 3-7. Setting Up a Custom Repository for Site Packages

                To set up a custom repository for your site packages, perform the following steps:

                1. Create a directory for your site-specific packages on the system admin controller, as follows:

                  # mkdir /tftpboot/site-local/sles-10-x86_64

                2. Copy your site packages to the new directory, as follows:

                  # copy my-package-1.0.x86_64.rpm /tftpboot/site-local/sles-10-x86_64

                3. Create the metadata so yum sees this as a repository, as follows:

                  # yume --prepare --repo /tftpboot/site-local/sles-10-x86_64

                4. If you wish to use yum (as opposed to only mksiimage), you can create a yum repository configuration file in /etc/yum.repos.d . Use tempo.repo as an example. You may need to turn off the gpgcheck option if your RPMs are not signed. Remember to deploy the repository to all images from which you wish to use this repository.

                5. If you wish your new package to be installed in to an image created by mksiimage by default, you will need to add it in to an RPM list. Example rpmlists are in /etc/opt/sgi/rpmlists. Always work on your own copy; do not modify the SGI supplied RPM lists. For more information, see “Creating Compute and Service Images Using the mksiimage Command”.

                  Below is an example of the mksiimage command is the same as the one shown in Procedure 3-5 except it adds your new repository to the list:

                  # mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64,
                  /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64,
                  /tftpboot/site-local/sles-10-x86_64 --filename
                  /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist

                  This command creates a new root image, using your newly created repository as one of the sources, and adds the new package assuming it is listed in /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist .

                6. Run post-process-sgi-image command as described in Procedure 3-5.

                Power Management Commands

                The cpower command allows you to power up, power down, reset, and show the power status of system components.

                cpower Command

                The cpower command is, as follows:

                cpower [<option> ...] [<target_type>] [<action>] <target>

                The <option> argument can be one or more of the following:

                Option 

                Description

                --noleader 

                Do not include leader nodes (valid with rack and system domains only).

                --noservice 

                Do not include service nodes (valid with system domain only).

                --ipmi 

                Uses ipmitool to communicate. [default]

                --ssh 

                Uses ssh to communicate.

                --intelplus 

                Uses the -o intelplus option for ipmitool [default] Note that you do not usually need to specify this.

                --force 

                When using wildcards in the target, disable all “safety” checks. Make sure you really want to use this command.

                -n, --noexec 

                Displays, but does not execute, commands that affect power.

                -v, --verbose 

                Print additional information on command progress


                Note: The command will fail if the target contains any wild cards, unless the --all option is specified.


                The <target> argument is one of the following:

                --node 

                Applies the action to nodes. Nodes are compute nodes, rack leader controllers (leader nodes), system admin controller (admin node), and service nodes. [default]

                --iru 

                Applies the action at the IRU level.

                --rack 

                Applies the action at the rack level.

                --system 

                Applies the action to the system. You must not specify a target with this type.

                The <action> argument is one of the following:

                --status 

                Show the power status of the target, including whether it is booted or not. [default]

                --up | --on 

                Powers up the target.

                --down | --off 

                Powers down the target.

                --reset 

                Performs a hard reset on the target.

                --cycle 

                Power cycles the target.

                --boot 

                Boots up the target, unless it is already booted. Waits for all targets to boot.

                --reboot 

                Reboots the target, even if already booted. Wait for all targets to boot.

                --shutdown 

                Shuts down the target, but does not power it off. Waits for targets to shut down.

                --identify <interval> 

                Turns on the identifying LED for the specified interval in seconds. Uses an interval of 0 to turn off immediately.

                -h, --help 

                Shows help usage statement.

                The target must always be specified except when the --system option is used. Wildcards may be used, but be careful not to accidentally power off or reboot the leader nodes. If wildcard use affects any leader node, the command fails with an error.

                Operations on Nodes

                The default for the cpower command is to operate on system nodes, such as compute nodes, leader nodes, or service nodes. If you do not specify --iru, --rack, or --system, the command defaultd to operating as if you had specified --node.

                Here are examples of node target names:

                • r1i3n10

                  Compute node at rack 1, IRU 3, slot 10

                • service0

                  Service node 0

                • r3lead

                  Rack leader controller (leader node) for rack 3

                • r1i*n*

                  Wildcards let you specify ranges of nodes, for example, r1i*n* all compute nodes in all IRUs on rack 1

                IPMI-style Commands

                The default operation for the cpower command is to operate on nodes and to provide you the status of these nodes, as follows:

                # cpower r1i*

                The cpower command also

                This example gives you the power status and boot status of all the compute blades in rack 1. This command is equivalent to cpower --node --status r1i*.

                This command issues an ipmitool power off command to all of the nodes specified by the wildcard, as follows:

                # cpower --off r2i*  

                The default is to apply to a node.

                The following commands behave exactly as you would expect as if you were using ipmitool, and have no special extra logic for ordering:

                • # cpower --up r1i*

                • # cpower --reset r1i*

                • # cpower --cycle r1i*

                • # cpower --identify 5 r1i*


                Note: --up is a synonym for --on and --down is a synonym for --off.


                IRU, Rack, and System Domains

                The cpower command contains more logic when you go up to higher levels of abstraction, for example, using --iru, --rack, and --system. These higher level domain specifiers tell the command to be smart about how to order various of the actions that you give on the command line.

                The --iru option tells the command to use correct ordering with IRU power commands. In this case, it firsts connect to the CMC on each IRU in rack 1 to issue the power on command, which turns on power to the IRU chassis (this is not the equivalent ipmitool command). Then it powers up the compute nodes in the IRU. Powering things down is the opposite, with the power to the IRU being turned off after power to the blades. IRU targets are specified as follows: r3i2 for rack 3, IRU 2.

                # cpower --iru --up r1* 

                The --rack option ensures power commands to the leader node are down in the correct order relative to compute nodes within a rack. First, it powers up the leader node and waits for it to boot up (if it is not already up). Then it will do the functional equivalent of a cpower --iru --up r4i* on each of the IRUs contained in the rack, including applying power to each IRU chassis. Using the --down option is the opposite, and also turns off the leader node (after doing a shutdown) after all the IRUs are powered down. To avoid including leader nodes in a power command for a rack, use the --noleader option. Rack targets are specified, as follows: r4 for rack 4. Here is an example:

                # cpower --rack --up r4

                Commands with the --system option ensures that power up commands are applied first to service nodes, then to leader nodes, then to IRUs and compute blades, in just the same way. Likewise, compute blades are powered down before IRUs, leader nodes, and service nodes, in that order. To avoid including service nodes in a system-domain command, use the --noservice option. Note that you must not specify a target with --system option, since it applies to the Altix ICE system.

                Shutting Down and Booting

                It useful to be able to shutdown a machine before turning off the power, in most cases. The following cpower options to enable you to do this: --shutdown , --boot, and --reboot. The --shutdown option is self-explanatory, but --reboot will ensure that a system is always rebooted, whereas --boot will only boot up a system if it is not already booted. Thus, --boot is useful for booting up compute blades that have failed to start.


                Note: The IPMI power commands necessary to enable a system to boot (either with a power reset, or a power on) may be sent to a node, but a node that has been shutdown with the --shutdown option does not have its power automatically turned off.


                The --shutdown option works on node, IRU, or rack domain levels. It will shut down nodes (in the correct order if you use the --iru or --rack options), and then just leave them as they are, power still applied. Usually you may only specify one action per command, however, with the --shutdown option, you may also specify --off. Using both these actions results in nodes being shutdown, then powered off. This is particularly useful when powering off a rack, since otherwise, the leaders may be shutdown before there is a chance to power off the compute blades. Here is an example:

                # cpower --shutdown --rack r1

                To boot up systems that have not already been booted, perform the following:

                # cpower --boot  r1i2n*

                Again, the command boots up nodes in the right orders if you specify the --iru or --rack options and the appropriate target. Otherwise, there is no guarantee that, for example, the command will attempt to power on the leader node before compute nodes in the same rack.

                To reboot all of the nodes specified, or boot them if they are already shut down, perform the following:

                # cpower --reboot --iru r3i3

                The --iru or --rack options ensure proper ordering if you use them. In this case, the command will make sure that power is supplied to the chassis for rack 3, IRU 3, and then the all the compute nodes in that IRU will be rebooted.

                EXAMPLES

                Example 3-2. cpower Command Examples

                To boot compute blade r1i0n8, perform the following:

                # cpower --boot r1i0n8
                

                To boot a number of compute blades at the same time, perform the following:

                # cpower --boot --rack r1


                Note: The --boot option will only boot those nodes that have not already booted.


                To shut down service node 0, perform the following:

                # cpower --shutdown service0

                To shutdown and switch off everything in rack 3, perform the following:

                # cpower --shutdown --off --rack r3


                Note: Using the --shutdown and the --off options together is the only time you can use more than one command on the cpower command line. This combination will shutdown then power off all of the computer nodes in parallel, then shutdown and power off the leader node. Use the --noleader option if you want the leader node to remain booted up.


                To shutdown the entire system, including all service nodes and all leader nodes, but not the admin node, and not turn the power off to anything, perform the following:

                # cpower --shutdown --system

                To shutdown all the compute nodes, but not the service nodes, leader nodes, perform the following:

                # cpower --shutdown --system --noleader --noservice


                Note: The only way to shut down the system admin controller (admin node) is to perform the operation manually.



                C3 Commands

                This section describes the cluster command and control (C3) tool suite for cluster administration and application support.


                Note: The SGI Tempo version of C3 does not include the cshutdown and cpushimage commands.


                The C3 commands used on the the SGI Alitx ICE 8200 system are, as follows:

                C3 Utilities 

                Description

                cexec(s) 

                Executes a given command string on each node of a cluster

                cget 

                Retrieves a specified file from each node of a cluster and places it into the specified target directory

                ckill 

                Runs kill on each node of a cluster for a specified process name

                clist 

                Lists the names and types of clusters in the cluster configuration file

                cnum 

                Returns the node names specified by the range specified on the command line

                cname 

                Returns the node positions specified by the node name given on the command line

                cpush 

                Pushes files from the local machine to the nodes in your cluster

                cexec is the most useful C3 utility. Use the cpower, power-iru, power-rack, and power-system commands rather than cshutdown (see “Power Management Commands”).

                EXAMPLES

                Example 3-3. C3 Command General Examples

                The following examples walk you through some typical C3 command operations.

                You can use the cname and cnum commands to map names to locations and vice versa, as follows:

                # cname rack_1:0-2
                local name for cluster:  rack_1
                nodes from cluster:  rack_1
                cluster:  rack_1 ; node name:  r1i0n0
                cluster:  rack_1 ; node name:  r1i0n1
                cluster:  rack_1 ; node name:  r1i0n10
                
                # cnum rack_1: r1i0n0
                local name for cluster:  rack_1
                nodes from cluster:  rack_1
                r1i0n0 is at index 0 in cluster rack_1
                
                # cnum rack_1: r1i0n1
                local name for cluster:  rack_1
                nodes from cluster:  rack_1

                You can use the clist command to retrieve the number of racks, as follows:

                # clist
                cluster  rack_1  is an indirect remote cluster
                cluster  rack_2  is an indirect remote cluster
                cluster  rack_3  is an indirect remote cluster
                cluster  rack_4  is an indirect remote cluster

                You can use the cexec command to view the addressing scheme of the C3 utility, as follows:

                # cexec rack_1:1 hostname
                ************************* rack_1 *************************
                ************************* rack_1 *************************
                --------- r1i0n1---------
                r1i0n1
                
                # cexec rack_1:2-3 rack_4:0-3,10 hostname
                ************************* rack_1 *************************
                ************************* rack_1 *************************
                --------- r1i0n10---------
                r1i0n10
                --------- r1i0n11---------
                r1i0n11
                ************************* rack_4 *************************
                ************************* rack_4 *************************
                --------- r4i0n0---------
                r4i0n0
                --------- r4i0n1---------
                r4i0n1
                --------- r4i0n10---------
                r4i0n10
                --------- r4i0n11---------
                r4i0n11
                --------- r4i0n4---------
                r4i0n4
                

                The following set of command shows how to use the C3 commands to transverse the different levels of hierarchy in your Altix ICE system (for information on the hierarchical design of your Altix ICE system see “Basic System Building Blocks” in Chapter 1).

                To execute a C3 command on all blades within the default Altix ICE system, for example, rack 1, perform the following:

                # cexec hostname
                ************************* rack_1 *************************
                ************************* rack_1 *************************
                --------- r1i0n0---------
                r1i0n0
                --------- r1i0n1---------
                r1i0n1
                --------- r1i0n10---------
                r1i0n10
                --------- r1i0n11---------
                r1i0n11
                
                ...

                To run a C3 command on all compute nodes across an Altix ICE system, perform the following:

                # cexec --all hostname
                ************************* rack_1 *************************
                ************************* rack_1 *************************
                --------- r1i0n0---------
                r1i0n0
                --------- r1i0n1---------
                r1i0n1
                ...
                --------- r2i0n10---------
                r2i0n10
                ...
                --------- r3i0n11---------
                r3i0n11
                ...

                To run a C3 command against the first rack leader controller, in the first rack, perform the following:

                # cexec --head hostname
                ************************* rack_1 *************************
                --------- rack_1---------
                r1lead
                

                To run a C3 command against all rack leader controllers across all racks, perform the following:

                # cexec --head --all hostname
                ************************* rack_1 *************************
                --------- rack_1---------
                r1lead
                ************************* rack_2 *************************
                --------- rack_2---------
                r2lead
                ************************* rack_3 *************************
                --------- rack_3---------
                r3lead
                ************************* rack_4 *************************
                --------- rack_4---------
                r4lead

                The following set of examples shows some specific case uses for the C3 commands that you are likely to employee.

                Example 3-4. C3 Command Specific Use Examples

                From the system admin controller, run command on rack 1 without including the rack leader controller, as follows:

                # cexec rack_1: <cmd>

                Run a command on all service nodes only, as follows:

                # cexec -f /etc/c3svc.conf <cmd>

                Run a command on all compute nodes in the system, as follows:

                # cexec --all <cmd>

                Run a command on all rack leader controllers, as follows:

                # cexec --all --head <cmd>

                Run a command on blade 42 (compute node 42) in rack 2, as follows:

                # cexec rack_2:42 <cmd>

                From a service node over the InfiniBand Fabric, run a command on all blades (compute nodes) in the system, as follows:

                # cexec --all <cmd>

                Run a command on blade 42 (compute node 42), as follows:

                # cexec blades:42 <cmd> 


                Console Management

                SGI Tempo management systems software uses the open-source console management package called conserver. For detailed information on consever, see http://www.conserver.com/

                An overview of the conserver package is, as follows:

                • Manages the console devices of all managed nodes in an Altix ICE system

                • A conserver daemon runs on the system admin controller (admin node) and the rack leader controllers (leader nodes). The system admin controller manages leader and service node consoles. The rack leader controllers manage blade consoles.

                • The conserver daemon connects to the consoles using ipmitool. Users connect to the daemon to access them. Multiple users can connect but non-primary users are read-only.

                • The conserver package is configured to allow all consoles to be accessed from the system admin controller.

                • All consoles are logged. These logs can be found at /var/log/consoles on the system admin controller and rack leader controllers. An autofs configuration file is created to allow you to access rack leader controller managed console logs from the system admin controller, as follows:

                  system-admin # /net/r1lead/var/log/consoles/ 

                The /etc/conserver.cf file is the configuration file for the conserver daemon. This file is generated for both the system admin controller and rack leader controllers from the /opt/sgi/sbin/generate-conserver-files script on the system admin controller. This script is called from discover-rack command as part of rack discovery or rediscovery and generates both the conserver.cf file for the rack in question and regenerates the conserver.cf for the sysem admin controller.


                Note: The conserver package replaces cconsole for access to all consoles (blades, leader nodes, managed service nodes)


                You may find the following conserver man pages useful:

                Man Page  

                Description

                console(1) 

                Console server client program

                conserver(8) 

                Console server daemon

                conserver.cf(5) 

                Console configuration file for conserver(8)

                conserver.passwd(5) 

                User access information for conserver(8)

                Procedure 3-8. Using conserver Console Manager

                  To use the conserver console manager, perform the following steps:

                  1. To see the list of available consoles, perform the following:

                    system-admin:~ # console -x
                     service0                 on /dev/pts/2                       at  Local 
                     r2lead                   on /dev/pts/1                       at  Local 
                     r1lead                   on /dev/pts/0                       at  Local 
                     r1i0n8                   on /dev/pts/0                       at  Local 
                     r1i0n0                   on /dev/pts/1                       at  Local                 

                  2. To connect to the service console, perform the following:

                    system-admin:~ # console service0
                    [Enter `^Ec?' for help]
                    
                    
                    Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp (ttyS1).
                    
                    
                    service0 login: 
                    


                  3. To connect to the rack leader controller console, perform the following:

                    system-admin:~ # console r1lead
                    [Enter `^Ec?' for help]
                    
                    
                    Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) 
                    - Kernel 2.6.16.46-0.12-smp (ttyS1).
                    
                    
                    r1lead login:

                  4. To trigger system request commands sysrq (once connected to a console), perform the following:

                    Ctrl-e c l 1 8               # set log level to 8
                    Ctrl-e c l 1 <sysrq cmd>            # send sysrq command

                  5. To see the list of conserver escape keys, perform the following:

                    Ctrl-e c ?

                  Keeping System Time Synchronized

                  The SGI Tempo systems management software uses network time protocol (NTP) as the primary mechanism to keep the nodes in your Altix ICE system synchronized. This section describes this mechanism operates on the various Altix ICE components and covers these topics:

                  System Admin Controller NTP

                  The NTP client on the system admin controller should point to the house network time server. The NTP server provides NTP service to system components so that nodes can consult it when they are booted. The system admin controller sends NTP broadcasts to some networks to keep the nodes in sync after they have booted.

                  Rack Leader controller NTP

                  NTP client on the rack leader controller gets time from the system admin controller when it is booted and then stays in sync by watching NTP broadcasts from the system admin controller. The NTP server node provides NTP service to Altix ICE components so that compute nodes can sync their time when they are booted. The rack leader controller sends NTP broadcasts to some networks to keep the compute nodes in sync after they have booted.

                  Service Node NTP

                  The NTP client on managed service nodes ( for a definition of managed, see “discover Command” in Chapter 2) sets its time at initial booting from the system admin controller. It listens to NTP broadcasts from the system admin controller to stay in sync. It does not provide any NTP service.

                  Compute Node NTP

                  The NTP Client on the compute node sets its time at initial booting from the rack leader controller. It listens to NTP broadcasts from the rack leader controller to stay in sync.

                  NTP Work Arounds

                  Sometime, especially during initial deployment of an Altix ICE system when system components are being installed and configured for the first time, NTP is not available to serve time to system components.

                  A non-modified NTP server, running for the first time, takes quite some time before it offers service. This means the leader and service nodes may fail to get time from the system admin controller as they come on-line. Compute nodes may also fail to get time from the leader when they first come up. This situation usually only happens at first deployment. After the ntp servers have a chance to create their drift files, ntp servers offer time with far less delay on subsequent reboots.

                  The following work arounds are in place for situations when NTP can not serve the time:

                  • The admin and rack leader controllers have the time service enabled (xinetd).

                  • All system node types have the netdate command.

                  • A special startup script is on leader, service, and compute nodes that runs before the NTP startup script.

                    This script attempts to get the time using the ntpdate command. If the ntpdate command fails because the NTP server it is using is not ready yet to offer time service, it uses the netdate command instead of get the clock “close".

                    The ntp startup script starts the NTP service as normal. Since the clock is known to be "close", NTP will fix the time when the NTP servers start offering time service.

                  Backing up and Restoring the System Database

                  The SGI Tempo systems management software captures the relevant data for the managed objects in an SGI Altix ICE system. Managed objects are the hierarchy of nodes described in “Basic System Building Blocks” in Chapter 1. The system database is critical to the operation of your SGI Altix ICE system and you need to back up the database on a regular basis.

                  Managed objects on an SGI Altix ICE include the following

                  • Altix ICE system

                    One ICE system is modeled as a meta-cluster. This meta-cluster contains the racks each modeled as a sub-cluster.

                  • Nodes

                    System admin controller (admin node), rack leader controllers (leader nodes), service nodes, compute nodes (blades) and chassis management control blades (CMCs) are modeled as nodes.

                  • Networks

                    The preconfigured and potentially customized IP networks

                  • Nics

                    The network interfaces for Ethernet and InfiniBand adapters.

                  • The network interfaces for Ethernet and InfiniBand adapter.

                    The node images installed on each particular node.

                  SGI recommends that you keep three backups of your system database at any given time. You should implement a rotating backup procedure following the son-father-grandfather principle.

                  Procedure 3-9. Backing up and Restoring the System Database

                    To back up and restore the system database, perform the following steps:

                    1. From the system admin controller, to back up the system database perform a command similar to the following:

                      # mysqldump --opt oscar > backup-file.sql
                      

                    2. To read the dump file back into the system admin controller, perform a command similar to the following:

                      # mysql oscar < backup-file.sql

                    For more information, see the mysqldump(1) man page.