This chapter describes how to use the SGI Tempo systems management software to operate your Altix ICE system and covers the following topics:
This section describes SLES10 services turned off on compute nodes by default, how you can customize the software running on compute nodes or service nodes, create a simple clone image of compute node or service node software, how to use the cimage command, how to use yum to install packages into software images, and how to use the mksiimage command to create compute and service node images. It covers these topics:
Currently, the compute nodes run the SUSE Linux Enterprise Server 10 (SLES10) Service Pack 1 (SP1) Linux distribution. To improve the performance of applications running MPI jobs on compute nodes, the following SLES10 services are turned off:
acpid
auditd
boot.crypto
boot.device-mapper
boot.lvm
boot.md
cron
earlykbd
earlysyslog
fbset
irq_balancer
kbd
novell-zmd
nscd
postfix
powersaved
resmgr
slpd
splash
splash_early
suseRegister
xdm
This section discusses how to manage various nodes on your SGI Altix ICE system. It describes how to configure the various nodes, including the compute and service nodes. It describe how to augment software packages. Many tasks having to do with package management have multiple valid methods to use.
For information on installing patches and updates, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” in Chapter 2.
You can add per-host compute node customization to the compute node images. You do this by adding scripts either to the /opt/sgi/share/per-host-customization/global/ directory or the /opt/sgi/share/per-host-customization/mynewimage/ directory on the system admin controller.
| Note: When creating custom images for compute nodes, make sure you clone the original SGI images. This provides the original images intact that yocan fall back to if necessary. |
Scripts in the global directory apply to all compute nodes images. Scripts under the image name apply only to the image in question. The scripts are cycled through once per host when being installed on the rack leader controllers. They receive one input argument, which is the full path (on the rack leader controller) to the per-host base directory, for example, /var/lib/sgi/mynewimage/i2n11. There is a README file at /opt/sgi/share/per-host-customization/README on the system admin controller, as follows:
This directory contains compute node image customization scripts which are executed as part of the install-image operations on the leader nodes when pulling over a new compute node image. After the image has been pulled over, and the per-host-customization dir has been rsynced, the per-host /etc and /var directories are populated, then the scripts in this directory are cycled through once per-host. This allows the scripts to source the node specific network and cluster management settings, and set node specific settings. Scripts in the global directory are iterated through first, then if a directory exists that matches the image name, those scripts are iterated through next. You can use the scripts in the global directory as examples. |
An example global script, /opt/sgi/share/per-host-customization/global/sgi-hostname is, as follows:
#!/bin/sh
#
# Copyright (c) 2007 Silicon Graphics, Inc.
# All rights reserved.
#
# Set the compute node's hostname to the cluster unique name
#
# This script is excecuted once per-host as part of the install-image operation
# run on the leader nodes, which is called from cimage on the admin node.
# The full path to the per-host iru+slot directory is passed in as $1,
# e.g. /var/lib/sgi/per-host//i2n11.
#
# sanity checks
. /opt/sgi/share/per-host-customization/global/sanity.sh
iruslot=$1
# source cluster management information
. ${iruslot}/etc/opt/sgi/cminfo
# set hostname of blade to cluster unique name
echo ${NAME} > ${iruslot}/etc/HOSTNAME |
All node types that are part of an SGI Altix ICE system can have configuration settings adjusted by the configuration framework. There is some overlap between the per-host customization instructions and the configuration framework instructions. Each approach plays a role in configuring your system. The major differences between the two methods are, as follows:
Per-host customization runs at the time an image is pushed to the rack leader controllers.
Per-host customization only applies to compute node images.
The Altix ICE system configuration framework can be used with all node types.
The system configuration framework is run when a new root is created, when SuSEconfig command is run for some other reason, or as part of a yum operation.
This framework exists to make it easy to adjust configuration items. There are SGI-supplied scripts already present. You can add more scripts as you wish. You can also exclude scripts from running without purging the script if you decide a certain script should not be run. The following set of questions in bold and bulleted answers describes how to use the system configuration framework.
How does the system configuration framwork operate?
These files could be added, for example, to a running service node, or to an already created service or compute image. Remember that images destined for compute nodes need to be pushed with the cimage command after being altered. For more information, see “cimage Command”.
A /opt/sgi/lib/cluster-configuration script is called, from where it is called is described below.
That script iterates through scripts residing in /etc/opt/sgi/conf.d.
Any scripts listed in /etc/opt/sgi/conf.d/exclude are skipped, as are scripts, that are not executable.
Scripts in system configruation framework must be tolerant of files that do not exist yet, as described below. For example, check that a syslog configuration file exists before trying to adjust it.
Where is the framework called from?
The callout for /opt/sgi/lib/cluster-configuration is implemented as a yum plugin that executes after packages have been installed and cleaned.
There is also a SUSE configuration script in /sbin/conf.d, called SuSEconfig.00cluster-configuration , that calls the framework. This is in case of you are using YaST to install or upgrade packages.
One of the scripts called by the framework calls SuSEconfig. A check is made to avoid a callout loop.
How do I adjust my system configuration?
Create a small script in /etc/opt/sgi/conf.d to do the adjustment.
Be sure that you test for existence of files and do not assume they are there (see "Why do scripts need to tolerate files that do not exist but should?" below).
Why do scripts need to tolerate files that do not exist but should?
This is because the mksiimage command runs yume and yum in two steps. The first step only installs 40 or so RPMs but our framework is called then too. The second pass installs the other "hundreds" of RPMs. So the framework is called once before many packages are installed, and again after everything is in place. So not all files you expect might be available when your small script is called.
How does the yum plugin work?
In order for the yum plugin to work, the /etc/yum.conf file has to have plugins=1 set in its configuration file. SGI Tempo software ensures that is in place by way of a trigger in the sgi-cluster package. Any time yum is installed or updated, it verify plugins=1 is set.
How does yume work?
yume, an oscar wrapper for yum, works by creating a temporary yum configuration file in /tmp and then points yum at it. This temporary configuration file needs to have plugins enabled. A tiny patch to yume makes this happen. This fixes it for yume and also mksiimage, which calls yume as part of its operation.
| Note: Procedures in this section describe how to work with service node and compute node images. Always use a cloned image. If you are adjusting an RPM list, use your own copy of the RPM list. |
The service and compute node images are created during the configure-cluster operation (or during your upgrade to SGI ProPack 5 SP3 if you were running SGI ProPack 5 SP2 previously). This process uses an RPM list to generate a root on the fly.
You can clone a compute node image, or create a new one based on an RPM list. For service nodes, SGI does not support a clone operation. For compute images, you can either clone the image and work on a copy or you can always make a new compute node image from the SGI supplied default RPM list.
To clone a compute node image, perform the following steps:
From the system admin controller, create a clone of the compute node image, as follows:
# cimage --clone-image compute-sles10sp2 new |
After that command is complete, you will have a new image located in /var/lib/systemimager/images/new on the system admin controller.
To see that the image is now available, perform the following command:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: new
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
|
For RPM lists, the default RPM lists are located in /etc/opt/sgi/rpmlists on the system admin controller. SGI suggests you never change these files, but rather, create your own versions using the ones supplied by SGI as a base.
Please note, it is important that certain packages be in the rpmlist for a given node. For example, an rpmlist used for compute nodes should have packages sgi-compute-node and sgi-cluster. Service nodes must have sgi-service-node and sgi-cluster.
To manually add a package to a compute node image, perform the steps:
Make a clone of the compute node image, as described in “Customizing Software Images”.
Determine what images and kernels you have available now, as follows:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-clone
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
|
From the system admin controller, change directory to the images directory, as follows:
# cd /var/lib/systemimager/images/ |
From the system admin controller, copy the RPMs your wish to add, as follows:
# cp /newrpm.rpm new/tmp |
The new RPMs now reside in /tmp direcory in the image named new. To install them into your new compute node image, perform the following commands:
# chroot new bash |
And then perform the following:
# rpm -Uvh /tmp/newrpm.rpm |
The image on the system admin controller is updated. However, you still need to push the changes out. Ensure there are no nodes currently using the image and then run this command:
# cimage --push-rack new r\* |
This will push the updates to the rack lead controllers and the changes will be seen by the compute nodes the next time they start up. For information on how to ensure the image is associated with a given node, see the cimage --set command and the example in Procedure 3-3.
To clone the compute node image, perform the following:
# cimage --clone-image compute-sles10sp1 compute-sles10sp1-clone |
To see the images and kernels in the list, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-clone
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To change the compute nodes to use the cloned image/kernel pair, perform the following:
# cimage --set compute-sles10sp1-clone 2.6.16.46-0.12-smp "r*i*n*" |
To manually add a package to the service node image, perform the following steps:
Use the mksiimage command to create your own version of the service node image. See “Creating Compute and Service Images Using the mksiimage Command”.
Change directory to the images directory, as follows:
# cd /var/lib/systemimager/images/ |
From the system admin controller, copy the RPMs your wish to add, as follows, where my-service-image is your own service node image:
# cp /newrpm.rpm my-service-image/tmp |
The new RPMs now reside in /tmp direcory in the image named my-service-image. To install them into your new compute node image, perform the following commands:
# chroot new bash |
# rpm -Uvh /tmp/newrpm.rpm |
At this point, the image has been updated with the rpm . Please note, that unlike compute node images, changes made to a service node image will not be seen by service nodes until they are re-installed with the image. If you wish to install the package on running systems, you can copy the rpm to the running system and use rpm from there.
The cimage command allows you to list, modify, and set software images on the compute nodes in your system.
The cimage command accepts the following options:
| Option | Description | |
| --help | Usage and help text | |
| --list-images | Lists images present in the database | |
| --list-nodes RACK ... | Lists what compute nodes are set to | |
| --set IMAGE KERNEL NODE ... | Sets the compute nodes to a certain boot image and kernel combination | |
| --add-db IMAGE | Adds an image to the database | |
| --del-db IMAGE | Deletes an image from the database | |
| --push-rack IMAGE RACK ... | Pushes an image to specified rack(s) | |
| --del-rack IMAGE RACK | Deletes an image from specified rack(s) | |
| --clone-image OIMAGE NIMAGE | Clones an existing image to a new image | |
| --del-image IMAGE | Deletes an existing image entirely |
RACK arguments take the format rX.
NODE arguments take the format rXiYnZ .
X, Y, Z can be single digits, a [start-end] range, or * for all matches.
... indicates more than one RACK or NODE argument can be passed in.
EXAMPLES
Example 3-1. cimage Command Examples
The following examples walk you through some typical cimage command operations.
To list the available images and their associated kernels, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To list the compute nodes in rack 1 and the image and kernel they are set to boot, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n3: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n4: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n5: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n6: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n7: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n8: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set the r1i0n0 compute node to boot the 2.6.16.46-0.12-carlsbad kernel from the compute-sles10sp1 image, perform the following: :
# cimage --set compute-sles10sp1 2.6.16.46-0.12-carlsbad r1i0n0 |
To list the nodes in rack 1 to see the changes set in the example above, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-carlsbad r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set all nodes in all racks to boot the 2.6.16.46-0.7-carlsbad kernel from the compute-sles10sp1 image, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-carlsbad r*i*n* |
To set two ranges of nodes to boot the 2.6.16.46-0.7-smp kernel, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-smp r1i[0-2]n[5-6] r1i[2-3]n[0-4] |
To clone the compute-sles10sp1 image to a new image (so that you can modify it) , perform the following:
# cimage --clone-image compute-sles10sp1 mynewimage Cloning compute-sles10sp1 to mynewimage ... done |
To change to the cloned image created in the example, above, copy the needed rpms into the /var/lib/systemimager/images/tmp directory, use the chroot command to enter the directory and then install the rpms, perform the following:
# cp .rpm /var/lib/systemimager/images//tmp # chroot /var/lib/systemimager/images// bash # rpm -Uvh /tmp/.rpm |
If you make changes to the kernels in the image, you need to refresh the kernel database entries for your image, To do this, perform the following:
# cimage --del-db mynewimage # cimage --add-db mynewimage |
To push new software images out to the compute blades in a rack or set of racks, perform the following:
# cimage --push-rack mynewimage r* r1lead: install-image: mynewimage r1lead: install-image: mynewimage done. |
To list images in the database the kernels they contain, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp
image: mynewimage
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp |
To set some compute nodes to boot an image, perform the following:
# cimage --set mynewimage 2.6.16.46-0.7-smp r1i3n* |
You need to reboot the compute nodes to run the new images.
Completely remove an image you no longer use, both from system admin controller and all compute nodes in all racks, perform the following:
# cimage --del-image mynewimage r1lead: delete-image: mynewimage r1lead: delete-image: mynewimage done. |
The packages that make up SGI Tempo and SLES are available in repositories at which yum is configured to look.
| Note: Always work with copies of software images. |
SGI provides a wrapper around yum that makes it simple to install a package in to an image.
However, yum only looks at packages that are part of a yum repository. So if you are installing your own rpm, you will need to configure yum to look at your own repositories in addition to the others. See the appropriate yum docomentation.
| Note: The yum software maintains a cache of the repository metdata and it will not update its cache of the metadata until a certain number of minutes have passed. This time limit is defined by the metadata_expire option of the yum.conf file. See the yum.conf(5) man page. If you happen to synchronize your repository to SUSE or Novell shortly after doing a yum operation, and you notice that yum now says there are no updates when there should be, you can run this command to clear the caches and force yum to try again: yum clean all. |
This example shows how to install the zlib-devel package in to the service node image so that the next time you image or install a service node it has this new package.
You can run the following command:
# yum-image-wrapper /var/lib/systemimager/images/my-service-sles10sp1 install zlib-devel |
If you update a compute node image on the system admin controller, you have to use the cimage command to push the changes.
If you update a service node image on the system admin controller, that service node needs to be re-installed/re-imaged to get the change. The discover command can be given an alternate image.
These instructions only apply to service nodes.
You can use the yum command to install a package on a service node. From the system admin controller, you can issue a command like the following. Please note that SGI suggests using the -y option. This prevents yum from asking for input.
# ssh service0 yum install zlib-devel |
You can create service node and compute node images using the mksiimage(1) command. This will generate a root on the fly.
Fresh installations of SGI ProPack 5 SP3 create these images during the configure-cluster installation step.
The rpm lists that drive which packages get installed in the images are listed in files located in /etc/opt/sgi/rpmlists . For example, /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist . You should not edit the default SGI RPM lists but instead make copies and work on the copy.
To use the mksiimage command to create a service node image, perform the following:
Make a copy of the example service node image RPM list and work on the copy, as follows:
# cp /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-service-node.rpmlist |
Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.
Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-service-node-image as the home for this image.
This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.
Execute the following command:
# mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist |
post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:
# post-process-sgi-image /var/lib/systemimager/images/my-service-node-image/ eth1 |
Now the image is ready to be used with service nodes. Please see the discover command for how to associate the new image with the service node for the discover command. See “Installing a Service Node with a Non-default Image”.
To use the mksiimage(1) command to create a compute node image, perform the following:
Make a copy of the example compute node image RPM list and work on the copy, as follows:
# cp /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-compute-node.rpmlist |
Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.
Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-compute-node-image as the home for this image. This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.
# mksiimage -A --name my-compute-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/my-compute-node.rpmlist |
Add the new image to the list of images cimage knows is available, as follows:
# cimage --add-db my-compute-node-image |
| Note: The order that you execute commands is important. Make sure to you run the cimage --add-db command before the post-process-sgi-image command. |
Post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:
# post-process-sgi-image /var/lib/systemimager/images/my-compute-node-image/ eth1 |
For information on how to use the cimage command to push this new image to the rack leader controllers, see “cimage Command”.
After you have updated or created a service node image, you can install that image on to a managed service node, such as a login node.
| Note: Re-installing the service node using the discover process will destroy everything previously on the root drive. |
By default, discover uses the SGI default service-sles10sp1 image. For example:
# discover --service 2,image=my-service-node-image |
The image above directs the installation of the described operating system image.
For more information on the discover command, see “discover Command” in Chapter 2.
This section describes how to maintain packages specific to your site and have them available to yum and mksiimage .
SGI suggests putting site-specific packages in a separate location. They should not reside in the same location as SGI or Novell supplied packages.
To set up a custom repository for your site packages, perform the following steps:
Create a directory for your site-specific packages on the system admin controller, as follows:
# mkdir /tftpboot/site-local/sles-10-x86_64 |
Copy your site packages to the new directory, as follows:
# copy my-package-1.0.x86_64.rpm /tftpboot/site-local/sles-10-x86_64 |
Create the metadata so yum sees this as a repository, as follows:
# yume --prepare --repo /tftpboot/site-local/sles-10-x86_64 |
If you wish to use yum (as opposed to only mksiimage), you can create a yum repository configuration file in /etc/yum.repos.d . Use tempo.repo as an example. You may need to turn off the gpgcheck option if your RPMs are not signed. Remember to deploy the repository to all images from which you wish to use this repository.
If you wish your new package to be installed in to an image created by mksiimage by default, you will need to add it in to an RPM list. Example rpmlists are in /etc/opt/sgi/rpmlists. Always work on your own copy; do not modify the SGI supplied RPM lists. For more information, see “Creating Compute and Service Images Using the mksiimage Command”.
Below is an example of the mksiimage command is the same as the one shown in Procedure 3-5 except it adds your new repository to the list:
# mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64, /tftpboot/site-local/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist |
This command creates a new root image, using your newly created repository as one of the sources, and adds the new package assuming it is listed in /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist .
Run post-process-sgi-image command as described in Procedure 3-5.
The cpower command allows you to power up, power down, reset, and show the power status of system components.
The cpower command is, as follows:
cpower [<option> ...] [<target_type>] [<action>] <target> |
The <option> argument can be one or more of the following:
| Option | Description | |
| --noleader | Do not include leader nodes (valid with rack and system domains only). | |
| --noservice | Do not include service nodes (valid with system domain only). | |
| --ipmi | Uses ipmitool to communicate. [default] | |
| --ssh | Uses ssh to communicate. | |
| --intelplus | Uses the -o intelplus option for ipmitool [default] Note that you do not usually need to specify this. | |
| --force | When using wildcards in the target, disable all “safety” checks. Make sure you really want to use this command. | |
| -n, --noexec | Displays, but does not execute, commands that affect power. | |
| -v, --verbose | Print additional information on command progress |
| Note: The command will fail if the target contains any wild cards, unless the --all option is specified. |
The <target> argument is one of the following:
| --node | Applies the action to nodes. Nodes are compute nodes, rack leader controllers (leader nodes), system admin controller (admin node), and service nodes. [default] | |
| --iru | Applies the action at the IRU level. | |
| --rack | Applies the action at the rack level. | |
| --system | Applies the action to the system. You must not specify a target with this type. |
The <action> argument is one of the following:
| --status | Show the power status of the target, including whether it is booted or not. [default] | |
| --up | --on | Powers up the target. | |
| --down | --off | Powers down the target. | |
| --reset | Performs a hard reset on the target. | |
| --cycle | Power cycles the target. | |
| --boot | Boots up the target, unless it is already booted. Waits for all targets to boot. | |
| --reboot | Reboots the target, even if already booted. Wait for all targets to boot. | |
| --shutdown | Shuts down the target, but does not power it off. Waits for targets to shut down. | |
| --identify <interval> | Turns on the identifying LED for the specified interval in seconds. Uses an interval of 0 to turn off immediately. | |
| -h, --help | Shows help usage statement. |
The target must always be specified except when the --system option is used. Wildcards may be used, but be careful not to accidentally power off or reboot the leader nodes. If wildcard use affects any leader node, the command fails with an error.
The default for the cpower command is to operate on system nodes, such as compute nodes, leader nodes, or service nodes. If you do not specify --iru, --rack, or --system, the command defaultd to operating as if you had specified --node.
Here are examples of node target names:
r1i3n10
Compute node at rack 1, IRU 3, slot 10
service0
Service node 0
r3lead
Rack leader controller (leader node) for rack 3
r1i*n*
Wildcards let you specify ranges of nodes, for example, r1i*n* all compute nodes in all IRUs on rack 1
The default operation for the cpower command is to operate on nodes and to provide you the status of these nodes, as follows:
# cpower r1i* |
The cpower command also
This example gives you the power status and boot status of all the compute blades in rack 1. This command is equivalent to cpower --node --status r1i*.
This command issues an ipmitool power off command to all of the nodes specified by the wildcard, as follows:
# cpower --off r2i* |
The default is to apply to a node.
The following commands behave exactly as you would expect as if you were using ipmitool, and have no special extra logic for ordering:
# cpower --up r1i*
# cpower --reset r1i*
# cpower --cycle r1i*
# cpower --identify 5 r1i*
| Note: --up is a synonym for --on and --down is a synonym for --off. |
The cpower command contains more logic when you go up to higher levels of abstraction, for example, using --iru, --rack, and --system. These higher level domain specifiers tell the command to be smart about how to order various of the actions that you give on the command line.
The --iru option tells the command to use correct ordering with IRU power commands. In this case, it firsts connect to the CMC on each IRU in rack 1 to issue the power on command, which turns on power to the IRU chassis (this is not the equivalent ipmitool command). Then it powers up the compute nodes in the IRU. Powering things down is the opposite, with the power to the IRU being turned off after power to the blades. IRU targets are specified as follows: r3i2 for rack 3, IRU 2.
# cpower --iru --up r1* |
The --rack option ensures power commands to the leader node are down in the correct order relative to compute nodes within a rack. First, it powers up the leader node and waits for it to boot up (if it is not already up). Then it will do the functional equivalent of a cpower --iru --up r4i* on each of the IRUs contained in the rack, including applying power to each IRU chassis. Using the --down option is the opposite, and also turns off the leader node (after doing a shutdown) after all the IRUs are powered down. To avoid including leader nodes in a power command for a rack, use the --noleader option. Rack targets are specified, as follows: r4 for rack 4. Here is an example:
# cpower --rack --up r4 |
Commands with the --system option ensures that power up commands are applied first to service nodes, then to leader nodes, then to IRUs and compute blades, in just the same way. Likewise, compute blades are powered down before IRUs, leader nodes, and service nodes, in that order. To avoid including service nodes in a system-domain command, use the --noservice option. Note that you must not specify a target with --system option, since it applies to the Altix ICE system.
It useful to be able to shutdown a machine before turning off the power, in most cases. The following cpower options to enable you to do this: --shutdown , --boot, and --reboot. The --shutdown option is self-explanatory, but --reboot will ensure that a system is always rebooted, whereas --boot will only boot up a system if it is not already booted. Thus, --boot is useful for booting up compute blades that have failed to start.
| Note: The IPMI power commands necessary to enable a system to boot (either with a power reset, or a power on) may be sent to a node, but a node that has been shutdown with the --shutdown option does not have its power automatically turned off. |
The --shutdown option works on node, IRU, or rack domain levels. It will shut down nodes (in the correct order if you use the --iru or --rack options), and then just leave them as they are, power still applied. Usually you may only specify one action per command, however, with the --shutdown option, you may also specify --off. Using both these actions results in nodes being shutdown, then powered off. This is particularly useful when powering off a rack, since otherwise, the leaders may be shutdown before there is a chance to power off the compute blades. Here is an example:
# cpower --shutdown --rack r1 |
To boot up systems that have not already been booted, perform the following:
# cpower --boot r1i2n* |
Again, the command boots up nodes in the right orders if you specify the --iru or --rack options and the appropriate target. Otherwise, there is no guarantee that, for example, the command will attempt to power on the leader node before compute nodes in the same rack.
To reboot all of the nodes specified, or boot them if they are already shut down, perform the following:
# cpower --reboot --iru r3i3 |
The --iru or --rack options ensure proper ordering if you use them. In this case, the command will make sure that power is supplied to the chassis for rack 3, IRU 3, and then the all the compute nodes in that IRU will be rebooted.
EXAMPLES
Example 3-2. cpower Command Examples
To boot compute blade r1i0n8, perform the following:
# cpower --boot r1i0n8 |
To boot a number of compute blades at the same time, perform the following:
# cpower --boot --rack r1 |
| Note: The --boot option will only boot those nodes that have not already booted. |
To shut down service node 0, perform the following:
# cpower --shutdown service0 |
To shutdown and switch off everything in rack 3, perform the following:
# cpower --shutdown --off --rack r3 |
| Note: Using the --shutdown and the --off options together is the only time you can use more than one command on the cpower command line. This combination will shutdown then power off all of the computer nodes in parallel, then shutdown and power off the leader node. Use the --noleader option if you want the leader node to remain booted up. |
To shutdown the entire system, including all service nodes and all leader nodes, but not the admin node, and not turn the power off to anything, perform the following:
# cpower --shutdown --system |
To shutdown all the compute nodes, but not the service nodes, leader nodes, perform the following:
# cpower --shutdown --system --noleader --noservice |
| Note: The only way to shut down the system admin controller (admin node) is to perform the operation manually. |
This section describes the cluster command and control (C3) tool suite for cluster administration and application support.
| Note: The SGI Tempo version of C3 does not include the cshutdown and cpushimage commands. |
The C3 commands used on the the SGI Alitx ICE 8200 system are, as follows:
| C3 Utilities | Description | |
| cexec(s) | Executes a given command string on each node of a cluster | |
| cget | Retrieves a specified file from each node of a cluster and places it into the specified target directory | |
| ckill | Runs kill on each node of a cluster for a specified process name | |
| clist | Lists the names and types of clusters in the cluster configuration file | |
| cnum | Returns the node names specified by the range specified on the command line | |
| cname | Returns the node positions specified by the node name given on the command line | |
| cpush | Pushes files from the local machine to the nodes in your cluster |
cexec is the most useful C3 utility. Use the cpower, power-iru, power-rack, and power-system commands rather than cshutdown (see “Power Management Commands”).
EXAMPLES
Example 3-3. C3 Command General Examples
The following examples walk you through some typical C3 command operations.
You can use the cname and cnum commands to map names to locations and vice versa, as follows:
# cname rack_1:0-2 local name for cluster: rack_1 nodes from cluster: rack_1 cluster: rack_1 ; node name: r1i0n0 cluster: rack_1 ; node name: r1i0n1 cluster: rack_1 ; node name: r1i0n10 # cnum rack_1: r1i0n0 local name for cluster: rack_1 nodes from cluster: rack_1 r1i0n0 is at index 0 in cluster rack_1 # cnum rack_1: r1i0n1 local name for cluster: rack_1 nodes from cluster: rack_1 |
You can use the clist command to retrieve the number of racks, as follows:
# clist cluster rack_1 is an indirect remote cluster cluster rack_2 is an indirect remote cluster cluster rack_3 is an indirect remote cluster cluster rack_4 is an indirect remote cluster |
You can use the cexec command to view the addressing scheme of the C3 utility, as follows:
# cexec rack_1:1 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n1--------- r1i0n1 # cexec rack_1:2-3 rack_4:0-3,10 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ************************* rack_4 ************************* ************************* rack_4 ************************* --------- r4i0n0--------- r4i0n0 --------- r4i0n1--------- r4i0n1 --------- r4i0n10--------- r4i0n10 --------- r4i0n11--------- r4i0n11 --------- r4i0n4--------- r4i0n4 |
The following set of command shows how to use the C3 commands to transverse the different levels of hierarchy in your Altix ICE system (for information on the hierarchical design of your Altix ICE system see “Basic System Building Blocks” in Chapter 1).
To execute a C3 command on all blades within the default Altix ICE system, for example, rack 1, perform the following:
# cexec hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ... |
To run a C3 command on all compute nodes across an Altix ICE system, perform the following:
# cexec --all hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 ... --------- r2i0n10--------- r2i0n10 ... --------- r3i0n11--------- r3i0n11 ... |
To run a C3 command against the first rack leader controller, in the first rack, perform the following:
# cexec --head hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead |
To run a C3 command against all rack leader controllers across all racks, perform the following:
# cexec --head --all hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead ************************* rack_2 ************************* --------- rack_2--------- r2lead ************************* rack_3 ************************* --------- rack_3--------- r3lead ************************* rack_4 ************************* --------- rack_4--------- r4lead |
The following set of examples shows some specific case uses for the C3 commands that you are likely to employee.
Example 3-4. C3 Command Specific Use Examples
From the system admin controller, run command on rack 1 without including the rack leader controller, as follows:
# cexec rack_1: <cmd> |
Run a command on all service nodes only, as follows:
# cexec -f /etc/c3svc.conf <cmd> |
Run a command on all compute nodes in the system, as follows:
# cexec --all <cmd> |
Run a command on all rack leader controllers, as follows:
# cexec --all --head <cmd> |
Run a command on blade 42 (compute node 42) in rack 2, as follows:
# cexec rack_2:42 <cmd> |
From a service node over the InfiniBand Fabric, run a command on all blades (compute nodes) in the system, as follows:
# cexec --all <cmd> |
Run a command on blade 42 (compute node 42), as follows:
# cexec blades:42 <cmd> |
SGI Tempo management systems software uses the open-source console management package called conserver. For detailed information on consever, see http://www.conserver.com/
An overview of the conserver package is, as follows:
Manages the console devices of all managed nodes in an Altix ICE system
A conserver daemon runs on the system admin controller (admin node) and the rack leader controllers (leader nodes). The system admin controller manages leader and service node consoles. The rack leader controllers manage blade consoles.
The conserver daemon connects to the consoles using ipmitool. Users connect to the daemon to access them. Multiple users can connect but non-primary users are read-only.
The conserver package is configured to allow all consoles to be accessed from the system admin controller.
All consoles are logged. These logs can be found at /var/log/consoles on the system admin controller and rack leader controllers. An autofs configuration file is created to allow you to access rack leader controller managed console logs from the system admin controller, as follows:
system-admin # /net/r1lead/var/log/consoles/ |
The /etc/conserver.cf file is the configuration file for the conserver daemon. This file is generated for both the system admin controller and rack leader controllers from the /opt/sgi/sbin/generate-conserver-files script on the system admin controller. This script is called from discover-rack command as part of rack discovery or rediscovery and generates both the conserver.cf file for the rack in question and regenerates the conserver.cf for the sysem admin controller.
| Note: The conserver package replaces cconsole for access to all consoles (blades, leader nodes, managed service nodes) |
You may find the following conserver man pages useful:
| Man Page | Description | |
| console(1) | Console server client program | |
| conserver(8) | Console server daemon | |
| conserver.cf(5) | Console configuration file for conserver(8) | |
| conserver.passwd(5) | User access information for conserver(8) |
To use the conserver console manager, perform the following steps:
To see the list of available consoles, perform the following:
system-admin:~ # console -x service0 on /dev/pts/2 at Local r2lead on /dev/pts/1 at Local r1lead on /dev/pts/0 at Local r1i0n8 on /dev/pts/0 at Local r1i0n0 on /dev/pts/1 at Local |
To connect to the service console, perform the following:
system-admin:~ # console service0 [Enter `^Ec?' for help] Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp (ttyS1). service0 login: |
To connect to the rack leader controller console, perform the following:
system-admin:~ # console r1lead [Enter `^Ec?' for help] Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp (ttyS1). r1lead login: |
To trigger system request commands sysrq (once connected to a console), perform the following:
Ctrl-e c l 1 8 # set log level to 8 Ctrl-e c l 1 <sysrq cmd> # send sysrq command |
To see the list of conserver escape keys, perform the following:
Ctrl-e c ? |
The SGI Tempo systems management software uses network time protocol (NTP) as the primary mechanism to keep the nodes in your Altix ICE system synchronized. This section describes this mechanism operates on the various Altix ICE components and covers these topics:
The NTP client on the system admin controller should point to the house network time server. The NTP server provides NTP service to system components so that nodes can consult it when they are booted. The system admin controller sends NTP broadcasts to some networks to keep the nodes in sync after they have booted.
NTP client on the rack leader controller gets time from the system admin controller when it is booted and then stays in sync by watching NTP broadcasts from the system admin controller. The NTP server node provides NTP service to Altix ICE components so that compute nodes can sync their time when they are booted. The rack leader controller sends NTP broadcasts to some networks to keep the compute nodes in sync after they have booted.
The NTP client on managed service nodes ( for a definition of managed, see “discover Command” in Chapter 2) sets its time at initial booting from the system admin controller. It listens to NTP broadcasts from the system admin controller to stay in sync. It does not provide any NTP service.
The NTP Client on the compute node sets its time at initial booting from the rack leader controller. It listens to NTP broadcasts from the rack leader controller to stay in sync.
Sometime, especially during initial deployment of an Altix ICE system when system components are being installed and configured for the first time, NTP is not available to serve time to system components.
A non-modified NTP server, running for the first time, takes quite some time before it offers service. This means the leader and service nodes may fail to get time from the system admin controller as they come on-line. Compute nodes may also fail to get time from the leader when they first come up. This situation usually only happens at first deployment. After the ntp servers have a chance to create their drift files, ntp servers offer time with far less delay on subsequent reboots.
The following work arounds are in place for situations when NTP can not serve the time:
The admin and rack leader controllers have the time service enabled (xinetd).
All system node types have the netdate command.
A special startup script is on leader, service, and compute nodes that runs before the NTP startup script.
This script attempts to get the time using the ntpdate command. If the ntpdate command fails because the NTP server it is using is not ready yet to offer time service, it uses the netdate command instead of get the clock “close".
The ntp startup script starts the NTP service as normal. Since the clock is known to be "close", NTP will fix the time when the NTP servers start offering time service.
The SGI Tempo systems management software captures the relevant data for the managed objects in an SGI Altix ICE system. Managed objects are the hierarchy of nodes described in “Basic System Building Blocks” in Chapter 1. The system database is critical to the operation of your SGI Altix ICE system and you need to back up the database on a regular basis.
Managed objects on an SGI Altix ICE include the following
Altix ICE system
One ICE system is modeled as a meta-cluster. This meta-cluster contains the racks each modeled as a sub-cluster.
Nodes
System admin controller (admin node), rack leader controllers (leader nodes), service nodes, compute nodes (blades) and chassis management control blades (CMCs) are modeled as nodes.
Networks
The preconfigured and potentially customized IP networks
Nics
The network interfaces for Ethernet and InfiniBand adapters.
The network interfaces for Ethernet and InfiniBand adapter.
The node images installed on each particular node.
SGI recommends that you keep three backups of your system database at any given time. You should implement a rotating backup procedure following the son-father-grandfather principle.
To back up and restore the system database, perform the following steps:
From the system admin controller, to back up the system database perform a command similar to the following:
# mysqldump --opt oscar > backup-file.sql |
To read the dump file back into the system admin controller, perform a command similar to the following:
# mysql oscar < backup-file.sql |