This chapter describes how to use the SGI Tempo systems management software to operate your Altix ICE system and covers the following topics:
This section describes image management operations.
This section describes SLES10 services turned off on compute nodes by default, how you can customize the software running on compute nodes or service nodes, create a simple clone image of compute node or service node software, how to use the cimage command, how to use yum to install packages into software images, and how to use the mksiimage command to create compute and service node images. It covers these topics:
Currently, the compute nodes run the SUSE Linux Enterprise Server 10 (SLES10) Service Pack 1 (SP1) Linux distribution. To improve the performance of applications running MPI jobs on compute nodes, the following SLES10 services are turned off:
acpid
auditd
boot.crypto
boot.device-mapper
boot.lvm
boot.md
cron
earlykbd
earlysyslog
fbset
irq_balancer
kbd
novell-zmd
nscd
postfix
powersaved
resmgr
slpd
splash
splash_early
suseRegister
xdm
This section discusses how to manage various nodes on your SGI Altix ICE system. It describes how to configure the various nodes, including the compute and service nodes. It describe how to augment software packages. Many tasks having to do with package management have multiple valid methods to use.
For information on installing patches and updates, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” in Chapter 2.
You can add per-host compute node customization to the compute node images. You do this by adding scripts either to the /opt/sgi/share/per-host-customization/global/ directory or the /opt/sgi/share/per-host-customization/mynewimage/ directory on the system admin controller.
| Note: When creating custom images for compute nodes, make sure you clone the original SGI images. This provides the original images intact that you can fall back to if necessary. |
Scripts in the global directory apply to all compute nodes images. Scripts under the image name apply only to the image in question. The scripts are cycled through once per host when being installed on the rack leader controllers. They receive one input argument, which is the full path (on the rack leader controller) to the per-host base directory, for example, /var/lib/sgi/mynewimage/i2n11. There is a README file at /opt/sgi/share/per-host-customization/README on the system admin controller, as follows:
This directory contains compute node image customization scripts which are executed as part of the install-image operations on the leader nodes when pulling over a new compute node image. After the image has been pulled over, and the per-host-customization dir has been rsynced, the per-host /etc and /var directories are populated, then the scripts in this directory are cycled through once per-host. This allows the scripts to source the node specific network and cluster management settings, and set node specific settings. Scripts in the global directory are iterated through first, then if a directory exists that matches the image name, those scripts are iterated through next. You can use the scripts in the global directory as examples. |
An example global script, /opt/sgi/share/per-host-customization/global/sgi-hostname is, as follows:
#!/bin/sh
#
# Copyright (c) 2007 Silicon Graphics, Inc.
# All rights reserved.
#
# Set the compute node's hostname to the cluster unique name
#
# This script is excecuted once per-host as part of the install-image operation
# run on the leader nodes, which is called from cimage on the admin node.
# The full path to the per-host iru+slot directory is passed in as $1,
# e.g. /var/lib/sgi/per-host//i2n11.
#
# sanity checks
. /opt/sgi/share/per-host-customization/global/sanity.sh
iruslot=$1
# source cluster management information
. ${iruslot}/etc/opt/sgi/cminfo
# set hostname of blade to cluster unique name
echo ${NAME} > ${iruslot}/etc/HOSTNAME |
| Note: Procedures in this section describe how to work with service node and compute node images. Always use a cloned image. If you are adjusting an RPM list, use your own copy of the RPM list. |
The service and compute node images are created during the configure-cluster operation (or during your upgrade to SGI ProPack 5 SP3 if you were running SGI ProPack 5 SP2 previously). This process uses an RPM list to generate a root on the fly.
You can clone a compute node image, or create a new one based on an RPM list. For service nodes, SGI does not support a clone operation. For compute images, you can either clone the image and work on a copy or you can always make a new compute node image from the SGI supplied default RPM list.
To clone a compute node image, perform the following steps:
From the system admin controller, create a clone of the compute node image, as follows:
# cimage --clone-image compute-sles10sp2 new |
After that command is complete, you will have a new image located in /var/lib/systemimager/images/new on the system admin controller.
To see that the image is now available, perform the following command:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: new
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
|
For RPM lists, the default RPM lists are located in /etc/opt/sgi/rpmlists on the system admin controller. SGI suggests you never change these files, but rather, create your own versions using the ones supplied by SGI as a base.
Please note, it is important that certain packages be in the rpmlist for a given node. For example, an rpmlist used for compute nodes should have packages sgi-compute-node and sgi-cluster. Service nodes must have sgi-service-node and sgi-cluster.
To manually add a package to a compute node image, perform the steps:
Make a clone of the compute node image, as described in “Customizing Software Images”.
Determine what images and kernels you have available now, as follows:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-new
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
|
From the system admin controller, change directory to the images directory, as follows:
# cd /var/lib/systemimager/images/ |
From the system admin controller, copy the RPMs your wish to add, as follows:
# cp /newrpm.rpm new/tmp |
The new RPMs now reside in /tmp direcory in the image named new. To install them into your new compute node image, perform the following commands:
# chroot new bash |
And then perform the following:
# rpm -Uvh /tmp/newrpm.rpm |
The image on the system admin controller is updated. However, you still need to push the changes out. Ensure there are no nodes currently using the image and then run this command:
# cimage --push-rack new r\* |
This will push the updates to the rack lead controllers and the changes will be seen by the compute nodes the next time they start up. For information on how to ensure the image is associated with a given node, see the cimage --set command and the example in Procedure 3-3.
To clone the compute node image, perform the following:
# cimage --clone-image compute-sles10sp1 compute-sles10sp1-new |
To see the images and kernels in the list, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-new
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To change the compute nodes to use the cloned image/kernel pair, perform the following:
# cimage --set compute-sles10sp1-new 2.6.16.46-0.12-smp "r*i*n*" |
To manually add a package to the service node image, perform the following steps:
Use the mksiimage command to create your own version of the service node image. See “Creating Compute and Service Images Using the mksiimage Command”.
Change directory to the images directory, as follows:
# cd /var/lib/systemimager/images/ |
From the system admin controller, copy the RPMs your wish to add, as follows, where my-service-image is your own service node image:
# cp /newrpm.rpm my-service-image/tmp |
The new RPMs now reside in /tmp direcory in the image named my-service-image. To install them into your new compute node image, perform the following commands:
# chroot new bash |
# rpm -Uvh /tmp/newrpm.rpm |
At this point, the image has been updated with the rpm . Please note, that unlike compute node images, changes made to a service node image will not be seen by service nodes until they are re-installed with the image. If you wish to install the package on running systems, you can copy the rpm to the running system and use rpm from there.
The cimage command allows you to list, modify, and set software images on the compute nodes in your system.
The cimage command accepts the following options:
| Option | Description | |
| --help | Usage and help text | |
| --list-images | Lists images present in the database | |
| --list-nodes RACK ... | Lists what compute nodes are set to | |
| --set IMAGE KERNEL NODE ... | Sets the compute nodes to a certain boot image and kernel combination | |
| --add-db IMAGE | Adds an image to the database | |
| --del-db IMAGE | Deletes an image from the database | |
| --push-rack IMAGE RACK ... | Pushes an image to specified rack(s) | |
| --del-rack IMAGE RACK | Deletes an image from specified rack(s) | |
| --clone-image OIMAGE NIMAGE | Clones an existing image to a new image | |
| --del-image IMAGE | Deletes an existing image entirely |
RACK arguments take the format rX.
NODE arguments take the format rXiYnZ .
X, Y, Z can be single digits, a [start-end] range, or * for all matches.
... indicates more than one RACK or NODE argument can be passed in.
EXAMPLES
Example 3-1. cimage Command Examples
The following examples walk you through some typical cimage command operations.
To list the available images and their associated kernels, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To list the compute nodes in rack 1 and the image and kernel they are set to boot, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n3: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n4: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n5: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n6: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n7: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n8: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set the r1i0n0 compute node to boot the 2.6.16.46-0.12-carlsbad kernel from the compute-sles10sp1 image, perform the following: :
# cimage --set compute-sles10sp1 2.6.16.46-0.12-carlsbad r1i0n0 |
To list the nodes in rack 1 to see the changes set in the example above, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-carlsbad r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set all nodes in all racks to boot the 2.6.16.46-0.7-carlsbad kernel from the compute-sles10sp1 image, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-carlsbad r*i*n* |
To set two ranges of nodes to boot the 2.6.16.46-0.7-smp kernel, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-smp r1i[0-2]n[5-6] r1i[2-3]n[0-4] |
To clone the compute-sles10sp1 image to a new image (so that you can modify it) , perform the following:
# cimage --clone-image compute-sles10sp1 mynewimage Cloning compute-sles10sp1 to mynewimage ... done |
To change to the cloned image created in the example, above, copy the needed rpms into the /var/lib/systemimager/images/mynewimage/tmp directory, use the chroot command to enter the directory and then install the rpms, perform the following:
# cp *.rpm /var/lib/systemimager/images//tmp # chroot /var/lib/systemimager/images/mynewimage/ bash # rpm -Uvh /tmp/*.rpm |
If you make changes to the kernels in the image, you need to refresh the kernel database entries for your image, To do this, perform the following:
# cimage --del-db mynewimage # cimage --add-db mynewimage |
To push new software images out to the compute blades in a rack or set of racks, perform the following:
# cimage --push-rack mynewimage r* r1lead: install-image: mynewimage r1lead: install-image: mynewimage done. |
To list images in the database the kernels they contain, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp
image: mynewimage
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp |
To set some compute nodes to boot an image, perform the following:
# cimage --set mynewimage 2.6.16.46-0.7-smp r1i3n* |
You need to reboot the compute nodes to run the new images.
Completely remove an image you no longer use, both from system admin controller and all compute nodes in all racks, perform the following:
# cimage --del-image mynewimage r1lead: delete-image: mynewimage r1lead: delete-image: mynewimage done. |
The packages that make up SGI Tempo and SLES are available in repositories at which yum is configured to look.
| Note: Always work with copies of software images. |
SGI provides a wrapper around yum that makes it simple to install a package in to an image.
However, yum only looks at packages that are part of a yum repository. So if you are installing your own rpm, you will need to configure yum to look at your own repositories in addition to the others. See the appropriate yum docomentation.
| Note: The yum software maintains a cache of the repository metdata and it will not update its cache of the metadata until a certain number of minutes have passed. This time limit is defined by the metadata_expire option of the yum.conf file. See the yum.conf(5) man page. If you happen to synchronize your repository to SUSE or Novell shortly after doing a yum operation, and you notice that yum now says there are no updates when there should be, you can run this command to clear the caches and force yum to try again: yum clean all. |
This example shows how to install the zlib-devel package in to the service node image so that the next time you image or install a service node it has this new package.
You can run the following command:
# yum-image-wrapper /var/lib/systemimager/images/my-service-sles10sp1 install zlib-devel |
If you update a compute node image on the system admin controller, you have to use the cimage command to push the changes.
If you update a service node image on the system admin controller, that service node needs to be re-installed/re-imaged to get the change. The discover command can be given an alternate image.
These instructions only apply to service nodes.
You can use the yum command to install a package on a service node. From the system admin controller, you can issue a command like the following. Please note that SGI suggests using the -y option. This prevents yum from asking for input.
# ssh service0 yum install zlib-devel |
You can create service node and compute node images using the mksiimage(1) command. This will generate a root on the fly.
Fresh installations of SGI ProPack 5 SP3 create these images during the configure-cluster installation step.
The rpm lists that drive which packages get installed in the images are listed in files located in /etc/opt/sgi/rpmlists . For example, /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist . You should not edit the default SGI RPM lists but instead make copies and work on the copy.
To use the mksiimage command to create a service node image, perform the following:
Make a copy of the example service node image RPM list and work on the copy, as follows:
# cp /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-service-node.rpmlist |
Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.
Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-service-node-image as the home for this image.
This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.
| Note: When using the mksiimage command, make sure that there are NO spaces in the comma-separated list, otherwise, the command will fail. |
# mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist |
post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:
# post-process-sgi-image /var/lib/systemimager/images/my-service-node-image/ eth1 |
Now the image is ready to be used with service nodes. Please see the discover command for how to associate the new image with the service node for the discover command. See “Installing a Service Node with a Non-default Image”.
To use the mksiimage(1) command to create a compute node image, perform the following:
Make a copy of the example compute node image RPM list and work on the copy, as follows:
# cp /etc/opt/sgi/rpmlists/compute-sles10sp1.rpmlist /etc/opt/sgi/rpmlists/my-compute-node.rpmlist |
Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.
Run the mksiimage command to create the root. This example uses /var/lib/systemimager/images/my-compute-node-image as the home for this image. This command may take a long time and has lots of output. You could consider redirecting output to a /tmp file. There is so much output that running this from a serial console more than doubles the time it takes to complete if the output is not redirected.
# mksiimage -A --name my-compute-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/my-compute-node.rpmlist |
Add the new image to the list of images cimage knows is available, as follows:
# cimage --add-db my-compute-node-image |
| Note: The order that you execute commands is important. Make sure to you run the cimage --add-db command before the post-process-sgi-image command. |
Post-process the image. This just runs several commands to properly integrate and update the image for use in your Altix ICE system:
# post-process-sgi-image /var/lib/systemimager/images/my-compute-node-image/ eth1 |
For information on how to use the cimage command to push this new image to the rack leader controllers, see “cimage Command”.
This section describes how to copy a software image from an existing service node so you can use the image when adding a new service node.
To copy a software image from an existing service node for use on a new service node, perform the following steps:
As root user, log into the service node from which you want to copy the software image.
The si_prepareclient program allows you to copy an image from the running service node. To start it, perform the following:
service0:~ # si_prepareclient --server admin |
There will be a couple questions to answer. It will take a few moments, then it will return you to the command prompt. Exit the service node and return to the admin node. Now you are ready to copy the image from the admin node.
As root user on the admin node, run the following command replacing the image name and service node name as appropriate for your site.
system-admin:~ # mksiimage --Get --client service0 --name service-custom |
The mksiimage command copies the existing service node image. No progress information is provided. This takes several minutes, depending on the size of the image on the service node.
Now run the post-processing command, as follows:
system-admin:~# post-process-sgi-image /var/lib/systemimager/images/service-custom eth1 |
Discover the new service node with the copied image using the image specifier, as follows:
system-admin:~# discover --service 0,image=service-custom |
After you have updated or created a service node image, you can install that image on to a managed service node, such as a login node.
| Note: Re-installing the service node using the discover process will destroy everything previously on the root drive. |
By default, discover uses the SGI default service-sles10sp1 image. For example:
# discover --service 2,image=my-service-node-image |
The image above directs the installation of the described operating system image.
For more information on the discover command, see “discover Command” in Chapter 2.
This section describes how to maintain packages specific to your site and have them available to yum and mksiimage .
SGI suggests putting site-specific packages in a separate location. They should not reside in the same location as SGI or Novell supplied packages.
To set up a custom repository for your site packages, perform the following steps:
Create a directory for your site-specific packages on the system admin controller, as follows:
# mkdir /tftpboot/site-local/sles-10-x86_64 |
Copy your site packages to the new directory, as follows:
# copy my-package-1.0.x86_64.rpm /tftpboot/site-local/sles-10-x86_64 |
Create the metadata so yum sees this as a repository, as follows:
# yume --prepare --repo /tftpboot/site-local/sles-10-x86_64 |
If you wish to use yum (as opposed to only mksiimage), you can create a yum repository configuration file in /etc/yum.repos.d . Use tempo.repo as an example. You may need to turn off the gpgcheck option if your RPMs are not signed. Remember to deploy the repository to all images from which you wish to use this repository.
If you wish your new package to be installed in to an image created by mksiimage by default, you will need to add it in to an RPM list. Example rpmlists are in /etc/opt/sgi/rpmlists. Always work on your own copy; do not modify the SGI supplied RPM lists. For more information, see “Creating Compute and Service Images Using the mksiimage Command”.
Below is an example of the mksiimage command is the same as the one shown in Procedure 3-5 except it adds your new repository to the list:
# mksiimage -A --name my-service-node-image --location /tftpboot/distro/sles-10-x86_64, /tftpboot/oscar/common-rpms,/tftpboot/oscar/sles-10-x86_64, /tftpboot/site-local/sles-10-x86_64 --filename /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist |
This command creates a new root image, using your newly created repository as one of the sources, and adds the new package assuming it is listed in /etc/opt/sgi/rpmlists/service-sles10sp1.rpmlist .
Run post-process-sgi-image command as described in Procedure 3-5.
All node types that are part of an SGI Altix ICE system can have configuration settings adjusted by the configuration framework. There is some overlap between the per-host customization instructions and the configuration framework instructions. Each approach plays a role in configuring your system. The major differences between the two methods are, as follows:
Per-host customization runs at the time an image is pushed to the rack leader controllers.
Per-host customization only applies to compute node images.
The Altix ICE system configuration framework can be used with all node types.
The system configuration framework is run when a new root is created, when SuSEconfig command is run for some other reason, as part of a yum operation, or when new compute images are pushed with the cimage command.
This framework exists to make it easy to adjust configuration items. There are SGI-supplied scripts already present. You can add more scripts as you wish. You can also exclude scripts from running without purging the script if you decide a certain script should not be run. The following set of questions in bold and bulleted answers describes how to use the system configuration framework.
How does the system configuration framwork operate?
These files could be added, for example, to a running service node, or to an already created service or compute image. Remember that images destined for compute nodes need to be pushed with the cimage command after being altered. For more information, see “cimage Command”.
A /opt/sgi/lib/cluster-configuration script is called, from where it is called is described below.
That script iterates through scripts residing in /etc/opt/sgi/conf.d.
Any scripts listed in /etc/opt/sgi/conf.d/exclude are skipped, as are scripts, that are not executable.
Scripts in system configruation framework must be tolerant of files that do not exist yet, as described below. For example, check that a syslog configuration file exists before trying to adjust it.
From where is the framework called?
The callout for /opt/sgi/lib/cluster-configuration is implemented as a yum plugin that executes after packages have been installed and cleaned.
There is also a SUSE configuration script in /sbin/conf.d, called SuSEconfig.00cluster-configuration , that calls the framework. This is in case of you are using YaST to install or upgrade packages.
One of the scripts called by the framework calls SuSEconfig. A check is made to avoid a callout loop.
The framework is also called when the admin, leader, or service nodes start up. The call is made just after networking is configured. As a site administrator, you could create custom scripts here that check on or perform certain configuration operations.
When using the cimage command to push a compute node root image to rack leaders, the configuration framework executes within the chroot of the compute node image after it is pulled from the admin node to the rack leader node.
How do I adjust my system configuration?
Create a small script in /etc/opt/sgi/conf.d to do the adjustment.
Be sure that you test for existence of files and do not assume they are there (see "Why do scripts need to tolerate files that do not exist but should?" below).
Why do scripts need to tolerate files that do not exist but should?
This is because the mksiimage command runs yume and yum in two steps. The first step only installs 40 or so RPMs but our framework is called then too. The second pass installs the other "hundreds" of RPMs. So the framework is called once before many packages are installed, and again after everything is in place. So not all files you expect might be available when your small script is called.
How does the yum plugin work?
In order for the yum plugin to work, the /etc/yum.conf file has to have plugins=1 set in its configuration file. SGI Tempo software ensures that is in place by way of a trigger in the sgi-cluster package. Any time yum is installed or updated, it verify plugins=1 is set.
How does yume work?
yume, an oscar wrapper for yum, works by creating a temporary yum configuration file in /tmp and then points yum at it. This temporary configuration file needs to have plugins enabled. A tiny patch to yume makes this happen. This fixes it for yume and also mksiimage, which calls yume as part of its operation.
The SGI Tempo 1.3 release includes a new cluster configuration repository/update framework. This framework generates and distributes configuration updates to admin, service, and leader nodes in the cluster. Some of the configuration files managed by this framework include C3 conserver, DNS, Ganglia, hosts files, and NTP.
When an event occurs that requires these files to be updated, the framework executes on the admin node. The admin node stores the updated configuration framework in a special cached location and updates the appropriate nodes with their new configuration files.
In addition to the updates happening as required, the configuration file repository is consulted when a admin, service, or leader node boots. This happens shortly after networking is started. Any configuration files that are new or updated are transferred at this early stage so that the node is fully configured by the time the node is fully operational.
There are no hooks for customer configuration in the configuration repository at this time.
This update framework is tied in with the /etc/opt/sgi/conf.d configuration framework to provide a full configuration solution. As mentioned earlier, customers are encouraged to create /etc/opt/sgi.conf.d scripts to do cluster configuration.
The cpower command allows you to power up, power down, reset, and show the power status of system components.
The cpower command is, as follows:
cpower [<option> ...] [<target_type>] [<action>] <target> |
The <option> argument can be one or more of the following:
| Option | Description | |
| --noleader | Do not include leader nodes (valid with rack and system domains only). | |
| --noservice | Do not include service nodes (valid with system domain only). | |
| --ipmi | Uses ipmitool to communicate. [default] | |
| --ssh | Uses ssh to communicate. | |
| --intelplus | Uses the -o intelplus option for ipmitool [default] Note that you do not usually need to specify this. | |
| --force | When using wildcards in the target, disable all “safety” checks. Make sure you really want to use this command. | |
| -n, --noexec | Displays, but does not execute, commands that affect power. | |
| -v, --verbose | Print additional information on command progress |
| Note: The command will fail if the target contains any wild cards, unless the --all option is specified. |
The <target> argument is one of the following:
| --node | Applies the action to nodes. Nodes are compute nodes, rack leader controllers (leader nodes), system admin controller (admin node), and service nodes. [default] | |
| --iru | Applies the action at the IRU level. | |
| --rack | Applies the action at the rack level. | |
| --system | Applies the action to the system. You must not specify a target with this type. |
The <action> argument is one of the following:
| --status | Show the power status of the target, including whether it is booted or not. [default] | |
| --up | --on | Powers up the target. | |
| --down | --off | Powers down the target. | |
| --reset | Performs a hard reset on the target. | |
| --cycle | Power cycles the target. | |
| --boot | Boots up the target, unless it is already booted. Waits for all targets to boot. | |
| --reboot | Reboots the target, even if already booted. Wait for all targets to boot. | |
| --halt | Halts and then powers off the target. | |
| --shutdown | Shuts down the target, but does not power it off. Waits for targets to shut down. | |
| --identify <interval> | Turns on the identifying LED for the specified interval in seconds. Uses an interval of 0 to turn off immediately. | |
| -h, --help | Shows help usage statement. |
The target must always be specified except when the --system option is used. Wildcards may be used, but be careful not to accidentally power off or reboot the leader nodes. If wildcard use affects any leader node, the command fails with an error.
The default for the cpower command is to operate on system nodes, such as compute nodes, leader nodes, or service nodes. If you do not specify --iru, --rack, or --system, the command defaultd to operating as if you had specified --node.
Here are examples of node target names:
r1i3n10
Compute node at rack 1, IRU 3, slot 10
service0
Service node 0
r3lead
Rack leader controller (leader node) for rack 3
r1i*n*
Wildcards let you specify ranges of nodes, for example, r1i*n* all compute nodes in all IRUs on rack 1
The default operation for the cpower command is to operate on nodes and to provide you the status of these nodes, as follows:
# cpower r1i* |
The cpower command also
This example gives you the power status and boot status of all the compute blades in rack 1. This command is equivalent to cpower --node --status r1i*.
This command issues an ipmitool power off command to all of the nodes specified by the wildcard, as follows:
# cpower --off r2i* |
The default is to apply to a node.
The following commands behave exactly as you would expect as if you were using ipmitool, and have no special extra logic for ordering:
# cpower --up r1i*
# cpower --reset r1i*
# cpower --cycle r1i*
# cpower --identify 5 r1i*
| Note: --up is a synonym for --on and --down is a synonym for --off. |
The cpower command contains more logic when you go up to higher levels of abstraction, for example, using --iru, --rack, and --system. These higher level domain specifiers tell the command to be smart about how to order various of the actions that you give on the command line.
The --iru option tells the command to use correct ordering with IRU power commands. In this case, it firsts connect to the CMC on each IRU in rack 1 to issue the power on command, which turns on power to the IRU chassis (this is not the equivalent ipmitool command). Then it powers up the compute nodes in the IRU. Powering things down is the opposite, with the power to the IRU being turned off after power to the blades. IRU targets are specified as follows: r3i2 for rack 3, IRU 2.
# cpower --iru --up r1* |
The --rack option ensures power commands to the leader node are down in the correct order relative to compute nodes within a rack. First, it powers up the leader node and waits for it to boot up (if it is not already up). Then it will do the functional equivalent of a cpower --iru --up r4i* on each of the IRUs contained in the rack, including applying power to each IRU chassis. Using the --down option is the opposite, and also turns off the leader node (after doing a shutdown) after all the IRUs are powered down. To avoid including leader nodes in a power command for a rack, use the --noleader option. Rack targets are specified, as follows: r4 for rack 4. Here is an example:
# cpower --rack --up r4 |
Commands with the --system option ensures that power up commands are applied first to service nodes, then to leader nodes, then to IRUs and compute blades, in just the same way. Likewise, compute blades are powered down before IRUs, leader nodes, and service nodes, in that order. To avoid including service nodes in a system-domain command, use the --noservice option. Note that you must not specify a target with --system option, since it applies to the Altix ICE system.
| Note: The --shutdown --off combination of actions were deprecated in the SGI Tempo v1.2 release. Use the --halt option in it's place. |
You need to configure the order in which service nodes are booted up and shut down as part of the overall system power management process. This is done by setting a boot_order for each service node. Use the cadmin command to set the boot order for a service node, for example:
# cadmin --set-boot-order --node service0 2 |
The cpower --system --boot command boots up service nodes with a lower boot order, first. It then boots up service nodes with a higher boot order. The reverse is true when shutting down the system with cpower. For example, if service1 has a boot order of 3 and service2 has a boot order of 5, service1 is booted completely, and then service2 is booted, afterwards. During shutdown, service2 is shut down completely before service1 is shutdown.
There is a special meaning to a service node having a boot order of zero. This value causes the cpower --system command to skip that service node completely for both start up and shutdown (although not for status queries). Negative values for the service node boot order setting are not permitted.
| Note: The IPMI power commands necessary to enable a system to boot (either with a power reset, or a power on) may be sent to a node. The --halt option, halts the target node and then powers it off. |
The --halt options works on node, IRU, or rack domain levels. It will shut down nodes (in the correct order if you use the --iru or --rack options), and then just leave them as they are, power still applied. Using both these actions results in nodes being halted, then powered off. This is particularly useful when powering off a rack, since otherwise, the leaders may be shutdown before there is a chance to power off the compute blades. Here is an example:
# cpower --halt --rack r1 |
To boot up systems that have not already been booted, perform the following:
# cpower --boot r1i2n* |
Again, the command boots up nodes in the right orders if you specify the --iru or --rack options and the appropriate target. Otherwise, there is no guarantee that, for example, the command will attempt to power on the leader node before compute nodes in the same rack.
To reboot all of the nodes specified, or boot them if they are already shut down, perform the following:
# cpower --reboot --iru r3i3 |
The --iru or --rack options ensure proper ordering if you use them. In this case, the command will make sure that power is supplied to the chassis for rack 3, IRU 3, and then the all the compute nodes in that IRU will be rebooted.
EXAMPLES
Example 3-2. cpower Command Examples
To boot compute blade r1i0n8, perform the following:
# cpower --boot r1i0n8 |
To boot a number of compute blades at the same time, perform the following:
# cpower --boot --rack r1 |
| Note: The --boot option will only boot those nodes that have not already booted. |
To shut down service node 0, perform the following:
# cpower --halt service0 |
To shutdown and switch off everything in rack 3, perform the following:
# cpower --halt --rack r3 |
| Note: This command will shutdown and then power off all of the computer nodes in parallel, then shutdown and power off the leader node. Use the --noleader option if you want the leader node to remain booted up. |
To shutdown the entire system, including all service nodes and all leader nodes, but not the admin node, and not turn the power off to anything, perform the following:
# cpower --halt --system |
To shutdown all the compute nodes, but not the service nodes, leader nodes, perform the following:
# cpower --halt --system --noleader --noservice |
| Note: The only way to shut down the system admin controller (admin node) is to perform the operation manually. |
This section describes the cluster command and control (C3) tool suite for cluster administration and application support.
| Note: The SGI Tempo version of C3 does not include the cshutdown and cpushimage commands. |
The C3 commands used on the the SGI Alitx ICE 8200 system are, as follows:
| C3 Utilities | Description | |
| cexec(s) | Executes a given command string on each node of a cluster | |
| cget | Retrieves a specified file from each node of a cluster and places it into the specified target directory | |
| ckill | Runs kill on each node of a cluster for a specified process name | |
| clist | Lists the names and types of clusters in the cluster configuration file | |
| cnum | Returns the node names specified by the range specified on the command line | |
| cname | Returns the node positions specified by the node name given on the command line | |
| cpush | Pushes files from the local machine to the nodes in your cluster |
cexec is the most useful C3 utility. Use the cpower, power-iru, power-rack, and power-system commands rather than cshutdown (see “Power Management Commands”).
EXAMPLES
Example 3-3. C3 Command General Examples
The following examples walk you through some typical C3 command operations.
You can use the cname and cnum commands to map names to locations and vice versa, as follows:
# cname rack_1:0-2 local name for cluster: rack_1 nodes from cluster: rack_1 cluster: rack_1 ; node name: r1i0n0 cluster: rack_1 ; node name: r1i0n1 cluster: rack_1 ; node name: r1i0n10 # cnum rack_1: r1i0n0 local name for cluster: rack_1 nodes from cluster: rack_1 r1i0n0 is at index 0 in cluster rack_1 # cnum rack_1: r1i0n1 local name for cluster: rack_1 nodes from cluster: rack_1 |
You can use the clist command to retrieve the number of racks, as follows:
# clist cluster rack_1 is an indirect remote cluster cluster rack_2 is an indirect remote cluster cluster rack_3 is an indirect remote cluster cluster rack_4 is an indirect remote cluster |
You can use the cexec command to view the addressing scheme of the C3 utility, as follows:
# cexec rack_1:1 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n1--------- r1i0n1 # cexec rack_1:2-3 rack_4:0-3,10 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ************************* rack_4 ************************* ************************* rack_4 ************************* --------- r4i0n0--------- r4i0n0 --------- r4i0n1--------- r4i0n1 --------- r4i0n10--------- r4i0n10 --------- r4i0n11--------- r4i0n11 --------- r4i0n4--------- r4i0n4 |
The following set of command shows how to use the C3 commands to transverse the different levels of hierarchy in your Altix ICE system (for information on the hierarchical design of your Altix ICE system see “Basic System Building Blocks” in Chapter 1).
To execute a C3 command on all blades within the default Altix ICE system, for example, rack 1, perform the following:
# cexec hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ... |
To run a C3 command on all compute nodes across an Altix ICE system, perform the following:
# cexec --all hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 ... --------- r2i0n10--------- r2i0n10 ... --------- r3i0n11--------- r3i0n11 ... |
To run a C3 command against the first rack leader controller, in the first rack, perform the following:
# cexec --head hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead |
To run a C3 command against all rack leader controllers across all racks, perform the following:
# cexec --head --all hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead ************************* rack_2 ************************* --------- rack_2--------- r2lead ************************* rack_3 ************************* --------- rack_3--------- r3lead ************************* rack_4 ************************* --------- rack_4--------- r4lead |
The following set of examples shows some specific case uses for the C3 commands that you are likely to employee.
Example 3-4. C3 Command Specific Use Examples
From the system admin controller, run command on rack 1 without including the rack leader controller, as follows:
# cexec rack_1: <cmd> |
Run a command on all service nodes only, as follows:
# cexec -f /etc/c3svc.conf <cmd> |
Run a command on all compute nodes in the system, as follows:
# cexec --all <cmd> |
Run a command on all rack leader controllers, as follows:
# cexec --all --head <cmd> |
Run a command on blade 42 (compute node 42) in rack 2, as follows:
# cexec rack_2:42 <cmd> |
From a service node over the InfiniBand Fabric, run a command on all blades (compute nodes) in the system, as follows:
# cexec --all <cmd> |
Run a command on blade 42 (compute node 42), as follows:
# cexec blades:42 <cmd> |
The cadmin command allows you to change certain administrative parameters in the cluster such as the boot order of service nodes, the administrative status of nodes, and the adding, changing, and removal of IP addresses associated with service nodes.
| Note: The Tempo 1.3 version of cadmin uses a different syntax than previous releases. |
To get the cadmin usage statement, perform the following:
# cadmin --help
cadmin: SGI Tempo Administrative Interface
Help:
In general, these commands operate on {node}. {node} is the Tempo style
node name. For example, service0, r1lead, r1i0n0. Even when the host name
for a service node is changed, the Tempo name for that node may still be used
for {node} below. The node name can either be the tempo unique node name
or a customer-supplied host name associated with a tempo unique node name.
--version : Display current release information
--set-admin-status --node {node} {value} : Set Administrative Status
--show-admin-status --node {node} : Show Administrative Status
--set-boot-order --node {node} [value] : Set boot order [*]
--show-boot-order --node {node} : Show boot order [*]
--show-ips --node {node} : Show all allocated IPs associated with node
--show-hostname --node {node} : show the current host name for ice node {node}
--set-ip --node {node} --net {net} {hostname=ip} : Change an allocated ip [*]
--del-ip --node {node} --net {net} {hostname=ip} : Delete an ip [*]
--add-ip --node {node} --net {net} {hostname=ip} : allocate a new ip [*]
Not yet implemented:
--set-hostname --node {node} {new-hostname} : change the host name [*]
Descriptions of Selected Values:
{hostname=ip} means specify the host name associated with the specified
ip address.
{net} is the tempo network to change such as ib-0, ib-1, head, gbe, bmc, etc
{node} is a tempo-style node name such as r1lead, service0, or r1i0n0.
[*] Only applies to service nodes |
EXAMPLES
Example 3-5. SGI Tempo Administrative Interface (cadmin) Command
Set a node offline, as follows:
# cadmin --set-admin-status --node r1i0n0 offline |
Set a node online, as follows:
# cadmin --set-admin-status --node r1i0n0 online |
Set the boot order for a service node, as follows:
# cadmin --set-boot-order --node service0 2 |
Add an IP to an existing service node, as follows:
# cadmin --add-ip --node service0 --net ib-0 my-new-ib0-ip=10.148.0.200 |
Change the Tempo needed service0-ib0 IP address, as follows:
# cadmin --set-ip --node service0 --net head service0=172.23.0.199 |
Show currently allocated IP addresses for service0, as follows:
# cadmin --show-ips --node service0 IP Address Information for Tempo node: service0 ifname ip Network myservice-bmc 172.24.0.3 head-bmc myservice 172.23.0.3 head myservice-ib0 10.148.0.254 ib-0 myservice-ib1 10.149.0.67 ib-1 myhost 172.24.0.55 head-bmc myhost2 172.24.0.56 head-bmc myhost3 172.24.0.57 head-bmc |
Delete a site-added IP address (you cannot delete Tempo needed IP addresses), as follows:
# cadmin --del-ip --node service0 --net ib-0 my-new-ib0-2-ip=10.148.0.201 |
Change the hostname associated with service0 to be myservice, as follows:
# cadmin --set-hostname --node service0 myservice |
Set and show the cluster subdomain, as follows:
# cadmin --set-subdomain biteme2.americas.sgi.com # cadmin --show-subdomain |
SGI Tempo management systems software uses the open-source console management package called conserver. For detailed information on consever, see http://www.conserver.com/
An overview of the conserver package is, as follows:
Manages the console devices of all managed nodes in an Altix ICE system
A conserver daemon runs on the system admin controller (admin node) and the rack leader controllers (leader nodes). The system admin controller manages leader and service node consoles. The rack leader controllers manage blade consoles.
The conserver daemon connects to the consoles using ipmitool. Users connect to the daemon to access them. Multiple users can connect but non-primary users are read-only.
The conserver package is configured to allow all consoles to be accessed from the system admin controller.
All consoles are logged. These logs can be found at /var/log/consoles on the system admin controller and rack leader controllers. An autofs configuration file is created to allow you to access rack leader controller managed console logs from the system admin controller, as follows:
system-admin # /net/r1lead/var/log/consoles/ |
The /etc/conserver.cf file is the configuration file for the conserver daemon. This file is generated for both the system admin controller and rack leader controllers from the /opt/sgi/sbin/generate-conserver-files script on the system admin controller. This script is called from discover-rack command as part of rack discovery or rediscovery and generates both the conserver.cf file for the rack in question and regenerates the conserver.cf for the sysem admin controller.
| Note: The conserver package replaces cconsole for access to all consoles (blades, leader nodes, managed service nodes) |
You may find the following conserver man pages useful:
| Man Page | Description | |
| console(1) | Console server client program | |
| conserver(8) | Console server daemon | |
| conserver.cf(5) | Console configuration file for conserver(8) | |
| conserver.passwd(5) | User access information for conserver(8) |
To use the conserver console manager, perform the following steps:
To see the list of available consoles, perform the following:
system-admin:~ # console -x service0 on /dev/pts/2 at Local r2lead on /dev/pts/1 at Local r1lead on /dev/pts/0 at Local r1i0n8 on /dev/pts/0 at Local r1i0n0 on /dev/pts/1 at Local |
To connect to the service console, perform the following:
system-admin:~ # console service0 [Enter `^Ec?' for help] Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp (ttyS1). service0 login: |
To connect to the rack leader controller console, perform the following:
system-admin:~ # console r1lead [Enter `^Ec?' for help] Welcome to SUSE Linux Enterprise Server 10 SP1 (x86_64) - Kernel 2.6.16.46-0.12-smp (ttyS1). r1lead login: |
To trigger system request commands sysrq (once connected to a console), perform the following:
Ctrl-e c l 1 8 # set log level to 8 Ctrl-e c l 1 <sysrq cmd> # send sysrq command |
To see the list of conserver escape keys, perform the following:
Ctrl-e c ? |
The SGI Tempo systems management software uses network time protocol (NTP) as the primary mechanism to keep the nodes in your Altix ICE system synchronized. This section describes this mechanism operates on the various Altix ICE components and covers these topics:
When you used the configure-cluster command, it guided you through setting up NTP on the admin node. The NTP client on the system admin controller should point to the house network time server. The NTP server provides NTP service to system components so that nodes can consult it when they are booted. The system admin controller sends NTP broadcasts to some networks to keep the nodes in sync after they have booted.
NTP client on the rack leader controller gets time from the system admin controller when it is booted and then stays in sync by connecting to the admin node for time. The NTP server on the leader node provides NTP service to Altix ICE components so that compute nodes can sync their time when they are booted. The rack leader controller sends NTP broadcasts to some networks to keep the compute nodes in sync after they have booted.
The BMC controllers on managed service nodes, compute nodes, and leader nodes are also kept in sync with NTP. Note that you may need the latest BMC firmware for the BMCs to sync with NTP properly. The NTP server information for BMCs is provided by special options stored in the DHCP server configuration file.
The NTP client on managed service nodes ( for a definition of managed, see “discover Command” in Chapter 2) sets its time at initial booting from the system admin controller. It listens to NTP broadcasts from the system admin controller to stay in sync. It does not provide any NTP service.
The NTP Client on the compute node sets its time at initial booting from the rack leader controller. It listens to NTP broadcasts from the rack leader controller to stay in sync.
Sometime, especially during initial deployment of an Altix ICE system when system components are being installed and configured for the first time, NTP is not available to serve time to system components.
A non-modified NTP server, running for the first time, takes quite some time before it offers service. This means the leader and service nodes may fail to get time from the system admin controller as they come on-line. Compute nodes may also fail to get time from the leader when they first come up. This situation usually only happens at first deployment. After the ntp servers have a chance to create their drift files, ntp servers offer time with far less delay on subsequent reboots.
The following work arounds are in place for situations when NTP can not serve the time:
The admin and rack leader controllers have the time service enabled (xinetd).
All system node types have the netdate command.
A special startup script is on leader, service, and compute nodes that runs before the NTP startup script.
This script attempts to get the time using the ntpdate command. If the ntpdate command fails because the NTP server it is using is not ready yet to offer time service, it uses the netdate command to get the clock "close".
The ntp startup script starts the NTP service as normal. Since the clock is known to be "close", NTP will fix the time when the NTP servers start offering time service.
This section describes how to change the size of /tmp on Altix ICE compute nodes.
To change the size of /tmp on your system compute nodes, perform the following steps:
From the admin node, change directory (cd) to /opt/sgi/share/per-host-customization/global.
Open the sgi-fstab file and change the size= parameter for the /tmp mount, as shown in the example below:
#!/bin/sh
#
# Copyright (c) 2007 Silicon Graphics, Inc.
# All rights reserved.
#
# Set up the compute node's /etc/fstab file.
#
# Modify per your sites requirements.
#
# This script is excecuted once per-host as part of the install-image operation
# run on the leader nodes, which is called from cimage on the admin node.
# The full path to the per-host iru+slot directory is passed in as $1,
# e.g. /var/lib/sgi/per-host//i2n11.
#
# sanity checks
. /opt/sgi/share/per-host-customization/global/sanity.sh
iruslot=$1
cat <${iruslot}/etc/fstab
# tmpfs /tmp tmpfs size=48m 0 0
EOF |
Push the image out to the racks to pick up the change, as follows:
# cimage --push-rack mynewimage r\* |
For more information on using the cimage command, see “cimage Command”.
This section describes how to disable swap space on your Altix ICE system.
To disable swap space, from the admin node, perform the following steps:
Turn off swapping, as follows:
# chroot /var/lib/systemimager/images/compute-sles10sp1 chkconfig iscsiswap off |
Push the new image out to the compute nodes, as follows:
# cimage --push-rack compute-sles10sp1 r\* |
Power on or reboot the compute nodes (see “Shutting Down and Booting”).
This section describes how to change per-node swap space on your SGI Altix ICE system.
To increase the default size of the per-blade swap space on your system, perform the following:
Shutdown all blades in the affected rack (see “Shutting Down and Booting”).
Log into the leader node for the rack in question. (Note that you need to do this on each rack leader).
Change directory (cd) to the /var/lib/sgi/swapfiles directory.
To adjust the swap space size appropriate for your site, run a script similar to the following:
#!/bin/bash
size=262144 # size in KB
for i in $(seq 0 3); do
for n in $(seq 0 15); do
dd if=/dev/zero of=i${i}n${n} bs=1k count=${size}
mkswap i${i}n${n}
done
done |
Reboot the all blades in the affected rack (see “Shutting Down and Booting”).
From the rack leader node, use the cexec --all command to run the free (1) command on the compute blades to view the new swap sizes, as follows:
r1lead:~ # cexec --all free
************************* rack_1 *************************
--------- r1i0n0---------
total used free shared buffers cached
Mem: 2060140 206768 1853372 0 4 46256
-/+ buffers/cache: 160508 1899632
Swap: 49144 0 49144
--------- r1i0n1---------
total used free shared buffers cached
Mem: 2060140 137848 1922292 0 4 44200
-/+ buffers/cache: 93644 1966496
Swap: 49144 0 49144
--------- r1i0n8---------
total used free shared buffers cached
Mem: 2060140 138076 1922064 0 4 43172
-/+ buffers/cache: 94900 1965240
Swap: 49144 0 49144 |
If you want change per-node swap space across your entire system, all (new) leaders nodes as part of discovery, you can edit the /etc/opt/sgi/conf.d/35-compute-swapfiles “inside” the lead-sles10sp1 image on the admin node. The images are in the /var/lib/systemimager/images directory. For more information on customizing these images, see “Customizing Software Images”.
This section describes how to view the per compute node read and write quota.
To view the per compute node read and write quota, log onto the leader node and perform the following:
r1lead:~ # xfs_quota -x -c 'quota -ph 1'
Disk quotas for Project #1 (1)
Filesystem Blocks Quota Limit Warn/Time Mounted on
/dev/disk/by-label/sgiroot
64.6M 0 1G 00 [------] /
|
Map the XFS project ID to the quota you are interested in by looking it up in /etc/projects file.
If you decided to change the xfs_quota values, log back onto the admin node and edit the /etc/opt/sgi/cminfo file inside the compute image where you want to change the value, for example, /var/lib/systemimager/images/ image_name. Change the value of the PER_BLADE_QUOTA variable and then repush the image with the following command:
# cimage --push-rack image_name racks |
For help information, perform the following:
xfs_quota> help df [-bir] [-hn] [-f file] -- show free and used counts for blocks and inodes help [command] -- help for one or all commands print -- list known mount points and projects quit -- exit the program quota [-bir] [-gpu] [-hnv] [-f file] [id|name]... -- show usage and limits Use 'help commandname' for extended help |
Use help commandname for extended help, such as the following:
xfs_quota> help quota quota [-bir] [-gpu] [-hnv] [-f file] [id|name]... -- show usage and limits display usage and quota information -g -- display group quota information -p -- display project quota information -u -- display user quota information -b -- display number of blocks used -i -- display number of inodes used -r -- display number of realtime blocks used -h -- report in a human-readable format -n -- skip identifier-to-name translations, just report IDs -N -- suppress the initial header -v -- increase verbosity in reporting (also dumps zero values) -f -- send output to a file The (optional) user/group/project can be specified either by name or by number (i.e. uid/gid/projid). xfs_quota> |
The SGI Tempo systems management software captures the relevant data for the managed objects in an SGI Altix ICE system. Managed objects are the hierarchy of nodes described in “Basic System Building Blocks” in Chapter 1. The system database is critical to the operation of your SGI Altix ICE system and you need to back up the database on a regular basis.
Managed objects on an SGI Altix ICE include the following
Altix ICE system
One ICE system is modeled as a meta-cluster. This meta-cluster contains the racks each modeled as a sub-cluster.
Nodes
System admin controller (admin node), rack leader controllers (leader nodes), service nodes, compute nodes (blades) and chassis management control blades (CMCs) are modeled as nodes.
Networks
The preconfigured and potentially customized IP networks
Nics
The network interfaces for Ethernet and InfiniBand adapters.
The network interfaces for Ethernet and InfiniBand adapter.
The node images installed on each particular node.
SGI recommends that you keep three backups of your system database at any given time. You should implement a rotating backup procedure following the son-father-grandfather principle.
To back up and restore the system database, perform the following steps:
From the system admin controller, to back up the system database perform a command similar to the following:
# mysqldump --opt oscar > backup-file.sql |
To read the dump file back into the system admin controller, perform a command similar to the following:
# mysql oscar < backup-file.sql |