This chapter describes how to use the SGI Tempo systems management software to operate your Altix ICE system and covers the following topics:
This section describes SLES10 services turned off on compute nodes by default, how you can customize the software running on compute nodes, create a simple clone image of compute node software, and how to use the cimage command. It covers these topics:
Currently, the compute nodes run the SUSE Linux Enterprise Server 10 (SLES10) Service Pack 1 (SP1) Linux distribution. To improve the performance of applications running MPI jobs on compute nodes, the following SLES10 services are turned off:
acpid
auditd
boot.crypto
boot.device-mapper
boot.lvm
boot.md
cron
earlykbd
earlysyslog
fbset
irq_balancer
kbd
novell-zmd
nscd
postfix
powersaved
resmgr
slpd
splash
splash_early
suseRegister
xdm
You can add per-host compute node customization to the compute node images. You do this by adding scripts either to the /opt/sgi/share/per-host-customization/global/ directory or the /opt/sgi/share/per-host-customization/ mynewimage/ directory on the system admin controller.
| Note: When creating custom images for compute blades, make sure you clone the original SGI images. This provides the original images intact that you can fall back to if necessary. |
Scripts in the global directory apply to all compute nodes images. Scripts under the image name apply only to the image in question. The scripts are cycled through once per host when being installed on the rack leaders. They receive one input argument, which is the full path (on the rack leader controller) to the per-host base directory, for example /var/lib/sgi/mynewimage/i2n11. There is a README file at /opt/sgi/share/per-host-customization/README on the system admin controller, as follows:
This directory contains compute node image customization scripts which are executed as part of the install-image operations on the leader nodes when pulling over a new compute node image. After the image has been pulled over, and the per-host-customization dir has been rsynced, the per-host /etc and /var directories are populated, then the scripts in this directory are cycled through once per-host. This allows the scripts to source the node specific network and cluster management settings, and set node specific settings. Scripts in the global directory are iterated through first, then if a directory exists that matches the image name, those scripts are iterated through next. You can use the scripts in the global directory as examples. |
An example global script, /opt/sgi/share/per-host-customization/global/sgi-hostname is, as follows:
#!/bin/sh
#
# Set the compute node's hostname to the cluster unique name
#
# This script is excecuted once per-host as part of the install-image operation
# run on the leader nodes. The full path to the per-host iru+slot directory is
# passed in as $1, e.g. /var/lib/sgi/per-host//i2n11.
#
iruslot=$1
# source cluster management information
. ${iruslot}/etc/opt/sgi/cminfo
# set hostname of blade to cluster unique name
echo ${NAME} > ${iruslot}/etc/HOSTNAME |
To customize the compute node operating system image, perform the following steps:
| Note: How to create a clone of the compute node image is also described in “Creating a Simple Compute Node Image Clone”. |
# cimage --clone-image compute-sles10sp2 new |
To see the images and kernels in the list, perform the following command:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-clone
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
|
From the system admin controller, change directory to the images directory, as follows:
# cd /var/lib/systemimager/images/ |
From the system admin controller, copy the RPMs your wish to add, as follows:
# cp /newrpm.rpm new/tmp |
The new RPMs now reside in /tmp direcory in a file called new. To install them into your new compute node image, perform the following commands:
# chroot new bash |
# rpm -Uvh /tmp/newrpm.rpm |
This section describes how to make a copy of a compute node image.
To create a simple compute node image clone from the system admin controller, perform the following steps:
To clone the compute node image, perform the following:
# cimage --clone-image compute-sles10sp1 compute-sles10sp1-clone |
To see the images and kernels in the list, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp
image: compute-sles10sp1-clone
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To change the compute nodes to use the cloned image/kernel pair, perform the following:
# cimage --set compute-sles10sp1-clone 2.6.16.46-0.12-smp "r*i*n*" |
The cimage command allows you to list, modify, and set software images on the compute nodes in your system.
The cimage command accepts the following options:
| Option | Description | |
| --help | Usage and help text | |
| --list-images | Lists images present in the database | |
| --list-nodes RACK ... | Lists what compute nodes are set to | |
| --set IMAGE KERNEL NODE ... | Sets the compute nodes to a certain boot image and kernel combination | |
| --add-db IMAGE | Adds an image to the database | |
| --del-db IMAGE | Deletes an image from the database | |
| --add-rack IMAGE RACK ... | Pushes an image to specified rack(s) | |
| --del-rack IMAGE RACK | Deletes an image from specified rack(s) | |
| --clone-image OIMAGE NIMAGE | Clones an existing image to a new image | |
| --del-image IMAGE | Deletes an existing image entirely |
RACK arguments take the format rX.
NODE arguments take the format rXiYnZ .
X, Y, Z can be single digits, a [start-end] range, or * for all matches.
... indicates more than one RACK or NODE argument can be passed in.
EXAMPLES
Example 3-1. cimage Command Examples
The following examples walk you through some typical cimage command operations.
To list the available images and their associated kernels, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.12-carlsbad
kernel: 2.6.16.46-0.12-smp |
To list the compute nodes in rack 1 and the image and kernel they are set to boot, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n3: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n4: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n5: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n6: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n7: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n8: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set the r1i0n0 compute node to boot the 2.6.16.46-0.12-carlsbad kernel from the compute-sles10sp1 image, perform the following: :
# cimage --set compute-sles10sp1 2.6.16.46-0.12-carlsbad r1i0n0 |
To list the nodes in rack 1 to see the changes set in the example above, perform the following:
# cimage --list-nodes r1 r1i0n0: compute-sles10sp1 2.6.16.46-0.7-carlsbad r1i0n1: compute-sles10sp1 2.6.16.46-0.7-smp r1i0n2: compute-sles10sp1 2.6.16.46-0.7-smp [...snip...] |
To set all nodes in all racks to boot the 2.6.16.46-0.7-carlsbad kernel from the compute-sles10sp1 image, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-carlsbad r*i*n* |
To set two ranges of nodes to boot the 2.6.16.46-0.7-smp kernel, perform the following:
# cimage --set compute-sles10sp1 2.6.16.46-0.7-smp r1i[0-2]n[5-6] r1i[2-3]n[0-4] |
To clone the compute-sles10sp1 image to a new image (so that you can modify it) , perform the following:
# cimage --clone-image compute-sles10sp1 mynewimage Cloning compute-sles10sp1 to mynewimage ... done |
To change to the cloned image created in the example, above, copy the needed rpms into the /var/lib/systemimager/images/tmp directory, use the chroot command to enter the directory and then install the rpms, perform the following:
# cp .rpm /var/lib/systemimager/images//tmp # chroot /var/lib/systemimager/images// bash # rpm -Uvh /tmp/.rpm |
If you make changes to the kernels in the image, you need to refresh the kernel database entries for your image, To do this, perform the following:
# cimage --del-db mynewimage # cimage --add-db mynewimage |
To push new software images out to the compute blades in a rack or set of racks, perform the following:
# cimage --add-rack mynewimage r* r1lead: install-image: mynewimage r1lead: install-image: mynewimage done. |
To list images in the database the kernels they contain, perform the following:
# cimage --list-images
image: compute-sles10sp1
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp
image: mynewimage
kernel: 2.6.16.46-0.7-carlsbad
kernel: 2.6.16.46-0.7-smp |
To set some compute nodes to boot an image, perform the following:
# cimage --set mynewimage 2.6.16.46-0.7-smp r1i3n* |
You need to reboot the compute nodes to run the new images.
Completely remove an image you no longer use, both from system admin controller and all compute nodes in all racks, perform the following:
# cimage --del-image mynewimage r1lead: delete-image: mynewimage r1lead: delete-image: mynewimage done. |
The cpower command allows you to power up, power down, reset, and show the power status of system components.
The cpower command is, as follows:
cpower [<option> ...] [<target_type>] [<action>] <target> |
The <option> argument can be one or more of the following:
| Option | Description | |
| --noleader | Do not include leader nodes (valid with rack and system domains only). | |
| --noservice | Do not include service nodes (valid with system domain only). | |
| --ipmi | Uses ipmitool to communicate. [default] | |
| --ssh | Uses ssh to communicate. | |
| --intelplus | Uses the -o intelplus option for ipmitool [default] Note that you do not usually need to specify this. | |
| --force | When using wildcards in the target, disable all “safety” checks. Make sure you really want to use this command. | |
| -n, --noexec | Displays, but does not execute, commands that affect power. | |
| -v, --verbose | Print additional information on command progress |
| Note: The command will fail if the target contains any wild cards, unless the --all option is specified. |
The <target> argument is one of the following:
| --node | Applies the action to nodes. Nodes are compute nodes, rack leader controllers (leader nodes), system admin controller (admin node), and service nodes. [default] | |
| --iru | Applies the action at the IRU level. | |
| --rack | Applies the action at the rack level. | |
| --system | Applies the action to the system. You must not specify a target with this type. |
The <action> argument is one of the following:
| --status | Show the power status of the target, including whether it is booted or not. [default] | |
| --up | --on | Powers up the target. | |
| --down | --off | Powers down the target. | |
| --reset | Performs a hard reset on the target. | |
| --cycle | Power cycles the target. | |
| --boot | Boots up the target, unless it is already booted. Waits for all targets to boot. | |
| --reboot | Reboots the target, even if already booted. Wait for all targets to boot. | |
| --shutdown | Shuts down the target, but does not power it off. Waits for targets to shut down. | |
| --identify <interval> | Turns on the identifying LED for the specified interval in seconds. Uses an interval of 0 to turn off immediately. | |
| -h, --help | Shows help usage statement. |
The target must always be specified except when the --system option is used. Wildcards may be used, but be careful not to accidentally power off or reboot the leader nodes. If wildcard use affects any leader node, the command fails with an error.
The default for the cpower command is to operate on system nodes, such as compute nodes, leader nodes, or service nodes. If you do not specify --iru, --rack, or --system, the command defaultd to operating as if you had specified --node.
Here are examples of node target names:
r1i3n10
Compute node at rack 1, IRU 3, slot 10
service0
Service node 0
r3lead
Rack leader controller (leader node) for rack 3
r1i*n*
Wildcards let you specify ranges of nodes, for example, r1i*n* all compute nodes in all IRUs on rack 1
The default operation for the cpower command is to operate on nodes and to provide you the status of these nodes, as follows:
# cpower r1i* |
The cpower command also
This example gives you the power status and boot status of all the compute blades in rack 1. This command is equivalent to cpower --node --status r1i*.
This command issues an ipmitool power off command to all of the nodes specified by the wildcard, as follows:
# cpower --off r2i* |
The default is to apply to a node.
The following commands behave exactly as you would expect as if you were using ipmitool, and have no special extra logic for ordering:
# cpower --up r1i*
# cpower --reset r1i*
# cpower --cycle r1i*
# cpower --identify 5 r1i*
| Note: --up is a synonym for --on and --down is a synonym for --off. |
The cpower command contains more logic when you go up to higher levels of abstraction, for example, using --iru , --rack, and --system. These higher level domain specifiers tell the command to be smart about how to order various of the actions that you give on the command line.
The --iru option tells the command to use correct ordering with IRU power commands. In this case, it firsts connect to the CMC on each IRU in rack 1 to issue the power on command, which turns on power to the IRU chassis (this is not the equivalent ipmitool command). Then it powers up the compute nodes in the IRU. Powering things down is the opposite, with the power to the IRU being turned off after power to the blades. IRU targets are specified as follows: r3i2 for rack 3, IRU 2.
# cpower --iru --up r1* |
The --rack option ensures power commands to the leader node are down in the correct order relative to compute nodes within a rack. First, it powers up the leader node and waits for it to boot up (if it is not already up). Then it will do the functional equivalent of a cpower --iru --up r4i* on each of the IRUs contained in the rack, including applying power to each IRU chassis. Using the --down option is the opposite, and also turns off the leader node (after doing a shutdown) after all the IRUs are powered down. To avoid including leader nodes in a power command for a rack, use the --noleader option. Rack targets are specified, as follows: r4 for rack 4. Here is an example:
# cpower --rack --up r4 |
Commands with the --system option ensures that power up commands are applied first to service nodes, then to leader nodes, then to IRUs and compute blades, in just the same way. Likewise, compute blades are powered down before IRUs, leader nodes, and service nodes, in that order. To avoid including service nodes in a system-domain command, use the --noservice option. Note that you must not specify a target with --system option, since it applies to the Altix ICE system.
It useful to be able to shutdown a machine before turning off the power, in most cases. The following cpower options to enable you to do this: --shutdown, --boot , and --reboot. The --shutdown option is self-explanatory, but --reboot will ensure that a system is always rebooted, whereas --boot will only boot up a system if it is not already booted. Thus, --boot is useful for booting up compute blades that have failed to start.
| Note: The IPMI power commands necessary to enable a system to boot (either with a power reset, or a power on) may be sent to a node, but a node that has been shutdown with the --shutdown option does not have its power automatically turned off. |
The --shutdown option works on node, IRU, or rack domain levels. It will shut down nodes (in the correct order if you use the --iru or --rack options), and then just leave them as they are, power still applied. Usually you may only specify one action per command, however, with the --shutdown option, you may also specify --off. Using both these actions results in nodes being shutdown, then powered off. This is particularly useful when powering off a rack, since otherwise, the leaders may be shutdown before there is a chance to power off the compute blades. Here is an example:
# cpower --shutdown --rack r1 |
To boot up systems that have not already been booted, perform the following:
# cpower --boot r1i2n* |
Again, the command boots up nodes in the right orders if you specify the --iru or --rack options and the appropriate target. Otherwise, there is no guarantee that, for example, the command will attempt to power on the leader node before compute nodes in the same rack.
To reboot all of the nodes specified, or boot them if they are already shut down, perform the following:
# cpower --reboot --iru r3i3 |
The --iru or --rack options ensure proper ordering if you use them. In this case, the command will make sure that power is supplied to the chassis for rack 3, IRU 3, and then the all the compute nodes in that IRU will be rebooted.
EXAMPLES
Example 3-2. cpower Command Examples
To boot compute blade r1i0n8, perform the following:
# cpower --boot r1i0n8 |
To boot a number of compute blades at the same time, perform the following:
# cpower --boot --rack r1 |
| Note: The --boot option will only boot those nodes that have not already booted. |
To shut down service node 0, perform the following:
# cpower --shutdown service0 |
To shutdown and switch off everything in rack 3, perform the following:
# cpower --shutdown --off --rack r3 |
| Note: Using the --shutdown and the --off options together is the only time you can use more than one command on the cpower command line. This combination will shutdown then power off all of the computer nodes in parallel, then shutdown and power off the leader node. Use the --noleader option if you want the leader node to remain booted up. |
To shutdown the entire system, including all service nodes and all leader nodes, but not the admin node, and not turn the power off to anything, perform the following:
# cpower --shutdown --system |
To shutdown all the compute nodes, but not the service nodes, leader nodes, perform the following:
# cpower --shutdown --system --noleader --noservice |
| Note: The only way to shut down the system admin controller (admin node) is to perform the operation manually. |
This section describes the cluster command and control (C3) tool suite for cluster administration and application support.
| Note: The SGI Tempo version of C3 does not include the cshutdown and cpushimage commands. |
The C3 commands used on the the SGI Alitx ICE 8200 system are, as follows:
| C3 Utilities | Description | |
| cexec(s) | Executes a given command string on each node of a cluster | |
| cget | Retrieves a specified file from each node of a cluster and places it into the specified target directory | |
| ckill | Runs kill on each node of a cluster for a specified process name | |
| clist | Lists the names and types of clusters in the cluster configuration file | |
| cnum | Returns the node names specified by the range specified on the command line | |
| cname | Returns the node positions specified by the node name given on the command line | |
| cpush | Pushes files from the local machine to the nodes in your cluster |
cexec is the most useful C3 utility. Use the cpower, power-iru, power-rack, and power-system commands rather than cshutdown (see “Power Management Commands”).
EXAMPLES
Example 3-3. C3 Command General Examples
The following examples walk you through some typical C3 command operations.
You can use the cname and cnum commands to map names to locations and vice versa, as follows:
# cname rack_1:0-2 local name for cluster: rack_1 nodes from cluster: rack_1 cluster: rack_1 ; node name: r1i0n0 cluster: rack_1 ; node name: r1i0n1 cluster: rack_1 ; node name: r1i0n10 # cnum rack_1: r1i0n0 local name for cluster: rack_1 nodes from cluster: rack_1 r1i0n0 is at index 0 in cluster rack_1 # cnum rack_1: r1i0n1 local name for cluster: rack_1 nodes from cluster: rack_1 |
You can use the clist command to retrieve the number of racks, as follows:
# clist cluster rack_1 is an indirect remote cluster cluster rack_2 is an indirect remote cluster cluster rack_3 is an indirect remote cluster cluster rack_4 is an indirect remote cluster |
You can use the cexec command to view the addressing scheme of the C3 utility, as follows:
# cexec rack_1:1 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n1--------- r1i0n1 # cexec rack_1:2-3 rack_4:0-3,10 hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ************************* rack_4 ************************* ************************* rack_4 ************************* --------- r4i0n0--------- r4i0n0 --------- r4i0n1--------- r4i0n1 --------- r4i0n10--------- r4i0n10 --------- r4i0n11--------- r4i0n11 --------- r4i0n4--------- r4i0n4 |
The following set of command shows how to use the C3 commands to transverse the different levels of hierarchy in your Altix ICE system (for information on the hierarchical design of your Altix ICE system see “Basic System Building Blocks” in Chapter 1).
To execute a C3 command on all blades within the default Altix ICE system, for example, rack 1, perform the following:
# cexec hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 --------- r1i0n10--------- r1i0n10 --------- r1i0n11--------- r1i0n11 ... |
To run a C3 command on all compute nodes across an Altix ICE system, perform the following:
# cexec --all hostname ************************* rack_1 ************************* ************************* rack_1 ************************* --------- r1i0n0--------- r1i0n0 --------- r1i0n1--------- r1i0n1 ... --------- r2i0n10--------- r2i0n10 ... --------- r3i0n11--------- r3i0n11 ... |
To run a C3 command against the first rack leader controller, in the first rack, perform the following:
# cexec --head hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead |
To run a C3 command against all rack leader controllers across all racks, perform the following:
# cexec --head --all hostname ************************* rack_1 ************************* --------- rack_1--------- r1lead ************************* rack_2 ************************* --------- rack_2--------- r2lead ************************* rack_3 ************************* --------- rack_3--------- r3lead ************************* rack_4 ************************* --------- rack_4--------- r4lead |
The following set of examples shows some specific case uses for the C3 commands that you are likely to employee.
Example 3-4. C3 Command Specific Use Examples
From the system admin controller, run command on rack 1 without including the rack leader controller, as follows:
# cexec rack_1: <cmd> |
Run a command on all service nodes only, as follows:
# cexec -f /etc/c3svc.conf <cmd> |
Run a command on all compute nodes in the system, as follows:
# cexec --all <cmd> |
Run a command on all rack leader controllers, as follows:
# cexec --all --head <cmd> |
Run a command on blade 42 (compute node 42) in rack 2, as follows:
# cexec rack_2:42 <cmd> |
From a service node over the InfiniBand Fabric, run a command on all blades (compute nodes) in the system, as follows:
# cexec --all <cmd> |
Run a command on blade 42 (compute node 42), as follows:
# cexec blades:42 <cmd> |
SGI Tempo management systems software uses the open-source console management package called conserver. For detailed information on consever, see http://www.conserver.com/
An overview of the conserver package is, as follows:
Manages the console devices of all managed nodes in an Altix ICE system
A conserver daemon runs on the system admin controller (admin node) and the rack leader controllers (leader nodes). The system admin controller manages leader and service node consoles. The rack leader controllers manage blade consoles.
The conserver daemon connects to the consoles using ipmitool. Users connect to the daemon to access them. Multiple users can connect but non-primary users are read-only.
The conserver package is configured to allow all consoles to be accessed from the system admin controller.
All consoles are logged. These logs can be found at /var/log/consoles on the system admin controller and rack leader controllers. An autofs configuration file is created to allow you to access rack leader controller managed console logs from the system admin controller, as follows:
system-admin # /net/r1lead/var/log/consoles/ |
The /etc/conserver.cf file is the configuration file for the conserver daemon. This file is generated for both the system admin controller and rack leader controllers from the /opt/sgi/sbin/generate-conserver-files script on the system admin controller. This script is called from discover-rack command as part of rack discovery or rediscovery and generates both the conserver.cf file for the rack in question and regenerates the conserver.cf for the sysem admin controller.
| Note: The conserver package replaces cconsole for access to all consoles (blades, leader nodes, managed service nodes) |
You may find the following conserver man pages useful:
| Man Page | Description | |
| console(1) | Console server client program | |
| conserver(8) | Console server daemon | |
| conserver.cf(5) | Console configuration file for conserver(8) | |
| conserver.passwd(5) | User access information for conserver(8) |
To use the conserver console manager, perform the following steps:
To see the list of available consoles, perform the following:
system-admin:~ # console -x service0 on /dev/pts/1 at Local r1lead on /dev/pts/0 at Local |
To connect to a console, perform the following:
system-admin:~ # console service0 service0 login: root |
The SGI Tempo systems management software uses network time protocol (NTP) as the primary mechanism to keep the nodes in your Altix ICE system synchronized. This section describes this mechanism operates on the various Altix ICE components and covers these topics:
The NTP client on the system admin controller should point to the house network time server. The NTP server provides NTP service to system components so that nodes can consult it when they are booted. The system admin controller sends NTP broadcasts to some networks to keep the nodes in sync after they have booted.
NTP client on the rack leader controller gets time from the system admin controller when it is booted and then stays in sync by watching NTP broadcasts from the system admin controller. The NTP server node provides NTP service to Altix ICE components so that compute nodes can sync their time when they are booted. The rack leader controller sends NTP broadcasts to some networks to keep the compute nodes in sync after they have booted.
The NTP client on managed service nodes ( for a definition of managed, see “discover Command” in Chapter 2) sets its time at initial booting from the system admin controller. It listens to NTP broadcasts from the system admin controller to stay in sync. It does not provide any NTP service.
The NTP Client on the compute node sets its time at initial booting from the rack leader controller. It listens to NTP broadcasts from the rack leader controller to stay in sync.
Sometime, especially during initial deployment of an Altix ICE system when system components are being installed and configured for the first time, NTP is not available to serve time to system components.
A non-modified NTP server, running for the first time, takes quite some time before it offers service. This means the leader and service nodes may fail to get time from the system admin controller as they come on-line. Compute nodes may also fail to get time from the leader when they first come up. This situation usually only happens at first deployment. After the ntp servers have a chance to create their drift files, ntp servers offer time with far less delay on subsequent reboots.
The following work arounds are in place for situations when NTP can not serve the time:
The admin and rack leader controllers have the time service enabled (xinetd).
All system node types have the netdate command.
A special startup script is on leader, service, and compute nodes that runs before the NTP startup script.
This script attempts to get the time using the ntpdate command. If the ntpdate command fails because the NTP server it is using is not ready yet to offer time service, it uses the netdate command instead of get the clock “close".
The ntp startup script starts the NTP service as normal. Since the clock is known to be "close", NTP will fix the time when the NTP servers start offering time service.
The SGI Tempo systems management software captures the relevant data for the managed objects in an SGI Altix ICE system. Managed objects are the hierarchy of nodes described in “Basic System Building Blocks” in Chapter 1. The system database is critical to the operation of your SGI Altix ICE system and you need to back up the database on a regular basis.
Managed objects on an SGI Altix ICE include the following
Altix ICE system
One ICE system is modeled as a meta-cluster. This meta-cluster contains the racks each modeled as a sub-cluster.
Nodes
System admin controller (admin node), rack leader controllers (leader nodes), service nodes, compute nodes (blades) and chassis management control blades (CMCs) are modeled as nodes.
Networks
The preconfigured and potentially customized IP networks
Nics
The network interfaces for Ethernet and InfiniBand adapters.
The network interfaces for Ethernet and InfiniBand adapter.
The node images installed on each particular node.
SGI recommends that you keep three backups of your system database at any given time. You should implement a rotating backup procedure following the son-father-grandfather principle.
To back up and restore the system database, perform the following steps:
From the system admin controller, to back up the system database perform a command similar to the following:
# mysqldump --opt oscar > backup-file.sql |
To read the dump file back into the system admin controller, perform a command similar to the following:
# mysql oscar < backup-file.sql |