This chapter describes the function and physical components of the administrative/rack leader control servers (also referred to as nodes) in the following sections:
For purposes of this chapter “administration/controller server” is used as a catch-all phrase to describe the stand-alone servers that act as management infrastructure controllers. The specialized functions these servers perform within the ICE system primarily include:
Administration and management
Rack leader controller (RLC) functions
Other servers can be configured to provide additional services, such as:
Fabric management (usually used with 8-rack or larger systems)
Login
Batch
I/O gateway (storage)
Note that these functions are usually performed by the system's “service nodes” which are additional individual servers set up for single or multiple service tasks.
User interfaces consist of the Compute Cluster Administrator, the Compute Cluster Job Manager, and a Command Line Interface (CLI). Management services include job scheduling, job and resource management, Remote Installation Services (RIS), and a remote command environment. The 2U administrative controller server is connected to the system via a Gigabit Ethernet link, (it is not directly linked to the system's InfiniBand communication fabric).
| Note: The system management software runs on the administrative node, RLC and service nodes as a distributed software function. The system management software performs all of its tasks on the ICE system through an Ethernet network. |
The administrative controller server is at the top of the distributed management infrastructure within the ICE system. The overall ICE 8400 series management is hierarchical (see Figure 5-1), with the RLC(s) communicating with the compute nodes via CMC.
The system administrative controller unit acts as the ICE system's primary interface to the “outside world”, typically a local area network (LAN). The server is used by administrators to provision and manage cluster functions using SGI's cluster manager software.
For systems using a separate login, batch, I/O, fabric management, or other service node; this 2U server is also an available option. Figure 5-2 and Figure 5-3 show front and rear views of the 2U administration/service node. Note that the XE270 uses up to 12 DIMM memory cards.
For more information on the XE270, see the SGI Altix XE270 User's Guide, (P/N 007-5535-00x).
The administrative server's control panel features are shown in Figure 5-4.
Table 5-1. System administrative server control panel functions
Functional feature | Functional description |
|---|---|
Unit identifier button | Pressing this button lights an LED on both the front and rear of the server for easy system location in large configurations. The LED will remain on until the button is pushed a second time. |
Universal information LED | This multi-color LED blinks red quickly, to indicate a fan failure and blinks red slowly for a power failure. A continuous solid red LED indicates a CPU is overheating. This LED will be on solid blue or blinking blue when used for UID (Unit Identifier). |
NIC 2 Activity LED | Indicates network activity on LAN 2 when flashing green. |
NIC 1 Activity LED | Indicates network activity on LAN 1 when flashing green. |
Disk activity LED | Indicates drive activity when flashing. |
Power LED | Indicates power is being supplied to the server's power supply units. |
Reset button | Pressing this button reboots the server. |
Power button | Pressing the button applies/ removes power from the power supply to the server. Turning off power with this button removes main power but keeps standby power supplied to the system. |
An MPI job is started from the rack leader controller server and the sub-processes are distributed to the system blade compute nodes. The main process on the RLC server will wait for the sub-processes to finish. Note that every Altix ICE 8400 system is required to have at least one RLC. For multi-rack systems or systems that run many MPI jobs, multiple RLC servers are used to distribute the load (one per rack).
Batch or login functions most often run on individual separate service nodes, especially when the system is a large-scale multi-rack installation or has a large number of users. This 1U server may also be used as a separate (non-RLC) login, batch, I/O or fabric management node. See the section “Modularity and Scalability” in Chapter 3 for a list of administration and support server types and additional functional descriptions.
The Altix ICE system also offers a 3U-high service node as a separate login, batch, I/O, fabric management, or graphics support node. Under specific circumstances the 3U server can be configured as a mass storage resource for the ICE system. Figure 5-6 shows an example front view of the Altix XE500 server.
For more information on using the 3U service node, see the SGI Altix XE500 System User's Guide, (P/N 007-5572-00x).
Check with your SGI sales or service representative for more information on available graphics card options that can be used with the Altix XE500 in an ICE system.
Figure 5-7 shows an example rear view of the 3U Altix XE500 service node.
The highest performance optional service node in the Altix ICE 8400 system is offered as a 4U-high service node. It can serve as a separate login, batch, I/O, fabric management, or graphics support node, or combine several of these functions. Under specific circumstances the 4U server can be configured as a mass storage resource for the ICE system.
Figure 5-8 shows the front controls and interfaces available on the server. Table 5-2 describes the front panel control and interface functions on the 4U server.
Figure 5-9 calls out the components used on the front of the 4U server. Table 5-3 identifies the components called out in the figure. Rear components used on the 4U server are shown in Figure 3-8.
For more information on using the 4U service node, see the SGI Altix UV 10 System User's Guide, (P/N 007-5645-00x).
Table 5-2. 4U Service Node Front Control and Interface Descriptions
Callout | Item function or description |
|---|---|
A | Local area network (LAN) status LEDs (1 through 4) |
B | System ID LED (blue) |
C | Hard drive status LED (green) |
D | System status/fault LED (green/amber) |
E | Fan fault LED (amber) |
F | System power LED (green) shows system power status |
G | System reset button |
H | VGA video connector |
I | System ID button (toggles the blue identification LED - callout B) |
J | System power button |
K | Non-maskable interrupt (NMI) button - asserts NMI |
L | USB 2.0 connector ports |
Table 5-3. 4U Service Node Front Panel Item Identification
Front panel item | Functional description |
|---|---|
A | Optional optical drive bay |
B | Rear LAN LEDs |
C | System control panel |
D | Video connector |
E | USB 2.0 connectors |
F | 5.25-inch peripheral bay |
G | Hard drive bays |