Chapter 5. ICE Administration/Leader Servers

This chapter describes the function and physical components of the administrative/rack leader control servers (also referred to as nodes) in the following sections:

For purposes of this chapter “administration/controller server” is used as a catch-all phrase to describe the stand-alone servers that act as management infrastructure controllers. The specialized functions these servers perform within the ICE system primarily include:

Other servers can be configured to provide additional services, such as:

Note that these functions are usually performed by the system's “service nodes” which are additional individual servers set up for single or multiple service tasks.

Overview

User interfaces consist of the Compute Cluster Administrator, the Compute Cluster Job Manager, and a Command Line Interface (CLI). Management services include job scheduling, job and resource management, Remote Installation Services (RIS), and a remote command environment. The 2U administrative controller server is connected to the system via a Gigabit Ethernet link, (it is not directly linked to the system's InfiniBand communication fabric).


Note: The system management software runs on the administrative node, RLC and service nodes as a distributed software function. The system management software performs all of its tasks on the ICE system through an Ethernet network.

The administrative controller server is at the top of the distributed management infrastructure within the ICE system. The overall ICE 8400 series management is hierarchical (see Figure 5-1), with the RLC(s) communicating with the compute nodes via CMC.

Figure 5-1. ICE System Administration Hierarchy Example Block Diagram

ICE System Administration Hierarchy Example Block Diagram

Administrative/Controller Server

The system administrative controller unit acts as the ICE system's primary interface to the “outside world”, typically a local area network (LAN). The server is used by administrators to provision and manage cluster functions using SGI's cluster manager software.

For systems using a separate login, batch, I/O, fabric management, or other service node; this 2U server is also an available option. Figure 5-2 and Figure 5-3 show front and rear views of the 2U administration/service node. Note that the XE270 uses up to 12 DIMM memory cards.

Figure 5-2. Front View of 2U Administration/Controller or Service Node

Front View of 2U Administration/Controller or Service Node

Figure 5-3. Rear View of 2U Administration/Controller or Service Node

Rear View of 2U Administration/Controller or Service Node

For more information on the XE270, see the SGI Altix XE270 User's Guide, (P/N 007-5535-00x).

The administrative server's control panel features are shown in Figure 5-4.

Figure 5-4. Administrative/Controller Server Control Panel Diagram

Administrative/Controller Server Control Panel Diagram

Table 5-1. System administrative server control panel functions

Functional feature

Functional description

Unit identifier button

Pressing this button lights an LED on both the front and rear of the server for easy system location in large configurations. The LED will remain on until the button is pushed a second time.

Universal information LED

This multi-color LED blinks red quickly, to indicate a fan failure and blinks red slowly for a power failure. A continuous solid red LED indicates a CPU is overheating. This LED will be on solid blue or blinking blue when used for UID (Unit Identifier).

NIC 2 Activity LED

Indicates network activity on LAN 2 when flashing green.

NIC 1 Activity LED

Indicates network activity on LAN 1 when flashing green.

Disk activity LED

Indicates drive activity when flashing.

Power LED

Indicates power is being supplied to the server's power supply units.

Reset button

Pressing this button reboots the server.

Power button

Pressing the button applies/ removes power from the power supply to the server. Turning off power with this button removes main power but keeps standby power supplied to the system.


1U Rack Leader Controller Server

An MPI job is started from the rack leader controller server and the sub-processes are distributed to the system blade compute nodes. The main process on the RLC server will wait for the sub-processes to finish. Note that every Altix ICE 8400 system is required to have at least one RLC. For multi-rack systems or systems that run many MPI jobs, multiple RLC servers are used to distribute the load (one per rack).

Figure 5-5. 1U Rack Leader Controller (RLC) Server Front and Rear Panels

1U Rack Leader Controller (RLC) Server Front and Rear Panels

Batch or login functions most often run on individual separate service nodes, especially when the system is a large-scale multi-rack installation or has a large number of users. This 1U server may also be used as a separate (non-RLC) login, batch, I/O or fabric management node. See the section “Modularity and Scalability” in Chapter 3 for a list of administration and support server types and additional functional descriptions.

Optional 3U Service Nodes

The Altix ICE system also offers a 3U-high service node as a separate login, batch, I/O, fabric management, or graphics support node. Under specific circumstances the 3U server can be configured as a mass storage resource for the ICE system. Figure 5-6 shows an example front view of the Altix XE500 server.

For more information on using the 3U service node, see the SGI Altix XE500 System User's Guide, (P/N 007-5572-00x).

Check with your SGI sales or service representative for more information on available graphics card options that can be used with the Altix XE500 in an ICE system.

Figure 5-6. SGI Altix XE500 3U Service Node Front View

SGI Altix XE500 3U Service Node Front View

Figure 5-7 shows an example rear view of the 3U Altix XE500 service node.

Figure 5-7. SGI Altix XE500 3U Service Node Rear View

SGI Altix XE500 3U Service Node Rear View

Optional 4U Service Nodes

The highest performance optional service node in the Altix ICE 8400 system is offered as a 4U-high service node. It can serve as a separate login, batch, I/O, fabric management, or graphics support node, or combine several of these functions. Under specific circumstances the 4U server can be configured as a mass storage resource for the ICE system.

Figure 5-8 shows the front controls and interfaces available on the server. Table 5-2 describes the front panel control and interface functions on the 4U server.

Figure 5-9 calls out the components used on the front of the 4U server. Table 5-3 identifies the components called out in the figure. Rear components used on the 4U server are shown in Figure 3-8.

For more information on using the 4U service node, see the SGI Altix UV 10 System User's Guide, (P/N 007-5645-00x).

Figure 5-8. 4U Service Node Front Controls and Interfaces

4U Service Node Front Controls and Interfaces

Table 5-2. 4U Service Node Front Control and Interface Descriptions

Callout

Item function or description

A

Local area network (LAN) status LEDs (1 through 4)

B

System ID LED (blue)

C

Hard drive status LED (green)

D

System status/fault LED (green/amber)

E

Fan fault LED (amber)

F

System power LED (green) shows system power status

G

System reset button

H

VGA video connector

I

System ID button (toggles the blue identification LED - callout B)

J

System power button

K

Non-maskable interrupt (NMI) button - asserts NMI

L

USB 2.0 connector ports

Figure 5-9. 4U Service Node Front Panel

4U Service Node Front Panel

Table 5-3. 4U Service Node Front Panel Item Identification

Front panel item

Functional description

A

Optional optical drive bay

B

Rear LAN LEDs

C

System control panel

D

Video connector

E

USB 2.0 connectors

F

5.25-inch peripheral bay

G

Hard drive bays