Chapter 1. Redundant Power Supply Features and Operation

This chapter explains

Redundant Power Supply Operation

The redundant power supply consists of an enclosure containing four power supply modules. Figure 1-1 shows the redundant power supply in an Origin2000 deskside system.

Figure 1-1. Redundant Power Supply in Origin2000 Deskside, Facade Removed


Each power supply module fulfills 25% of the power needs of the Origin2000 system. If one module fails, the remaining modules run at a higher rate, each supplying 33.3% of the system's power. This option enables a Silicon Graphics Origin2000 deskside or rackmount system to continue operation after a power supply failure; a Silicon Graphics System Support Engineer can easily replace a failed module while the system remains in normal operation.


Note: The redundant power supply requires a 220-volt outlet.

If the redundant power supply senses overtemperature conditions on any output, or detects insufficient airflow, it shuts down; AC input power must be manually power-cycled.

Redundant Power Supply LEDs

The LEDs on the redundant power supply function the same way as those on the original power supply; see your system's owner's guide for a description. In addition, each module in the redundant power supply has a red fault LED.

Note the following:

  • The LEDs provide information for service personnel; none are visible unless the system facade is removed.

  • Removing the system facade is necessary only when a fan tray or power supply is being replaced; these are not customer-replaceable units.

  • The facade must remain in place for normal operation.

The module system controller (MSC) provides complete information on system status and any power supply fault conditions, as explained in “MSC Failure Messages”.

Origin2000 Changes for the Redundant Power Supply

The rackmount Origin2000 facade and the deskside Origin2000 facade and chassis have been slightly redesigned to accommodate the redundant power supply. The revised MSC accommodates signals from the redundant power supply that were not produced by the original power supply, resulting in new messages. Appendix A, “MSC Messages,” explains these and, for reference, all other MSC messages. The MSC firmware revision level for the redundant power supply is 4.0 or greater.

The redundant power supply option uses a redesigned fan tray. The three front fans in the tray, which are directly below the power supply, are more powerful to handle the increased airflow requirements of the redundant power supply. These three fans run only at high speed.

The other six fans in the fan tray have two speeds, regulated by the system with respect to ambient temperature. If one fan of these six fans fails, the others compensate to enable continuous operation. If two of these six fail, or if one of the first three fans fails, the system shuts down. See “Getting the Failed Component Replaced” for information on replacing the failed component.

Power Supply Failure Messages

If a redundant power supply module fails, the MSC displays a failure message. You can use hinv -v or hwgraph to determine if redundancy is operational. This section describes using hinv -v and hwgraph output to determine redundancy, as well as MSC messages relating to system failure.

hwgraph Output

You can use hwgraph to determine if a redundant power supply is present in the system. For example:

ls /hw/module/1/slot/n1/node/rps

If a message like the following appears, the rps directory is not present, and a redundant power supply is not present in the system.

Cannot access rps: No such file or directory

hinv Output

You can use hinv -v to determine if redundancy is operational. For each Origin2000 module, example output is as follows:

Redundant Power Supply in Module 1 (Enabled)

If redundancy is lost, the output is as follows:

Redundant Power Supply in Module 1 (Lost Redundancy)

The syslog file displays a message when redundancy is lost; for example:

ALERT: (MAINT-NEEDED):module 1 MSC: Power Supply Redundancy is Lost

When redundancy is restored, syslog shows a message such as the following:

WARNING: module 1 MSC: Power Supply Redundancy is Restored

MSC Failure Messages

If a module fails, the MSC displays this message:

RPWR FL


Note: When the system boots, the PROM gets information from the system controller about redundant power supply status. However, if a module fails while the system is still in PROM mode, the system does not recognize the loss of redundancy.

In the unlikely event that more than one module in the redundant power supply fails, or if the redundant power supply enclosure fails, or for certain fan tray failures, the system shuts down. Table 1-1 lists the MSC error messages informing you of these conditions.

Table 1-1. MSC Failure Messages

Message

Condition

RPWR FL

Power supply has entered a nonredundant mode because a power supply module has failed.

PFW FAIL

Power supplied to the system has failed or dropped below acceptable parameters. The system has shut down.

PS FAIL

The internal power supply has failed and the system has shut down.

OVR TEMP

The system's temperature has exceeded acceptable limits and the system has shut down.

I2C FAIL

The system controller and the node board(s) cannot communicate. This message can appear during power-on of a multirack configuration. See “Power-Cycling a Multirack Configuration”.

M FAN FL

More than one system fan has failed and the system has shut down.

FAN FAIL

A system fan has failed. The system shuts down if
- fan 1, 2, or 3 (front fans, under the power supply) fails
- one of fans 4 through 9 fails and the system temperature is high

For a full listing of MSC messages, see Appendix A, “MSC Messages.” Information here and in Appendix A supersedes MSC message information in the Origin2000 Deskside Owner's Guide and Origin2000 Rackmount Owner's Guide.

Getting the Failed Component Replaced

If an MSC failure message or a lost redundancy message described in “Power Supply Failure Messages” appears, immediately call your service provider. The power supply cannot be turned on again until the faulty modules, redundant power supply enclosure, or fan tray is replaced. Table 1-2 gives part numbers for failed components.

Table 1-2. Redundant Power Supply Replacement Part Numbers

Failed Component

Order Number

Redundant power supply module

060-0055-00x

Redundant power supply enclosure, including four modules

013-2469-00x

Fan tray for redundant power supply

013-2329-00x



Warning: The Origin2000 system uses electrical power internally that is hazardous if the equipment is improperly handled. Only Silicon Graphics System Support Engineers (SSEs) or other personnel trained by Silicon Graphics are qualified to replace a failed power supply, power supply module, or fan tray.

The SSE removes the faulty power supply or module, or failed fan tray, only if the replacement component is available. Leaving a component bay empty causes airflow problems and exposes potentially dangerous sources of electrical power.

Power-Cycling a Multirack Configuration

A module displaying the I2C FAIL message automatically shuts down; other modules in the multirack configuration continue to boot. Depending on circumstances, you might want to bring down the other modules in the configuration. To do so, follow these steps:

  1. If the configuration has booted to the IRIX level, verify that filesystems are backed up and make sure all users are off the systems in the configuration.

  2. On each module in the configuration, become superuser and shut down the system software by entering shutdown -y g0.

  3. Power off the configuration with the MMSC command Power Down. See Chapter 7 of the Origin2000 Rackmount Owner`s Guide for more information if necessary. (Alternatively, you can power off each module individually by putting the key switch on the front of each module into the standby position.)

  4. Power off the module that displayed the I2C FAIL message by pushing down the main power switch on the rear of the module; see Figure 1-2.

    Figure 1-2. Module Main Power Switch



    Note: The module that displayed the I2C FAIL message is the only one you need to power off with the power switch. Leave the switch on the Origin Rack's power distribution unit (PDU) and the key switch on the front of the disabled module in the ON position.


  5. Wait 30 seconds. Check that all LEDs have gone dark.

  6. Push up the module power switch on the module that displayed the I2C FAIL message.

  7. Power on the configuration with the MMSC Power Up command.

Technical Specifications

Table 1-3 lists technical data for the redundant power supply.

Table 1-3. Redundant Power Supply Specifications

Parameter

Specification

Voltage

187-264 Volts, 1-phase

Maximum watts (from the wall)

2,300 watts

Minimum power factor

0.98

Maximum inrush current

140 amps

Frequency

47-63 Hertz

Maximum heat output

7.850 Btu/hr (1.63 ton AC load)

Weight

16 kg (35 lb) complete (enclosure with four modules)
2.8 kg (6.25 lb) per module

Ambient temperature

Minimum 0˚ C (32˚ F)
Maximum 50˚ C (122˚ F)