Chapter 2. System Discovery, Installation, and Configuration

This chapter describes how to use the SGI Tempo systems management software to discovery, install, and configure your Altix ICE system and covers the following topics:


Note: If you are upgrading from a prior release or installing SGI Tempo software patches, see “Installing SGI Tempo Patches and Updating SGI Altix ICE Systems ” and “Upgrading from Prior SGI ProPack Releases to SGI ProPack 6 SP3 ”.


Configuring Factory-installed SGI Altix ICE System

This section describes what you should do if you wish to use the pre-installed software on the system admin controller (admin node).

Procedure 2-1. Configuring Factory-installed SGI Altix ICE System

    To configure the pre-installed software that comes on the admin node, perform the following steps:

    1. Use YaST to configure the first interface of the admin node for your house network. Settings to adjust may include the following:

      • Network settings including IP, default route, and so on

      • Root password

      • Time zone

    2. If you need to adjust SGI Altix ICE settings such as the Altix ICE cluster domain or any internal network ranges, you will need to reset the database and rediscover the leader nodes and service nodes, as follows:

      1. Start the configure-cluster command (see “configure-cluster Command Cluster Configuration Tool”).

      2. Choose the Reset Database operation. Read the on-screen instructions.

      3. After the database has been reset, choose Initial Setup Menu.

      4. Start the options in this menu in order starting at Perform Initial Admin Node Infrastructure Setup. Note that if you are changing any network ranges or the cluster subdomain, you should choose Network Settings before proceeding to Perform Initial Admin Node Infrastructure Setup.


        Note: You will get a message about the systemimager images already existing. You may choose to use the existing images instead of re-creating them. This will save about 30 minutes. Either choice is OK. Do not choose use existing images if you changed the root password or time zone as these settings are stored in the image when the image is created.


      5. At this point, you can begin to discover leader and service nodes and continue cluster installation. See “discover Command”.

    Overview of Installing Software and Configuring Your SGI Altix ICE System

    This section provides a high-level overview of installing and configuring your SGI Altix ICE system.

    Procedure 2-2. Overview of Installing Software and Configuring Your SGI Altix ICE System

      To install and configure software on your SGI Altix ICE system, perform the following steps:

      1. Follow “Installing SLES10 on the Admin Node”, “Installing SLES11 on the Admin Node”, or “Installing RHEL on the Admin Node” to install software on your system admin controller (admin node).

      2. Follow “configure-cluster Command Cluster Configuration Tool” to configure the overall cluster.

      3. Follow “Installing Software on the Rack Leader Controllers and Service Nodes” to install software on the leader nodes and service nodes.

      4. Follow “Discovering Compute Nodes” to discover compute nodes.

      5. Follow “Service Node Discovery, Installation and Configuration” to discover, install and configure service nodes.

      6. Ensure that all hardware components of the cluster have been discovered successfully, that is, admin, leader, service, and compute nodes and then follow “InfiniBand Configuration” to configure and check the status of the InfiniBand fabric.

      7. Follow “Configuring the Service Node”, “Setting Up an NFS Home Server on a Service Node for Your Altix ICE System”, “Service Node NFS Server Alternate: Re-exporting House NFS Servers ”, and “Setting Up a NIS Server for Your Altix ICE System” to complete your system setup.

      Installing Software on the System Admin Controller

      This section describes how to install software on the system admin controller (admin node). The system admin controller contains software for provisioning, administering, and operating the SGI Altix ICE 8200 system. The SGI Admin Node Autoinstallation DVD contains a software image for the system admin controller (admin node) and contains SGI Tempo and SGI ProPack for Linux packages, used in conjunction with the packages from the SLES10 SP2 DVD or SLES11 DVD, to create leader, service, and compute images.

      The root image for the admin node appliance is created by SGI and installed on to the admin node using the admin install DVD.


      Note: If you are reinstalling the admin node, you may want to make a backup of the cluster configuration snapshot that comes with your system so that you can recover it later. You can find it in the /opt/sgi/var/ivt directory on the admin node; it is the earliest snapshot taken. You can use this information with the interconnect verification tool (IVT) to verify that the current system shows the same hardware configuration as when it was shipped. For more information on IVT, see “Inventory Verification Tool” in Chapter 5.


      This section covers the following topics:

      Installing SLES10 on the Admin Node

      Procedure 2-3. Installing Software on the System Admin Controller

        To install SLES 10 software images on the system admin controller, perform the following steps:

        1. Turn on, reset, or reboot the system admin controller. The power on button is on the right of the system admin controller, as shown in Figure 2-1.

          Figure 2-1. System Admin Controller Power On Button and DVD Drive

          System Admin Controller Power On Button and DVD Drive

          Prior to the SGI Tempo 1.2 release, the serial console was always used even if the admin node install itself went to the vga screen.

          The new method configures the default serial console used by the system to match the console used for installation.

          If the you type "serial" at the Admin dvd install prompt, the system is also configured for serial console operations after installation and the yast2-firstboot questions appear on the serial console.

          If the you hit Enter at the prompt or type vga, the VGA screen is used for installation, as previously, but also, the system is configured to use VGA as the default console, thereafter.

          If a you want to install to the VGA screen, but also want the serial console to be used for operations after initial installation, you should add a console= parameter to /boot/grub/menu.lst for each kernel line. This is done when the admin node boots for the first time after installation is completed. An example of this is, as follows:

          kernel /boot/vmlinuz-2.6.16.46-0.12-smp root=/dev/disk/by-label/sgiroot  console=ttyS1,38400n8
          splash=silent showopts

          The appropriate entries were added to the inittab and /etc/security. The change, above, is the only one needed to switch the default console from VGA to serial. Likewise, to move from serial to VGA, simply remove the console= parameter, altogether.

        2. Insert the SGI Admin Node Autoinstallation DVD in the DVD drive on the left of the system admin controller as shown in Figure 2-1.

        3. An autoinstall message appears on your console, as follows:

          SGI Admin Node Autoinstallation DVD
          
          The first time you boot after installation, you will be prompted for
          system setup questions early in the startup process.  These questions will
          appear on the same console you use to install the system.
          
          You may optionally append the "netinst" option with an nfs path to an ISO.
          
          Cascading Dual-Boot Support:
          install_slot=: install to a specific root slot, default 1
          re_partition_with_slots=: re-partition with number of slot positions, up to 5.
          default is 2. Reminder: applies to the whole cluster, not just admin node
          destructive=1: allow slots with existing filesystems to be re-created
          or signify ok to re-partition non-blank disks for re_partition_with_slots=
          
          You may install from the vga screen or from the serial console.
          The default system console will match the console you used for
          installation. Type "vga" for the vga screen or "serial" for serial.
          Append any additional parameters after "vga" or "serial".
          EXAMPLE: vga re_partition_with_slots=3 netinst=server:/mntpoint/admin.iso
          


          Note: If you want to use the serial console, enter serial at the boot: prompt, otherwise, output for the install procedure goes to VGA screen.


          It is important to note that the command line arguments you supply to the boot prompt will have implications for your entire cluster including things such as how many root slots are available, and which root slot to install to. Please read “Cascading Dual-Boot” before you install the admin node so you are prepared for these crucial decisions.

          You can hit the ENTER button at the boot prompt. The boot initrd.image executes, the hard drive is patitioned creating a swap area and a root file system, the Linux operating system and the cluster manager software is installed and a repository is set up for the rack leader controller, service node, and compute node software RPMs.


          Note: When you boot the admin install DVD and choose to repartition an existing disk, all data is lost. If you are making use of cascading dual-boot (see “Cascading Dual-Boot”) and are reinstalling a given slot, the data in that slot will be deleted but the other slots will not be modified.


        4. Once installation of software on the system admin controller is complete, remove the DVD from the DVD drive.

        5. Once the system has been installed, enter the reboot command to reboot your system.


          Note: The output will go to the VGA screen unless you used serial for the admin install DVD earlier.


          You will see messages about the system admin controller booting the kernel. You can ignore any messages about a few services that may fail to start.


          Note: If you used the serial console for installation ( serial is not the default), the console output and configuration questions from yast2 firstboot will go to the serial port. Pressing Ctrl -l will re-draw the yast2 firstboot screen when you are using the serial console.


        6. After the reboot completes, the YaST first boot installation tool starts and a Welcome screen appears, as shown in Figure 2-2. Click on the Next button to proceed.


          Note: The YaST Installation Tool has a main menu with sub-menus. You will be redirected back to the main menu, at various times, as you follow the steps in this procedure.


          Figure 2-2. YaST Welcome Screen

          YaST Welcome Screen

          You will be prompted by YaST firstboot installer to enter your system details including the root password, network configuration, time zone, and so on.

        7. From the Hostname and Name Server Configuration screen, as shown in Figure 2-3, enter the hostname and domain name of your system in the appropriate fields. Make sure that Change Hostname via DHCP is unselected (no x should appear in the box). Click on the Next button to continue.

          Figure 2-3.  -> Hostname and Domain Name Configuration Screen

          Hostname and Domain Name Configuration
 Screen


          Note: You can use Ctrl L to refresh the YaST screen as necessary.


        8. The Network Configuration II screen appears. Select Change and a small window pops up that lets you choose Network Interfaces... or "Reset to Defaults" (as shown in Figure 2-4). Choose Network Interfaces. Click Next to continue.

          Figure 2-4. Network Card Configuration Interfaces Screen

          Network Card Configuration Interfaces Screen

        9. From the Network Card Configuration Overview screen, configure the first card under Name to establish the public network (sometimes called the house network) connection to your SGI Altix ICE 8200 system.

          Figure 2-5. Network Card Configuration Overview Screen

          Network Card Configuration Overview Screen


          Note: Do NOT configure the second interface at this time. A script will do this for you in a later step.


          Click on the Next button to continue.

        10. From the Network Address Setup screen, enter the IP address for the system admin controller. SGI recommends static IP configuration (as opposed to DHCP). This is your public/house network information. Click on the Next button to continue.

          Figure 2-6. Network Address Setup Screen

          Network Address Setup Screen

        11. From the Hostname and Name Server Configuration screen, enter the name and DNS domain name as shown in Figure 2-7. Note that the hostname was entered in step 7.

          Figure 2-7. Hostname and Name Server Configuration Screen

          Hostname and Name Server Configuration Screen

        12. From the Routing Configuration screen, enter the appropriate gateway address and netmask. Click on the OK button to continue.

        13. From the Clock and Time Zone screen, select the appropriate region and time zone. Click on the Next button to continue.

        14. From the Password for the System Administrator “root” screen, set the root password. Click on the Next button to continue.

        15. From the User Authentication Method screen, select the authentication method to use for the users on your system. Click on the Accept button to continue.

        16. Enter the user's full name, username, and user password in the New Local User screen. Click on the Next button to continue.

        17. From the Hardware Configuration screen, select Use Following Configuration. Click on the Next button to continue.

        18. An Installation Competed screen appears, as show in Figure 2-8. Click on the Finish button.

          Figure 2-8. Installation Competed Screen

          Installation Competed Screen

        19. After you have completed the YaST first boot installation instructions, login into the system admin controller. You can use YaST to confirm or correct any configuration settings.


          Note: It is important that you make sure that you network settings are correct before proceeding with cluster configuration.


        20. You are now ready to run the configure-cluster command, proceed to the next section “configure-cluster Command Cluster Configuration Tool”.

        Installing SLES11 on the Admin Node


        Note: Novell SUSE Linux Enterprise Server 11 (SLES11) specific information in this document applies to SGI software product(s) built for use with SLES11. These software products mention SLES11 on the physical media cover art or in the ISO file name (for example, foundation-1SP3-cd1-media-sles11-ia64.iso). For information on SLES11 availability from SGI, refer to the announcements section on SGI Supportfolio, https://support.sgi.com .


        Procedure 2-4. Installing Software on the System Admin Controller

          To install SLES 11 software images on the system admin controller, perform the following steps:

          1. Perform steps one through five in the SLES10 procedure described, above (see “Installing SLES10 on the Admin Node”). They are the same for SLES11.

          2. After the reboot completes from the first five steps you followed in the SLES10 Admin Node Installation instructions, you will eventually see the YaST2 - firstboot@Linux Welcome screen, as shown in Figure 2-9. Click on the Next button to continue.

            Figure 2-9. YaST2 - firstboot@Linux Welcome Screen

            YaST2 - firstboot@Linux Welcome Screen


            Note: The YaST Installation Tool has a main menu with sub-menus. You will be redirected back to the main menu, at various times, as you follow the steps in this procedure.


            You will be prompted by YaST firstboot installer to enter your system details including the root password, network configuration, time zone, and so on.

          3. From the Hostname and Domain Name screen, as shown in Figure 2-10, enter the hostname and domain name of your system in the appropriate fields. Make sure that Change Hostname via DHCP is not selected (no x should appear in the box). Click the Next button to continue.

            Figure 2-10. Hostname and Domain Name Screen

            Hostname and Domain Name Screen


            Note: You can use Ctrl L to refresh the YaST screen as necessary.


          4. The Network Configuration II screen appears, as shown in Figure 2-11. Select Change and a small window pops up that lets you choose Network Interfaces... or Reset to Defaults. Choose Network Interfaces.

            Figure 2-11. Network Configuration II Screen

            Network Configuration II Screen

          5. From the Network Settings screen, as shown in Figure 2-12, configure the first card under Name to establish the public network (sometimes called the house network) connection to your SGI Altix ICE 8200 system. To do this, highlight the first card and select Edit.

            Figure 2-12. Network Settings Screen

            Network Settings Screen


            Note: In SLES11, this screen is also where we will come back to in order to set up things like the default route and DNS. You can see all of those menu choices just to the right of Overview in Figure 2-12.


          6. The Network Card Setup screen appears, as shown in Figure 2-13. SGI suggests using static IP addresses and not DHCP for admin nodes. Select Statically assigned IP Address. Once selected, you can enter the IP Address, Subnet Mask, and Hostname. These are the settings for your admin node's house/public network interface. You will enter the default route, if needed, in a different step. Select Next to continue.

            Figure 2-13. Network Card Setup Screen

            Network Card Setup Screen

          7. At this point, you are back at the Network Settings screen as shown in Figure 2-14. At this time, select Hostname/DNS. In this screen, you should enter your house/public network hostname and fully qualified domian names. In addition, any name servers for your house/public network should be supplied. Please select (ensure an x is in the box) for Write hostname to /etc/hosts. Do not select OK yet.

            Figure 2-14. Network Settings Screen

            Network Settings Screen

          8. Select Routing shown in Figure 2-15 and enter your house/public network default router information there. Now you can select OK.

            Figure 2-15. Network Settings Routing Screen

            Network Settings Routing Screen

          9. You are now back at the Network Configuration II screen, Click Next.

          10. In the Clock and Time Zone screen, you can enter the appropriate details. Select Next to continue.

          11. In the Password for the System Administrator "root"' screen, enter the password you wish to use. This password will be used throughout the cluster, not just the admin node. Select Next to continue.

          12. In the User Authentication Method screen, most customers will want to stick with the default (Local). Select Next to continue.

          13. In the New Local User screen, you can just select Next (and say Yes to the Empty User Login warning). Select Next to continue.

          14. In Installation Completed, select Finish.

          15. After you have completed the YaST first boot installation instructions, login into the system admin controller. You can use YaST to confirm or correct any configuration settings.


            Note: It is important that you make sure that you network settings are correct before proceeding with cluster configuration


          16. You are now ready to run the configure-cluster command.

            The configure-cluster command does not always draw the screen properly from the serial console. Therefore, log in to your admin node as root using ssh prior to running the configure-cluster command. For more information on the configure-cluster command, see “configure-cluster Command Cluster Configuration Tool”.

          Installing RHEL on the Admin Node

          For RHEL service and admin nodes, you are strongly encouraged to NOT use dhcp for configuring the house network. That is because doing this can clobber the /etc/resolv.conf file temporarily. The SGI cluster management tools recreate this, but even if resolv.conf has house network settings for a short time, it could disrupt nscd caches and result in DNS lookup query failures that are not obvious to track down. If the you must use dhcp for the house network, add the RESOLV_MODS=no flag to the respective /etc/sysconfig/ifcfg-ethX file to prevent resolv.conf from being disrupted. SGI has found that certain RHEL operations can strip this line out, however.

          Use text first boot or graphical first boot to install RHEL and then follow the Procedure 2-7 before you run the configure-cluster command.

          Procedure 2-5. Installing RHEL on the Admin Node Using Text Firstboot

            To install RHEL on the admin node perform the following steps:

            1. Follow normal admin install DVD instructions.

              When the system boots up, firstboot questions will start.


              Note: The default root password is sgisgi.


              If you want to use graphical firstboot, see Procedure 2-6.

            2. RHEL firstboot has a timeout after which firstboot will quit. If you wish to restart the firstboot menu, enter the following:

              # /etc/init.d/firstboot start

            3. To force it to run, use the following commands:

              # touch /etc/reconfigSys
              # /etc/init.d/firstboot

            4. Use Network configuration to configure eth0 for your house network.

            5. Set Firewall configuration, appropriately.

            6. Set up the time zone in Timezone configuration.

            7. Currently, the following steps need to be performed before you start the configure-cluster command (see “configure-cluster Command Cluster Configuration Tool”).

              1. Update the root password.

              2. Configure your host name in /etc/sysconfig/network file with a line similar to the following:

                HOSTNAME=system-admin


                Note: Use the short name of the hostname with no dots.


              3. Configure name servers in the /etc/resolv.conf file.

              4. You can set ONBOOT=no in /etc/sysconfig/network-scripts/ifcfg-eth1 file. This avoids a multi-minute timeout in the next step.

              5. Perform the following commands:

                # /etc/init.d/network restart
                # /etc/init.d/portmap start


                Note: You can start the graphical system-config-network tool to configure these in a GUI. If you are on a remote system, make sure to use ssh with X11 forwarding (-X) or other method to display graphics from the admin node to your own workstation.


            Procedure 2-6. Installing RHEL on the Admin Node Using Graphical Firstboot

              To install RHEL on the admin node perform the following steps:

              1. If text based firstboot is already running, choose exit.

              2. If needed, perform the following command:

                # touch /etc/reconfigSys

              3. If you are connecting from a workstation, you need networking configured on the admin node at least enough to be able to ssh to the node. To do this, run system-config-network from the console and configure eth0 with the proper networking values for your network.

              4. After system-config-network exits, run the following commads:

                # /etc/init.d/network restart
                # /etc/init.d/portmap start

              5. Use the following command, to connect to the admin node:

                # ssh -X11 root@admin-node

              6. Perform the following command:

                # /etc/init.d/firstboot start

                Currently, the following steps need to be performed before you start the configure-cluster command (see “configure-cluster Command Cluster Configuration Tool”).

                1. Configure eth0 for your house network.

                2. Set up the host name and name servers

                3. Set up your timezone correctly. You do not need to set up your NTP servers yet. This is done in a later step. Any NTP server configuration done now would be lost later.

                4. Do not configure or register for updates in Set Up Software Updates (RHN) Doing so will make creating systemimager images take too long. These updates need to be mirrored on the admin node in a later process.

              Procedure 2-7. Repair /etc/hosts File

                You must fix up the /etc/hosts file. In default installations, after configuring the IP addresses, one or both localhost entries may disappear from the /etc/hosts file. In addition, there must be a hosts entry for the house network interface of the admin node or configure-cluster will fail. The suggested method for fixing the hosts table is to use the system-config-network command in graphical mode, as follows:

                1. Use the cat command to view the contents of the file and perform the following:

                  • If there is no entry for 127.0.0.1 in the hosts table, take note of this.

                  • If there is no ipv6 entry for ::1 in the hosts table, take note of this.

                  • If there is no entry for your house network, take note of this.

                2. Start system-config-network in graphical mode.

                3. Click New.

                4. A window comes up prompting for Address:, Hostname:, and Aliases.

                5. For entries missing from the hosts table that you noted above, perform the following:

                  • To add 127.0.0.1: enter 127.0.0.1 for Address, localhost.localdomain for Hostname, and localhost for Aliases.

                  • To add ::1: enter ::1 for Address, localhost6.localdomain6 for Hostname, and localhost6 for Aliases.

                  • To add your house network, enter your admin node house IP for Address, the fully qualified hostname with dots for Hostname, and the hostname without dots for Aliases.


                  Note: The system-config-network command will NOT display the localhost entries in the GUI but it will add them. The entry for your house network will show up in the GUI.


                6. Choose File -> Save.

                7. If your hostname were 'foo' and your fully qualified domain name were 'foo.bar.org', then your /etc/hosts would look something like this after the steps above are complete:

                  ::1     localhost6.localdomain6      localhost6
                  127.0.0.1       localhost.localdomain       localhost
                  128.162.244.88  foo.bar.org     foo
                  


                  Note: If localhost is missing from /etc/hosts, various commands will fail to start up properly. If your hostname is missing from /etc/hosts, configure-cluster will error out.


                8. At this point, enter hostname -d to confirm it returns the domain name. If it does not, try nscd -i hosts to invalidate the nscd hosts cache and see if hostname returns what you expect then. If not, reboot the admin node.

                9. Set the root password with the passwd command.

                10. You may now start the configure-cluster command.


                  Note: SGI suggests that you run the configure-cluster command either from the VGA screen or from an ssh session to the admin node. Avoid running the configure-cluster command from a serial console.


                configure-cluster Command Cluster Configuration Tool


                Note: SGI suggests that you run the configure-cluster command either from the VGA screen or from an ssh session to the admin node. Avoid running the configure-cluster command from a serial console.


                The configure-cluster command launches a cluster configuration tool. It allows you to perform the following:

                • Creates the root images for the service nodes, leader nodes, and compute blades

                • Prompts for installation media including SLES10 SP2, SLES11, and optionally SGI ProPack 6 SP3. The media is used to construct repositories that are used for software installation and updates.

                • Runs a set of commands that allows you to setup the cluster

                • Change the subnet numbers for the various cluster networks

                • Configure the subdomain of the cluster (which is likely different than the domain of eth0 on the system admin controller itself)

                • Configure the InfiniBand network (see “InfiniBand Configuration”)

                Information on using this tool is described in the procedure in the following section, see “Installing Software on the System Admin Controller ”.


                Note: SGI suggests that you run the configure-cluster command either from the VGA screen or from an ssh session to the admin node. Avoid running the configure-cluster command from a serial console.


                This section describes how to use configure-cluster command to configure the system administrator controller (admin node) for your Altix ICE system.

                Procedure 2-8. Using the Cluster Configuration Tool to Configure Your System Admin Controller

                  To use the configure-cluster command to configure system admin controller (amin node), perform the following steps:

                  1. To start cluster configuration, enter the following command:

                    % /opt/sgi/sbin/configure-cluster

                  2. The Cluster Configuration Tool: Initial Configuration Check screen appears, as shown in Figure 2-16. This tool provides instructions on the steps you need to take to configure your cluster. Click OK to continue.

                    Figure 2-16. Cluster Configuration Tool: Initial Configuration Check Screen

                    Cluster Configuration Tool: Initial Configuration Check
 Screen

                  3. The Cluster Configuration Tool: Initial Cluster Setup screen appears, as shown in Figure 2-17. Read the notice and then click OK to continue.

                    Figure 2-17. Cluster Configuration Tool: Initial Cluster Setup Screen

                    Cluster Configuration Tool: Initial Cluster Setup
Screen


                    Note: The Cluster Configuration Tool has a main menu with sub-menus. You will be redirected back to the main menu, at various times, as you follow the steps in this procedure.


                  4. From the Initial Cluster Setup screen, select Repo Manager: Set up Software Repos and click OK.

                    Figure 2-18. Initial Cluster Setup Tasks Screen

                    Initial Cluster Setup Tasks Screen


                  5. Note: The next four screens use the crepo command to set up software repositories, such as, SGI Foundation, SGI Tempo, SGI ProPack, SLES10 SP2, SLES11, and RHEL 5.3. For more information, see “crepo Command” in Chapter 3.


                    To register ISO images from the admin node with Tempo and make them available to your cluster, click the Yes button.

                    Figure 2-19. Cluster Configuration Tool: Repo Manager Screen One

                    Cluster Configuration Tool: Repo Manager
Screen One

                  6. To add the SLES media and other media, such as, SGI ProPack, click OK.

                    Figure 2-20. Cluster Configuration Tool: Repo Manager Screen Two

                    Cluster Configuration Tool: Repo Manager
Screen Two

                  7. To register additional media with SGI Tempo, click Yes.

                    Figure 2-21. Cluster Configuration Tool: Repo Manager Screen Three

                    Cluster Configuration Tool: Repo Manager
Screen Three

                  8. Enter the full path to the mount point or the ISO file or a URL or NFS path that points to an ISO file. Click OK to continue.

                    Figure 2-22. Cluster Configuration Tool: Repo Manager Screen Four

                    Cluster Configuration Tool: Repo Manager
Screen Four

                  9. From the Repo Manager: Add Media screen, click OK to continue and eject you DVD if you used physical media.

                    Figure 2-23. Cluster Configuration Tool: Repo Manager: Add Media Screen Four

                    Cluster Configuration Tool: Repo Manager: Add Media 
 Screen Four


                    Note: You will continue to be prompted to add additional media until you answer no. Once you answer no, you are directed back to the Inital Cluster Setup Tasks menu.


                  10. After choosing the Network Settings option, the Cluster Network Setup screen appears, as shown in Figure 2-24.

                    Figure 2-24. Cluster Network Setup Screen

                    Cluster Network Setup Screen

                    The subnet addresses allows you to change the cluster internal network addresses. SGI recommends that you do NOT change these. Click OK to continue to adjust subnets. Otherwise, select Domain Name: Configure Cluster Domain Name and then skip to step 31. A warning screen appears, as shown in Figure 2-25.

                    Figure 2-25. Update Subnet Address Warning Screen

                    Update Subnet Address Warning Screen

                    Once you deploy your Altix ICE system, to change the network IP values or change domain names, you must reset the system data base and then rediscover the system. You do not need to reinstall the admin node, however. Click OK to continue.

                  11. The Update Subnet Addresses screen appears, as shown in Figure 2-26.

                    Figure 2-26. Update Subnet Addresses Screen

                    Update Subnet Addresses Screen

                    The default IP address of the system admin controller which is the Head Network for the Altix ICE system is shown. SGI recommends that you do NOT change the IP address of the system admin controller (admin node) or rack leader controllers (leader nodes) if at all possible. You can adjust the IP addresses of the InfiniBand network (ib0 and ib1) to match the IP requirements of the house network. Click OK to continue.

                  12. Enter the domain name for your Altix ICE system, as shown in Figure 2-27. Click OK to continue (this will be a subdomain to your house network, by default).

                    Figure 2-27. Update Cluster Domain Name Screen

                    Update Cluster Domain Name Screen

                  13. The next operation in the Initial Cluster Setup menu is NTP Time Server/Client Setup . This procedure changes your NTP configuration file. Click on OK to continue. This sets the system admin controller to serve time to the Altix ICE system and allows you to add time servers on your house networks, which you may optionally use.

                    Figure 2-28. NTP Time Server/Client Setup Screen

                    NTP Time Server/Client Setup Screen

                  14. Configure NTP time service as shown in Figure 2-29. The example provided is for SLES10 SP2. SLES11 is simlar. For RHEL 5.3, you need to configure ntp by hand after the ntp.conf file is put into place. Click Next to continue.

                    Figure 2-29. Advance NTP Configuration Screen

                    Advance NTP Configuration Screen

                  15. From the New Synchronization screen, select a synchronization peer and click Next to continue.

                    Figure 2-30. New Synchronization Screen

                    New Synchronization Screen

                  16. From the NTP Server screen, set the address of the NTP server and click OK to continue.

                    Figure 2-31. NTP Server Screen

                    NTP Server  Screen

                  17. The YaST tool completes. Click OK to continue.

                    Figure 2-32. NTP Time Server/ Client Setup Screen Three

                    NTP Time Server/ Client Setup Screen Three

                  18. The next step in the Initial Cluster Setup menu directs you to select Perform Initial Admin Node Infrastructure Setup. This step runs a series of scripts that will configure the system admin controller of the Altix ICE system.

                    The script installs and configures your system and you should see an install-cluster completed line in the output.

                    Figure 2-33. Admin Infrastructure One Time Setup Screen One

                    Admin Infrastructure One Time Setup Screen
One

                    The root images for the service, rack leader controller, and compute nodes are then created. The output of the mksiimage commands are stored in a log file at the following location:

                    /var/log/cinstallman

                    You can review the output if you so choose.

                    The final output of the script reads, as follows:

                    /opt/sgi/sbin/create-default-sgi-images Done!


                    Note: As it notes on the Admin Infrastructure One Time Setup screen, this step takes about 30 minutes.


                    Click OK to continue.

                  19. The next step in the Initial Cluster Setup menu is to configure the house DNS resolvers. It is OK to set these resolvers to the same name servers used on the system admin controller itself. Configuring these servers is what allows service nodes to resolve host names on your network. For a description of how to set up service nodes, see “Service Node Discovery, Installation and Configuration”. This menu has default values printed that match your current admin node resolver setup. If this is ok, just select OK to continue. Otherwise, make any changes you wish to the resolver listing and select OK. If you do not wish to have any house resolvers, select Disable House DNS.

                    After entering the IPs, click OK to enable, click Disable House DNS to stop using house DNS resolution, click Back to leave house DNS resolution as it was when you started (disabled at installation).

                    Figure 2-34. Configure House DNS Resolvers Screen

                    Configure House DNS Resolvers Screen

                  20. The setting DNS forwarding screen appears. Click Yes to continue.

                    Figure 2-35. Setting DNS Forwarding Screen

                    Setting DNS Forwarding Screen

                  21. The Initial Cluster Setup complete message appears. Click OK to continue.

                    Figure 2-36. Cluster Configuration Tool: Admin Infrastructure One Time Setup Screen

                    Cluster Configuration Tool: Admin Infrastructure One
Time Setup Screen

                  22. Proceed to “Installing Software on the Rack Leader Controllers and Service Nodes”. It describes the discovery process for the rack leader controllers in your system and how to install software on the rack leader controllers.


                    Note: The main menu contains a reset the database function that allows you to start software installation over without having to reinstall the system admin controller.


                  discover Command

                  The discover command is used to discover rack leader controllers (leader nodes), service nodes, including their associated BMC controllers, in an entire system or in a set of one or more racks that you select. Rack numbers generally start at one. Service nodes generally start at zero. When you use the discover command to perform the discovery operation on your Altix ICE system, you will be prompted with instructions on how to proceed (see “Installing Software on the Rack Leader Controllers and Service Nodes”).


                  Note: For the Tempo 1.5 release (or later), the operation of the discover command --delrack and --delservice options has changed. Now when using these options, the node is not removed completely from the database but it is marked with the administrative status NOT_EXIST. When you go to discover a node that previously existed, you now get the same IP allocations you had previously and the node is then marked with the administrative status of EXIST. If you have a service node, for example, service0, that has a custom host name of " myhost" and you later go to delete service0 using the discover --delservice command, the host name associated with it will still be present. This can cause conflicts if you wish to reuse the custom host name "myhost" on a node other than service0 in the future. You can use the cadmin --db-purge --node service0 command that will remove the node entirely from the database (for more information, see “cadmin: SGI Tempo Administrative Interface” in Chapter 3). You can then reuse the “myhost” name.


                  For a discover command usage statement, perform the following:

                  sys-admin ~# discover --h
                  
                  Discover allows any number of racks, service nodes, or external switches to be
                  discovered in one command line.
                  /opt/sgi/sbin/discover --rack <#>[,<options>]
                  /opt/sgi/sbin/discover --rackset <start-number>,<count>[,<options>]
                  /opt/sgi/sbin/discover --service <#>[,<options>]
                  /opt/sgi/sbin/discover --switch <name>[,<options>]
                  
                  --rack: discover a specific rack or set of racks (more than one --rack ok)
                  --rackset: discover count racks starting at start-number
                  --service: discover the specified service node
                  --switch: discover the specified external switch
                  --delrack: Mark rack leaders as deleted
                  --delservice: Mark a service node as deleted
                  --delswitch: Mark an external switch as deleted
                  --force: Not normally used.  Avoid sanity checks that require input.
                  --ignoremac <mac>: Not normally used.  See description below.
                  --macfile <file>: Not normally used.  See description below.
                  
                  This script is used to discover lead nodes and service nodes in an entire
                  system or in a set of one or more racks that you select.
                  
                  Rack numbers generally start at one.  Service nodes generally start at zero.
                  Switches are specified by name.
                  
                  <options> is a list of comma separated options that modify how discover
                  proceeds for the associated node and sets it up for installation.  Hardware
                  types (see below) have no variable style naming with equal signs.  All
                  other option types take the form "name=value".
                  
                  Options include:
                  
                   hardware type: A hardware model that affects how discover proceeds.  The list
                   of options are at the end of this help message.  If a hardware type isn't
                   specified, a default value is used.  Use the 'other' hardware type for a
                   service node you supply and manage.  This mode will allocate IPs for you and
                   print them to the screen.  It won't be managed by the cluster management
                   software.  See the examples section for how to use the hardware type.
                  
                   image type: You can specify an alternate image to install on to the target
                   system.  See the examples for how to specify this.  Alternate image names
                   can be supplied for managed service nodes and leaders.
                  
                   net: ib0 or ib1 for external IB switches only.
                  
                  If you wish to re-discover an existing service node or rack, simply run
                  the discover command in the same manner you would normally.  If you wish
                  to purge a rack or service node entirely -- never to be seen again -- use
                  --delservice / --delrack for this.  The same applies to external switches.
                  
                  Expanded descriptions for other arguments:
                  
                  --macfile:
                  Instead of discovering MACs by power cycling when instructed, consult the
                  file for the MAC instead.  This is not normally used.  All MACs to be
                  discovered must be in the file.  File format:
                  <hostname> <bmc-mac> <host-mac>
                  Example file contents:
                  r1lead 00:11:22:33:44:55 66:77:88:99:EE:FF
                  service0 00:00:00:00:00:0A 00:00:00:00:00:0B
                  
                  --ignoremac:
                  A MAC address to ignore during discover operations.  Not normally needed.
                  Multiple --ignoremac options may be specified.
                  
                  Instructions on how to proceed with discover will be provided when you
                  perform a discover.
                  
                  
                  

                  EXAMPLES

                  Example 2-1. discover Command Examples

                  The following examples walk you through some typical discover command operations.

                  To discover rack 1 and service node 0, perform the following:

                  # /opt/sgi/sbin/discover --rack 1 --service 0,xe210

                  In this example, service node 0 is an Altix XE210 system.

                  To discover racks 1-5, and service node 0-2, perform the following:

                  # /opt/sgi/sbin/discover --rackset 1,5 --service 0,xe240 --service 1,altix450 --service 2,other

                  In this example, service node 1 is an Altix 450 system. Service node 2 is other hardware type.


                  To discover service 0, but use service-myimage instead of service-sles10sp2 (default), perform the following:

                  # /opt/sgi/sbin/discover --service 0,image=service-myimage


                  Note: You may direct a service node to image itself with a custom image later, without re-discovering it. See “cinstallman Command” in Chapter 3.


                  To discover racks 1 and 4, service node 1, and ignore MAC address 00:04:23:d6:03:1c , perform the following:

                  # /opt/sgi/sbin/discover --ignoremac 00:04:23:d6:03:1c --rack 1 --rack 4 --service 0

                  The Tempo v1.6 release (and later), the discover command supports external switches in a manner similar to racks and service nodes, except that switches do not have BMCs and there is no software to install. The syntax to add a switch is, as follows:

                  discover --switch name, hardware, net=fabric

                  where name can be any alphanumeric string, hardware is any one of the supported switch types (run discover --help to get a list), and fabric is either ib0 or ib1.

                  An example command is, as follows:

                  # discover --switch extsw,voltaire-isr-9024,net=ib0

                  Once discover has assigned an IP address to the switch, it will call the fabric management sgifmcli command to initialize it with the information provided. The /etc/hosts and /etc/dhcpd.conf files should also have entries for the switch as named, above. You can use the cnodes --switch command to list all such nodes in the cluster.

                  To remove a switch, perform the following:

                  discover --delswitch name

                  where name is that of a previously discovered switch.

                  An example command is, as follows:

                  # discover --delswitch extsw

                  Installing Software on the Rack Leader Controllers and Service Nodes

                  The discover command, described in “discover Command”, sets up the leader and managed service nodes for installation and discovery. This section describes the discovery process you use to determine the Media Access Control (MAC) address, that is, the unique hardware address, of each rack leader controller (leader nodes) and then how to install software on the rack leader controllers.

                  Procedure 2-9. Installing Software on the Rack Leader Controllers and Service Nodes

                    To install software on the rack leader controllers, perform the following steps:

                    1. Use the discover command from the command line, as follows:

                      # /opt/sgi/sbin/discover --rack 1
                      


                      Note: You can discover multiple racks at a time using the --rackset option. Service nodes can be discovered with the --service option.


                      The discover script executes. When prompted, turn the power on to the node being discovered and only that node.


                      Note: Make sure you only power on the node being discovered and nothing else in the system. Make sure not to power the system up itself.


                      When the node has electrical power, the BMC starts up even though the system is not powered on. The BMC does a network DHCP request that the discover script intercepts and then configures the cluster database and DHCP with the MAC address for the BMC. The BMC then retrieves its IP address. Next, this script instructs the BMC to power up the node. The node performs a DHCP request that the script intercepts and then configures the cluster database and DHCP with the MAC address for the node. The rack leader controller installs itself using the systemimager software and then boots itself.

                      The discover script will turn on the chassis identify light for 2 minutes. Output similar to the following appears on the console:

                      Discover of rack1 / leader node r1lead complete
                      r1lead has been set up to install itself using systemimager
                      The chassis identify light has been turned on for 2 minutes

                    2. The blue chassis identify light is your cue to power on the next rack leader controller and start the process all over.

                      You may watch install progress by using the console command. For example, console r1lead connects you to the console of the r1lead so that you can watch installation progress. The sessions are also logged. For more information on the console command, see “Console Management” in Chapter 3.

                    3. Using the identify light, you can configure all the rack leader controllers and service nodes in the cluster without having to go back and fourth to and from your workstation between each discovery operation. Just use the identify light on the node that was just discovered as your cue to move to the next node to plug in.

                    4. Shortly after the discover command reports that discovery is complete for a given node, that node installs itself. If you supplied multiple nodes on the discover command line, it is possible multiple nodes could be in different stages of the imaging/installation process at the same time. For rack leaders, when the leader boots up for the first time, one process it starts is the blademond process. This process discovers the IRUs and attached blades and sets them up for use. The blademond process is described in “blademond Command For Automatic Blade Discovery ”, including which files to watch for progress.

                      If your discover process does not find the appropriate BMC after a few minutes, the following message appears:

                      ==============================================================================
                      Warning: Trouble discovering the BMC!
                      ==============================================================================
                      3 minutes have passed and we still can't find the BMC we're looking for.
                      We're going to keep looking until/if you hit ctrl-c.
                      
                      Here are some ideas for what might cause this:
                      
                        - Ensure the system is really plugged in and is connected to the network.
                        - This can happen if you start discover AFTER plugging in the system.
                          Discover works by watching for the DHCP request that the BMC on the system
                          makes when power is applied.  Only nodes that have already been discovered
                          should be plugged in.  You should only plug in service and leader nodes
                          when instructed.
                        - Ensure the CMC is operational and passing network traffic.
                        - Ensure the CMC firwmare up to date and that it's configured to do VLANs.
                        - Ensure the BMC is properly configured to use dhcp when plugged in to power.
                        - Ensure the BMC, frusdr, and bios firmware up to date on the node.
                        - Ensure the node is connected to the correct CMC port.
                      
                      Still Waiting.   Hit ctrl-c to abort this process.  That will abort discovery
                      at this problem point -- previously discovered components will not be affected.
                      ==============================================================================

                      If your discover process finds the appropriate BMC, but cannot find the leader or service node that is powered up after a few minutes, the following message appears:

                      ==============================================================================
                      Warning: Trouble discovering the NODE!
                      ==============================================================================
                      4 minutes have passed and we still can't find the node.
                      We're going to keep looking until/if you hit ctrl-c.
                      
                      If you got this far, it means we did detect the BMC earlier,
                      but we never saw the node itself perform a DHCP request.
                      
                      Here are some ideas for what might cause this:
                      
                       - Ensure the BIOS boot order is configured to boot from the network first
                       - Ensure the BIOS / frusdr / bmc firmware are up to date.
                       - Is the node failing to power up properly? (possible hardware problem?)
                         Consider manually pressing the front-panel power button on this node just
                         in case the ipmitool command this script issued failed.
                       - Try connecting a vga screen/keyboard to the node to see where it's at.
                       - Is there a fault on the node?  Record the error state of the 4 LEDs on the
                         back and contact SGI support.  Consider moving to the next rack in the mean
                         time, skippnig this rack (hit ctrl-c and re-run discover for the other
                         racks and service nodes).
                       
                      Still Waiting.   Hit ctrl-c to abort this process.  That will abort discovery
                      at this problem point -- previously discovered components will not be affected.
                      ==============================================================================

                    5. You are now ready to discover and install software on the compute blades in the rack. For instructions, see “Discovering Compute Nodes”.

                    blademond Command For Automatic Blade Discovery

                    You no longer need to explicitly call the discover-rack command to discover a rack and integrate new blades. This is done automatically by a the blademond daemon that runs on the leader nodes.

                    The blademond daemon is started up when the leader node boots after imaging and begins to poll the chassis management control (CMC) blade in each IRU to determine if any new blades are present. It polls the CMCs every two minutes to see if anything has changed. If something has changed (a new blade, a blade removed, or a blade swapped), it sends the new slot map to the admin node and calls the discover-rack command to integrate the changes. It then boots new nodes on the default compute image.

                    The blademond daemon maintains its log file at /var/log/blademond on the leader nodes.

                    You can turn on debug mode in the blademond daemon by sending it a SIGUSR1 signal from the leader node, as follows:

                    # kill -USR1 pid

                    To turn debug mode off, send it another SIGUSR1 signal. You should see a message in the blademond log about debug mode being enabled or disabled.

                    The blademond daemon maintains the slot map at /var/opt/sgi/lib/blademond/slot_map on the leader nodes. This appears as /var/opt/sgi/lib/blademond/slot_map. rack_number on the admin node.

                    Discovering Compute Nodes

                    This section describes how to discover compute nodes in your Altix ICE system.


                    Note: You no longer need to explicitly call the discover-rack command to discover a rack and integrate new compute nodes (blades). This is done automatically by the blademond daemon that runs on the leader nodes (see “blademond Command For Automatic Blade Discovery ”).


                    Procedure 2-10. Discovering Compute Nodes

                      To discover compute nodes (blades) in your Altix ICE system, perform the following:

                      1. Complete the steps in “Installing Software on the Rack Leader Controllers and Service Nodes”.

                      2. For instructions on how to configure, start, verify, or stop the InfiniBand Fabric management software on your Altix ICE system, see Chapter 4, “System Fabric Management”.


                      Note: The InfiniBand fabric does not automatically configure itself. For information on how to configure and start up the InfiniBand fabric, see Chapter 4, “System Fabric Management”.


                      Service Node Discovery, Installation and Configuration

                      Service nodes are discovered and deployed similar to rack leader controllers (leader nodes). The discover command, with the --service related commands, allow you to discover service nodes in the same discover operation that discovered the leader nodes.

                      Like rack leader controllers, the service node is automatically installed. The service node image associated with the given service node is used for installation.

                      Unlike system admin controllers (admin nodes), eth0 on the service node connects to the Altix ICE network (like rack leader controllers). If you wish to have the service node on your house network, you need to configure the second Ethernet interface (eth1).

                      The firstboot system setup script does not start automatically on the system console after the first boot after installation (unlike the admin node).

                      Use YAST to set up the public/house network on the service node, as follows:

                      • eth1 is the house network that you should configure in firstboot.

                      • If you change the default host name, you need to make sure that the cluster service name is still resolvable as tools depend on that.

                      • Name service configuration is handled by the admin and leader nodes. Therefore, service node resolv.conf files need to always point to the admin and leader nodes in order to resolve cluster names. If you wish to resolve host names on your "house" network, use the configure-cluster command to configure the house name servers. The admin and leader nodes will then be able to resolve your house network addresses, in addition to the internal cluster hostnames. Besides, the cluster configuration update framework may replace your resolv.conf file anyway when cluster configuration adjustments are made.

                        Do not change resolv.conf and do not configure different name servers in yast.

                      InfiniBand Configuration

                      Before you start configuring the InfiniBand network, you need to ensure that all hardware components of the cluster have been discovered successfully, that is, admin, leader, service and compute nodes. You also need to be finished with the cluster configuration steps in “configure-cluster Command Cluster Configuration Tool”. To configure the InfiniBand network, start the configure-cluster command again on the admin node. Since the Initial Setup has been done already, you can now use the Configure InfiniBand Fabric option to configure the InfiniBand fabric as shown in Figure 2-37.

                      Figure 2-37. Configure InfiniBand Fabric from Cluster Configuration Tool

                      Configure InfiniBand Fabric from Cluster Configuration
Tool

                      Select the Configure InfiniBand Fabric option, the InfiniBand Fabric Management tool appears, as shown in Figure 2-38.

                      Figure 2-38. InfiniBand Management Tool Screen

                      InfiniBand Management Tool
Screen

                      Use the the online help available with this tool to guide you through the InfiniBand configuration. After configuring and bringing up the InfiniBand network, select the Administer InfiniBand ib0 option or the Administer InfiniBand ib1 option, the Administer InfiniBand screen appears as shown in Figure 2-39. Verify the status using the Status option.

                      Figure 2-39. Administer InfiniBand GUI

                      Administer InfiniBand GUI

                      Configuring the Service Node

                      This section describes how to configure a service node and covers the following topics:

                      Service Node Configuration for NAT

                      You may want to reach network services outside of your SGI Altix ICE 8200 system. For this type of access, SGI recommends using Network Address Translation (NAT), also known as IP Masquerading or Network Masquerading. Depending on the amount of network traffic and your site needs, you may want to have multiple service nodes providing NAT services.

                      Procedure 2-11. Service Node Configuration for NAT

                        To enable NAT on your service node, perform the following steps:

                        1. Use the configuration tools provided on your service node to turn on IP forwarding and enable NAT/IP MASQUERADE.

                          Specific instructions should be available in the third-party documentation provided for your storage node system. Additional documentation is available at /opt/sgi/docs/setting-up-NAT/README. This document describes how to get NAT working for both IB interfaces.


                          Note: This file is only on the service node. You need to # ssh service0 and then from service 0 # cd /opt/sgi/docs/setting-up-NAT .


                        2. Update the all of the compute node images with default route configured for NAT.

                          SGI recommends a script on the system admin controller at /opt/sgi/share/per_host_customization/global/sgi-static-routes that can customize the routes based upon rack, IRU, and slot of the compute blade. Some examples are available in that script.

                        3. Use the cimage --push-rack command to propagate the changes to the proper location for compute nodes to boot. For more information on using the cimage command, see “cimage Command” in Chapter 3 and “Customizing Software On Your SGI Altix ICE System” in Chapter 3.

                        4. Use the cimage --set command to select the image.

                        5. Reboot/reset the compute nodes using that desired image.

                        6. Once the service node(s) has NAT enabled, is attached to an operational house network, and the compute nodes are booted from an image which sets their routing to point at the service node, test the NAT operation by using the ping(8) command to ping known IP addresses on the house network from an interactive session on the compute blade.

                        7. See the troubleshooting discussion that follows.

                        Troubleshooting Service Node Configuration for NAT

                        Troubleshooting can become very complex. The first steps are to determine that the service node(s) are correctly configured for the house network and can ping the house IP addresses. Good choices are house name servers possibly found in the /etc/resolv.conf or /etc/name.d.conf files on the admin node. Additionally, the default gateway addresses for the service node may be a good choice. You can use the netstat -rn command for this information, as follows:

                        system-1:/ # netstat -rn
                        Kernel IP routing table
                        Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
                        128.162.244.0   0.0.0.0         255.255.255.0   U         0 0          0 eth0
                        172.16.0.0      0.0.0.0         255.255.0.0     U         0 0          0 eth1
                        169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
                        172.17.0.0      0.0.0.0         255.255.0.0     U         0 0          0 eth1
                        127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
                        0.0.0.0         128.162.244.1   0.0.0.0         UG        0 0          0 eth0

                        If the ping command executed from the service node to the selected IP address gets responses, network monitoring tools such as tcpdump(1) should be used. On the service node, monitor the eth1 interface and simultaneously in a separate session monitor the ib[01] interface. You should specify monitoring specific-enough to not have additional noise then attempt execute a ping command from the compute node.

                        Example 2-2. tcpdump Command Examples

                        tcpdump -i eth1 ip proto ICMP # Dump ping packets on the public side of service node.
                        tcpdump -i ib1 ip proto ICMP # Dump ping packets on the IB fabric side of service node.
                        tcpdump -i eth1 port nfs # Dump NFS traffic on the eth1 side of service node.
                        tcpdump -i ib1 port nfs # Dump NFS traffic on the eth1 side of service node.

                        If packets do not reach the service nodes respective IB interface, perform the following:

                        • Check the system admin controller's compute image configuration of the default route.

                        • Verify that this image has been pushed to the compute nodes.

                        • Verify that the compute nodes have booted with this image.

                        If the packets reach the service nodes IB interface, but do not exit the eth1 interface, verify the NAT configuration on the service node.

                        If the packets exit the eth1 interface, but replies do not return, verify the house network configuration and that IP masquerading is properly configured so that the packets exiting the interface appear to be originating from the service node and not the compute node.

                        Using External DNS for Compute Node Name Resolution

                        You may want to configure service node(s) to act as NAT gateways for your cluster (see “Service Node Configuration for NAT”) and to have the host names for the compute nodes in the cluster resolve through external DNS servers.

                        You need to reserve a large block of IP addresses on your house network. If you configure to resolve via external DNS, you need to do it for both the ib0 and ib1 networks, for all node types. In other words, ALL -ib* addresses need to be provided by external DNS. This includes compute nodes, leader nodes, and service nodes. Careful planning is required to use this feature. Allocation of IP addresses will often require assistance from a network administrator of your site.

                        Once the IP addresses have been allocated on the house network, you need to tell the SGI Tempo software the IP addresses of the DNS servers on the house network that the SGI Tempo software can query for hostname resolution.

                        To do this, use the configure-cluster tool (see “configure-cluster Command Cluster Configuration Tool”). The menu item that handles this operation is Configure External DNS Masters (optional).

                        Some important considerations are, as follows:

                        • It is important to note that if you choose to use external DNS, you need to make this change before discovering anything. The change is not retroactive. If you have already discovered some nodes, then turn on external DNS support, the IP addresses assigned by SGI Tempo for the nodes already discovered will remain.

                        • This is an optional feature that only a small set of customers will need to use. It should not be used by default.

                        • This feature only makes sense if the compute nodes can reach the house network. This is not the default case for SGI Altix ICE systems.

                        • It is assumed that you have already configured a service node to act as a NAT gateway to your house network (see “Service Node Configuration for NAT”) and that the compute nodes have been configured to use that service node as their gateway.

                        Service Node Configuration for DNS

                        For information on setting up DNS, see Figure 2-34.

                        Service Node Configuration for NFS

                        Assuming the installation has either NAT or Gateway operations configured on one or more service nodes, the compute nodes can directly mount the house NFS server's exports (see the exports(5) man page).

                        Procedure 2-12. Service Node Configuration for NFS

                          To allow the compute nodes to directly mount the house NFS server's exports, perform the following steps:

                          1. Edit the system admin controller's /opt/sgi/share/per_host_customization/global/sgi-fstab file or alternatively an image-specific script. An example of the sgi-fstab file is, as follows:

                            #!/bin/sh
                            #
                            # Copyright (c) 2007,2008 Silicon Graphics, Inc.
                            # All rights reserved.
                            #
                            #  This program is free software; you can redistribute it and/or modify
                            #  it under the terms of the GNU General Public License as published by
                            #  the Free Software Foundation; either version 2 of the License, or
                            #  (at your option) any later version.
                            #
                            #  This program is distributed in the hope that it will be useful,
                            #  but WITHOUT ANY WARRANTY; without even the implied warranty of
                            #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
                            #  GNU General Public License for more details.
                            #
                            #  You should have received a copy of the GNU General Public License
                            #  along with this program; if not, write to the Free Software
                            #  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
                            #
                            # Set up the compute node's /etc/fstab file.
                            #
                            # Modify per your sites requirements.
                            #
                            # This script is excecuted once per-host as part of the install-image operation
                            # run on the leader nodes, which is called from cimage on the admin node.
                            # The full path to the per-host iru+slot directory is passed in as $1,
                            # e.g. /var/lib/sgi/per-host/<imagename>/i2n11.
                            #
                            
                            # sanity checks
                            . /opt/sgi/share/per-host-customization/global/sanity.sh
                            
                            iruslot=$1
                            os=( $(/opt/oscar/scripts/distro-query -i ${iruslot} | sed -n '/^compat /s/^compat.*: //p') )
                            compatdistro=${os[0]}${os[1]}
                            
                            if [ ${compatdistro} = "sles10" -o ${compatdistro} = "sles11" ]; then
                            
                            	#
                            	# SLES 10 compatible
                            	#
                            	cat <<EOF >${iruslot}/etc/fstab
                            # <file system> <mount point>   <type>  <options>       <dump>  <pass>
                            tmpfs           /tmp            tmpfs   size=150m       0       0
                            EOF
                            
                            elif [ ${compatdistro} = "rhel5"  ]; then
                            
                            	#
                            	# RHEL 5 compatible
                            	#
                            
                            	#
                            	# RHEL expects several subsys directories to be present under /var/run
                            	# and /var/lock, hence no tmpfs mounts for them
                            	#
                            	cat <<EOF >${iruslot}/etc/fstab
                            # <file system> <mount point>   <type>  <options>       <dump>  <pass>
                            tmpfs           /tmp            tmpfs   size=150m       0       0
                            devpts          /dev/pts        devpts  gid=5,mode=620  0       0
                            EOF
                            
                            else
                            
                            	echo -e "\t$(basename ${0}): Unhandled OS.  Doing nothing"
                            
                            fi
                            

                          2. Add the mount point, push the image, and reset the node.

                          3. The server's export should get mounted. If it is not, use the technique for troubleshooting outlined in “Troubleshooting Service Node Configuration for NAT”.

                          Service Node Configuration for NIS for the House Network

                          This section describes two different ways to configure NIS for service nodes and compute blades when you want to use the house network NIS server, as follows:

                          • NIS with the compute nodes directly accessing the house NIS infrastructure

                          • NIS with a service node as a NIS slave server to the house NIS master

                          The first approach would be used in the case where a service node is configured with network address translation (NAT) or gateway operations so that the compute nodes can access the house network directly.

                          The second approach may be used if the compute nodes do not have direct access to the house network.

                          Procedure 2-13. NIS with Compute Nodes Directly Accessing the House NIS Infrastructure

                            To setup NIS with the compute nodes directly accessing the house NIS infrastructure, perform the following steps:

                            1. In this case, you do not have to set up any additional NIS servers. Instead, each service node and compute node should be configured to bind to the existing house network servers. The nodes should already have the ypbind package installed. The following steps should work with most Linux distributions. You may need to vary them slightly to meet your specific needs.

                            2. For service nodes, the instructions are very similar to the following:

                              The only difference is that you should configure yp.conf to look at the IP address of your house network NIS server and not the leader node as is described in the sections listed, above.

                            Procedure 2-14. NIS with a Service Node as a NIS Slave Server to the House NIS Master

                              To setup NIS with a service node as a NIS slave server to the house NIS master, perform the following:

                              1. Any service nodes that are NOT acting as an NIS slave server can be pointed at the existing house network NIS servers as described in Procedure 2-13. This is because they have house interfaces.

                              2. One (or more) service node(s) should be then be configured as NIS slave server(s) to the existing house network NIS Master server.

                                Since SGI can not anticipate what operating system or release the house network NIS Master server is running, no suggestions on any configuration you need to do to tell it that you are adding new NIS slave servers can be offered, however, some hints could be found in “Setting Up a RHEL Service Node as a NIS Master”.

                              Setting Up an NFS Home Server on a Service Node for Your Altix ICE System

                              These section describes how to make a service node an NFS home directory server for the compute nodes.


                              Note: Having a single, small server provide filesystems to the whole Altix ICE system could create network bottlenecks that the hierarchical design of Altix ICE is meant to avoid, especially if large files are stored there. Consider putting your home filesystems on an NAS file server. For instructions on how to do this, see “Service Node Configuration for NFS ”.


                              The instructions in this section assume you are using the service node image provided with the Tempo software. If you are using your own installation procedures or a different operating system, the instructions will not be exact but the approach is still appropriate.


                              Note: The example below specifically avoids using /dev/sdX style device names. This is because /dev/sdX device names are not persistent and may change as you adjust disks and RAID volumes in your system. In some situations, you may assume /dev/sda is the system disk and that /dev/sdb is a data disk; this is not always the case. To avoid accidental destruction of your root disk, follow the instructions given below.


                              When you are choosing a disk, please consider the following:

                              To pick a disk device, first find the device that is being currently used as root. Avoid re-partitioning the installation disk by accident. To find which device is being used for root, use this command:

                              # ls -l /dev/disk/by-label/sgiroot
                              lrwxrwxrwx 1 root root 10 2008-03-18 04:27 /dev/disk/by-label/sgiroot ->
                              ../../sda2

                              At this point, you know the sd name for your root device is sda.

                              SGI suggests you use by-id device names for your data disk. Therefore, you need to find the by-id name that is NOT your root disk. To do that, use ls command to list the contents of /dev/disk/by-id, as follows:

                              # ls -l /dev/disk/by-id
                              total 0
                              lrwxrwxrwx 1 root root  9 2008-03-20 04:57 ata-MATSHITADVD-RAM_UJ-850S_HB08_020520 -> ../../hdb
                              lrwxrwxrwx 1 root root  9 2008-03-20 04:57 scsi-3600508e000000000307921086e156100 -> ../../sda
                              lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part1 -> ../../sda1
                              lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part2 -> ../../sda2
                              lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part5 -> ../../sda5
                              lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part6 -> ../../sda6
                              lrwxrwxrwx 1 root root  9 2008-03-20 04:57 scsi-3600508e0000000008dced2cfc3c1930a -> ../../sdb
                              lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e0000000008dced2cfc3c1930a-part1 -> ../../sdb1
                              lrwxrwxrwx 1 root root  9 2008-03-20 09:57 usb-PepperC_Virtual_Disc_1_0e159d01a04567ab14E72156DB3AC4FA -> ../../sr0

                              In the output, above, you can see that ID scsi-3600508e000000000307921086e156100 is in use by your system disk because it has a symbolic link pointing back to ../../sda. So do not consider that device.nThe other disk in the listing has ID scsi-3600508e0000000008dced2cfc3c1930a and happens to be linked to /dev/sdb.

                              Therefore, you know the by-id name you should use for your data is /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a because it is not connected with sda, which we found in the first ls example happened to be the root disk.

                              Partitioning, Creating, and Mounting Filesystems

                              Procedure 2-15. Partitioning and Creating Filesystems for an NFS Home Server on a Service Node

                                The following example uses /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a ID as the empty disk on which you will put your data. It is very important that you know this for sure. In “Setting Up an NFS Home Server on a Service Node for Your Altix ICE System”, an example is provided that allows you to determine where your root disk is located so you can avoid accidently destroying it. Remember, in some cases, /dev/sdb will be the root drive and /dev/sda or /dev/sdc may be the data drive. Please confirm that you have selected the right device, and use the persistent device name to help prevent accidental overwriting of the root disk.


                                Note: Steps 1 through 7 of this procedure are performed on the service node. Steps 8 and 9 are performed from the system admin controller (admin node).


                                To partition and create filesystems for an NFS home server on a service node, perform the following steps:

                                1. Use the parted(8) utility, or some other partition tool, to create a partition on /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a . The following example makes one filesystem out of the disk. You can use the parted utility interactively or in a command-line driven manner.

                                2. Make a new msdos label, as follows:

                                  # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a mklabel msdos
                                  

                                3. Find the size of the disk, as follows:

                                  # # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a print
                                  Disk geometry for /dev/sdb: 0kB - 249GB
                                  Disk label type: msdos
                                  Number  Start   End     Size    Type      File system  Flags
                                  Information: Don't forget to update /etc/fstab, if necessary. 
                                  

                                4. Create a partition that spans the disk, as follows:

                                  # # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a mkpart
                                  primary ext2 0 249GB
                                  Information: Don't forget to update /etc/fstab, if necessary.  

                                5. Issue the following command to cause the /dev/disk/by-id partition device file is in place and available for use with the mkfs command that follows:

                                  # udevtrigger

                                6. Create a filesystem on the disk. You can choose the filesystem type.


                                  Note: The mkfs.ext3 command takes more than 10 minutes to create a single 500GB filesystem using default mkfs.ext3 options. If you do not need the number of inodes created by default, use the -N option to mkfs.ext3 or other options that reduce the number of inodes. The following example creates 20 million inodes. XFS filesystems can be created in much shorter time.


                                  An ext3 example is, as follows:
                                  # mkfs.ext3 -N 20000000 /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1

                                  An xfs example is, as follows:
                                  # mkfs.xfs /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1 


                                7. Note: RHEL based distros normally use the LABEL= syntax in /etc/fstab.


                                  Add the newly created filesystem to the server's fstab file and mount it. Ensure that the new filesystem is exported and that the NFS service is running, as follows:

                                  1. Append the following line to your /etc/fstab file.

                                    /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1       /home   ext3    defaults        1       2


                                    Note: If you are using XFS, replace ext3 with xfs. This example uses the /dev/disk/by-id path for the device and not a /dev/sd device.


                                  2. Mount the new filesystem (the fstab entry, above, enables it to mount automatically the next time the system is rebooted), as follows:

                                    # mount -a


                                  3. Note: In some distros, the NFS server init script is simply "nfs"


                                    Make sure the NFS server service is enabled, as follows:
                                    # chkconfig nfsserver on
                                    # /etc/init.d/nfsserver restart


                                  Note: Steps 8 and 9 are performed from the system admin controller (admin node).


                                8. The following steps describe how to mount the home filesystem on the compute nodes, as follows:


                                  Note: SGI recommends that you always work on clones of the SGI-supplied compute image so that you always have a base to copy to fall back to if necessary. For information on cloning a compute node image, see “Customizing Software Images” in Chapter 3.


                                  1. Make a mount point in the blade image. In the following example, /home already is a mount point. If you used a different mount point, you need to do something similar to the following on the system admin controller. Note that the rest of the examples will resume using /home.

                                    # mkdir /var/lib/systemimager/images/compute-sles10sp2-clone/my-mount-point

                                  2. Add the /home filesystem to the compute nodes. SGI supplies an example script for managing this. You just need to add your new mount point to the sgi-fstab post-host-customization script.

                                  3. Use a text editor to edit the following file:

                                    /opt/sgi/share/per-host-customization/global/sgi-fstab

                                  4. Insert the following line just after the tmpfs and devpts lines in the sgi-fstab file:

                                    service0-ib1:/home  /home           nfs     hard            0       0


                                    Note: In order to maximize performance, SGI advises that the ib0 fabric be used for all MPI traffic. The ib1 fabric is reserved for storage related traffic.


                                  5. Use the cimage command to push the update to the rack leader controllers serving each compute node, as follows:

                                    # cimage --push-rack compute-sles10sp2-clone "r*"

                                    Using --push-rack on an image that is already on the rack leader controllers has the simple affect of updating them with the change you made above. For more information on using the cimage, see “cimage Command” in Chapter 3.

                                9. When you reboot the compute nodes, they will mount your new home filesystem.

                                For information on centrally managed user accounts, see “Setting Up a NIS Server for Your Altix ICE System”. It describes NIS master set up. In this design, the master server residing on the service node provides the filesystem and the NIS slaves reside on the rack leader controllers. If you have more than one home server, you need to export all home filesystems on all home servers to the server acting as the NIS master. You also need to export the filesystems to the NIS master using the no_root_squash exports flag.

                                Home Directories on NAS

                                If you want to use NAS server for scratch storage or make home filesystems available on NAS, you can follow the instructions in “Setting Up an NFS Home Server on a Service Node for Your Altix ICE System”. In this example, you need to replace service0-ib1 with the ib1 InfiniBand host name for the NAS server and you need to know where on the NAS server the home filesystem is mounted to craft the sgi-fstab script properly.

                                Service Node NFS Server Alternate: Re-exporting House NFS Servers


                                Note: That the nfs-server package on the download page is currently only available for SLES. This section only applies to SLES.


                                All operations are from the service node acting as the NFS proxy except where noted.

                                This procedure described in this section does not require the NAT/gateway operations and may be more efficient. This method does require that an unsupported package be installed. It is available from the SGI support page as described below.

                                Procedure 2-16. Service Node NFS Server Alternate: Re-exporting House NFS Servers

                                  To set up a service node for re-exporting house NFS servers, perform the following steps:

                                  1. Download the unsupported nfs-server RPM from the SGI support server:

                                    1. Login to Supportfolio (https://support.sgi.com/ )

                                    2. Click on Browse Collections.

                                    3. Click on Download Cool Software .

                                    4. Find the nfs-server package.

                                  2. Remove nfs-utils on the service node, as follows:

                                    # rpm -e nfs-utils

                                  3. Install the newly downloaded nfs-server RPM, as follows:

                                    # rpm -Uvh /usr/src/packages/RPMS/x86_64/nfs-server-2.2beta51-246*.x86_64.rpm

                                  4. Edit the /etc/sysconfig/nfs file and change the REEXPORT_NFS option to "yes"

                                  5. Enable the NFS server at start-up, as follows:

                                    # chkconfig nfsserver on

                                  6. Start it on the service node, as follows:

                                    # rcnfsserver start

                                  7. Add the mount to the "house nfs server" on to the service node acting as the proxy for NFS. An example fstab line is, as follows:

                                    house-server:/mirror /mirror nfs defaults 0 0 
                                    

                                  8. Ensure the filesystem is mounted, as follows:

                                     # mount -a

                                  9. Export the filesystem by adding a line to /etc/exports similar to the example. You also need to change the subdomain to match your site's.

                                    /mirror *.ice.americas.sgi.com(ro,sync)

                                  10. Now configure the compute blades to mount this directory from the service node acting as a proxy. In this example, it is assumed that service0 is the node from which the blades will mount /mirror. To do this, add a line similar to this to the following before 'EOF' in /opt/sgi/share/per-host-customization/global/sgi-fstab file. This file is located on the system admin controller (admin node).

                                    service0-ib1:/mirror  /mirror           nfs     hard            0       0

                                  11. Recall that the mount point for the compute blades needs to exist. Therefore, you might need to create a directory within the systemimager image on the admin node, for example, mkdir /var/lib/systemimager/images/compute-sles10sp2/mirror.

                                  12. Tell NFS about the exports change, as follows:

                                    # rcnfsserver reload

                                  13. Earlier, in this procedure, you changed the sgi-fstab per-host customization script and created a mount point within one or more compute blade systemimager images. From the admin node, you need to push the images so they are available on the leader nodes serving your racks. The compute blades in the rack in question should be shut down prior to running this command. You should do this for all compute images you may have and for all racks.

                                    # cimage --push-rack compute-sles10sp2 r1

                                  14. Now you may boot up your compute blades. The filesystem will now be mounted on each one. When you access /mirror on a compute blade, the service node proxy NFS server then accesses its /mirror, which contacts the actual NFS server on the house network.

                                  RHEL Service Node House Network Configuration

                                  If you plan to put your service node on the house network, you need to configure it for networking. For this, you may use the system-config-network command. It is better to use the graphical version of the tool if you are able. Use the ssh -X command from your desktop to connect to the admin node and then again to connect to the service node. This should redirect graphics over to your desktop.

                                  Some helpful hints are, as follows:

                                  • On service nodes, the cluster interface is eth0 . Therefore, do not configure this interface as it is already configured for the cluster network.

                                  • Do not make the public interface a dhcp client as this can overwrite the /etc/resolv.conf file.

                                  • Do not configure name servers, the name server requests on a service node are always directed to the admin leader nodes for resolution. If you wish to resolve network addresses on your house network, just be sure to enable the House DNS Resolvers using configure-cluster command on the admin node.

                                  • Do not configure or change the search order, as this again could adjust what cluster management has placed in the /etc/resolv.conf file.

                                  • Do not change the host name using the RHEL tools. You can change the hostname using the cadmin tool on the admin node.

                                  • After configuring your house network interface, you can use the ifup ethX command to bring the interface up. Replace X with your house network interface.

                                  • If you wish this interface to come up by default when the service node reboots, be sure ONBOOT is set to yes in /etc/sysconfig/network-scripts/ifcfg-ethX (again, replace X with the proper value). The graphical tool allows you to adjust this setting while the text tool does not.

                                  • If you happen to wipe out the resolv.conf file by accident and end up replacing it, you may need to issue this command to ensure that DNS queries work again:

                                    # nscd --invalidate hosts

                                  Setting Up a NIS Server for Your Altix ICE System

                                  This section describes how to set up a network information service (NIS) server running SLES10 for your Altix ICE system. If you would like to use an existing house network NIS server, see “Service Node Configuration for NIS for the House Network”. This section covers the following topics:

                                  Setting Up a NIS Server Overview

                                  In the procedures that follow in this section, here are some of the tasks you need to perform and system features you need to consider:

                                  • Make a service node the NIS master

                                  • Make the rack leader controllers (leader nodes) the NIS slave servers

                                  • Do not make the system admin controller as the NIS master because it may not be able to mount all of the storage types. Having the storage mounted on the NIS master server makes it far less complicated to add new accounts using NIS.

                                  • If multiple service nodes provide home filesystems, the NIS master should mount all remote home filesystems. They should be exported to the NIS master service node with the no_root_squash export option. The example in the following section assumes a single service node with storage and that same node is the NIS master.

                                  • No NIS traffic goes over the InfiniBand network.

                                  • Compute node NIS traffic goes over Ethernet, not InfiniBand, by way of using a the lead-eth server name in the yp.conf file. This design feature prevents NIS traffic from affecting the InfiniBand traffic between the compute nodes.

                                  Setting Up a SLES Service Node as a NIS Master

                                  This section describes how to set up a service node as a NIS master. This section only applies to service nodes running SLES.

                                  Procedure 2-17. Setting Up a SLES Service Node as a NIS master

                                    To set up a SLES service node as a NIS master, from the service node, perform the following steps:


                                    Note: These instructions use the text-based version of YaST. The graphical version of YaST may be slightly different.


                                    1. Start up YaST, as follows:

                                      # yast nis_server

                                    2. Choose Create NIS Master Server and click on Next to continue.

                                    3. Choose an NIS domain name and place it in the NIS Domain Name window. This example, uses ice.

                                      1. Select This host is also a NIS client.

                                      2. Select Active Slave NIS server exists .

                                      3. Select Fast Map distribution.

                                      4. Select Allow changes to passwords .

                                      5. Click on Next to continue.

                                    4. Set up the NIS master server slaves.


                                      Note: You are now in the NIS Master Server Slaves Setup. Just now, you can enter the already defined rack leader controllers (leader nodes) here. If you add more leader nodes or re-discover leader nodes, you will need to change this list. For more information, see “Tasks You Should Perform After Changing a Rack Leader Controller ”.


                                    5. Select Add and enter r1lead in the Edit Slave window. Enter any other rack leader controllers you may have just like above. Click on Next to continue.

                                    6. You are now in NIS Server Maps Setup . The default selected maps are okay. Avoid using the hosts map (not selected by default) because can interfere with Altix ICE system operations. Click on Next to continue.

                                    7. You are now in NIS Server Query Hosts Setup. Use the default settings here. However, you may want to adjust settings for security purposes. Click on Finish to continue.

                                      At this point, the NIS master is configured. Assuming you checked the This host is also a NIS client box, the service node will be configured as a NIS client to itself and start yp ypbind for you.

                                    Setting Up a RHEL Service Node as a NIS Master

                                    This section describes how to set up a service node as a NIS master. This section only applies to service nodes running RHEL 5. 3 (or later).

                                    If you have enabled the firewall on the service node, you will need to ensure the firewall allows NIS traffic to pass. That is beyond the scope of this document. By default, service nodes have the firewall disabled.

                                    Procedure 2-18. Setting Up a RHEL Service Node as a NIS master

                                      To set up a RHEL service node as a NIS master, from the service node, perform the following steps:

                                      1. Log in to the service node as root and turn on these services:

                                        # chkconfig ypserv on
                                        # chkconfig yppasswdd on

                                      2. Choose a NIS domain name and place it in the /etc/sysconfig/network file. For example:

                                        # echo "NISDOMAIN=ice" >> /etc/sysconfig/network

                                        The NIS domain will be set using this value when ypbind or ypserv is started for the first time.

                                      3. Change NOPUSH=true to NOPUSH=false in /var/yp/Makefile. This will ensure that slave servers get map updates.

                                      4. Start up the NIS server daemon, as follows:

                                        # /etc/init.d/ypserv start

                                      5. Configure the yp master server, as follows:

                                        # /usr/lib64/yp/ypinit -m

                                      6. It will prompt you for the hostnames that will act as NIS servers. It automatically includes the service node acting as the master in the list. At this time, enter the hostnames for the leader nodes starting with r1lead. Enter Ctrl-d when done as instructed by the tool.

                                      7. Start up yppasswdd, as follows:

                                        # /etc/init.d/yppasswdd start

                                      8. NIS Master Servers are also NIS Clients. However, they are configured to look at themselves for NIS. Therefore, follow these steps to make the NIS Master Server an NIS client to itself:

                                        1. Make ypbind start at bootup, as follows:

                                          # chkconfig ypbind on

                                        2. Make the NIS master service node a client of itself, as follows:

                                          # echo "ypserver localhost" >> /etc/yp.conf

                                        3. Start ypbind, as follows:

                                          # /etc/init.d/ypbind start

                                      Setting Up a SLES Service Node as a NIS Client

                                      This section describes how to use YaST to set up your other service nodes to be broadcast binding NIS clients. This section only applies to service nodes running SLES10.


                                      Note: You do not do this on the NIS Master service node that you already configured as a client in “Setting Up a SLES Service Node as a NIS Master”.


                                      Procedure 2-19. Setting Up a SLES Service Node a as NIS Client

                                        To set up a service node as a NIS client, perform the following steps:

                                        1. Enable ypbind, perform the following:

                                          # chkconfig ypbind on 

                                        2. Set the default domain (already set on NIS master). Change ice (or whatever domain name you choose above) to be the NIS domain for your Altix ICE system, as follows:

                                          # echo "ice" > /etc/defaultdomain

                                        3. In order to ensure that no NIS traffic goes over the IB network, SGI does not recommend using NIS broadcast binding on service nodes. You can list a few leader nodes the in /etc/yp.conf file on non-NIS-master service nodes. The following is an example /etc/yp.conf file. Add or remove rack leader nodes as appropriate. Having more entries in the list allows for some redundancy. If r1lead is hit by excessive traffic or goes down, ypbind can use the next server in the list as its NIS server. SGI does not suggest listing other service nodes in yp.conf file because all resolvable names for service nodes on service nodes use IP addresses that go over the InfiniBand network. For performance reasons, it is better to keep NIS traffic off of the InfiniBand network.

                                          ypserver r1lead
                                          ypserver r2lead

                                        4. Start the ypbind service, as follows:

                                          # rcypbind start

                                          The service node is now bound.

                                        5. Add the NIS include statement to the end of the password and group files, as follows:

                                          # echo "+:::" >> /etc/group
                                          # echo "+::::::" >> /etc/passwd
                                          # echo "+" >> /etc/shadow

                                        Setting up a SLES Rack Leader Controller as a NIS Slave Server and Client

                                        This section provides two sets of instructions for setting up rack leader controllers (leader nodes) as NIS slave servers. It is possible to make all these adjustments to the leader image in /var/lib/systemimager/images . Currently, SGI does not recommend using this approach.


                                        Note: Be sure the InfiniBand interfaces are up and running before proceeding because the rack leader controller gets its updates from the NIS Master over the InfiniBand network. If you get a "can't enumerate maps from service0" error, check to be sure the InfiniBand network is operational.


                                        Procedure 2-20. Setting up a Rack Leader Controller as a NIS Slave Server and Client

                                          Use the following set of commands from the system admin controller (admin node) to set up a rack leader controller (leader node) as a NIS slave server and client.


                                          Note: Replace ice with your NIS domain name and service0 with the service node you set up as the master server.


                                          admin:~ # cexec --head --all chkconfig ypserv on
                                          admin:~ # cexec --head --all chkconfig ypbind on
                                          admin:~ # cexec --head --all chkconfig portmap on
                                          admin:~ # cexec --head --all chkconfig nscd on
                                          admin:~ # cexec --head --all rcportmap start
                                          admin:~ # cexec --head --all "echo ice > /etc/defaultdomain"
                                          admin:~ # cexec --head --all "ypdomainname ice"
                                          admin:~ # cexec --head --all "echo ypserver service0 > /etc/yp.conf"
                                          admin:~ # cexec --head --all /usr/lib/yp/ypinit -s service0
                                          admin:~ # cexec --head --all rcportmap start
                                          admin:~ # cexec --head --all rcypserv start
                                          admin:~ # cexec --head --all rcypbind start
                                          admin:~ # cexec --head --all rcnscd start

                                          Setting up a RHEL Rack Leader Controller as a NIS Slave Server and Client

                                          This section describes how to set up a RHEL rack leader controller (leader node) as a NIS slave server and client.

                                          Procedure 2-21. Setting Up a RHEL Service Node a as NIS Client

                                            To set up a RHEL rack leader controller (leader node) as a NIS slave server and client, perform the following:

                                            1. Log in to the leader node(s) as root enable ypserv, as follows:

                                              # chkconfig ypserv on

                                            2. Choose a NIS domain name and place it in the /etc/sysconfig/network file. For example:

                                              # echo "NISDOMAIN=ice" >> /etc/sysconfig/network

                                              The NIS domain will be set using this value when ypbind or ypserv is started for the first time.

                                            3. Start the yp server, as follows:

                                              # /etc/init.d/ypserv start

                                            4. For each leader node, log on using the ssh command and run this command replacing service0 with the hostname of the service node acting as the NIS master, as follows:

                                              # /usr/lib64/yp/ypinit -s service0

                                            5. Enable ypbind to make this a NIS client, as follows:

                                              # chkconfig ypbind on

                                            6. Configure the leader node to be a NIS client to itself, as follows:

                                              # echo "ypserver localhost" >> /etc/yp.conf

                                            7. Start up ypbind, as follows:

                                              # /etc/init.d/ypbind start

                                            8. Optionally, set up the password, shadow, and group files with NIS includes, as follows:


                                              Note: SGI does not suggest that users log in to leader nodes, however, it is sometimes helpful for their accounts to show up there. Therefore, this is OPTIONAL.


                                              # echo "+:::" >> /etc/group
                                              # echo "+::::::" >> /etc/passwd
                                              # echo "+" >> /etc/shadow

                                            9. Optionally, if you elected to perform the above step, then you should ensure NIS is enabled in /etc/nsswitch.conf for passd, shadow, and group. The lines would look like this:

                                              passwd:     files nis
                                              shadow:     files nis
                                              group:      files nis

                                            Setting Up a RHEL Service Node as a NIS Client

                                            This section describes how to configure service nodes that are not your NIS Master to be NIS clients.

                                            Procedure 2-22. Setting Up a RHEL Service Node as a NIS Client

                                              To configure service nodes that are not your NIS Master to be NIS clients, perform the following:

                                              1. Make it so ypbind starts at bootup, as follows:

                                                # chkconfig ypbind on

                                              2. Configure the NIS domain to the same NIS domain that the NIS master server is setup as, as follows:

                                                # echo "NISDOMAIN=ice" >> /etc/sysconfig/network

                                              3. In order to ensure that no NIS traffic goes over the IB network, SGI does not recommend using NIS broadcast binding on service nodes. You can list a few leader nodes the in the /etc/yp.conf file on non-NIS-master service nodes. The following is an example /etc/yp.conf file. Add or remove rack leader nodes as appropriate. Having more entries in the list allows for some redundancy. If r1lead is experiences excessive traffic or goes down, ypbind can use the next server in the list as its NIS server. SGI does not suggest listing other service nodes in yp.conf file because all resolvable names for service nodes on service nodes use IP addresses that go over the InfiniBand network. For performance reasons, it is better to keep NIS traffic off of the InfiniBand network.

                                                ypserver r1lead
                                                ypserver r2lead

                                              4. Start ypbind up, as follows:

                                                # /etc/init.d/ypbind start

                                              5. Set up the password, shadow, and group files with NIS includes, as follows:

                                                # echo "+:::" >> /etc/group
                                                # echo "+::::::" >> /etc/passwd
                                                # echo "+" >> /etc/shadow

                                              6. Enable NIS lookups in the /etc/nsswitch.conf file. Ensure NIS is enabled in /etc/nsswitch.conf for passd, shadow, and group, as follows:

                                                passwd:     files nis
                                                shadow:     files nis
                                                group:      files nis

                                              Setting up RHEL Compute Nodes to be NIS Clients

                                              This section shows how to set up RHEL compute blades as NIS clients. The instructions work on the actual images, then push them out.

                                              Procedure 2-23. Setting up RHEL Compute Nodes to be NIS Clients

                                                To set up RHEL compute blades as NIS clients, perform the following:

                                                1. SGI suggests that you operate on a cloned image, preserving the SGI default images as a reference. See “Customizing Software On Your SGI Altix ICE System” in Chapter 3.

                                                  In RHEL, the NIS domain is set up in /etc/sysconfig/network . However, this is a file that Tempo normally creates on its own in per-host-customization. Therefore, to modify this file, we have to create a customization script of our own that appends the value. This script can be simple. The script filename should sort alphabetically after "sgi-hostname" to ensure it executes after the sgi-hostname customization script. For example, create the following file on the admin node assuming the NIS domain is 'ice' and the compute node image is compute-rhel52-clone, as follows:

                                                  # vi /opt/sgi/share/per-host-customization/global/yp-setup

                                                  Put in the following two lines in the file, then save it, as follows:

                                                  iruslot=$1
                                                  echo "NISDOMAIN=ice" >> ${iruslot}/etc/sysconfig/network

                                                2. Change permissions on the file to be executable and readable, as follows:

                                                  # chmod a+rx /opt/sgi/share/per-host-customization/global/yp-setup

                                                3. Set up compute nodes to get their NIS service from their rack leader controller (fix the domain name as appropriate), as follows:

                                                  # echo "ypserver lead-eth" > /var/lib/systemimager/images/compute-rhel52-clone/etc/yp.con

                                                4. Enable the ypbind service, using the chroot command, as follows:

                                                  # chroot /var/lib/systemimager/images/compute-rhel52-clone chkconfig ypbind on

                                                5. Set up the password, shadow, and group files with NIS includes, as follows:

                                                  # echo "+:::" >> /var/lib/systemimager/images/compute-rhel52-clone/etc/group
                                                  # echo "+::::::" >> /var/lib/systemimager/images/compute-rhel52-clone/etc/passwd
                                                  # echo "+" >> /var/lib/systemimager/images/compute-rhel52-clone/etc/shadow

                                                6. Enable NIS lookups in nsswitch.conf . On a compute node, do this by ensuring the nsswitch.conf file in the image you wish to modify has the lines as they are shown below. The filename would be, assuming image name compute-rhel52-clone, /var/lib/systemimager/images/compute-rhel52-clone/etc/nsswitch.conf :

                                                  passwd:     files nis
                                                  shadow:     files nis
                                                  group:      files nis

                                                7. Push out the updates using the cimage command, as follows:

                                                  # cimage --push-rack compute-rhel52-clone "r*" 

                                                Setting up the SLES Compute Nodes to be NIS Clients

                                                This section describes how to set up the compute nodes to be NIS clients. You an configure NIS on the clients to use a server list that only contains the their rack leader controller (leader node). All operations are performed from the system admin controller (admin node).

                                                Procedure 2-24. Setting up the Compute Nodes to be NIS Clients

                                                  To set up the compute nodes to be NIS clients, perform the following steps:

                                                  1. Create a compute node image clone. SGI recommends that you always work with a clone of the compute node images. For information on how to clone the compute node image, see “Customizing Software Images” in Chapter 3.

                                                  2. Change the compute nodes to use the cloned image/kernel pair, as follows:

                                                    admin:~ # cimage --set compute-sles10sp2-clone 2.6.16.46-0.12-smp "r*i*n*"

                                                  3. Set up the NIS domain, as follows ( ice in this example):

                                                    admin:~ # echo "ice" > /var/lib/systemimager/images/compute-sles10sp2-clone/etc/defaultdomain

                                                  4. Set up compute nodes to get their NIS service from their rack leader controller (fix the domain name as appropriate), as follows:

                                                    admin:~ # echo "ypserver lead-eth" > /var/lib/systemimager/images/compute-sles10sp2-clone/etc/yp.conf

                                                  5. Enable the ypbind service, using the chroot command, as follows:

                                                    admin:~# chroot /var/lib/systemimager/images/compute-sles10sp2-clone chkconfig ypbind on

                                                  6. Set up the password, shadow, and group files with NIS includes, as follows:

                                                    admin:~# echo "+:::" >> /var/lib/systemimager/images/compute-sles10sp2-clone/etc/group
                                                    admin:~# echo "+::::::" >> /var/lib/systemimager/images/compute-sles10sp2-clone/etc/passwd
                                                    admin:~# echo "+" >> /var/lib/systemimager/images/compute-sles10sp2-clone/etc/shadow

                                                  7. Push out the updates using the cimage command, as follows:

                                                    admin:~ # cimage --push-rack compute-sles10sp2-clone "r*"

                                                  NAS Configuration for Multiple IB Interfaces

                                                  The NAS cube needs to get configured with each InfiniBand fabric interface in a separate subnet. These fabrics will be separated from each other logically, but attached to the same physical network. For simplicity, this guide assumes that the -ib1 fabric for the compute nodes has addresses assigned in the 10.149.0.0/16 network. This guide also assumes the lowest address the cluster management software has used is 10.149.0.1 and the highest is 10.149.1.3 (already assigned to the NAS cube).

                                                  For the NAS cube, you need to configure the large physical network into four, smaller subnets, each of which would be capable of containing all the nodes and service nodes. It will have subnets 10.149.0.0/18 , 10.149.64.0/18, 10.149.128.0/18 , and 10.149.192.0/18.

                                                  After the discovery of the storage node has happened, SGI personnel will need to log onto the NAS box and change the network settings to use the smaller subnets, and then define the other three adapters with the same offset within the subnet; for example: Initial configuration of the storage node had set ib0 fabric's IP to 10.149.1.3 netmask 255.255.0.0. After the addresses are changed, ib0=10.149.1.3:255.255.192.0, ib1=10.149.65.3:255.255.192.0 , ib2=10.149.129.3:255.255.192.0, ib3=10.149.193.3:255.255.192.0 . The NAS cube should now have all four adapter connections connected to the fabric with IP addresses which can be pinged from the service node.


                                                  Note: The service nodes and the rack leads will remain in the 10.149.0.0/16 subnet.


                                                  For the compute blades, log into the admin node and modify /opt/sgi/share/per-host-customization/global/sgi-setup-ib-configs file. Following the line iruslot=$1, insert:

                                                  # Compute NAS interface to use
                                                  IRU_NODE=`basename ${iruslot}`
                                                  RACK=`cminfo --rack`
                                                  RACK=$(( ${RACK} - 1 ))
                                                  IRU=`echo ${IRU_NODE} | sed -e s/i// -e s/n.*//`
                                                  NODE=`echo ${IRU_NODE} | sed -e s/.*n//`
                                                  POSITION=$(( ${IRU} * 16 + ${NODE} ))
                                                  POSITION=$(( ${RACK} * 64 + ${POSITION} ))
                                                  NAS_IF=$(( ${POSITION} % 4 ))
                                                  NAS_IPS[0]="10.149.1.3"
                                                  NAS_IPS[1]="10.149.65.3"
                                                  NAS_IPS[2]="10.149.129.3"
                                                  NAS_IPS[3]="10.149.193.3"

                                                  Then following the line $iruslot/etc/opt/sgi/cminfo add:

                                                  IB_1_OCT12=`echo ${IB_1_IP} | awk -F "." '{ print $1 "." $2 }'`
                                                  IB_1_OCT3=`echo ${IB_1_IP} | awk -F "." '{ print $3 }'`
                                                  IB_1_OCT4=`echo ${IB_1_IP} | awk -F "." '{ print $4 }'`
                                                  IB_1_OCT3=$(( ${IB_1_OCT3} + ${NAS_IF} * 64 ))
                                                  IB_1_NAS_IP="${IB_1_OCT12}.${IB_1_OCT3}.${IB_1_OCT4}"

                                                  Then change the IPADDR='${IB_1_IP}' and NETMASK='${IB_1_NETMASK}' lines to the following:

                                                  IPADDR='${IB_1_NAS_IP}'
                                                  NETMASK='255.255.192.0'

                                                  Then add the following to the end of the file:

                                                  # ib-1-vlan config
                                                  cat << EOF >$iruslot/etc/sysconfig/network/ifcfg-vlan1
                                                  # ifcfg config file for vlan ib1
                                                  BOOTPROTO='static'
                                                  BROADCAST=''
                                                  ETHTOOL_OPTIONS=''
                                                  IPADDR='${IB_1_IP}'
                                                  MTU=''
                                                  NETMASK='255.255.192.0'
                                                  NETWORK=''
                                                  REMOTE_IPADDR=''
                                                  STARTMODE='auto'
                                                  USERCONTROL='no'
                                                  ETHERDEVICE='ib1'
                                                  EOF
                                                  if [ $NAS_IF -eq 0 ]; then
                                                      rm $iruslot/etc/sysconfig/network/ifcfg-vlan1
                                                  fi

                                                  To update the fstab for the compute blades, edit /opt/sgi/share/per-host-customization/global/sgi-fstab file. Perform the equivalent steps as above to add the # Compute NAS interface to use section into this file. Then to specify mount points, add lines similar to the following example:

                                                  # SGI NAS Server Mounts
                                                  ${NAS_IPS[${NAS_IF}]}:/mnt/data/scratch     /scratch nfs    defaults 0 0

                                                  Creating User Accounts

                                                  The example used in this section assumes that the home directory is mounted on the NIS Master service and that the NIS master is able to create directories and files on it as root. The following example use command line commands. You could also create accounts using YaST.

                                                  Procedure 2-25. Creating User Accounts on a NIS Server

                                                    To create user accounts on the NIS server, perform the following steps:

                                                    1. Log in to the NIS Master service node as root.

                                                    2. Issue a useradd command similar to the following:

                                                      # useradd -c "Joe User" -m -d /home/juser juser

                                                    3. Provide the user a password, as follows:

                                                      # passwd juser

                                                    4. Push the new account to the NIS servers, as follows:

                                                      # cd /var/yp && make

                                                    Tasks You Should Perform After Changing a Rack Leader Controller

                                                    If you add or remove a rack leader controller (leader node), for example, if you use discover command to discover a new rack of equipment, you will need to configure the new rack leader controller to be an NIS slave server as described in “Setting Up a SLES Service Node as a NIS Client”.

                                                    In addition, you need to add or remove the leader from the /var/yp/ypservers file on NIS Master service node. Remember to use the -ib1 name for the leader, as service nodes cannot resolve r2lead style names. For example, use r2lead-ib1.

                                                    # cd /var/yp && make

                                                    Installing SGI Tempo Patches and Updating SGI Altix ICE Systems

                                                    This section describes how to update the software on an SGI Altix ICE system.

                                                    Overview of Installing SGI Tempo Patches

                                                    SGI supplies updates to SGI Tempo software via the SGI update server at https://update.sgi.com/. Access to this server requires a Supportfolio login and password. Access to SUSE Linux Enterprise Server updates requires a Novell login account and registration.

                                                    The initial installation process for the SGI Altix ICE system set up a number of package repositories in the /tftpboot directory on the admin node. The SGI Tempo related packages are in directories located under the /tftpboot/sgi directory. If the cluster is configured with SUSE Linux Enterprise Linux 10 (SLES10), the SLES packages are in the /tftpboot/distro/sles10sp2. For SLES11, they are in /tftpboot/distro/sles11. If the cluster is configured with Red Hat Entperise Linux (RHEL), then the RHEL packages are in /tftpboot/distro/RHEL5.3.

                                                    When SGI releases updates, you may run sync-repo-updates (described later) to download the updated packages that are part of a patch. The sync-repo-updates command automatically positions the files properly under /tftpboot.

                                                    Once the local repositories contain the updated packages, it is possible to update the various SGI Altix ICE admin, leader, and managed service node images using the cinstallman command. The cinstallman command is used for all package updates including those within images, running nodes, including the admin node itself.

                                                    For additional information on updating your system, see “Upgrading from Prior SGI ProPack Releases to SGI ProPack 6 SP3 ”.

                                                    There is a small amount of preparation required, in order to setup an SGI Altix ICE system, so that updated packages can be downloaded from the SGI update server and the Linux distro server and then installed with the cinstallman command.

                                                    This following sections describe these steps, as follows:

                                                    Update the Local Package Repositories on the Admin Node

                                                    This section explains how to update the local product package repositories needed to share updates on all of the various nodes on an SGI Altix ICE system.

                                                    Update the SGI Package Repositories on the Admin Node

                                                    SGI provides a sync-repo-updates script to help keep your local package repositories on the admin node synchronized with available updates for the SGI Tempo, SGI Foundation, SGI ProPack for Linux and SLES products. The script is located in /opt/sgi/sbin/sync-repo-updates on the admin node.

                                                    The sync-repo-updates script requires your Supportfolio user name and password. You can supply this on the command line or it will prompt you for it. With this login information, the script contacts the SGI update server and downloads the updated packages into the appropriate local package repositories.

                                                    For SLES, if you installed and configured the SMT tool as described in “Update the SLES Package Repository”, the sync-repo-updates script will also download any updates to SLES from the Novell update server. When all package downloads are complete, the script updates the repository metadata.

                                                    For RHEL, if you configured the RHN tool as described in “Update the RHEL Package Repository”, the sync-repo-updates script will also download any updates to RHEL from the Red Hat update server. When all package downloads are complete, the script updates the repository metadata.

                                                    Once the script completes, the local package repositories on the admin node should contain the latest available package updates and be ready to use with the cinstallman command.


                                                    Note: You can use the crepo command to set up custom repositories. If you add packages to these custom repositories later, you need to use the yume --prepare --repo command on the custom repository so that the metadata is up to date. If you fail to do this, the yum/yume/cinstallman command may not be able to see your new packages.


                                                    Update the SLES Package Repository

                                                    In Tempo 1.7 (or later), SLES updates are mirrored to the admin node using the SUSE Linux Enterprise Subscription Management Tool. The Subscription Management Tool (SMT) is used to mirror and distribute updates from Novell. SGI Tempo software only uses the mirror abilities of this tool. Mechanisms within SGI Tempo are used to deploy updates to installed nodes and images. SMT is described in detail in the SUSELinux Enterprise Subscription Management Tool Guide. A copy of this manual is in the SMT_en.pdf file located in the /usr/share/doc/manual/sle-smt_en directory on the admin node of your system. Use the scp(1) command to copy the manual to a location where you can view it, as follows:

                                                    # scp /usr/share/doc/manual/sle-smt_en/SMT_en.pdf user@domain_name.mycompany.com:

                                                    Register with Novell

                                                    Register your system with Novell using Novell Customer Center Configuration. This is in the Software category of YaST. When registering, use the email address that is already on file with Novell. If there is not one on file, use a valid email address that you can associate with your Novell login at a future date.

                                                    The SMT will not be able to subscribe to the necessary update channels unless it is configured to work with a properly authorized Novell login. If you have an activation code or if you have entitlements associated with your Novell login, the SMT should be able to access the necessary update channels.

                                                    More information on how to register, how to find activation codes, and how to contact Novell with questions about registration can be found in the YaST help for Novell Customer Center Configuration.

                                                    Configuring the SMT Using YaST

                                                    At this point, your admin node should be registered with Novell. You should also have a Novell login available that is associated with the admin node. This Novell login will be used when configuring the SMT described in this section. If the Novell login does not have proper authorization, you will not be able to register the appropriate update channels. Contact Novell with any questions on how to obtain or properly authorize your Novell login for use with the SMT.

                                                    Procedure 2-26. Configuring SMT Using YaST

                                                      To configure SMT using YaST, perform the following steps:

                                                      1. Start up the YaST tool, as follows:

                                                        # yast

                                                      2. Under Network Services, find SMT Configuration

                                                      3. For Enable Subscription Management Tool Service (SMT), uncheck the box. You do not want SMT to be running by default. You want SMT to mirror things and but you do not want its service or cron jobs running.

                                                      4. For NU User, enter your Novell user name.

                                                      5. For NU Password, enter your Novell password.

                                                      6. For NU E-Mail, use the email with which you registered.

                                                      7. For your SMT Server URL, just leave the default.

                                                        It is a good idea to use the test feature. This will at least confirm basic functionality with your login. However, it does not guarantee that your login has access to all the desired update channels.

                                                        Note that Help is available within this tool regarding the various fields.

                                                      8. When you click Next, a window pops up asking for the Database root password. View the file /etc/odapw. Enter the contents of that file as the password in the blank box.

                                                        A window will likely pop up telling you that you do not have a certificate. You will then be given a chance to create the default certificate. Note that when that tool comes up, you will need to set the password for the certificate by clicking on the certificate settings.

                                                      Setting up SMT to Mirror Updates

                                                      This section describes how to set up SMT to mirror the appropriate SLES updates.

                                                      Procedure 2-27. Setting up SMT to Mirror Updates

                                                        To set up SMT to mirror updates, perform the following steps:

                                                        1. Look at the available catalogs, as follows:

                                                          # smt-catalogs

                                                          In that listing, you should see that the majority of the catalogs matching the admin node distribution (distro) (sles10sp2 or sles11) have "Yes" in the "Can be Mirrored" column.

                                                        2. Use the smt-catalogs -m command to show you just the ones that you are allowed to mirror.

                                                        3. Choose the -Updates matching channels matching the installed distro. For example, if the base distro is SLES10SP2, you might choose:

                                                          SLE10-SP2-SMT-Updates
                                                          SLE10-SDK-SP2-Updates
                                                          SLES10-SP2-Updates

                                                          For SLES11, you might choose:

                                                          SLE11-SDK-Updates
                                                          SLES11-Updates
                                                          SLE11-SMT-Updates [may not be available until after inital Tempo 1.7 release]

                                                        4. This step shows how you might enable the catalogs. Each time, you will be presented with a menu of choices. Be sure to select only the x86_64 version and if given a choice between sles and sled, choose sles , as follows:

                                                          # smt-catalogs -e SLE10-SP2-SMT-Updates
                                                          # smt-catalogs -e SLE10-SDK-SP2-Updates
                                                          # smt-catalogs -e SLES10-SP2-Updates

                                                          Example output is, as follows:

                                                          quiero-admin:~ # smt-catalogs -e SLE10-SDK-SP2-Updates
                                                          .---------------------------------------------------------------------------------------------------------------------------.
                                                          | Mirror? | ID | Type | Name                  | Target         | Description                          | Can be Mirrored |
                                                          +---------+----+------+-----------------------+----------------+------------------------------------------+-----------------+
                                                          | No      |  1 | nu   | SLE10-SDK-SP2-Updates | sled-10-i586   | SLE10-SDK-SP2-Updates for sled-10-i586   | Yes
                                                          |
                                                          | No      |  2 | nu   | SLE10-SDK-SP2-Updates | sled-10-x86_64 | SLE10-SDK-SP2-Updates for sled-10-x86_64 | Yes
                                                          |
                                                          | No      |  3 | nu   | SLE10-SDK-SP2-Updates | sles-10-i586   | SLE10-SDK-SP2-Updates for sles-10-i586   | Yes
                                                          |
                                                          | No      |  4 | nu   | SLE10-SDK-SP2-Updates | sles-10-ia64   | SLE10-SDK-SP2-Updates for sles-10-ia64   | Yes
                                                          |
                                                          | No      |  5 | nu   | SLE10-SDK-SP2-Updates | sles-10-ppc    | SLE10-SDK-SP2-Updates for sles-10-ppc    | Yes
                                                          |
                                                          | No      |  6 | nu   | SLE10-SDK-SP2-Updates | sles-10-s390x  | SLE10-SDK-SP2-Updates for sles-10-s390x  | Yes
                                                          |
                                                          | No      |  7 | nu   | SLE10-SDK-SP2-Updates | sles-10-x86_64 | SLE10-SDK-SP2-Updates for sles-10-x86_64 | Yes
                                                          |
                                                          '---------+----+------+-----------------------+----------------+------------------------------------------+-----------------'
                                                          Select catalog number (or all) to change,  (1-7,a) : 7
                                                          
                                                          
                                                          

                                                          In the example, above, select 7 because it is x86_64 and sles, the others are not.

                                                        5. Use the smt-catalogs -o comand to only show the enabled catalogs. Make sure that it shows the channels you need to be set up for mirroring.


                                                          Warning: SGI Tempo does not map the concept of channels on to its repositories. This means that any channel you subscribe to will have its RPMs placed into the distribution repository. Therefore, only subscribe the Tempo admin node to channels related to your SGI Tempo cluster needs.


                                                        Downloading the Updates from Novell and SGI

                                                        At this time, you should have your update channels registered. From here on, the sync-repo-updates script will do the rest of the work. That script will use SMT to download all the updates and position those updates in to the existing repositories so that the various nodes and images can be upgraded.

                                                        Run /opt/sgi/sbin/sync-repo-updates script.

                                                        After this completes, you need to update your nodes and images (see “Installing Updates on Running Admin, Leader, and Service Nodes ”).


                                                        Note: Be advised that the first sync with the Novell server will take a very long time.


                                                        Update the RHEL Package Repository

                                                        As described in “Update the SGI Package Repositories on the Admin Node”, it is possible to download packages updates from Red Hat and apply those updates across the cluster. To download updates from Red Hat, you need to register the admin node with Red Hat Network (RHN). To do this, you can run the rhn_register command. Enter your Red Hat Network account information when prompted. After you are registered, you are able to download the Red Hat updates using the sgi sync-repo-updates-tool (described later).


                                                        Warning: SGI disables the RHN plugin by default. This is very important because, if the RHN plugin is enabled on the admin node, most distro related packages will be downloaded from Red Hat instead of the admin node itself when creating an image. This would lead to image creation times taking a very long time, depending on your connection to the outside world. Therefore, SGI disables RHN and only enables it when we need to sync (described later).

                                                        After you are registered with RHN, you may use the Red Hat updates feature of the /opt/sgi/sbin/sync-repo-updates script. The script will do the following Red Hat specific tasks (after it has download SGI updates from SGI servers):

                                                        • Temporarily enable the RHN plugin.

                                                        • Use the reposync command to download the updates from Red Hat, dumping them in to a staging area.

                                                        • Disable the RHN plugin.

                                                        • Copy the packages in to the Red Hat distro repository, /tftpboot/distro/RHEL5.3.

                                                        • Update the yum metadata so that the newer packages are available to the cluster for installation with the cinstallman tool.

                                                        SGI has observed that the RHN sync process will sometimes not complete the first couple times it has run. SGI has seen temporary connection failures. In that case, just re-start the sync-repo-updates script to re-try. The problem will likely only be seen when you are doing the first giant sync of all updates as subsequent runs of the command will only download what is new.

                                                        Installing Updates on Running Admin, Leader, and Service Nodes

                                                        This section explains how to update existing nodes and images to the latest packages in the repositories.

                                                        To install updates on the admin node, perform the following command from the admin node:

                                                        admin:~ # cinstallman --update-node --node admin

                                                        To install updates on all online leader nodes, perform the following command from the admin node:

                                                        admin:~ # cinstallman --update-node --node r\*lead

                                                        To install updates on all managed and online service nodes, perform the following from the admin node:

                                                        admin:~ # cinstallman --update-node --node service\*

                                                        To install updates on the admin, all online leader nodes, and all online and managed service nodes with one command, perform the following command from the admin node:

                                                        admin:~ # cinstallman --update-node --node \*

                                                        Please note the following:

                                                        • The cinstallman command does not operate on running compute nodes.

                                                        • When using a node aggregation, for examle, the asterisk (*), if a node happens to be unreachable, it is skipped. Therefore, you should ensure that all expected nodes get their updated packages.

                                                        • For more information on the crepo and cinstallmancommands, see “crepo Command” in Chapter 3 and “cinstallman Command” in Chapter 3, respectively.

                                                        Updating Packages Within Systemimager Images

                                                        You can also use the cinstallman command to update systemimager images with the latest software packages.


                                                        Note: Changes to the kernel package inside the compute image require some additional steps before the new kernel can be used on compute nodes (see “Additional Steps for Compute Image Kernel Updates” for more details). This note does not apply to leader or managed service nodes. Replace sles10sp2 with the distro and version you are using.


                                                        The following examples show how to upgrade the packages inside the three node images supplied by SGI:

                                                        admin:~ # cinstallman --update-image --image lead-sles10sp2 
                                                        admin:~ # cinstallman --update-image --image service-sles10sp2 
                                                        admin:~ # cinstallman --update-image --image compute-sles10sp2


                                                        Note: Changes to the compute image on the admin node are not seen by the compute nodes until the updates have been pushed to the leader nodes with the cimage command. Updating leader and managed service node images ensure that the next time you add or re-discover or re-image a leader or service node, it will already contain the updated packages.


                                                        Before pushing the compute image to the leaders using the cimage command, it is good idea to clean the yum cache.


                                                        Note: The yum cache can grow and is in the writable portion of the compute blade image. This means it is replicated 64 times per compute blade image per rack and the space that may be used by compute blades is limited by design to minimize network and load issues on rack leader nodes.


                                                        To clean the yum cache, from the system admin controller (admin node), perform the following:

                                                        # cinstallman --yum-image --image compute-sles10sp2 clean all

                                                        Additional Steps for Compute Image Kernel Updates

                                                        Any time a compute image is updated with a new kernel, you will need to run some additional steps in order to make the new kernel available. The following example assumes that the compute node image name is compute-sles10sp2 and that you have already updated the compute node image in the image directory per the instructions in “Creating Compute and Service Node Images Using the cinstallman Command” in Chapter 3. If you have named your compute node image something other than compute-sles10sp2, replace this in the example that follows:

                                                        1. Shut down any compute nodes that are running the compute-sles10sp2 image (see “Power Management Commands” in Chapter 3).

                                                        2. Push out the changes with the cimage --push-rack command, as follows:

                                                          # cimage --push-rack compute-sles10sp2 r\* 

                                                        3. Update the database to reflect the new kernel in the compute-sles10sp2, as follows:

                                                          # cimage --update-db compute-sles10sp2

                                                        4. Verify the available kernel versions and select one to associate with the compute-sles10sp2 image, as follows:

                                                          # cimage --list-images

                                                        5. Associate the compute nodes with the new kernel/image pairing, as follows:

                                                          # cimage --set compute-sles10sp2 2.6.16.46-0.12-smp "r*i*n*"


                                                          Note: Replace 2.6.16.46-0.12-smp with the actual kernel version.


                                                        6. Reboot the compute nodes with the new kernel/image.

                                                        Upgrading from Prior SGI ProPack Releases to SGI ProPack 6 SP3

                                                        For information on upgrading your system from a prior SGI ProPack release to SGI ProPack 6 for Linux Service Pack 3, see the release notes. The SGI ProPack 6 SP3 release notes can be found in a file named README.TXT that is available in /docs directory on the SGI ProPack 6 for Linux Service Pack 3 CD.

                                                        The SGI ProPack 6 for Linux Service Pack 3 release notes get installed to the following location on a system running SGI ProPack 6 SP3: /usr/share/doc/packages/sgi-propack-6/README.txt

                                                        Cascading Dual-Boot

                                                        This section describes cascading dual-root (multiple root) support. This adds the notion of a "root slot" that represents a /(root directory) and /boot directory pair for a certain operating system. The layout and usage is described in the section that follows.

                                                        Partition Layout for Admin, Leader, and Service Nodes with Multiroot

                                                        For the the Tempo v1.7 release, only leader node have XFS root filesystems. Partition layout for more than one slot is shown in Table 2-1.

                                                        Table 2-1. Partition Layout for Multiroot

                                                        Partition

                                                        Filesystem Type

                                                        Filesystem Label

                                                        Notes

                                                        1

                                                        swap

                                                        sgiswap

                                                        Partition Layout: Multiroot

                                                        2

                                                        ext3

                                                        sgidata

                                                        SGI Data Partition, MBR boot loader for admin nodes

                                                        3

                                                        extended

                                                        N/A

                                                        Extendedpartition, making logicals out of the rest of the disk

                                                        5

                                                        ext3

                                                        sgiboot

                                                        /boot partition for slot 1

                                                        6

                                                        ext3 or xfs

                                                        sgiroot

                                                        /partition for slot 1

                                                        7

                                                        ext3

                                                        sgiboot

                                                        /boot partition for slot 2 (optional)

                                                        8

                                                        ext3 or xfs

                                                        sgiboot

                                                        / partition for slot 2


                                                        Table 2-1 shows a partition table with two available slots. Tempo supports up to five available slots. After five slots, partitions are not available to support the slot.

                                                        Partition Layout for a Single Root

                                                        Partition layout for a single root is shown in Table 2-2. Partition layout for single slot is the same layout that leader and service nodes have used previously. Legacy leader/service node layout is used for single slot, in order to generate the correct pxelinux chainload setup. Previously, the MBR bootloader was used. For multiroot, a chainload to a root slot boot partition is used.

                                                        Table 2-2. Partition Layout for Single Root

                                                        Partition

                                                        Filesystem Type

                                                        Filesystem Label

                                                        Notes

                                                        1

                                                        ext3

                                                        sgiboot

                                                        /boot

                                                        2

                                                        extended

                                                        n/a

                                                        Extended partition, making logicals out of the rest of the disk

                                                        5

                                                        swap

                                                        sgiswap

                                                        Swap partition

                                                        6

                                                        ext3 or xfs

                                                        sgiroot

                                                        /


                                                        Prior to 1.6 release, admin nodes had a different partition layout than either shown in Table 2-1 or Table 2-2. It had two partitions: swap and a single root. No separate /boot . Any newly installed admin node will have one of the two partition layouts described in the tables above. However, since admin nodes can be upgraded as opposed to re-installed, you may have one of three different partition layouts for admin nodes.

                                                        Admin Node Installation Choices Related to Cascading Dual-Boot

                                                        When you boot the admin node installation DVD, you are brought to a syslinux boot banner by default with a boot prompt, as in previous releases.

                                                        The multiroot feature support adds a few new parameters, as follows:

                                                        • re_partition_with_slots

                                                          When installing, re-partition the admin node to allow for the specified number of slots, default is 2.

                                                        • install_slot

                                                          Details which root slot to install to for this session, a number from 1 to the number of slots available, default

                                                        • destructive

                                                          If destructive is set to 1, potentially destructive operations are allowed. Some examples follow.

                                                          If an admin node is encountered with exactly one blank/virgin disk, and no parameters are provided, the admin node will be configured with a partition table for two slots and will install an operating system in to the first slot.

                                                          If an admin node is encountered with more than one blank/virgin disk, a protection mechanism triggers and the installer errors out because we are not sure which disk to choose.

                                                          If an admin node is encountered with a disk previously used for Tempo use, nothing destructive will happen unless the destructive=1 parameter is passed.

                                                          If an install_slot is specified that appears to have been used for something once, it will not be reformatted unless destructive=1 is supplied.


                                                          Note: This detection is simply parted detecting what filesystem was there. If you subsequently re-partitioned your disk using the same partition layout, then partitions from the past may appear to parted as having a valid filesystem when you might not expect that. In this case, just use destructive=1 once you confirm it is okay to proceed.

                                                          If re_partition_with_slots is supplied, and a previous Tempo configuration is detected, it will error out unless destructive=1 is supplied.


                                                        Leader and Service Node Installation

                                                        Leader and service nodes are installed, as previously. However, they mimic the admin node in terms of partition layout and which slot is used for what purpose.

                                                        Therefore, when a discover operation is performed, the slot used for installation is the same slot on which the admin node is currently booted. So you cannot choose what goes where, currently, it all matches the admin node.

                                                        If the leader or service node is found to have a slot count that does not match the admin node, the node is re-partitioned. It is assumed if the admin node changes its layout, all partitions on leaders and service nodes are re-initialized as well.

                                                        Choosing a Slot to Boot the Admin Node

                                                        After the admin node is installed with Tempo 1.7 (or later), it will boot one of two ways. If only one root slot is configured, the MBR of the admin node will be used to boot the root as usual.

                                                        However, if more than one root slot is selected, then the grub loader in the MBR will direct you to a special grub menu that allows you to choose a root slot.

                                                        For the multi-root admin node, the sgidata partition is used to store some grub files and grub configuration information. Included is a chainload for each slot. Therefore, the first grub to come up on the admin node chooses between a list of slots. When a slot is selected, a chainload is performed and the grub representing that slot comes up.

                                                        How to Handle Resets, Power Cycles, and BMC dhcp Leases When Changing Slots

                                                        This section describes how to handle resets, power cycles, and BMC dhcp leases when changing slots, as follows:

                                                        • Prior to rebooting the admin node to a new root slot, you should shut down the entire cluster including compute blades, leader nodes, and service nodes. If you use the cpower with the --shutdown option, the managed leader and service nodes will be left in a single user mode state. An example cpower command is, as follows:

                                                          admin:~ # cpower --shutdown --system

                                                        • After this is complete, reboot the admin node and boot the new slot.

                                                        • After the admin node comes up on its new slot, you should use the cpower command to reboot all of the leader and service nodes. This ensures that they reboot and become available. An example cpower command is, as follows:

                                                          admin:~ # cpower --reboot --system


                                                        Note: In some cases, the IP address setup in one slot may be different than another. This problem can potentially affect leader and service node BMCs. After the admin node is booted up in to a new slot, it is possible the BMCs on the leaders and service nodes may have hung on to their old IP addresses. They will eventually time-out and grab new leases. This problem may manifest itself in cpower not being able to communication with the BMCs properly. If you have trouble connecting to leader and service node BMCs after switching slots on the admin node, give the nodes up to 15 minutes to grab a new leases that match the new slot.


                                                        Leader and Service Node Booting

                                                        The way leader and service nodes boot is dependent on whether the cascading dual-boot feature is in use or not, as explained in this section.

                                                        Leader and Service Node Booting on a System Configured with One Root Slot

                                                        When a system is configured with only one root slot, it is not using the cascading dual-boot feature. This may be because you want all the disk space on your nodes dedicated to a single installation, or it may be because you have upgraded from previous Tempo releases that did not make use of this feature and you do not want to reinstall at this time.

                                                        When not using the cascading dual-boot feature, the admin node creates PXE configuration files that direct the service and leader nodes to do one of the following:

                                                        Leader and Service Node Booting on a System Configured with Multiple Roots Slots

                                                        When a system is configured with two or more root slots, it is using the cascading dual-boot feature.

                                                        In this case, the admin node creates leader and service PXE configuration files that direct the managed service and leader nodes to do one of the following:

                                                        • Boot from the currently configured slot

                                                        • Reinstall the currently configured slot

                                                        Which slot is current, is determined by the slot on which the admin node is booted. Therefore, the admin node and all managed service and leader nodes are always booted on the same slot number.

                                                        In order to configure a managed service or leader node to boot from a given slot, the admin node creates a PXE configuration file that is configured to load a chainloader. This chainloader is used to boot the appropriate boot partition of the managed service or leader node.

                                                        This means that, in a cascading dual-boot situation, the service and leader nodes do not have anything in their master boot record (MBR). However, each /boot has grub configured to match the associated root slot. A syslinux chainload is performed by PXE to start grub on the appropriate boot partition.

                                                        If, for some reason, a PXE boot fails to work properly, there will be no output at all from that node. This means that cascading dual-boot is heavily dependent on PXE boots working properly for its operation.


                                                        Note: Unlike the managed service and leader nodes, the admin node always has an MBR entry. See “Choosing a Slot to Boot the Admin Node”.


                                                        Slot Cloning

                                                        A script named /opt/sgi/sbin/clone-slot is available. This script allows you to clone a source slot to a destination slot. It then handles synchronizing the data and fixing up grub and fstabs to make the cloned slot a viable booting choice.

                                                        The script sanitizes the input values, then calls a worker script in parallel on all managed nodes and the admin node that does the actual work. The clone-slot script waits for all children to complete before exiting.

                                                        Important: If the slot you are using as a source is the mounted/active slot, the script will shut down mysql on the admin node prior to starting the backup operation and start it when the backup is complete. This ensures there is no data loss.

                                                        Admin Node: Managing Which Slot Boots by Default

                                                        Use the cadmin command to control which slot on the admin node boots by default.

                                                        To show the slot that is currently the default, perform the following:

                                                        # cadmin --show-default-root

                                                        To change it so slot 2 boots by default, peform the following:

                                                        # cadmin --set-default-root --slot 2

                                                        Admin Node: Managing Grub Labels

                                                        You can use the cadmin command to control the grub labels the various slots have. When a slot is installed, the label is updated to be in this form:

                                                        slot 1: tempo 1.7 / rhel5.3: (none)

                                                        You can adjust the last part (none in the above example). The following are some example commands.

                                                        Show the currently configured grub root labels, as follows:

                                                        # cadmin --show-root-labels

                                                        Set the customer-adjustable portion of the root label for slot 1 to say "life is good", as follows:

                                                        # cadmin --show-root-labels
                                                        slot 1: tempo 1.7 / rhel5.3: first rhel
                                                        slot 2: tempo 1.7 / rhel5.3: my party
                                                        slot 3: tempo 1.7 / sles102: I can cry if I want to.
                                                        slot 4
                                                        # cadmin --set-root-label --slot 1 --label "life is good"
                                                        # cadmin --show-root-labels
                                                        slot 1: tempo 1.7 / rhel5.3: life is good
                                                        slot 2: tempo 1.7 / rhel5.3: my party
                                                        slot 3: tempo 1.7 / sles10sp2: I can cry if I want to.
                                                        slot 4

                                                        Admin Node: Which root slot is in use?

                                                        You can use the cadmin command to show the root slot you are currently booted in to on the admin node, as follows:

                                                        # cadmin --show-current-root
                                                        admin node currently booted on slot: 2