Chapter 2. Using the OPS System

This chapter explains

Configuring OPS

OPS configuration files that are edited or created when your OPS system is set up are

  • /usr/opscm/conf/opsconf on each server

  • /usr/opscm/conf/sidconf on each server

  • /usr/opsnc/conf/ncconf on the Indy

Each process is explained separately in this section.

Editing opsconf

To check the contents of the opsconf configuration file, follow these steps:

  1. Specify the Indy workstation used for IRISconsole: open /usr/opscm/conf/opsconf on each server. Check to see that the line

    CLUSTER   1    test     2      ichostname    
    

    has the correct system name of the Indy workstation used for IRISconsole.

  2. Define the DLM domain(s) and DLM instances, as explained “OPS Instances and Domains” in Chapter 1. Check the lines under

    #NODE dom inst ndname ndaddress      cmsvc   apsvc   wt
    

    have accurate information on your OPS servers. For example:

    NODE  0   0    host1  150.166.42.37  opscm   opsdlm  1
    NODE  0   1    host2  150.166.42.38  opscm   opsdlm  1
    

  3. Save and close /usr/opscm/conf/opsconf.

  4. Edit the file /etc/services; add three services:

    opscm	newnumber1/tcp
    opsdlm	newnumber2/tcp
    opsnc	newnumber3/tcp
    

    where newnumbern is a number not used elsewhere at this site. For example:

    opscm	7018/tcp
    opsdlm	7019/tcp
    opsnc	7020/tcp
    

  5. Save and exit the file.

Editing sidconf

The sidconf file maps each Oracle instance (sid, or Oracle system ID) to a DLM domain-instance pair. To check the sidconf configuration file, follow these steps:

  1. Determine the sid of the Oracle database for each instance.

  2. In /usr/opscm/conf/sidconf, check that the lines

    MAP sid0 domainnumber instancenumber 
    MAP sid1 domainnumber instancenumber 
    

    contain accurate information for the servers. For example:

    MAP finance1 0 0
    MAP finance2 0 1
    

  3. Save and close /usr/opscm/conf/sidconf.

  4. If necessary, change permissions on this file so that it can be read by all.

Editing ncconf

To edit the ncconf configuration file, follow these steps:

  1. Note the ports on the ST-1600 to which the Remote System Control server ports are cabled.

  2. In an Indy node controller IRIX window, open the /usr/opsnc/conf/ncconf file.

  3. Check that the entries under

    #nodename	ttyname
    

    contain accurate information for the servers. For example:

    ops1   /dev/ttyf031
    ops2   /dev/ttyf033
    

    The last digit in each tty entry should be the ST-1600 port into which the Remote System Control for each server is cabled. The numbers 1 through 16 on the ST-1600 correspond to 0 through f in the tty entries. In the example above, the Remote System Control ports are cabled to ports 2 and 4 on the ST-1600.

  4. Save and close /usr/opsnc/conf/ncconf.

Starting OPS

This section explains

  • starting the node controller software (opscm) on the Indy workstation

  • starting the Connection Manager software (opscm) on the CHALLENGE servers

  • starting OPS automatically

  • starting a node for single-host operation

Starting the Node Controller Software on the Indy Workstation

To restart the OPS software for normal dual-host operation on both hosts and the Indy node controller, follow these steps:

  1. As root on the Indy workstation, start the OPS node controller software by typing

    /usr/opsnc/bin/opsnc
    

  2. Start IRISconsole by typing

    /usr/sbin/ic
    

  3. In the IRISconsole site window, click the Get Console button for each OPS host.

Starting the Connection Manager Software on the CHALLENGE Servers

Follow these steps:

  1. Check for the presence of the CM lock file in /tmp. This filename has the format .nn, where each number stands for the DLM domain and instance, for example, .00. If this file exists, delete it.

  2. Check to see if the CM is already running by typing

    ps -ef | grep opscm
    

    If it is running, type the following to kill it:

    killall -TERM opscm
    

  3. Check to see if the DLM is already running by typing

    ps -ef | grep dlm
    

    If it is running, type the following to kill its processes:

    killall -TERM dlmmon
    killall -TERM dlmd 
    

  4. Run ipcs to determine the shared memory and semaphores used on the host. The following is an example output:

    IPC status from /dev/kmem as of Thu May 18 14:31:22 1995
    T     ID     KEY        MODE       OWNER    GROUP
    Message Queues:
    Shared Memory:
    m      0 0x53637444 --rw-r--r--     root      sys
    m    301 0x000022bb --rw-rw----   oracle      dba
    m   2202 0x0c33b7c9 --rw-r-----   oracle      dba
    Semaphores:
    s   2200 0x00000000 --ra-r-----   oracle      dba
    

  5. If Oracle or DLM is using any shared memory segments or semaphores, save them to another location if you need them for debugging a DLM or Oracle crash; otherwise, delete them with ipcrm. In the example in step 4, you would use

    ipcrm -m 301 -m 2202 -s 2200
    

  6. For each host, create a startup script containing the following lines:

    #!/sbin/sh
    
    ORACLE_HOME=/usr/people/oracle
    ORACLE_SID=sidname
    LKDOM=0
    LKINST=0
    USER=oracle
    GROUP=dba
    
    export ORACLE_HOME ORACLE_SID LKDOM LKINST
    
    /usr/opscm/bin/opscm 
    $ORACLE_HOME/bin/lkmgrd -u $USER -g $GROUP
    

    In each script, make sure that the values for LKDOM= and LKINST= are accurate for the domain and instance on that host. Theses values must match those in /usr/opscm/conf/sidconf, as explained in “Editing sidconf,” earlier in this chapter.

  7. As root, tun the startup script on each host to bring up OPS.

  8. In each host console window, start the Oracle database.

Starting OPS Automatically

To enable opscm and the DLM to start automatically at boot time, follow these steps:

  1. Edit the /etc/init.d/opscmgr script. This script is similar to the startup script created in “Starting the Connection Manager Software on the CHALLENGE Servers,” earlier in this section.

  2. Run this command as root:

    chkconfig -f opscm on
    

Starting a Node for Single-Host Operation

To start a node for single-host operation, run opscm with the -F option:

opscm -F


Caution: Do not use the -F option for normal OPS dual-host operation.


Halting the OPS System

To halt the OPS system, follow these steps:

  1. Back up Oracle database information as needed.

  2. On each OPS host, halt Oracle database operation.

  3. Type the following to terminate CM gracefully:

    killall -TERM opscm
    

The log and control files that are relevant for OPS failures are

  • node control and CM log information are in syslog

  • DLM log: stored in /var/tmp/dlm/