Appendix B. SCSI Controller Error Messages

This appendix lists many common error messages. It contains the following sections:

This appendix lists the error strings printed by the device drivers tpsc, dksc, and, in some cases, devscsi.

Introduction

In early IRIX releases, the differential SCSI dual-channel controller board and the dksc driver printed the information differently; in IRIX 4.0.1, they began to use the same form. In IRIX 5.x and later releases, the differential SCSI dual-channel controller board driver reports the error message in the same format as the integral SCSI controller driver.

The error message format for IRIX 3.x and 4.x was:

sense codes. key%x asc%x asq%x

Arguments

key 

the number from Table B-1.

asc 

(additional sense code) from Table B-2.

asq 

(additional sense qualifier) sometimes provides additional information.


Note: Sometimes, there is only one possible asq for a given asc, and many SCSI devices return nonstandard asq values.

The asq tends to be more vendor-specific, although the IEEE SCSI 2 specification defines the “standard” sense qualifiers.

For IRIX 5.x and 6.x, the integral SCSI controller on your system normally prints messages in the forms below, corrected for the two Western Digital bus controllers:

WD93 Bus # tarsct # lun # message

or

wd95_(bus)d(target); sense key {num} ({string}) asc{num}

Arguments

  • the first # (or d for the wd95) is the SCSI adapter involved (0 for all systems except those with the IO3 (input/output board), which supports up to four adapters, numbered 0-3).

  • the second #, # pair is printed only if you know which device is causing the problem.

In a number of cases, a phase and, possibly, a state are printed. These error codes come from the files /usr/include/sys/scsidev.h and /usr/include/sys/scsi.h.

The state and phase meanings are listed in Table B-5 and Table B-6. A few comments have been added. Some of the messages are also included.

Sense Key Information

Table B-1 and Table B-2 map error codes to sense keys.

Table B-1. Primary Sense Key Information

Message

Sense Key

Most Common Cause(s)

No sense

0x0

No error information available

Recovered error

0x1

The device recovered by itself

Device not ready

0x2

No media or not spun up

Media error

0x3

An actual media problem

Device hardware error

0x4

Usually a device hardware error

Illegal request

0x5

Invalid command or data issued

Unit attention

0x6

Device was reset or power-cycled

Data protect error

0x7

Usually device is write protected

Unexpected blank media

0x8

Tried to read at end of a tape

Vendor unique error

0x9

Varies

Copy aborted

0xa

Copy cmd aborted by host (not used)

Aborted command

0xb

Target aborted command

Search data successful

0xc

Search data command OK (not used)

Volume overflow

0xd

Tried to write past EOT on tape

Reserved

(0xE)

0xe should not be seen

Reserved

(0xF)

0xf should not be seen

While Table B-1 helps to identify an error, Table B-2 provides further information on the cause of an error. The ASQ (additional sense qualifier) is printed numerically when its value is not 0 (in 4.0; in 3.3.3, it is always printed by the differential SCSI dual-channel controller). Missing numerical values are not printed either because they are not defined or because the drivers treat them specially.

Table B-2 is provided primarily so you can look up the additional sense codes in the device manual. Some are self-explanatory, others quite obscure.

Table B-2. Additional Sense Code

Addition Sense Qualifier Message

Additional Sense Code

No index/sector signal

0x01

No seek complete

0x02

Write fault

0x03

Not ready to perform command

0x04

Unit does not respond to selection

0x05

No reference position

0x06

Multiple drives selected

0x07

LUN communication error

0x08

Track error

0x09

Error log overflow

0x0a

Write error

0x0c

ID CRC or ECC error

0x10

Unrecovered data block read error

0x11

No address mark found in ID field

0x12

No address mark found in Data field

0x13

No record found

0x14

Seek position error

0x15

Data sync mark error

0x16

Read data recovered with retries

0x17

Read data recovered with ECC

0x18

Defect list error

0x19

Parameter overrun

0x1a

Synchronous transfer error

0x1b

Defect list not found

0x1c

Compare error

0x1d

Recovered ID with ECC

0x1e

Invalid command code

0x20

Illegal logical block address

0x21

Illegal function

0x22

Illegal field in CDB

0x24

Invalid LUN

0x25

Invalid field in parameter list

0x26

Media write protected

0x27

Media change

0x28

Device reset

0x29

Log parameters changed

0x2a

Copy requires disconnect

0x2b

Command sequence error

0x2c

Update in place error

0x2d

Tagged commands cleared

0x2f

Incompatible media

0x30

Media format corrupted

0x31

No defect spare location available

0x32

Media length error

0x33[a]

Toner/ink error

0x36

Parameter rounded

0x37

Saved parameters not supported

0x39

Medium not present

0x3a

Forms error

0x3b

Invalid ID msg

0x3d

Self config in progress

0x3e

Device config has changed

0x3f

RAM failure

0x40

Data path diagnostic failure

0x41

Power on diagnostic failure

0x42

Message reject error

0x43

Internal controller error

0x44

Select/reselect failed

0x45

Soft reset failure

0x46

SCSI interface parity error

0x47

Initiator detected error

0x48

Inappropriate/illegal message

0x49

Command phase error

0x4a

Data phase error

0x4b

Failed self configuration

0x4c

Overlapped commands attempted

0x4e

Media load/unload failure

0x53

Unable to read table of contents

0x57

Generation (optical device) bad`

0x58

Updated block read (optical device)

0x59

Operator request or state change

0x5a

Logging exception

0x5b

RPL status change

0x5c

Self diagnostics predict unit will fail soon

0x5d

Lamp failure

0x60

Video acquisition error/focus problem

0x61

Scan head positioning error

0x62

End of user area on track

0x63

Illegal mode for this track

0x64

Decompression error

0x70[b]

[a] Specified as tape only

[b] DAT only; may be in SCSI3


SCSI Driver Error Messages

Table B-3 lists the messages that are printed by the wd93 SCSI driver. (Messages for the wd95 SCSI driver are similar). After the message is printed, the driver resets the SCSI bus. These messages are from IRIX 5.1, but similar ones are printed by earlier releases.

Table B-3. SCSI Driver Error Messages

Error Message

Comments

No memory for wd93 device arrayNot enough memory for WD93 data structures Not enough memory for WD93 DMA maps

These messages occur during boot if something is seriously wrong, and memory can't be allocated.

wd93 SCSI Bus=%d ID=%d LUN=%d: error during abort message, resetting bus

An upper level driver tried to issue an ABORT message, but the expected bus phases were not followed.

wd93 SCSI Bus=%d ID=%d LUN=%d: SYNC negotiation error, resetting bus

An error occurred while trying to negotiated synchronous SCSI rates (usually during an open or mount); the device is left in async mode.

timeout after %d %ssec

Any SCSI command that doesn't terminate within the time limit set by the upper level driver will result in this message and a SCSI bus reset. The “%ssec” part will either be “msec” or “sec”, depending on whether it is an integral number of seconds or not (timeouts are passed as HZ values). This can be caused by anything from driver errors to hardware errors to SCSI bus problems. The latter is the most common cause. This message is always paired with the standard “wd93 SCSI Bus=%d...” message.

wd93 controller %d didn't reset correctly

An attempt to reset the controller chip failed; this is a catastrophic (usually) hardware error.

wd93 SCSI Bus=%d ID=%d LUN=%d: SCSI cmd=0x%x <MSG>. Resetting SCSI bus wd93 SCSI Bus=%d: <MSG>. Resetting SCSI bus

Used with a number of other messages when a SCSI bus timeout or other error on the SCSI bus occurs. The SCSI bus is reset. The long form (with target, and the first byte of the SCSI command) is shown when the driver is connected to a known target. The short form is shown when no target is connected (referred to as “cmdabort” in other messages).

Spurious wd93 interrupt, no connected channel

Occurs when the driver is responding to a SCSI bus phase where some device should be connected (active), but in fact, none is.

wd93 SCSI Bus=%d ID=%d LUN=%d: SCSI bus parity error

A SCSI bus parity error was detected during a data transfer. Usually a cabling problem.

wd93 SCSI Bus=%d ID=%d LUN=%d: host memory parity error during DMA

On command completion (normal or error), the DMA hardware tells us that a parity error occurred during data transfer to memory. Usually a system hardware problem.


SCSI Driver Debugging Messages

The information listed in Table B-4 is sometimes useful for debugging drivers (kernel or devscsi). It dumps out information on the current command and the sense info obtained, after a SCSI “check condition” status. Printed only when the variable wd93_printsense is set non-zero in master.d/wd93.

Table B-4. Error Messages Useful for Debugging

Error Message

Comments

wd93 SCSI Bus=%d ID=%d LUN=%d: check condition start request sense

 

The check condition was detected, a request sense is started.

 

wd93 SCSI Bus=%d ID=%d LUN=%d: sense failed wd93 status %d, scsi status 0x%x

The request sense command failed. Usually bad device firmware, or SCSI bus problems.

wd93 SCSI Bus=%d ID=%d LUN=%d: sense key=0x%x (%s) ASC=0x%x (%s)

The request sense succeeded, the driver status and the SCSI status are printed. The ASC is printed if valid, and the ascii strings corresponding to the sense key and the ASC (additional sense code) are also printed, if they are known to the driver.

Hex sense data:

If wd93_printsense is > 1, then the raw data returned by request sense are dumped in hex with this header.

reselect without ID

The SCSI bus phase indicates a reselection, but the reselecting device's ID could not be determined. Usually a cabling problem. Printed with the “cmdabort” message.

illegal disconnection interrupt: phase %x

A SCSI bus disconnect was detected at an unexpected point. The wd93 phase register (see sys/wd93.h) is printed. Printed with the “cmdabort” message.

unexpected message in %x, phase %x

A SCSI bus message in phase was found, but the message byte was not expected. Printed with the “cmdabort” message.

Hardware error

The wd93 reported no active phase, but should have. Normally a hardware problem. Printed with the “cmdabort” message.

Too much data %s (probable SCSI bus cabling problem)

Too many REQ's were received for the amount of data programmed into the SCSI controller. Usually a SCSI bus cabling problem, but can also occur when the byte count passed to the wd93 driver doesn't match the way that the device interprets the bytes in the SCSI command. Followed by either “requested” or “sent”, depending on direction of transfer. Printed with the “cmdabort” message.

Unexpected info phase %x, state %x

Another unexpected bus phase. The wd93 phase and state registers are printed (see sys/wd93.h). Printed with the “cmdabort” message.

wd93 SCSI Bus=%d ID=%d LUN=%d: Unexpected extended msgin type %x, len %x

A special case of an unexpected message. Extended messages occur when the first byte is 001; subsequent bytes indicate the length and type. The only extended messages currently handled are synchronous negotiation initiated by the target.

unexpected reselection

A reselection of the host was attempted, but the host doesn't think that any target is both disconnected and active. Usually a SCSI bus problem, but might be a firmware bug also. Printed with the “cmdabort” message.

wd93 SCSI Bus=%d ID=%d LUN=%d: I/O address %x not correctly aligned, can't DMA disconnected on non-word boundary (addr=%x, 0x%x left), can't DMA

Silicon Graphics DMA hardware requires word (32-bit) alignment at start of any DMA (low two bits must be 0 in the address). If not, one of the following two messages is issued, depending on whether this is the start of a command or a data phase continued on a reselection. The latter case can occur if a device disconnects and reselects, even if it doesn't go to data phase, if the DMA count remaining is non-zero. This is because it is too difficult to handle the case where the device disconnects and then reselects just to go to status phase. Such devices are inefficient at best, and, fortunately, are rare. If you must use such a device, your only option is to disable SCSI disconnects altogether (in master.d/wd93).

ID=%d LUN=%d not found in active list

Typically occurs due to SCSI bus problems or to driver bugs. A reselection occurred with valid data and bus phases at the same time as the driver attempted to select a device to initiate a command, but the reselecting device does not appear to have a command active.


SCSI States and Phases

Several of the SCSI states and phases are listed in Table B-5. There are other possible states and phases, but they rarely occur. The SCSI states and phases are listed in the files /usr/include/sys/wd93.h and /usr/include/sys/wd95.h and perhaps in scsi.h. The comments below have been extracted from these files and supplemented with additional information.


Note: “Out” is from the CPU to the SCSI device in these descriptions, and “receive” and “send” are also from the SCSI device point of view, since the target controls all the bus phases except for initial selection.


Table B-5. SCSI State Error Messages

State Message

Sense Key

Comments

ST_RESET

0x00

SCSI chip reset by reset command or power-up.

ST_SELECT

0x11

Selection of target complete (after C93SELATN).

ST_SATOK

0x16

Select-And-Transfer completed successfully, that is, all phases have completed in a normal manner.

ST_TR_DATAOUT

0x18

Transfer cmd done, target requesting data.

ST_TR_DATAIN

0x19

Transfer cmd done, target sending data.

ST_TR_STATIN

0x1b

Target is sending status in.

ST_TR_MSGIN

0x1f

Transfer cmd done, target sending msg.

ST_TRANPAUSE

0x20

Transfer cmd has paused with ACK.

ST_SAVEDP

0x21

Save Data Pointers message during SAT normal state when device is disconnecting from the bus.

ST_A_RESELECT

0x27

Reselected after disc (93A).

ST_UNEXPDISC

0x41

An unexpected disconnect device disconnected without sending a disconnect message; sometimes happens when devices with removable media have had the media removed during a transfer.

ST_PARITY

0x43

Cmd terminated due to parity error on the SCSI bus.

ST_PARITY_ATN

0x44

Cmd terminated due to parity error (ATN is asserted so that host can send a message to device; the transfer is just aborted).

ST_TIMEOUT

0x42

Time-out during Select or Reselect, that is, the device never responded to an attempt to select it; normally seen only during hardware inventory probing, but sometimes happens after a SCSI bus reset if device takes a long time to recover from the reset or is powered off.

ST_INCORR_DATA

0x47

Incorrect message or status byte.

ST_UNEX_RDATA

0x48

Unexpected receive data phase device tried to send more data than the SCSI chip is programmed to expect. This can be OK, as when a high-level request is made to transfer more data than the DMA hardware can map on a single request. In this case, simply reprogram the DMA hardware for the next chunk of data and restart the transfer (but don't send a new SCSI command to the device). When printed as part of an error message, it can sometimes be caused by a SCSI cabling problem, or (particularly with devscsi user drivers) by a mismatch in the byte count given to the driver and the byte count implied by the SCSI command sent to the device.

ST_UNEX_SDATA

0x49

Unexpected send data phase (same as above, but device is asking for more data).

ST_UNEX_CMDPH

0x4a

Unexpected cmd phase

ST_UNEX_SSTATUS

0x4b

Unexpected send status phases occur at the end of SCSI command (that is, byte count remaining is 0); if they happen at other times, the chip interrupts. This can happen when you ask a device for more data than it can give you, and in this case, you just return a short I/O count to the caller. When printed as part of an error message, it usually implies a cabling or termination problem.

ST_UNEX_RMESGOUT

0x4e

Unexpected request message out phase usually indicates a SCSI cabling problem.

ST_UNEX_SMESGIN

0x4f

Unexpected send message in phase usually indicates a SCSI cabling problem; also happens when device sends a disconnect message in normal use when preparing to disconnect from the bus.

ST_RESELECT

0x80

WD33C93 has been reselected.

ST_93A_RESEL

0x81

Reselected while idle (93A).

ST_DISCONNECT

0x85

Disconnect has occurred.

ST_NEEDCMD

0x8a

Target is ready for a cmd.

ST_REQ_SMESGOUT

0x8e

REQ signal for send message out.

ST_REQ_SMESGIN

0x8f

REQ signal for send message in above 3 usually seen only during sync negotiations.

Table B-6 lists phases during Select-and-Transfer commands.

Table B-6. Phases During a Select-and-Transfer Command

Phase Message

Sense Key

Comments

PH_NOSELECT

0x00

Selection not successful.

PH_SELECT

0x10

Selection successful.

PH_IDENTSEND

0x20

Identify message sent (during selection when sending initial command to a device). Phase 30 indicates none of the cmd bytes have yet been sent; every cmd byte sent increments that by one.

PH_CDB_START

0x30

Start of CDB transfers.

PH_CDB_6

0x36

6th cmd byte sent.

PH_CDB_10

0x3a

0xAth cmd byte sent.

PH_CDB_12

0x3c

0xCth cmd byte sent.

PH_SAVEDP

0x41

Save data pointers.

PH_DISCRECV

0x42

Disconnect message received.

PH_DISCONNECT

0x43

Target disconnected.

PH_RESELECT

0x44

Original target reselected.

PH_IDENTRECV

0x45

Correct identify (right LUN) message received (during reselection).

PH_DATA

0x46

Data transfer completed (expect status next).

PH_STATUSRECV

0x50

Status byte received (expect cmd complete next).

PH_COMPLETE

0x60

Command complete message received; SCSI command is finished, and SCSI bus is free.