Questions about camcontrol, hot-swapping, ciss and Compaq SmartArray

Josh Endries josh at endries.org
Mon Mar 10 23:59:35 UTC 2008


Hello,

Today I saw that one of my disks seems to be dead/dying in a RAID 5 array I have:

http://pastebin.ca/937249

<snip>
loki.domain.int ciss0: *** Fatal drive error, SCSI port 1 ID 0
loki.domain.int (da1:ciss0:0:1:0): WRITE(10). CDB: 2a 0 c ae 3f d0 0 0 20 0
loki.domain.int (da1:ciss0:0:1:0): CAM Status: SCSI Status Error
loki.domain.int (da1:ciss0:0:1:0): SCSI Status: Check Condition
loki.domain.int (da1:ciss0:0:1:0): MEDIUM ERROR asc:11,0
loki.domain.int (da1:ciss0:0:1:0): Unrecovered read error
loki.domain.int (da1:ciss0:0:1:0): Retrying Command (per Sense Data)
</snip>

I see messages for port 0 only, but varying ID 0-3, and I'm not sure what that 
means (partition?). After a while the error messages "went away", though the 
disks were/are still being used. I found cciss_vol_status online but it says the 
volume is OK (not degraded), which doesn't really make sense to me:

# cciss_vol_status /dev/ciss0
/dev/ciss0: (Smart Array 642) RAID 0 Volume 0(?) status: OK.
/dev/ciss0: (Smart Array 642) RAID 5 Volume 1(?) status: OK.

Is there a way I can tell which port/disk is bad from these messages?

Assuming I can determine which disk it is, do I need to do anything in the OS 
before/after I swap out a drive? I've seen people talk about rescanning and 
running other camcontrol commands before...

Any other tips?

Thanks,
Josh


More information about the freebsd-questions mailing list