card state dump during tape backup
Georg Altmann
galtmann at las-cad.com
Thu Jan 20 12:01:09 PST 2005
Hi,
I am having troubles with our tape backup. The system has been running
stable for about 3 month now after a tape drive failure/replacement and
maybe for one year before that. The system (hardware/kernel/software) has
not been touched during this period. A week ago the backup software
(Netvault 7.1) showed the tape drive being offline and I found a scsi card
dump in the logs (see below). The problem recurred almost daily the past
few days during the differential backups (1-3 GB maybe). Still I am able to
perform larger test backups (10 GB) without the problem occurring.
After restarting netvault and doing a camcontrol reset all, the backup is
running fine again for 12h to 48h.
We are using an ADIC Fastor 2 tape library with a LTO1 tape drive attached
to an Adaptec 29160 ultra160 adapter. FreeBSD is 4.10-RELEASE-p4. Machine
details see below.
Although I don't suspect this to be the cause: termination is set to
automatic for LVD in the adapter BIOS (68pin external connector) and an
active terminator is plugged on the library side. I checked cables again 2
days ago.
My first guess was to disable wide negotiation for all devices and set the
tape drive to 40 MB/s in the adapter BIOS, but I doesn't seem to help.
What is this? Is the hardware starting to fail?
And if so is likely to be something on the SCSI bus e.g. the drive or is it
the motherboard or the SCSI card failing?
I mean it is very unlikely for it to be some kind of kernel or software
failure, right?
Any help would be greatly appreciated.
Best regards,
Georg
# pciconf -lv
ahc0 at pci0:12:0: class=0x010000 card=0xe2a09005 chip=0x00809005 rev=0x02
hdr=0x00
vendor = 'Adaptec'
device = '29160/N/LP Ultra160 (AIC-7892A) SCSI Host Adapter'
class = mass storage
subclass = SCSI
# camcontrol devlist
<ADIC FastStor 2 A12r> at scbus0 target 0 lun 0 (pass0)
<HP Ultrium 1-SCSI E33A> at scbus0 target 5 lun 0 (pass1,sa0)
relevant (?) kernel options:
options SCSI_DELAY=8000
options CAMDEBUG
options AHC_ALLOW_MEMIO
/var/run/dmesg.boot:
CPU: AMD Athlon(tm) Processor (1102.51-MHz 686-class CPU)
Origin = "AuthenticAMD" Id = 0x642 Stepping = 2
Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
AMD Features=0xc0440000<RSVD,AMIE,DSP,3DNow!>
real memory = 805240832 (786368K bytes)
avail memory = 780029952 (761748K bytes)
pci0: <VIA 83C572 USB controller> at 7.2 irq 9
pci0: <VIA 83C572 USB controller> at 7.3 irq 9
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xbc00-0xbcff mem
0xdfffb000-0xdfffbfff irq 5 at device 12.0 on pci0
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
sa0 at ahc0 bus 0 target 5 lun 0
sa0: <HP Ultrium 1-SCSI E33A> Removable Sequential Access SCSI-3 device
sa0: 40.000MB/s transfers (40.000MHz, offset 15)
pass0 at ahc0 bus 0 target 0 lun 0
pass0: <ADIC FastStor 2 A12r> Removable Changer SCSI-2 device
pass0: 3.300MB/s transfers
Here comes the dump:
Jan 19 22:46:01 asterix /kernel: ahc0: Recovery Initiated
Jan 19 22:46:01 asterix /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins
<<<<<<<<<<<<<<<<<
Jan 19 22:46:01 asterix /kernel: ahc0: Dumping Card State while idle, at
SEQADDR 0x9
Jan 19 22:46:01 asterix /kernel: Card was paused
Jan 19 22:46:01 asterix /kernel: ACCUM = 0x4, SINDEX = 0x7, DINDEX = 0x21,
ARG_2 = 0x3
Jan 19 22:46:01 asterix /kernel: HCNT = 0x0 SCBPTR = 0x1
Jan 19 22:46:01 asterix /kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0]
SCSIBUSL[0x0]
Jan 19 22:46:01 asterix /kernel: LASTPHASE[0x1]:(P_BUSFREE)
SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)
Jan 19 22:46:01 asterix /kernel: SBLKCTL[0xa]:(SELWIDE|SELBUSB)
SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE)
Jan 19 22:46:01 asterix /kernel:
SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x0]
Jan 19 22:46:01 asterix /kernel: SSTAT1[0x8]:(BUSFREE) SSTAT2[0x0]
SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP)
Jan 19 22:46:01 asterix /kernel:
SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON)
Jan 19 22:46:01 asterix /kernel: DFCNTRL[0x0]
DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
Jan 19 22:46:01 asterix /kernel: STACK: 0x0 0x164 0x179 0x3
Jan 19 22:46:01 asterix /kernel: SCB count = 20
Jan 19 22:46:01 asterix /kernel: Kernel NEXTQSCB = 4
Jan 19 22:46:01 asterix /kernel: Card NEXTQSCB = 4
Jan 19 22:46:01 asterix /kernel: QINFIFO entries:
Jan 19 22:46:01 asterix /kernel: Waiting Queue entries:
Jan 19 22:46:01 asterix /kernel: Disconnected Queue entries: 1:14
Jan 19 22:46:01 asterix /kernel: QOUTFIFO entries:
Jan 19 22:46:01 asterix /kernel: Sequencer Free SCB List: 0 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Jan 19 22:46:01 asterix /kernel: Sequencer SCB Info:
Jan 19 22:46:01 asterix /kernel: 0 SCB_CONTROL[0xc0]:(DISCENB|TARGET_SCB)
SCB_SCSIID[0x7]
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 1 SCB_CONTROL[0x44]:(DISCONNECTED|DISCENB)
SCB_SCSIID[0x7]
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xe]
Jan 19 22:46:01 asterix /kernel: 2 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 3 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 4 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 5 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 6 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 7 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 8 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 9 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 10 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 11 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 12 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 13 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 15 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 16 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 17 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 18 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 19 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 20 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 21 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 22 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 23 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 24 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 25 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 26 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 27 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 28 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 29 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 30 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 31 SCB_CONTROL[0x0]
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID)
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: Pending list:
Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x40]:(DISCENB)
SCB_SCSIID[0x7] SCB_LUN[0x0]
Jan 19 22:46:01 asterix /kernel: Kernel Free SCB list: 9 15 16 17 18 19 0 1
2 3 5 6 7 8 13 12 11 10
Jan 19 22:46:01 asterix /kernel: Untagged Q(0): 14
Jan 19 22:46:01 asterix /kernel:
Jan 19 22:46:01 asterix /kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends
>>>>>>>>>>>>>>>>>>
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): SCB 0xe - timed out
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Queuing a BDR SCB
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Bus Device Reset
Message Sent
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): no longer in timeout,
status = 34b
Jan 19 22:46:01 asterix /kernel: ahc0: Bus Device Reset on A:0. 1 SCBs
aborted
Jan 19 22:46:01 asterix /kernel: ahc0: Timedout SCBs already complete.
Interrupts may not be functioning.
More information about the freebsd-scsi
mailing list