AIC7XXX (2940UW Pro) file system corruption
Matthias Andree
ma at dt.e-technik.uni-dortmund.de
Wed Feb 11 04:02:46 PST 2004
Hi,
I have a 2940 UW Pro running in a FreeBSD 4-STABLE (checked out and
built kernel around Feb 3rd) machine with Yamaha CRW4416S (CD, USCSI),
Plextor PX-20TS (CD, USCSI) and Micropolis 4345WS (HDD). The external
connector is unused, the 50-pin stuff is terminated internally in the
Plextor at the bus end, the 68-pin stuff is terminated internally in the
Micropolis at the other bus end.
Last Friday, the SCSI stuff in the box went haywire, dumped card state
and finally locked the machine up - I had to press the reset button. On
reboot, fsck -p aborted the boot since /var was corrupt. At that time,
the hard disk drive was running with the "WCE" set to 0 in the saved and
current mode pages. It's a test machine, so I didn't bother to report
this yet.
I'd used both a Tekram DC-390 (AMD53C974, amd(4)) and a Tekram DC-390U
(SYM53C975, sym(4)) in the same machine with one of these 50<->68
adaptor plugs without seeing such problems, but at that time, the Yamaha
was missing.
The log entries (logged across the network) are too large to post here,
download URL (the log is gzipped):
ftp://ftp.dt.e-technik.uni-dortmund.de/pub/people/ma/aic7xxx-hang.gz
The log is segmented, the first part of Feb 6 is the boot-up message
(around 15:07), then I elided logs until 20:00, where the trouble
started at 20:10:20 with
Feb 6 20:10:20 libertas /kernel: swap_pager: indefinite wait buffer: device: #da/0x20001, blkno: 4296, size: 24576
Feb 6 20:10:41 libertas /kernel: (da0:ahc0:0:0:0): SCB 0x0 - timed out
Feb 6 20:10:42 libertas /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Feb 6 20:10:42 libertas /kernel: ahc0: Dumping Card State in Data-in phase, at SEQADDR 0x64
At that time, the machine was running portupgrade -a and supposed to
build some big stuff, gcc, XFree and other, from ports.
Is this a driver issue?
After reboot and manually cleaning up the /var mess which involved force
installing some ports, another portupgrade -a has completed without
problem.
I tried reading the defect lists with either of these following
commands, to no other avail than an error message card state dumps again
(not posted, only first and last lines below)
camcontrol defects da0 -G -f block
camcontrol defects da0 -G -f bfi
camcontrol defects da0 -G -f phys
(pass0:ahc0:0:0:0): SCB 0xf - timed out
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
ahc0: Dumping Card State while idle, at SEQADDR 0x7
Card was paused
...
(pass0:ahc0:0:0:0): Queuing a BDR SCB
(pass0:ahc0:0:0:0): Bus Device Reset Message Sent
(pass0:ahc0:0:0:0): no longer in timeout, status = 34b
ahc0: Bus Device Reset on A:0. 11 SCBs aborted
This card dump occurred within half a second after issuing the
camcontrol command.
I have then, as an alternative, run "sformat -verify dev=0,0,0", which
has not reported any defects or weak blocks or something, so I can
assume the drive is fine.
--
Matthias Andree
Encrypt your mail: my GnuPG key ID is 0x052E7D95
More information about the freebsd-scsi
mailing list