Reproducible FreeBSD 4.10-STABLE (Jul 7) , 3ware 7506-4 lockup.
Jason Thomson
jason.thomson at mintel.com
Fri Jul 30 08:34:24 PDT 2004
Vinod Kashyap wrote:
> After the system locks up, from the DDB prompt, do a
> 'tr, 20'. What does it say?
>
> Please check the drive compatibility list at:
> http://www.3ware.com/products/pdf/Drive_compatibility_list.pdf
>
> If you suspect a problem with any of the 3ware components,
> I strongly encourage you to contact 3ware support.
>
Apologies for taking so long to reply.
I've finally got a serial console connected to this machine.
When the machine locks up (after the controller reports an error),
breaking into the debugger from the console just shows:
twe0: AEN: <twe0: port 3: sector repair occurred>
db> tr, 20
siointr1(c326b000,c04d1cc8,0,ffc08ff4,c039fd70) at siointr1+0xc1
siointr(c326b000) at siointr+0x17
Xfastintr4(0,ffc09000,0,0,ddaac000) at Xfastintr4+0x20
idle_loop() at idle_loop+0x44
Does this mean that it's not locked up in the kernel, it's just the
disk controller / driver that is frozen?
I've included the process list at the bottom of this mail. I'm stuck
for clues with regard to what else I should look at. I can provide
access to the serial console on this machine from the internet, if
anyone is able to help debug this? Please reply in private mail.
(To recap, I can reproduce the problem by dd'ing from the disk to
/dev/null - when it hits a bad sector on the disk, no further twe I/O
takes place. Contrary to a previous report, it doesn't always seem to
hit a bad sector in the same place).
With respect to the drive compatibility list, the drives we are using
are not on the list, but drives from the same range are: The drives we
have are 5A300J0 and 4A320J8 Maxtor drives - the Maxtor 4A300J0 is on
the list.
I don't suspect a problem with these specific 3ware components - we've
had the same problem occur on 3 different machines (all Dell 1600SCs
with 7506-4LP controllers). I don't know if there is a design fault
with the 3ware hardware or the Maxtor disks that means they don't play
well together. I would guess this is a fairly popular hardware
configuration - and I haven't read any problem reports about operating
systems other than FreeBSD.
BTW I did contact 3ware support, but heard nothing back - this may be
because I submitted a too vague problem report. I will try again, if
you think they might be able to help.
db> ps
db> ps
pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
229 dffbcc20 dfffc000 0 227 227 004004 3 getblk cfa1a03c atrun
228 dffbcdc0 dffe0000 0 226 226 8000004 3 spread cfa161d4 sh
227 dffbcf60 dffd8000 0 225 227 004084 3 wait dffbcf60 sh
226 dffbd2a0 dffe7000 0 224 226 004084 3 wait dffbd2a0 sh
225 dffbd100 dfff3000 0 92 92 000084 3 piperd dfebe3e0 cron
224 dffbd780 dffaa000 0 92 92 000084 3 piperd dfebe700 cron
218 dffbd440 dffdc000 1003 210 218 004106 3 inode c3503d00 systat
210 dffbd5e0 dffc7000 1003 209 210 2004086 3 pause dffc7260 csh
209 dffbde00 dffbe000 1003 207 94 000184 3 select c04bd588 sshd
207 dffbd920 dffc2000 0 94 94 000184 3 sbwait ddac4268 sshd
196 dffbdc60 dffca000 0 162 196 004086 3 ttyin c1ddb430 csh
181 dffbdac0 dffcf000 0 155 181 004006 3 physstr cfa16088 dd
171 dc059ea0 dffae000 1003 170 171 004086 3 ttyin c3506830 csh
170 dc05a1e0 dff9c000 1003 159 94 000184 3 select c04bd588 sshd
162 dc05a040 dffa1000 1003 161 162 2004086 3 pause dffa1260 csh
161 dc05a520 dff7d000 1003 157 94 000184 3 select c04bd588 sshd
159 dc05a380 dff96000 0 94 94 000184 3 sbwait ddac47a8 sshd
157 dc05a6c0 dff88000 0 94 94 000184 3 sbwait ddac4348 sshd
155 dc05cdc0 dfeb8000 0 151 155 2004086 3 pause dfeb8260 csh
151 dc05a860 dff6b000 0 1 151 004186 3 wait dc05a860 login
150 dc05aa00 dff67000 0 1 150 004086 3 ttyin c3571210 getty
149 dc05aba0 dff63000 0 1 149 004086 3 ttyin c3571410 getty
148 dc05ad40 dff5f000 0 1 148 004086 3 ttyin c3571610 getty
147 dc05aee0 dff5b000 0 1 147 004086 3 ttyin c3571810 getty
146 dc05b3c0 dff45000 0 1 146 004086 3 ttyin c3571a10 getty
145 dc05b560 dff3a000 0 1 145 004086 3 ttyin c3571c10 getty
144 dc05ba40 dff32000 0 1 144 004086 3 ttyin c356be10 getty
143 dc05cf60 dfeb0000 0 1 143 004086 3 ttyin c318d110 getty
140 dc05b080 dff50000 0 1 140 000085 3 select c04bd588 nmbd
138 dc05b220 dff3f000 0 1 138 000085 3 select c04bd588 smbd
132 dc05b8a0 dff36000 0 130 10 000086 3 nanslp c04a3910 3dmd
131 dc05c740 dfef5000 0 130 10 000086 3 accept ddac2ff2 3dmd
130 dc05bf20 dff19000 0 1 10 000086 3 nanslp c04a3910 3dmd
129 dc05b700 dff2b000 0 1 129 000084 3 select c04bd588 rsync
102 dc05bbe0 dff25000 25 1 102 2000184 3 pause dff25260
sendmail
99 dc05bd80 dff21000 0 1 99 000184 3 select c04bd588
sendmail
96 dc05c0c0 dff15000 0 1 96 000084 3 select c04bd588 usbd
94 dc05c260 dff11000 0 1 94 000184 3 select c04bd588 sshd
92 dc05c400 dff0b000 0 1 92 000084 3 nanslp c04a3910 cron
90 dc05c5a0 dfef9000 0 1 90 000084 3 select c04bd588 inetd
83 dc05c8e0 dfec9000 0 1 83 000084 3 select c04bd588 ntpd
79 dc05ca80 dfec4000 0 1 79 000004 3 getblk cfa1ea28 syslogd
31 dc05cc20 dfec0000 0 1 31 2000084 3 pause dfec0260
adjkerntz
9 dc05d100 deb18000 0 0 0 000204 3 getblk cfa1a03c syncer
8 dc05d2a0 deb15000 0 0 0 000204 3 vlruwt dc05d2a0 vnlru
7 dc05d440 deb12000 0 0 0 000204 3 psleep c04a3ae4
bufdaemon
6 dc05d5e0 deb0f000 0 0 0 000204 3 psleep c04b2c20
vmdaemon
5 dc05d780 deb0c000 0 0 0 000204 3 psleep c047cdf8
pagedaemon
4 dc05d920 dda8e000 0 0 0 000204 3 usbtsk c04c2778 usbtask
3 dc05dac0 dda8b000 0 0 0 000204 3 usbevt c318f210 usb0
2 dc05dc60 dda65000 0 0 0 000204 3 tqthr c04bd584
taskqueue
1 dc05de00 dc062000 0 0 1 004284 3 wait dc05de00 init
0 c04bc8a0 c0579000 0 0 0 000204 3 sched c04bc8a0 swapper
More information about the freebsd-stable
mailing list