mps(4) blocks panic-reboot
Harry Schmalzbauer
freebsd at omnilan.de
Fri Jun 2 16:56:45 UTC 2017
Bezüglich Kenneth D. Merry's Nachricht vom 02.06.2017 17:37 (localtime):
> On Fri, Jun 02, 2017 at 14:30:44 +0200, Harry Schmalzbauer wrote:
…
>> KDB: stack backtrace:
>> #0 0xffffffff805df4f7 at kdb_backtrace+0x67
>> #1 0xffffffff8059df96 at vpanic+0x186
>> #2 0xffffffff8059de03 at panic+0x43
>> #3 0xffffffff808a1892 at trap_fatal+0x322
>> #4 0xffffffff808a18e9 at trap_pfault+0x49
>> #5 0xffffffff808a1126 at trap+0x286
>> #6 0xffffffff80887401 at calltrap+0x8
>> #7 0xffffffff805800f2 at __mtx_unlock_sleep+0x72
>> #8 0xffffffff8029a7dc at xpt_polled_action+0x31c
>> #9 0xffffffff80416c2b at mpssas_ir_shutdown+0x51b
>> #10 0xffffffff8059db9a at kern_reboot+0x49a
>> #11 0xffffffff8059d6f8 at sys_reboot+0x458
>> #12 0xffffffff808a23f4 at amd64_syscall+0x6c4
>> #13 0xffffffff808876eb at Xfast_syscall+0xfb
>>
>> (kgdb) list *0xffffffff805f43ec
>> 0xffffffff805f43ec is in turnstile_broadcast
>> (/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_turnstile.c:837).
>> 832
>> 833 /*
>> 834 * Transfer the blocked list to the pending list.
>> 835 */
>> 836 mtx_lock_spin(&td_contested_lock);
>> 837 TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked[queue],
>> td_lockq);
>> 838 mtx_unlock_spin(&td_contested_lock);
>> 839
>> 840 /*
>> 841 * Give a turnstile to each thread. The last thread gets
>>
>> I haven't looked at the code at all and only very briefly lokked at the
>> diff, just out of curiosity, like pigs staring at clockworks ;-)
>>
>> But at least I hope this report does help.
>
> Thanks for testing it!
>
> My guess is that the problem is that the problem is xpt_polled_action()
> releases the device mutex, but mpssas_SSU_to_SATA_devices() isn't acquiring
> the mutex.
>
> You could try putting the following around the call to xpt_polled_action():
>
> mtx_lock(xpt_path_mtx(ccb->ccb_h.path));
> xpt_polled_action(ccb);
> mtx_unlock(xpt_path_mtx(ccb->ccb_h.path));
>
> See if that fixes things. One other thing to put in there -- after the
> if (target->stop_at_shutdown) { } statement, but still inside the for
> loop, add these two lines:
>
> xpt_free_path(ccb->ccb_h.path);
> xpt_free_ccb(ccb);
Jope I didn't mess up with text editing, pleas see the attached hunk if
it corresponds to the (additional) chages to Stephen's diff.
This leads to a series of panics?!? (was very quick after the dump of
the first panic was written)
ums1: detached
mps0: Sending StopUnit: path (xpt0:mps0:0:2:ffffffff): handle 12
mps0: Completing stop unit for (xpt0:mps0:0:2:ffffffff):
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x478
fault code = supervisor write data, page not present
instruction pointer = 0x20:0xffffffff80416cca
stack pointer = 0x28:0xfffffe03bc9c37f0
frame pointer = 0x28:0xfffffe03bc9c3880
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 1 (init)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff805df5c7 at kdb_backtrace+0x67
#1 0xffffffff8059e066 at vpanic+0x186
#2 0xffffffff8059ded3 at panic+0x43
#3 0xffffffff808a1962 at trap_fatal+0x322
#4 0xffffffff808a19b9 at trap_pfault+0x49
#5 0xffffffff808a11f6 at trap+0x286
#6 0xffffffff808874d1 at calltrap+0x8
#7 0xffffffff8059dc6a at kern_reboot+0x49a
#8 0xffffffff8059d7c8 at sys_reboot+0x458
#9 0xffffffff808a24c4 at amd64_syscall+0x6c4
#10 0xffffffff808877bb at Xfast_syscall+0xfb
Uptime: 1m15s
(da0:mps0:0:2:0): Synchronize cache failed
Dumping 1277 out of 15734
…
#0 doadump (textdump=<value optimized out>) at pcpu.h:222
222 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) list *0xffffffff80416cca
0xffffffff80416cca is in mpssas_ir_shutdown (atomic.h:188).
183 atomic.h: No such file or directory.
in atomic.h
Should I reduce compiler optimization?
Thanks,
-harry
More information about the freebsd-scsi
mailing list