kern/124127: [msk] watchdog timeout (missed Tx interrupts) --
recovering
Pyun YongHyeon
pyunyh at gmail.com
Thu Oct 29 16:50:03 UTC 2009
The following reply was made to PR kern/124127; it has been noted by GNATS.
From: Pyun YongHyeon <pyunyh at gmail.com>
To: Mark Atkinson <atkin901 at yahoo.com>
Cc: freebsd-net at freebsd.org, bug-followup at FreeBSD.org
Subject: Re: kern/124127: [msk] watchdog timeout (missed Tx interrupts) -- recovering
Date: Thu, 29 Oct 2009 09:49:09 -0700
On Thu, Oct 29, 2009 at 06:52:34AM -0700, Mark Atkinson wrote:
> Wow, not sure what to blame for that charset nightmare. Apologies.
> Here's the original message:
>
> On the unpatched -current kernel, built
>
> FreeBSD hellfire.filament.org 9.0-CURRENT FreeBSD 9.0-CURRENT #14: Mon
> Oct 19 09:12:03 PDT 2009
>
> I recieved the following panic today related to this:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0xdeadc10a
> fault code = supervisor read, page not present
> instruction pointer = 0x20:0xc0987410
> stack pointer = 0x28:0xd533dac0
> frame pointer = 0x28:0xd533dae8
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 0 (mskc0 taskq)
> Physical memory: 495 MB
> Dumping 132 MB: 117 101 85 69 53 37 21 5
>
> Reading symbols from /boot/kernel/linux.ko...Reading symbols from
> /boot/kernel/linux.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/linux.ko
> #0 0xc08907a9 in doadump () at /usr/src/sys/kern/kern_shutdown.c:254
> 254 }
> (kgdb) bt
> #0 0xc08907a9 in doadump () at /usr/src/sys/kern/kern_shutdown.c:254
> #1 0xc04f7e37 in db_fncall (dummy1=-1067299898, dummy2=0,
> dummy3=-718022488,
> dummy4=0xd533d898 "\200%t?") at /usr/src/sys/ddb/db_command.c:548
> #2 0xc04f8214 in db_command (last_cmdp=0xc0da059c, cmd_table=0x0,
> dopager=1)
> at /usr/src/sys/ddb/db_command.c:445
> #3 0xc04f8352 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
> #4 0xc04fa05e in db_trap (type=12, code=0) at
> /usr/src/sys/ddb/db_main.c:229
> #5 0xc08bf2d2 in kdb_reenter () at /usr/src/sys/kern/subr_kdb.c:398
> #6 0xc0ba9b62 in trap_fatal (frame=0x1, eva=3735929098)
> at /usr/src/sys/i386/i386/trap.c:938
> #7 0xc0baa483 in trap (frame=0xd533da80) at
> /usr/src/sys/i386/i386/trap.c:339
> #8 0xc0b8e4ab in Xlcall_syscall () at
> /usr/src/sys/i386/i386/exception.s:241
> #9 0xc0987410 in in_lltable_lookup (llt=0xc39e1000, flags=Variable
> "flags" is not available.
> )
> at /usr/src/sys/netinet/in.c:1380
> #10 0xc0982470 in arpintr (m=0xc3baeb00) at
> /usr/src/sys/netinet/if_ether.c:642
> #11 0xc094227a in netisr_dispatch_src (proto=7, source=0, m=0xc0de)
> at /usr/src/sys/net/netisr.c:932
> #12 0xc09424dd in netisr_unregister (nhp=0xc0de)
> at /usr/src/sys/net/netisr.c:583
> #13 0xc093ac69 in ether_demux (ifp=0x0, m=0xc3baeb00)
> at /usr/src/sys/net/if_ethersubr.c:911
> #14 0xc093b1ce in ether_output (ifp=0xc36ad400, m=0xc3baeb00,
> dst=0xc0c55c27,
> ro=0x301010a) at /usr/src/sys/net/if_ethersubr.c:181
> ---Type <return> to continue, or q <return> to quit---
> #15 0xc070b032 in msk_handle_events (sc=0xc3686c00)
> at /usr/src/sys/dev/msk/if_msk.c:3048
> #16 0xc070b828 in msk_int_task (arg=0xc3686c00, pending=1)
> at /usr/src/sys/dev/msk/if_msk.c:3625
> #17 0xc08cac8c in taskqueue_run (queue=0xc36bf380)
> at /usr/src/sys/kern/subr_taskqueue.c:72
> #18 0xc08cadcc in taskqueue_thread_loop (arg=0xc3686c8c)
> at /usr/src/sys/kern/subr_taskqueue.c:90
> #19 0xc0869271 in fork_exit (callout=0xc08cad67 <taskqueue_thread_loop+64>,
> arg=0xc3686c8c, frame=0xd533dd38) at /usr/src/sys/kern/kern_fork.c:854
> #20 0xc0b8e520 in Xatpic_intr0 () at atpic_vector.s:62
> #21 0x00000000 in ?? ()
>
I think it's not a bug of msk(4). Qin Li fixed the bug in arp code.
See r198301.
For watchdog timeout issues on 88E8053 controller, did you ever try
disabling MSI? msk(4) was changed a lot since 7.0-RELEASE to
support newer controllers and added several workarounds to address
silicon bugs. So don't blindly apply experimental patches to your
controller. 88E8053 also has a couple of hardware bugs but I guess
msk(4) already incorporated required workarounds. So if you can
reliably reproduce watchdog timeouts please let me know.
More information about the freebsd-net
mailing list