coredump when loading cxgb after boot with routing daemon already running (RELENG11)

Navdeep Parhar nparhar at gmail.com
Wed Jan 4 19:26:33 UTC 2017


Please file a bug against the network stack.

Is zebra easy to install/configure?  Send me details of your
configuration offline and I can try it on head if it's something
straightforward.

Regards,
Navdeep


On Wed, Jan 4, 2017 at 11:15 AM, Mike Tancsa <mike at sentex.net> wrote:
> On 1/4/2017 2:07 PM, Navdeep Parhar wrote:
>> What source line in releng-11 does ifioctl+0x6dd correspond to?
>>
>> (kgdb) l *(ifioctl+0x6dd)
>>
>> This might be race where the ifnet is being created or coming up and
>> zebra pokes it in some way before it's fully ready.  If that's the
>> case it could affect any ifnet.
>
> Hi Navdeep,
>         Thanks for looking. yes, I just tried it with igb and a similar panic.
>
>
> igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port
> 0xc000-0xc01f mem 0xf7200000-0xf727ffff,0xf7280000-0xf7283fff irq 17 at
> device 0.0 on pci4
> igb0: Using MSIX interrupts with 5 vectors
> igb0:
> Ethernet address: 00:25:90:47:b5:d8
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 3; apic id = 06
> fault virtual address   = 0x0
> fault code              = supervisor read instruction, page not present
> instruction pointer     = 0x20:0x0
> stack pointer           = 0x28:0xfffffe085d4d1728
> frame pointer           = 0x28:0xfffffe085d4d1750
> igb0: code segment              = base 0x0, limit 0xfffff, type 0x1b
> Bound queue 0 to cpu 0
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 846 (zebra)
> trap number             = 12
> panic: page fault
> cpuid = 3
> KDB: stack backtrace:
> #0 0xffffffff806efae7 at kdb_backtrace+0x67
> #1 0xffffffff806a6006 at vpanic+0x186
> #2 0xffffffff806a5e73 at panic+0x43
> #3 0xffffffff80989622 at trap_fatal+0x322
> #4 0xffffffff809897ec at trap_pfault+0x1bc
> #5 0xffffffff80988ea0 at trap+0x280
> #6 0xffffffff8096dab1 at calltrap+0x8
> #7 0xffffffff807aa79d at ifioctl+0x6dd
> #8 0xffffffff8070d876 at kern_ioctl+0x346
> #9 0xffffffff8070d47f at sys_ioctl+0x13f
> #10 0xffffffff80989fae at amd64_syscall+0x50e
> #11 0xffffffff8096dd9b at Xfast_syscall+0xfb
> Uptime: 1m9s
> Dumping 1267 out of 32675
> MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> Dump complete
>
>
> kgdb)  l *(ifioctl+0x6dd)
> 0xffffffff807b90fd is in ifioctl (/usr/src/sys/net/if.c:2655).
> 2650            case SIOCGIFMEDIA:
> 2651            case SIOCGIFXMEDIA:
> 2652            case SIOCGIFGENERIC:
> 2653                    if (ifp->if_ioctl == NULL)
> 2654                            return (EOPNOTSUPP);
> 2655                    error = (*ifp->if_ioctl)(ifp, cmd, data);
> 2656                    break;
> 2657
> 2658            case SIOCSIFLLADDR:
> 2659                    error = priv_check(td, PRIV_NET_SETLLADDR);
> Current language:  auto; currently minimal
> (kgdb)
>
>
>
>>
>> Regards,
>> Navdeep
>>
>>
>>
>> On Wed, Jan 4, 2017 at 11:00 AM, Mike Tancsa <mike at sentex.net> wrote:
>>> I ran into a strange problem when manually loading a network driver
>>> after RELENG_11 box starts up with a routing daemon already running.
>>>
>>> If I have zebra running (just a few static routes) and then try and do a
>>> kldload if_cxgb, the box panics.  If I boot the box, load the nic's
>>> driver and then start zebra, all is fine.
>>>
>>> At first, I thought it might be a firmware issue, but I updated the
>>> NIC's firmware and the same behaviour.  Not sure if this is specific to
>>> the chelsio or if any kldload of a NIC driver will do.
>>>
>>>
>>>
>>> cxgbc0: <Chelsio T310, 1 port> mem
>>> 0xf7081000-0xf7081fff,0xf6800000-0xf6ffffff,0xf7080000-0xf7080fff irq 16
>>> at device 0.0 on pci5
>>> cxgbc0: PCIe x4 Link, expect reduced performance
>>> cxgbc0: using MSI-X interrupts (5 vectors)
>>> cxgbc0: firmware needs to be updated to version 7.11.0
>>> cJan  4 13:03:02 xgbc0: Firmware Version 5.0.0
>>> cxgb0: <Port 0 10GBASE-SR> on cxgbc0
>>> cxgb0: Using defaults for TSO: 65518/35/2048
>>> cxgb0:
>>> Ethernet address: 00:07:43:07:9e:14
>>>
>>> offsite2 kernel:Fatal trap 12: page fault while in kernel mode
>>> c found old FW mipuinor version(5.0)d =, driver compile 2; d for version
>>> 7.apic11
>>>  id = 04
>>> fault virtual address   = 0x0
>>> fault code              = supervisor read instruction, page not present
>>> instruction pointer     = 0x20:0x0
>>> stack pointer           = 0x28:0xfffffe085d2df728
>>> frame pointer           = 0x28:0xfffffe085d2df750
>>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags        = interrupt enabled, resume, IOPL = 0
>>> current process         = 420 (zebra)
>>> trap number             = 12
>>> panic: page fault
>>> cpuid = 0
>>> KDB: stack backtrace:
>>> #0 0xffffffff806fe447 at kdb_backtrace+0x67
>>> #1 0xffffffff806b4966 at vpanic+0x186
>>> #2 0xffffffff806b47d3 at panic+0x43
>>> #3 0xffffffff80997f82 at trap_fatal+0x322
>>> #4 0xffffffff8099814c at trap_pfault+0x1bc
>>> #5 0xffffffff80997800 at trap+0x280
>>> #6 0xffffffff8097c411 at calltrap+0x8
>>> #7 0xffffffff807b90fd at ifioctl+0x6dd
>>> #8 0xffffffff8071c1d6 at kern_ioctl+0x346
>>> #9 0xffffffff8071bddf at sys_ioctl+0x13f
>>> #10 0xffffffff8099890e at amd64_syscall+0x50e
>>> #11 0xffffffff8097c6fb at Xfast_syscall+0xfb
>>> Uptime: 3m9s
>>> Dumping 1635 out of 32675
>>> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>>> --
>>> -------------------
>>> Mike Tancsa, tel +1 519 651 3400
>>> Sentex Communications, mike at sentex.net
>>> Providing Internet services since 1994 www.sentex.net
>>> Cambridge, Ontario Canada   http://www.tancsa.com/
>>
>>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, mike at sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/


More information about the freebsd-stable mailing list