kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
John Baldwin
jhb at FreeBSD.org
Sun Jan 20 04:30:01 UTC 2013
The following reply was made to PR kern/172113; it has been noted by GNATS.
From: John Baldwin <jhb at FreeBSD.org>
To: bug-followup at FreeBSD.org, egrosbein at rdtc.ru
Cc: jfv at FreeBSD.org, George Neville-Neil <gnn at FreeBSD.org>
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in
igb(4): m_getjcl: invalid cluster type
Date: Sat, 19 Jan 2013 23:26:17 -0500
I was able to finally reproduce this panic today. It seems to require
a server configured for PXE but that receives no DHCP reply (and
possibly with the requisite SuperMicro X8 board). I was able to
prevent the panic with a subset of the referenced patch by only adding
the 'if_drv_flags & IFF_DRV_RUNNING' check to the start of
igb_msix_que(). The rest of the patch was unnecessary. I also added
some debugging to print out the ICR, EICR, IMS, and EIMS registers in
this case. It does look like the hardware is sending an interrupt that
is not enabled in the interrupt mask (specifically LSC). In fact, the
82576 datasheet specifically mentions masking LSC until initialization
is complete to avoid spurious interrupts during boot and AFAICT igb(4)
does this since e1000_reset_hw() clears the interrupt mask via writes
to IMC and doesn't re-enable interrupts until igb_init_locked() is
invoked via 'ifconfig up'. Here is my debug output:
SMP: AP CPU #6 Launched!
SMP: AP CPU #4 Launched!
stray irq0
igb0: interrupt on que 0: icr 0x1000004 eicr 0
ims 0 eims 0x80000000
Hmmm. Nothing clears EIMS. After some more debugging, I determined
that e1000_reset_hw() always turns this bit in EIMS on, even if it is
off before e1000_reset_hw() is called(!). I added explicit calls to
igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().
This removes the 'stray irq0', but I still get a spurious interrupt
during boot (albeit with eims 0). I can use the IFF_DRV_RUNNING hack
for now, but I think the real fix is something else.
--
John Baldwin
More information about the freebsd-net
mailing list