igb watchdog timeouts
Charles Owens
cowens at greatbaysoftware.com
Fri Jan 14 21:47:34 UTC 2011
Thanks for all the feedback on polling, Jack and others. Very helpful.
We are working to merge the latest RELENG_8 em/igb driver into our
custom build that's based on RELENG_8_1. I've been able to create a
patch using the following command:
cvs di -N -up -jRELENG_8_1 -jRELENG_8 sys/dev/e1000 sys/dev/ixgb
sys/dev/ixgbe sys/conf/files > /tmp/e1000.diff
... by hand trimming sys/conf/files down to only the relevant bits. It
compiled and seems to be functioning, but I wouldn't mind a sanity
check on my methodology. In particular:
* Some of the patches overlapped with sys/dev/ixgb, igbe... so I
included them. Should I have?
* Is there anything else I should have included?
Thanks very much,
Charles
On 1/13/11 4:49 PM, Jack Vogel wrote:
> Polling has seemed to me to be a way around other problems, problems
> that these days
> no longer exist. I remember back in the FreeBSD 6 days having
> interrupt problems which
> of course also led to watchdogs. Polling got rid of that. But now
> there are dedicated
> MULTIPLE interrupts by using MSIX, so that reason for polling is gone.
>
> Of course there can still be advantages, reducing interrupts and hence
> context switches,
> which is why the Linux approach does what it does.
>
> I have not spent time with that issue, its good to know that there
> could be problems
> lurking with it. But if you can simply go with MSIX I would do that
> for now.
>
> Jack
>
>
> On Thu, Jan 13, 2011 at 1:42 PM, Charles Owens
> <cowens at greatbaysoftware.com <mailto:cowens at greatbaysoftware.com>> wrote:
>
> So we went back to basics (stock 8.1-RELEASE) and found no
> issue! We then added in our kernel mods one by one and
> ultimately discovered that device-polling is the culprit (the
> kernel config was simply GENERIC + PAE + polling).
>
> Immediately upon running "ifconfig igb0 polling" the symptoms appear.
>
> This is very good news overall, in that we can certainly disable
> polling for igb. This begs the question, though, as to whether
> polling is recommended these days at all for em/igb NICs... or
> even in general. From other conversations we've seen there seems
> to be some general debate about this. In testing we've done in
> the past (circa 7.0) there certainly seemed to be benefit to using
> this feature. What are your thoughts about this?
>
> For our product releases we'd like stay with RELENG_8_1. Would
> you recommend the driver in 8.2 as being preferable?
>
> In case it's of interest:
>
> igb0 at pci0:1:0:0: class=0x020000 card=0x34de8086 chip=0x10a78086 rev=0x02
> hdr=0x00
> vendor = 'Intel Corporation' device = '82575EB Gigabit Network Connection'
> class = network
> subclass = ethernet
>
>
>
> Thanks,
> Charles
>
>
>
> On 1/13/11 1:27 PM, Jack Vogel wrote:
>> The 8.2 latest does have the latest igb, so using that should be
>> indicative...
>>
>> Jack
>>
>>
>> On Thu, Jan 13, 2011 at 7:56 AM, Charles Owens
>> <cowens at greatbaysoftware.com
>> <mailto:cowens at greatbaysoftware.com>> wrote:
>>
>> Ok... I got my wires crossed: our first time testing 8.1 on
>> this particular platform was with a kernel that had ichwd
>> enabled (a new thing for us) and so when igb started
>> complaining about "watchdog" we thought it was related.
>>
>> We've tested again and clearly the real story is that we're
>> simply seeing igb issues, symptoms similar to those described.
>>
>> Does 8.2-RC1 have sufficiently "latest" code, or should I be
>> looking to load up something else? (8-stable, maybe?)
>>
>> Thanks,
>> Charles
>>
>>
>>
>> On 1/13/11 12:07 AM, Jack Vogel wrote:
>>> The problem that Robin saw was due to having MSIX interrupts
>>> disabled on the system, I doubt that
>>> is going to be the "issue" for others.
>>>
>>> Get the latest version of the igb code and see if that helps
>>> you as a first step.
>>>
>>> Jack
>>>
>>>
>>> On Wed, Jan 12, 2011 at 6:43 PM, Charles Owens
>>> <cowens at greatbaysoftware.com
>>> <mailto:cowens at greatbaysoftware.com>> wrote:
>>>
>>> I'd like to report that we're running into this issue
>>> also, in our case on systems that are based on the Intel
>>> S5520UR Server Board, running 8.1-RELEASE. If the ichwd
>>> driver is loaded we see the same messages, and network
>>> communication via the igb nics is non-functional.
>>>
>>> Have you had any luck?
>>>
>>> Thanks,
>>> Charles
>>>
>>> Charles Owens
>>> Great Bay Software, Inc.
>>>
>>>
>>>
>>>
>>> On 1/3/11 4:02 PM, Robin Sommer wrote:
>>>
>>> Hello all,
>>>
>>> quite a while ago I asked about the problem below.
>>> Unfortunately, I
>>> haven't found a solution yet and I'm actually still
>>> seeing these
>>> timeouts after just upgrading to 8.2-RC1. Any
>>> further ideas on what
>>> could be triggering them, or how I could track down
>>> the cause?
>>>
>>> Thanks,
>>>
>>> Robin
>>>
>>> On Thu, Jul 29, 2010 at 14:56 -0700, I wrote:
>>>
>>> Since upgrading from 8.0 to 8.1-RELEASE, I'm
>>> seeing lots of messages
>>> like those below on all my SuperMicro
>>> SBI-7425C-T3 blades. There's
>>> almost no traffic on those interfaces.
>>>
>>> Any idea?
>>>
>>> Thanks,
>>>
>>> Robin
>>>
>>> Jul 29 13:01:18 blade0 kernel: igb1: Watchdog
>>> timeout -- resetting
>>> Jul 29 13:01:18 blade0 kernel: igb1: Queue(0)
>>> tdh = 256, hw tdt = 266
>>> Jul 29 13:01:18 blade0 kernel: igb1: TX(0) desc
>>> avail = 1013,Next TX to Clean = 255
>>> Jul 29 13:01:18 blade0 kernel: igb1: link state
>>> changed to DOWN
>>> Jul 29 13:01:18 blade0 kernel: igb1: link state
>>> changed to UP
>>> Jul 29 13:01:29 blade0 kernel: igb1: Watchdog
>>> timeout -- resetting
>>> Jul 29 13:01:29 blade0 kernel: igb1: Queue(0)
>>> tdh = 0, hw tdt = 10
>>> Jul 29 13:01:29 blade0 kernel: igb1: TX(0) desc
>>> avail = 1014,Next TX to Clean = 0
>>> Jul 29 13:01:29 blade0 kernel: igb1: link state
>>> changed to DOWN
>>> Jul 29 13:01:29 blade0 kernel: igb1: link state
>>> changed to UP
>>> Jul 29 13:01:46 blade0 kernel: igb1: Watchdog
>>> timeout -- resetting
>>> Jul 29 13:01:46 blade0 kernel: igb1: Queue(0)
>>> tdh = 32, hw tdt = 33
>>> Jul 29 13:01:46 blade0 kernel: igb1: TX(0) desc
>>> avail = 1022,Next TX to Clean = 31
>>> Jul 29 13:01:46 blade0 kernel: igb1: link state
>>> changed to DOWN
>>> Jul 29 13:01:46 blade0 kernel: igb1: link state
>>> changed to UP
>>> Jul 29 13:01:57 blade0 kernel: igb1: Watchdog
>>> timeout -- resetting
>>> Jul 29 13:01:57 blade0 kernel: igb1: Queue(0)
>>> tdh = 0, hw tdt = 10
>>> Jul 29 13:01:57 blade0 kernel: igb1: TX(0) desc
>>> avail = 1014,Next TX to Clean = 0
>>> Jul 29 13:01:57 blade0 kernel: igb1: link state
>>> changed to DOWN
>>> Jul 29 13:01:58 blade0 kernel: igb1: link state
>>> changed to UP
>>> Jul 29 13:02:13 blade0 kernel: igb1: Watchdog
>>> timeout -- resetting
>>>
>>> grep igb /var/run/dmesg.boot
>>>
>>> igb0:<Intel(R) PRO/1000 Network Connection
>>> version - 1.9.5> port 0x2000-0x201f mem
>>> 0xfc940000-0xfc95ffff,0xfc920000-0xfc93ffff,0xfc900000-0xfc903fff
>>> irq 16 at device 0.0 on pci4
>>> igb0: [FILTER]
>>> igb0: Ethernet address: 00:30:48:9e:22:00
>>> igb1:<Intel(R) PRO/1000 Network Connection
>>> version - 1.9.5> port 0x2020-0x203f mem
>>> 0xfc980000-0xfc99ffff,0xfc960000-0xfc97ffff,0xfc904000-0xfc907fff
>>> irq 17 at device 0.1 on pci4
>>> igb1: [FILTER]
>>> igb1: Ethernet address: 00:30:48:9e:22:01
>>>
>>> pciconf -lv
>>>
>>> [...]
>>> igb0 at pci0:4:0:0: class=0x020000 card=0x10a915d9
>>> chip=0x10a98086 rev=0x02 hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = '82575EB Gigabit Backplane
>>> Connection'
>>> class = network
>>> subclass = ethernet
>>> igb1 at pci0:4:0:1: class=0x020000
>>> card=0x10a915d9
>>> chip=0x10a98086 rev=0x02 hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = '82575EB Gigabit Backplane
>>> Connection'
>>> class = network
>>> subclass = ethernet
>>> [...]
>>>
>>>
>>> _______________________________________________
>>> freebsd-net at freebsd.org <mailto:freebsd-net at freebsd.org>
>>> mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to
>>> "freebsd-net-unsubscribe at freebsd.org
>>> <mailto:freebsd-net-unsubscribe at freebsd.org>"
>>>
>>>
>>
>
More information about the freebsd-net
mailing list