cvs commit: src/sys/dev/bge if_bge.c
Scott Long
scottl at samsco.org
Sat Dec 23 21:52:03 PST 2006
Bruce Evans wrote:
> On Sun, 24 Dec 2006, Oleg Bulyzhin wrote:
>
>> On Fri, Dec 22, 2006 at 01:24:45AM +1100, Bruce Evans wrote:
>>> On Wed, 20 Dec 2006, Gleb Smirnoff wrote:
>>>>
>>>> I have a suspicion that this may cause a problem under high load.
>>>> Imagine
>>>> that thread #1 is spinning in bge_start_locked() getting packets out
>>>> of interface queue and putting them into TX ring. Some other threads
>>>> are
>>>> putting the packets into interface queue while its lock is temporarily
>>>> relinguished be the thread #1. In the same time interrupts happen, some
>>>> packets are sent, but the TX ring is never got empty.
>>>>
>>>> The above scenario will cause a fake watchdog event.
>>>
>>> bge_start_locked() starts with the bge (sc) lock held and never releases
>>> it as far as I can see. This this problem can't happen (the lock
>>> prevents both txeof and the watchdog from being reached before start
>>> resets the timeout to 5 seconds).
>
>> it's quite unusal) and it is not lock related:
>> 1) bge_start_locked() & bge_encap fills tx ring.
>> 2) during next 5 seconds we do not have packets for transmit (i.e. no
>> bge_start_locked() calls --> no bge_timer refreshing)
>> 3) for any reason (don't ask me how can this happen), chip was unable to
>> send whole tx ring (only part of it).
>> 4) here we have false watchdog - chip is not wedged but bge_watchdog
>> would
>> reset it.
>
> Then it is a true watchdog IMO. Something is very wrong if you can't send
> 512 packets in 5 seconds (or even 1 packet in 5/512 seconds).
>
No it's not wrong. You can be under heavy load and be constantly
preempted. Or you could be getting a fed a steady stream of traffic
and have a driver that is smart enough to clean the TX-complete ring
in if_start if it runs out of TX slots. These effects have been
observed in at least the if_em driver.
Scott
More information about the cvs-src
mailing list