em network issues
Scott Long
scottl at samsco.org
Thu Oct 19 06:30:54 UTC 2006
Bruce Evans wrote:
> On Wed, 18 Oct 2006, Scott Long wrote:
>
> [too much quoted; much deleted]
>
>> Bruce Evans wrote:
>>> On Wed, 18 Oct 2006, Kris Kennaway wrote:
>>>
>>>> I have been working with someone's system that has em shared with fxp,
>>>> and a simple fetch over the em (e.g. of a 10 GB file of zeroes) is
>>>> enough to produce watchdog timeouts after a few seconds.
>>>>
>>>> As previously mentioned, changing the INTR_FAST to INTR_MPSAFE in the
>>>> driver avoids this problem. However, others are seeing sporadic
>>>> watchdog timeouts at higher system load on non-shared em systems too.
>>>
>>> em_intr_fast() has no locking whatsoever. I would be very surprised
>>> if it even seemed to work for SMP. For UP, masking of CPU interrupts
>>> (as is automatic in fast interrupt handlers) might provide sufficient
>>> locking, ...
>
> I barely noticed the point about it being shared. With sharing, and
> probably especially with fast and normal interrupt handlers sharing an
> IRQ, locking is more needed. There are many possibilities for races.
> One likely one is:
> - em interrupt task running. Device interrupts are disabled, so the
> task thinks it won't be interfered with by the em interrupt handler.
What interference are you talking about? em_intr_fast changes no state
in the driver softc (aside from the silly bookkeeping). It only reads
from one register, and writes to no registers or shared memory.
> - shared fxp interrupt. The em interrupt handler is called. Without
> any explicit synchonization, bad things may happen and apparently do.
> In the UP case, there is some implicit synchronization which may help
> but is hard to understand.
Can you be more specific as to the 'bad things'?
Scott
More information about the freebsd-net
mailing list