em network issues
Scott Long
scottl at samsco.org
Thu Oct 19 07:10:15 UTC 2006
Bruce Evans wrote:
> On Thu, 19 Oct 2006, Scott Long wrote:
>
>> Bruce Evans wrote:
>
>>>>> On Wed, 18 Oct 2006, Kris Kennaway wrote:
>>>>>> I have been working with someone's system that has em shared with
>>>>>> fxp,
>>>>>> and a simple fetch over the em (e.g. of a 10 GB file of zeroes) is
>>>>>> enough to produce watchdog timeouts after a few seconds.
>>>>>
>>>>> em_intr_fast() has no locking whatsoever. I would be very surprised
>>>>> if it even seemed to work for SMP. For UP, masking of CPU interrupts
>>>>> (as is automatic in fast interrupt handlers) might provide sufficient
>>>>> locking, ...
>>>
>>> I barely noticed the point about it being shared. With sharing, and
>>> probably especially with fast and normal interrupt handlers sharing an
>>> IRQ, locking is more needed. There are many possibilities for races.
>>> One likely one is:
>>> - em interrupt task running. Device interrupts are disabled, so the
>>> task thinks it won't be interfered with by the em interrupt handler.
>>
>> What interference are you talking about? em_intr_fast changes no state
>> in the driver softc (aside from the silly bookkeeping). It only reads
>> from one register, and writes to no registers or shared memory.
>
> It disables interrupts. To do that, it calls em_disable_intr(). The
> hardware is simple enough for em_disable_intr() not to have to make
> many state changes, but it certainly has to make at least 1 to work.
> It uses several layers of macros which I think ends up doing a write
> to 1 register in bus space.
>
>>> - shared fxp interrupt. The em interrupt handler is called. Without
>>> any explicit synchonization, bad things may happen and apparently do.
>>> In the UP case, there is some implicit synchronization which may help
>>> but is hard to understand.
>>
>> Can you be more specific as to the 'bad things'?
>
> Not very. Maybe interrupts don't get reenabled as intended. Then the
> symptoms get mutated by watchdog timeouts.
>
> Bruce
Then yes, I'm already thinking of a better way to do the interrupt
enable/disable thing. I am still very surprised that the hardware
cannot be silenced by doing a read and/or write of a status register,
like most other hardware. If that were possible, this would be a very
simple problem.
Scott
More information about the freebsd-net
mailing list