Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

Thu Nov 10 18:27:04 UTC 2011

Wait, but low resources which are stopping _what_ ?

The whole point behind OACTIVE is to say "oi, give me time to flush my
TX queue." I'll tell you when I'm ready.

So when you clear OACTIVE after a TX completion handler, you would
then call _start (or _start_locked) to reschedule some more frames.

There shouldn't be a resource shortage here - ie, once you run out of
tx descriptor slots, you'd set OACTIVE, then you will call the TX
completion handler (on interrupt) until you've handled the pending
entries in the TX descriptor ring/list. Once you've handled at least
-one- TXed frame, that slot is now available and you can free OACTIVE
+ call _start/_start_locked.

The problem is whether this is working right for drivers that
implement _transmit, do multiqueue, etc. Ie, there's now multiple
places where OACTIVE is being set/cleared, and from my experience with
a few other > 100mbit ethernet/wifi drivers, I've seen situations
where the TX queue stalled because the right mix of "clear OACTIVE to
tell the stack they can start firing off more frames" and "call _start
/ _start_locked once there are actually TX buffers/descriptors
available" didn't happen correctly. At this point, if you never call
_start() and OACTIVE is set, TX stalls. It doesn't matter if you
_clear_ OACTIVE at this point. The queue is full and thus nothing will
happen.

Some drivers even work around this silliness by calling the tx start
routine from the end of the RX completion routine. :-) I don't think
that's strictly needed if your hardware is posting interrupt
notifications sensibly and your interrupt handler isn't buggy, but
hey, maybe it isn't (eg, you disable interrupts, you do your TX/RX
work, in the meantime some more TX completion has occured, then you
restore interrupts and clear the bits you were working on - there's
stuff in the TX completion queue, but now that the queue is full you
won't _get_ another interrupt for it.)

So the magic is:

* is OACTIVE being set/cleared sensibly;
* are there calls to _start or _start_locked at the right spots (ie,
after TX completion has occured);
* what's going on with the _transmit stuff;
* how's the descriptor aggregation stuff for _transmit working and are
you hitting concurrency issues where that queue/queue gets stalled
because you never call the TX completion function and then _start or
_transmit to drain that queue?
* Check the interrupt handling and see if when you disable/enable
interrupts, you're clearing the status bits wrong (if the hardware
even lets you do the sensible thing.) Eg, don't clear the TX
completion bits _after_ you handle TX completion but before you
re-enable interrupts, as the hardware may have completed some more
frames and filled the TX descriptor queue/ring. At this point if
you're lucky you'll get a "yo, TX queue FULL!" interrupt/error, but if
you're even _more_ unlucky, you've just handled a TX completion + TX
queue full event and whilst handling that full queue, you fill the
queue up again. Then you won't get a subsequent interrupt for the now
filled queue.

Now, I've not got any em hardware (legacy or shiny) so I'm simply
offering observations from having had to debug this stuff in the
recent past. I just think you've hit the same issue. :)

Adrian