ixgbe & if_igb RX ring locking

Fri Oct 19 15:48:03 UTC 2012

On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov <
melifaro at freebsd.org> wrote:

> On 17.10.2012 18:06, John Baldwin wrote:
>
>> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
>>
>>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
>>>
>>>> On 13.10.2012 23:24, Jack Vogel wrote:
>>>>
>>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo at iet.unipi.it>
>>>>>  wrote:
>>>>>
>>>>
>>>>
>>>>>> one option could be (same as it is done in the timer
>>>>>> routine in dummynet) to build a list of all the packets
>>>>>> that need to be sent to if_input(), and then call
>>>>>> if_input with the entire list outside the lock.
>>>>>>
>>>>>> It would be even easier if we modify the various *_input()
>>>>>> routines to handle a list of mbufs instead of just one.
>>>>>>
>>>>>
>>>> Bulk processing is generally a good idea we probably should implement.
>>>> Probably starting from driver queue ending with marked mbufs
>>>> (OURS/forward/legacy processing (appletalk and similar))?
>>>>
>>>> This can minimize an impact for all
>>>> locks on RX side:
>>>> L2
>>>> * rx PFIL hook
>>>> L3 (both IPv4 and IPv6)
>>>> * global IF_ADDR_RLOCK (currently commented out)
>>>> * Per-interface ADDR_RLOCK
>>>> * PFIL hook
>>>>
>>>>   From the first glance, there can be problems with:
>>>> * Increased latency (we should have some kind of rx_process_limit), but
>>>> still
>>>> * reader locks being acquired for much longer amount of time
>>>>
>>>>
>>>>>> cheers
>>>>>> luigi
>>>>>>
>>>>>> Very interesting idea Luigi, will have to get that some thought.
>>>>>>
>>>>>
>>>>> Jack
>>>>>
>>>>
>>>> Returning to original post topic:
>>>>
>>>> Given
>>>> 1) we are currently binding ixgbe ithreads to CPU cores
>>>> 2) RX queue lock is used by (indirectly) in only 2 places:
>>>> a) ISR routine (msix or legacy irq)
>>>> b) taskqueue routine which is scheduled if some packets remains in RX
>>>> queue and rx_process_limit ended OR we need something to TX
>>>>
>>>> 3) in practice taskqueue routine is a nightmare for many people since
>>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>>>> some traffic burst happens: once it is called it starts to schedule
>>>> itself more and more replacing original ISR routine. Additionally,
>>>> increasing rx_process_limit does not help since taskqueue is called with
>>>> the same limit. Finally, currently netisr taskq threads are not bound to
>>>> any CPU which makes the process even more uncontrollable.
>>>>
>>>
>>> I think part of the problem here is that the taskqueue in ixgbe(4) is
>>> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>>> just start transmitting packets directly.
>>>
>>> I fixed this in igb(4) here:
>>>
>>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708<http://svnweb.freebsd.org/base?view=revision&revision=233708>
>>>
>>> You can try this for ixgbe(4).  It also comments out a spurious taskqueue
>>> reschedule from the watchdog handler that might also lower the taskqueue
>>> usage.  You can try changing that #if 0 to an #if 1 to test just the
>>> txeof
>>> changes:
>>>
>>
>> Is anyone able to test this btw to see if it improves things on ixgbe at
>> all?
>> (I don't have any ixgbe hardware.)
>>
> Yes. I'll try to to this next week (since ixgbe driver from at least 9-S
> fails to detect twinax cable which works in 8-S....)).
>
>>
>>
>
If you have a major problem like this you might want to put it in a bug
report or at least an
email with that specific topic rather than bury it in an unrelated thread
in a parenthetic remark :(
This is the first I've heard of this, did you check the code on HEAD to see
if it also has the issue?

Jack