buf_ring in HEAD is racy
Michael Tuexen
Michael.Tuexen at lurchi.franken.de
Tue Dec 17 16:39:50 UTC 2013
On Dec 17, 2013, at 4:43 PM, Adrian Chadd <adrian at freebsd.org> wrote:
> ie:
>
> Index: sys/dev/ixgbe/ixgbe.c
> ===================================================================
> --- sys/dev/ixgbe/ixgbe.c (revision 2995)
> +++ sys/dev/ixgbe/ixgbe.c (working copy)
> @@ -5178,6 +5178,7 @@
> struct ixgbe_hw *hw = &adapter->hw;
> u32 missed_rx = 0, bprc, lxon, lxoff, total;
> u64 total_missed_rx = 0;
> + u64 odrops;
What about
+ u64 odrops = 0;
since odrops seems to be used uninitialized.
Best regards
Michael
>
> adapter->stats.crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS);
> adapter->stats.illerrc += IXGBE_READ_REG(hw, IXGBE_ILLERRC);
> @@ -5308,6 +5309,11 @@
> adapter->stats.fcoedwtc += IXGBE_READ_REG(hw, IXGBE_FCOEDWTC);
> }
>
> + /* TX drops */
> + for (int i = 0; i < adapter->num_queues; i++) {
> + odrops += adapter->tx_rings[i].br->br_drops;
> + }
> +
> /* Fill out the OS statistics structure */
> ifp->if_ipackets = adapter->stats.gprc;
> ifp->if_opackets = adapter->stats.gptc;
> @@ -5317,6 +5323,9 @@
> ifp->if_omcasts = adapter->stats.mptc;
> ifp->if_collisions = 0;
>
> + /* TX drops are stored in if_snd for now, not the top level counters */
> + ifp->if_snd.ifq_drops = drops;
> +
> /* Rx Errors */
> ifp->if_iqdrops = total_missed_rx;
> ifp->if_ierrors = adapter->stats.crcerrs + adapter->stats.rlec;
>
>
>
>
> -adrian
>
> On 16 December 2013 12:19, Adrian Chadd <adrian at freebsd.org> wrote:
>> So I've done some digging.
>>
>> The TL;DR - we really should expose the drops count in the ifa_data
>> field. But that would involve an ABI change and .. well, that will be
>> full of drama.
>>
>> The non-TL:DR:
>>
>> * The send queue drops are in the ifmibdata structure, which includes
>> a few other things. But that requires you use the MIB interface to
>> pull things out, rather than getifaddrs().
>> * getifaddrs() doesn't contain the max sendq length or drops, so we
>> can't report those without adding a MIB fetch for the relevant
>> interface.
>> * bsnmp (which we use at work) correctly populates output discards by
>> checking the MIB for ifmd_snd_drops.
>>
>> So I'd rather we fix ixgbe and other drivers by updating the send
>> queue drops counter if the frames are dropped. That way it's
>> consistent with other things.
>>
>> We should then do some of these things:
>>
>> * add a send queue drops field to the ifdata/ifadata where appropriate
>> and make sure it gets correctly populated;
>> * teach netstat to use ifmib instead of getifaddrs();
>> * as part of killing the ifsnd queue stuff, we should also migrate
>> that particular drops counter out from there and to a top-level
>> counter in ifnet, like the rest.
>>
>> Comments?
>>
>>
>> -a
>>
>>
>> On 14 December 2013 16:06, Adrian Chadd <adrian at freebsd.org> wrote:
>>> oh cool, you just did the output-drops thing I was about to code up.
>>> We're missing those counters at work and the ops guys poked me about
>>> it.
>>>
>>> I'll also give that a whirl locally and see about working with jack to
>>> get it into -HEAD / MFC'ed to 10.
>>>
>>> Thanks!
>>>
>>>
>>> -a
>>>
>>>
>>> On 13 December 2013 21:04, Ryan Stone <rysto32 at gmail.com> wrote:
>>>> I am seeing spurious output packet drops that appear to be due to
>>>> insufficient memory barriers in buf_ring. I believe that this is the
>>>> scenario that I am seeing:
>>>>
>>>> 1) The buf_ring is empty, br_prod_head = br_cons_head = 0
>>>> 2) Thread 1 attempts to enqueue an mbuf on the buf_ring. It fetches
>>>> br_prod_head (0) into a local variable called prod_head
>>>> 3) Thread 2 enqueues an mbuf on the buf_ring. The sequence of events
>>>> is essentially:
>>>>
>>>> Thread 2 claims an index in the ring and atomically sets br_prod_head (say to 1)
>>>> Thread 2 sets br_ring[1] = mbuf;
>>>> Thread 2 does a full memory barrier
>>>> Thread 2 updates br_prod_tail to 1
>>>>
>>>> 4) Thread 2 dequeues the packet from the buf_ring using the
>>>> single-consumer interface. The sequence of events is essentialy:
>>>>
>>>> Thread 2 checks whether queue is empty (br_cons_head == br_prod_tail),
>>>> this is false
>>>> Thread 2 sets br_cons_head to 1
>>>> Thread 2 grabs the mbuf from br_ring[1]
>>>> Thread 2 sets br_cons_tail to 1
>>>>
>>>> 5) Thread 1, which is still attempting to enqueue an mbuf on the ring.
>>>> fetches br_cons_tail (1) into a local variable called cons_tail. It
>>>> sees cons_tail == 1 but prod_head == 0 and concludes that the ring is
>>>> full and drops the packet (incrementing br_drops unatomically, I might
>>>> add)
>>>>
>>>>
>>>> I can reproduce several drops per minute by configuring the ixgbe
>>>> driver to use only 1 queue and then sending traffic from concurrent 8
>>>> iperf processes. (You will need this hacky patch to even see the
>>>> drops with netstat, though:
>>>> http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff)
>>>>
>>>> I am investigating fixing buf_ring by using acquire/release semantics
>>>> rather than load/store barriers. However, I note that this will
>>>> apparently be the second attempt to fix buf_ring, and I'm seriously
>>>> questioning whether this is worth the effort compared to the
>>>> simplicity of using a mutex. I'm not even convinced that a correct
>>>> lockless implementation will even be a performance win, given the
>>>> number of memory barriers that will apparently be necessary.
>>>> _______________________________________________
>>>> freebsd-net at freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
More information about the freebsd-net
mailing list