Or it could be ZFS memory starvation and 9k packets (was Re: istgt causes massive jumbo nmbclusters loss)

Garrett Wollman wollman at hergotha.csail.mit.edu
Sat Oct 26 05:52:41 UTC 2013


In article <CACpH0MfEy50Y5QOZCdn2co_JmY_QPfVRxYwK-73W0WYsHB-Fqw at mail.gmail.com> you write:

>Now... below the netstat -m shows 1399 9k bufs with 376 available.  When
>the network gets busy, I've seen 4k or even 5k bufs in total... never near
>the 77k max.  After some time of lesser activity, the number of 9k buffers
>returns to this level.

The network interface (driver) almost certainly should not be using 9k
mbufs.  These buffers are physically contiguous, and after not too
much activity, it will be nearly impossible to allocate three
physically contiguous buffers.

>> That has an em0 with jumbo packets enabled:
>>
>> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9014

I don't know for certain about em(4), but it very likely should not be
using 9k mbufs.  Intel network hardware has done scatter-gather since
nearly the year dot.  (Seriously, I wrote a network driver for the
i82586 back at the very beginning of FreeBSD's existence, and *that*
part had scatter-gather.  No jumbo frames, though!)

The entire existence of 9k and 16k mbufs is probably a mistake.  There
should not be any network interfaces that are modern enough to do
jumbo frames but ancient enough to require physically contiguous pages
for each frame.  I don't know if the em(4) driver is written such that
you can just disable the use of those mbufs, though.  You could try
making this change, though.  Look for this code in if_em.c:

        /*
        ** Figure out the desired mbuf
        ** pool for doing jumbos
        */
        if (adapter->max_frame_size <= 2048)
                adapter->rx_mbuf_sz = MCLBYTES;
        else if (adapter->max_frame_size <= 4096)
                adapter->rx_mbuf_sz = MJUMPAGESIZE;
        else
                adapter->rx_mbuf_sz = MJUM9BYTES;

Comment out the last two lines and change the else if (...) to else.
It's not obvious that the rest of the code can cope with this, but it
does work that way on other Intel hardware so it seems like it may be
worth a shot.

-GAWollman


More information about the freebsd-net mailing list