9.2 ixgbe tx queue hang

Sat Mar 22 02:39:39 UTC 2014

Christopher Forgeron wrote:
> It may be a little early, but I think that's it!
> 
> It's been running without error for nearly an hour - It's very rare
> it
> would go this long under this much load.
> 
> I'm going to let it run longer, then abort and install the kernel
> with the
> extra printfs so I can see what value ifp->if_hw_tsomax is before you
> set
> it.
> 
I think you'll just find it set to 0. Code in if_attach_internal()
{ in sys/net/if.c } sets it to IP_MAXPACKET (which is 65535) if it
is 0. In other words, if the if_attach routine in the driver doesn't
set it, this code sets it to the maximum possible value.

Here's the snippet:
 /* Initialize to max value. */
657 	if (ifp->if_hw_tsomax == 0)
658 	     ifp->if_hw_tsomax = IP_MAXPACKET;

Anyhow, this sounds like progress.

As far as NFS is concerned, I'd rather set it to a smaller value
(maybe 56K) so that m_defrag() doesn't need to be called, but I
suspect others wouldn't like this.

Hopefully Jack can decide if this patch is ok?

Thanks yet again for doing this testing, rick
ps: I've attached it again, so Jack (and anyone else who reads this)
    can look at it.
pss: Please report if it keeps working for you.

> It still had netstat -m denied entries on boot, but they are not
> climbing
> like they did before:
> 
> 
> $ uptime
>  9:32PM  up 25 mins, 4 users, load averages: 2.43, 6.15, 4.65
> $ netstat -m
> 21556/7034/28590 mbufs in use (current/cache/total)
> 4080/3076/7156/6127254 mbuf clusters in use (current/cache/total/max)
> 4080/2281 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 0/53/53/3063627 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 16444/118/16562/907741 9k jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max)
> 161545K/9184K/170729K bytes allocated to network
> (current/cache/total)
> 17972/2230/4111 requests for mbufs denied
> (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 35/8909/0 requests for jumbo clusters denied (4k/9k/16k)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 
> - Started off bad with the 9k denials, but it's not going up!
> 
> uptime
> 10:20PM  up  1:13, 6 users, load averages: 2.10, 3.15, 3.67
> root at SAN0:/usr/home/aatech # netstat -m
> 21569/7141/28710 mbufs in use (current/cache/total)
> 4080/3308/7388/6127254 mbuf clusters in use (current/cache/total/max)
> 4080/2281 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 0/53/53/3063627 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 16447/121/16568/907741 9k jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max)
> 161575K/9702K/171277K bytes allocated to network
> (current/cache/total)
> 17972/2261/4111 requests for mbufs denied
> (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 35/8913/0 requests for jumbo clusters denied (4k/9k/16k)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 
> This is the 9.2 ixgbe that I'm patching into 10.0, I'll move into the
> base
> 10.0 code tomorrow.
> 
> 
> On Fri, Mar 21, 2014 at 8:44 PM, Rick Macklem <rmacklem at uoguelph.ca>
> wrote:
> 
> > Christopher Forgeron wrote:
> > >
> > >
> > >
> > >
> > >
> > >
> > > Hello all,
> > >
> > > I ran Jack's ixgbe MJUM9BYTES removal patch, and let iometer
> > > hammer
> > > away at the NFS store overnight - But the problem is still there.
> > >
> > >
> > > From what I read, I think the MJUM9BYTES removal is probably good
> > > cleanup (as long as it doesn't trade performance on a lightly
> > > memory
> > > loaded system for performance on a heavily memory loaded system).
> > > If
> > > I can stabilize my system, I may attempt those benchmarks.
> > >
> > >
> > > I think the fix will be obvious at boot for me - My 9.2 has a
> > > 'clean'
> > > netstat
> > > - Until I can boot and see a 'netstat -m' that looks similar to
> > > that,
> > > I'm going to have this problem.
> > >
> > >
> > > Markus: Do your systems show denied mbufs at boot like mine does?
> > >
> > >
> > > Turning off TSO works for me, but at a performance hit.
> > >
> > > I'll compile Rick's patch (and extra debugging) this morning and
> > > let
> > > you know soon.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Mar 20, 2014 at 11:47 PM, Christopher Forgeron <
> > > csforgeron at gmail.com > wrote:
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > BTW - I think this will end up being a TSO issue, not the patch
> > > that
> > > Jack applied.
> > >
> > > When I boot Jack's patch (MJUM9BYTES removal) this is what
> > > netstat -m
> > > shows:
> > >
> > > 21489/2886/24375 mbufs in use (current/cache/total)
> > > 4080/626/4706/6127254 mbuf clusters in use
> > > (current/cache/total/max)
> > > 4080/587 mbuf+clusters out of packet secondary zone in use
> > > (current/cache)
> > > 16384/50/16434/3063627 4k (page size) jumbo clusters in use
> > > (current/cache/total/max)
> > > 0/0/0/907741 9k jumbo clusters in use (current/cache/total/max)
> > >
> > > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max)
> > > 79068K/2173K/81241K bytes allocated to network
> > > (current/cache/total)
> > > 18831/545/4542 requests for mbufs denied
> > > (mbufs/clusters/mbuf+clusters)
> > >
> > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> > > 15626/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > >
> > > 0 requests for sfbufs denied
> > > 0 requests for sfbufs delayed
> > > 0 requests for I/O initiated by sendfile
> > >
> > > Here is an un-patched boot:
> > >
> > > 21550/7400/28950 mbufs in use (current/cache/total)
> > > 4080/3760/7840/6127254 mbuf clusters in use
> > > (current/cache/total/max)
> > > 4080/2769 mbuf+clusters out of packet secondary zone in use
> > > (current/cache)
> > > 0/42/42/3063627 4k (page size) jumbo clusters in use
> > > (current/cache/total/max)
> > > 16439/129/16568/907741 9k jumbo clusters in use
> > > (current/cache/total/max)
> > >
> > > 0/0/0/510604 16k jumbo clusters in use (current/cache/total/max)
> > > 161498K/10699K/172197K bytes allocated to network
> > > (current/cache/total)
> > > 18345/155/4099 requests for mbufs denied
> > > (mbufs/clusters/mbuf+clusters)
> > >
> > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> > > 3/3723/0 requests for jumbo clusters denied (4k/9k/16k)
> > >
> > > 0 requests for sfbufs denied
> > > 0 requests for sfbufs delayed
> > > 0 requests for I/O initiated by sendfile
> > >
> > >
> > >
> > > See how removing the MJUM9BYTES is just pushing the problem from
> > > the
> > > 9k jumbo cluster into the 4k jumbo cluster?
> > >
> > > Compare this to my FreeBSD 9.2 STABLE machine from ~ Dec 2013 :
> > > Exact
> > > same hardware, revisions, zpool size, etc. Just it's running an
> > > older FreeBSD.
> > >
> > > # uname -a
> > > FreeBSD SAN1.XXXXX 9.2-STABLE FreeBSD 9.2-STABLE #0: Wed Dec 25
> > > 15:12:14 AST 2013 aatech at FreeBSD-Update
> > > Server:/usr/obj/usr/src/sys/GENERIC amd64
> > >
> > > root at SAN1:/san1 # uptime
> > > 7:44AM up 58 days, 38 mins, 4 users, load averages: 0.42, 0.80,
> > > 0.91
> > >
> > > root at SAN1:/san1 # netstat -m
> > > 37930/15755/53685 mbufs in use (current/cache/total)
> > > 4080/10996/15076/524288 mbuf clusters in use
> > > (current/cache/total/max)
> > > 4080/5775 mbuf+clusters out of packet secondary zone in use
> > > (current/cache)
> > > 0/692/692/262144 4k (page size) jumbo clusters in use
> > > (current/cache/total/max)
> > > 32773/4257/37030/96000 9k jumbo clusters in use
> > > (current/cache/total/max)
> > >
> > > 0/0/0/508538 16k jumbo clusters in use (current/cache/total/max)
> > > 312599K/67011K/379611K bytes allocated to network
> > > (current/cache/total)
> > >
> > > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> > > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> > > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > > 0/0/0 sfbufs in use (current/peak/max)
> > > 0 requests for sfbufs denied
> > > 0 requests for sfbufs delayed
> > > 0 requests for I/O initiated by sendfile
> > > 0 calls to protocol drain routines
> > >
> > > Lastly, please note this link:
> > >
> > > http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033660.html
> > >
> > Hmm, this mentioned the ethernet header being in the TSO segment. I
> > think
> > I already mentioned my TCP/IP is rusty and I know diddly about TSO.
> > However, at a glance it does appear the driver uses ether_output()
> > for
> > TSO segments and, as such, I think an ethernet header is prepended
> > to the
> > TSO segment. (This makes sense, since how else would the hardware
> > know
> > what ethernet header to use for the TCP segments generated.)
> >
> > I think prepending the ethernet header could push the total length
> > over 64K, given a default if_hw_tsomax == IP_MAXPACKET. And over
> > 64K
> > isn't going to fit in 32 * 2K (mclbytes) clusters, etc and so
> > forth.
> >
> > Anyhow, I think the attached patch will reduce if_hw_tsomax, so
> > that
> > the result should fit in 32 clusters and avoid EFBIG for this case,
> > so it might be worth a try?
> > (I still can't think of why the CSUM_TSO bit isn't set for the
> > printf()
> >  case, but it seems TSO segments could generate EFBIG errors.)
> >
> > Maybe worth a try, rick
> >
> > > It's so old that I assume the TSO leak that he speaks of has been
> > > patched, but perhaps not. More things to look into tomorrow.
> > >
> > >
> > >
> > >
> > >
> >
> > _______________________________________________
> > freebsd-net at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe at freebsd.org"
> >
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe at freebsd.org"
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ixgbe.patch
Type: text/x-patch
Size: 473 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-net/attachments/20140321/0bde7f72/attachment.bin>