9.2 ixgbe tx queue hang

Rick Macklem rmacklem at uoguelph.ca
Sun Mar 23 23:57:12 UTC 2014


Christopher Forgeron wrote:
> 
> 
> 
> 
> 
> 
> On Sat, Mar 22, 2014 at 6:41 PM, Rick Macklem < rmacklem at uoguelph.ca
> > wrote:
> 
> 
> 
> Christopher Forgeron wrote:
> > #if defined(INET) || defined(INET6)
> > /* Initialize to max value. */
> > if (ifp->if_hw_tsomax == 0)
> > ifp->if_hw_tsomax = IP_MAXPACKET;
> > KASSERT(ifp->if_hw_tsomax <= IP_MAXPACKET &&
> > ifp->if_hw_tsomax >= IP_MAXPACKET / 8,
> > ("%s: tsomax outside of range", __func__));
> > #endif
> > 
> > 
> > Should this be the location where it's being set rather than in
> > ixgbe? I would assume that other drivers could fall prey to this
> > issue.
> > 
> All of this should be prepended with "I'm an NFS guy, not a
> networking
> guy, so I might be wrong".
> 
> Other drivers (and ixgbe for the 82598 chip) can handle a packet that
> is in more than 32 mbufs. (I think the 82598 handles 100, grep for
> SCATTER
> in *.h in sys/dev/ixgbe.)
> 
> 
> [...]
> 
> 
> Yes, I agree we have to be careful about the limitations of other
> drivers, but I'm thinking setting tso to IP_MAXPACKET is a bad idea,
> unless all of the header subtractions are happening elsewhere. Then
> again, perhaps every other driver (and possibly ixgbe.. i need to
> look more) does a maxtso - various_headers to set a limit for data
> packets.
> 
> 
> I'm not familiar with the Freebsd network conventions/styles - I'm
> just asking questions, something I have a bad habit for, but I'm in
> charge of code stability issues at my work so it's hard to stop.
> 
Well, IP_MAXPACKET is simply the largest # that fits in the 16bit length
field of an IP header (65535). This limit is on the TSO segment (which
is really just a TCP/IP packet greater than the MTU) and does not include
a MAC level (ethernet) header.

Beyond that, it is the specific hardware that limits things, such as
this case, which is limited to 32 mbufs (which happens to imply 64K
total, including ethernet header using 2K mbuf clusters).
(The 64K limit is just a quirk caused by the 32mbuf limit and the fact
 that mbuf clusters hold 2K of data each.)

> 
> 
> Now, since several drivers do have this 32 mbufs limit, I can see an
> argument
> for making the default a little smaller to make these work, since the
> driver can override the default. (About now someone usually jumps in
> and says
> something along the lines of "You can't do that until all the drivers
> that
> can handle IP_MAXPACKET are fixed to set if_hw_tsomax" and since I
> can't fix
> drivers I can't test, that pretty much puts a stop on it.)
> 
> 
> 
> 
> Testing is a problem isn't it? I once again offer my stack of network
> cards and systems for some sort of testing.. I still have coax and
> token ring around. :-)
> 
> 
> 
> You see the problem isn't that IP_MAXPACKET is too big, but that the
> hardware
> has a limit of 32 non-contiguous chunks (mbufs)/packet and 32 *
> MCLBYTES = 64K.
> (Hardware/network drivers that can handle 35 or more chunks (they
> like to call
> them transmit segments, although ixgbe uses the term scatter)
> shouldn't have
> any problems.)
> 
> I have an untested patch that adds a tsomaxseg count to use along
> with tsomax
> bytes so that a driver could inform tcp_output() it can only handle
> 32 mbufs
> and then tcp_output() would limit a TSO segment using both, but I
> can't test
> it, so who knows when/if that might happen.
> 
> 
> 
> 
> I think you give that to me in the next email - if not, please send.
> 
> 
> 
> I also have a patch that modifies NFS to use pagesize clusters
> (reducing the
> mbuf count in the list), but that one causes grief when testing on an
> i386
> (seems to run out of kernel memory to the point where it can't
> allocate something
> called "boundary tags" and pretty well wedges the machine at that
> point.)
> Since I don't know how to fix this (I thought of making the patch
> "amd64 only"),
> I can't really commit this to head, either.
> 
> 
> 
> 
> Send me that one too. I love NFS patches.
> 
> 
> 
> As such, I think it's going to be "fix the drivers one at a time" and
> tell
> folks to "disable TSO or limit rsize,wsize to 32K" when they run into
> trouble.
> (As you might have guessed, I'd rather just be "the NFS guy", but
> since NFS
> "triggers the problem" I\m kinda stuck with it;-)
> 
> 
> 
> I know in some circumstances disabling TSO can be a benefit, but in
> general you'd want it on a modern system with heavy data load.
> 
> 
> 
> 
> > Also should we not also subtract ETHER_VLAN_ENCAP_LEN from tsomax
> > to
> > make sure VLANs fit?
> > 
> No idea. (I wouldn't know a VLAN if it jumped up and tried to
> bite me on the nose.;-) So, I have no idea what does this, but
> if it means the total ethernet header size can be > 14bytes, then I'd
> agree.
> 
> 
> 
> Yeah, you need another 4 bytes for VLAN header if you're not using
> hardware that strips it before the TCP stack gets it. I have a mix
> of hardware and software VLANs running on our backbone, mostly due
> to a mixed FreeBSD/OpenBSD/Windows environment.
> 
> 
> 
> 
> > Perhaps there is something in the newer network code that is
> > filling
> > up the frames to the point where they are full - thus a TSO =
> > IP_MAXPACKET is just now causing problems.
> > 
> Yea, I have no idea why this didn't bite running 9.1. (Did 9.1 have
> TSO enabled by default?)
> 
> 
> 
> I believe 9.0 has TSO on by default.. I seem to recall it always
> being there, but I can't easily confirm it now. My last 9.0-STABLE
> doesn't have an ixgbe card in it.
> 
> 
> 
> 
Ok, I've attached 3 patches:
ixgbe.patch - A slightly updated version of the one that sets if_hw_tsomax,
              which subtracts out the additional 4bytes for the VLAN header.
*** If you can test this, it would be nice to know if this gets rid of all
    the EFBIG replies, since I think Jack might feel it is ok to commit if
    it does do so.

4kmcl.patch - This one modifies NFS to use pagesize mbuf clusters for the
              large RPC messages. It is NOT safe to use on a small i386,
              but might be ok on a large amd64 box. On a small i386, using
              a mix of 2K and 4K mbuf clusters seems to fragment kernel memory
              enough that allocation of "boundary tags" (whatever those are?)
              fail and this trainwrecks the system.
              Using pagesize (4K) clusters reduces the mbuf count for an
              IP_MAXPACKET sized TSO segment to 19, avoiding the 32 limit
              and any need to call m_defrag() for NFS.
*** Only use on a test system, at your own risk.

tsomaxseg.patch - This one adds support for if_hw_tsomaxseg, which is a limit on
          the # of mbufs in an output TSO segment (and defaults to 32).
*** This one HAS NOT BEEN TESTED and probably doesn't even work at this point.

rick


-------------- next part --------------
A non-text attachment was scrubbed...
Name: ixgbe.patch
Type: text/x-patch
Size: 530 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-net/attachments/20140323/a786ac5b/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 4kmcl.patch
Type: text/x-patch
Size: 8152 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-net/attachments/20140323/a786ac5b/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tsomaxseg.patch
Type: text/x-patch
Size: 4179 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-net/attachments/20140323/a786ac5b/attachment-0002.bin>


More information about the freebsd-net mailing list