igb driver RX (was TX) hangs when out of mbuf clusters

Fri Feb 11 19:53:58 UTC 2011

Hi,

I see a commit was made in current (r218530 | jfv | 2011-02-10 20:00:26
-0500 (Thu, 10 Feb 2011)). Is that commit done to address this issue?

And if so Is there any MFC planned for 7.4 for this?

Thanks,

Karim.

2011/2/9 Michael Tuexen <tuexen at freebsd.org>

> On Feb 9, 2011, at 6:35 PM, Jack Vogel wrote:
>
> > OK, but the question is why does the ring get totally consumed this way,
> the
> > ring has 1024 descriptors, it seems unintuitive that that whole quantity
> can be
> > used without some being recharged. Do you see the system mbuf pool being
> > depleted at the same time?
> That was the test case I created: I set up a server accepting connections
> but not reading anything. So the driver passes the mbufs to the transport
> stack and they are not consumed. Then the problem occurs. Then I kill the
> server. Now there are mbufs available again, but the driver doesn't know.
>
> I had the impression that these were the circumstances in which the problem
> showed up (mbuf allocations failing).
> >
> > Since you can reproduce it, do me a favor, in rxeof,  change the
> processed
> > value from 8 to 4 and then 1, effectively call refresh every descriptor,
> see if
> > that eliminates the issue.
> I will do. Need to see if I can do it remotely, since I'm not in my lab
> right now. Can do it tomorrow for sure.
>
> But I do not think that this solves the problem, since I did the things
> very slowly and you call it at least when you are leaving rxeof.
>
> Best regards
> Michael
> >
> > Thanks for your help,
> >
> > Jack
> >
> >
> > On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen <tuexen at freebsd.org>
> wrote:
> > Hi Jack,
> >
> > I could recreate the problem. When the problem occurs, we see
> >
> > rx_nxt_check = n
> > rx_nxt_refresh = n + 1
> >
> > (This was also reported in a mail from Karim)
> >
> > This means that the *whole* receive ring has no buffers anymore. This can
> > occur if, for some amount of time, no clusters are available.
> >
> > Now outside of the driver, at some point of time, clusters are freed.
> > I don't think that igb_refresh_mbufs() gets called, since it only gets
> > called from igb_rxeof(), which gets called when a packet has been
> received,
> > which can not happen since the receive ring is empty. So how can the
> driver
> > know? I have no idea. Maybe we can periodically check for such an event
> > and call igb_refresh_mbufs().
> >
> > Does this make sense to you?
> >
> > Best regards
> > Michael
> >
> >
> > On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote:
> >
> > > Hmmm, well so much for that theory :)
> > >
> > > Jack
> > >
> > >
> > > On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin <
> fodillemlinkarim at gmail.com> wrote:
> > >
> > >
> > > 2011/2/8 Jack Vogel <jfvogel at gmail.com>
> > >
> > >
> > > I have been following this, and thinking about it. I still am working
> from a theoretical
> > > standpoint, but based on a patch I got quite a long time back and never
> quite groked,
> > > I believe now that I might have a solution.
> > >
> > > The original PR and patch was kern/150516 from Beezar Liu,  I was never
> quite comfortable
> > > with the code changes, nor convinced that it was a real issue and not a
> misunderstanding.
> > > However I think now that this very report might be behind what we are
> seeing today. I have
> > > a slightly different approach to solving it, of course it remains to be
> seen if it handles it
> > > properly.
> > >
> > > Please try the patch I've attached, I'm open to further correction or
> polishing of the
> > > changes. And thanks to Beezar for his original report and changes, this
> is not for em,
> > > but if this eliminates the problem its clearly needed in all drivers.
> > >
> > > Jack
> > >
> > >
> > > Hi Jack,
> > >
> > > Thanks for your help. I tried your patch and it didn't work so I added
> a couple of printf to see if the added code was getting hit:
> > >
> > > --- a/freebsd/sys/dev/e1000/if_igb.c
> > > --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c
> > > @@ -612,7 +612,7 @@ igb_attach(device_t dev)
> > >             device_get_nameunit(dev));
> > >
> > >         INIT_DEBUGOUT("igb_attach: end");
> > > -
> > > +       printf("this driver has a patch from Jack Vogel\n");
> > >         return (0);
> > >
> > >  err_late:
> > > @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int
> *done)
> > >                 struct mbuf             *sendmp, *mh, *mp;
> > >                 struct igb_rx_buf       *rxbuf;
> > >                 u16                     hlen, plen, hdr, vtag;
> > > +               int                     commit;
> > >                 bool                    eop = FALSE;
> > >
> > >                 cur = &rxr->rx_base[i];
> > > @@ -4255,10 +4256,23 @@ next_desc:
> > >                 bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map,
> > >                     BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
> > >
> > > +               commit = i;     /* capture the old index */
> > > +
> > >                 /* Advance our pointers to the next descriptor. */
> > >                 if (++i == adapter->num_rx_desc)
> > >                         i = 0;
> > >                 /*
> > > +               ** Sanity test for ring full, if this
> > > +               ** happens we need to refresh immediately
> > > +               ** or refresh may deadlock.
> > > +               */
> > > +               if (i == rxr->next_to_refresh) {
> > > +                       igb_refresh_mbufs(rxr, commit);
> > > +                       printf("igb_refresh_mbufs called with commit
> %d\n", commit);
> > > +                       processed = 0;
> > > +               }
> > > +
> > > +               /*
> > >                 ** Send to the stack or LRO
> > >                 */
> > >                 if (sendmp != NULL) {
> > >
> > > Here is the results:
> > >
> > > # dmesg | grep Vogel
> > > this driver has a patch from Jack Vogel
> > > this driver has a patch from Jack Vogel
> > >
> > > # netstat -m
> > > 60453/52707/113160 mbufs in use (current/cache/total)
> > > 48416/51584/100000/100000 mbuf clusters in use
> (current/cache/total/max)
> > > 2894/690 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> > > 11946/854/12800/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> > > 164834K/119760K/284595K bytes allocated to network
> (current/cache/total)
> > > 0/339/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > > 0/4/6656 sfbufs in use (current/peak/max)
> > > 0 requests for sfbufs denied
> > > 0 requests for sfbufs delayed
> > > 0 requests for I/O initiated by sendfile
> > > 0 calls to protocol drain routines
> > > # dmesg | grep commit
> > >
> > > At this point RX has hung.
> > >
> > > Somehow the check (i == rxr->next_to_refresh) is never true in this
> case. Also, I did read kern/150516 and couldn't wrap my head around the
> patch for the em driver that Beezar Liu suggested.
> > >
> > > Regards,
> > >
> > > Karim.
> > >
> > >
> >
> >
>
>