igb driver RX (was TX) hangs when out of mbuf clusters
Jack Vogel
jfvogel at gmail.com
Wed Feb 9 07:32:27 UTC 2011
Hmmm, well so much for that theory :)
Jack
On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin <
fodillemlinkarim at gmail.com> wrote:
>
>
> 2011/2/8 Jack Vogel <jfvogel at gmail.com>
>
>
>> I have been following this, and thinking about it. I still am working from
>> a theoretical
>> standpoint, but based on a patch I got quite a long time back and never
>> quite groked,
>> I believe now that I might have a solution.
>>
>> The original PR and patch was kern/150516 from Beezar Liu, I was never
>> quite comfortable
>> with the code changes, nor convinced that it was a real issue and not a
>> misunderstanding.
>> However I think now that this very report might be behind what we are
>> seeing today. I have
>> a slightly different approach to solving it, of course it remains to be
>> seen if it handles it
>> properly.
>>
>> Please try the patch I've attached, I'm open to further correction or
>> polishing of the
>> changes. And thanks to Beezar for his original report and changes, this is
>> not for em,
>> but if this eliminates the problem its clearly needed in all drivers.
>>
>> Jack
>>
>>
>> Hi Jack,
>
> Thanks for your help. I tried your patch and it didn't work so I added a
> couple of printf to see if the added code was getting hit:
>
> --- a/freebsd/sys/dev/e1000/if_igb.c
> --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c
> @@ -612,7 +612,7 @@ igb_attach(device_t dev)
> device_get_nameunit(dev));
>
> INIT_DEBUGOUT("igb_attach: end");
> -
> + printf("this driver has a patch from Jack Vogel\n");
> return (0);
>
> err_late:
> @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int
> *done)
> struct mbuf *sendmp, *mh, *mp;
> struct igb_rx_buf *rxbuf;
> u16 hlen, plen, hdr, vtag;
> + int commit;
> bool eop = FALSE;
>
> cur = &rxr->rx_base[i];
> @@ -4255,10 +4256,23 @@ next_desc:
> bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map,
> BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
>
> + commit = i; /* capture the old index */
> +
> /* Advance our pointers to the next descriptor. */
> if (++i == adapter->num_rx_desc)
> i = 0;
> /*
> + ** Sanity test for ring full, if this
> + ** happens we need to refresh immediately
> + ** or refresh may deadlock.
> + */
> + if (i == rxr->next_to_refresh) {
> + igb_refresh_mbufs(rxr, commit);
> + printf("igb_refresh_mbufs called with commit %d\n",
> commit);
> + processed = 0;
> + }
> +
> + /*
> ** Send to the stack or LRO
> */
> if (sendmp != NULL) {
>
> Here is the results:
>
> # dmesg | grep Vogel
> this driver has a patch from Jack Vogel
> this driver has a patch from Jack Vogel
>
> # netstat -m
> 60453/52707/113160 mbufs in use (current/cache/total)
> 48416/51584/100000/100000 mbuf clusters in use (current/cache/total/max)
> 2894/690 mbuf+clusters out of packet secondary zone in use (current/cache)
> 11946/854/12800/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 164834K/119760K/284595K bytes allocated to network (current/cache/total)
> 0/339/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/4/6656 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
> # dmesg | grep commit
>
> At this point RX has hung.
>
> Somehow the check (i == rxr->next_to_refresh) is never true in this case.
> Also, I did read kern/150516 and couldn't wrap my head around the patch for
> the em driver that Beezar Liu suggested.
>
> Regards,
>
> Karim.
>
>
More information about the freebsd-net
mailing list