em net (optical GigE) driver hangs?
Don Bowman
don at sandvine.com
Tue Apr 22 14:47:44 PDT 2003
From: John Polstra [mailto:jdp at polstra.com]
> Sent: April 22, 2003 16:12
> To: net at freebsd.org
> Subject: Re: em net (optical GigE) driver hangs?
>
>
> In article
> <FE045D4D9F7AED4CBFF1B3B813C8533701918A83 at mail.sandvine.com>,
> Dave Dolson <ddolson at sandvine.com> wrote:
> >
> > Has anyone experienced em interface hangs after approx
> several days of heavy
> > operation?
> >
> > We are using a system which is mostly RELENG_4_7, using
> multiple optical em
> > GigE devices.
> >
> > The symptom is that the interface stops transmitting or
> receiving, reporting
> > drops on output (no tx descriptors) and input errors (MPC
> stat-->no receive
> > descriptors).
> >
> > It turns out that all but 64 transmit descriptors are in
> use. The driver is
> > waiting for the "done" flag to be set so it can clean the
> descriptors.
> > The device is also in the OACTIVE state at this time.
> >
> > After the interface is brought down (or unplugged), the em
> watchdog timer
> > goes off 5s later.
> >
> > We are trying to figure out two things:
> > 1. why did the driver lock up?
> > 2. why didn't the watchdog timer go off earlier?
> >
> > I think we would be happy to solve #2 given the rarity of the event.
> > Is the RELENG_4 version likely to fix the problem?
>
> I think the RELENG_4 version is likely to eliminate the problem. See
> the comment near the define of EM_RDTR in if_em.h (in the RELENG_4
> version of that file, of course).
We saw that, but we are using DEVICE_POLLING, so assumed it was not
the issue. We think instead its another problem, which is also solved
in the RELENG_4 driver, in that em_poll() calls em_start() if device is
running and there are pkts on the queue. em_start() re-arms the timer,
holding off the wdog forever.
--don
More information about the freebsd-net
mailing list