cvs commit: src/sys/dev/em if_em.c
Ruslan Ermilov
ru at freebsd.org
Tue Aug 22 10:13:23 UTC 2006
On Tue, Aug 22, 2006 at 06:44:52PM +0900, Pyun YongHyeon wrote:
> On Tue, Aug 22, 2006 at 12:22:37PM +0400, Ruslan Ermilov wrote:
> > I agree this is a less painful way to recover, but it's still a
> > watchdog and it slows down the performance when it happens. After
> > this commit, if there's a moderate number of missing Tx completion
> > interrupts (for some reason), even a diagnostic message won't be
> > printed. This is bad -- users will "seem" to have working but
> > slow systems, without any indication of what causes this slowness.
>
> It just reinvokes txeof handler and check whether there are pending Tx
> descriptors in driver queue. If there are no pending Tx descriptors
> it's false watchdog timeout and just return without resetting
> hardware.
>
This is all clear.
> So there is no performance drop. Of course, if we are out of
> Tx descriptors and missed Tx completion interrupts it would slow down
> Tx process.
>
Yes, that's what I was talking about.
> ATM I don't know what caused this missing Tx completion interrupt.
> (chipset bug/Tx interrupt moderation or other bug)
>
> > I think a diagnostic message should still be printed in this case,
> I have no objections on printing a diagnostic message. But if missing
> Tx completion interrupts is normal consequences for these hardwares
> it would give negative impresstion to users.
>
It would tell the true, like
em0: watchdog timeout (missed Tx interrupt) -- recovering
(Maybe under bootverbose only.)
> > and adapter->watchdog_events should still be incrementd, we just
> > don't need to reinit the chip in this case.
> >
> adapter->watchdog_events is used to count output errors(if_oerrors).
> If we know the watchog timeout is false we should not increment the
> counter as we sucessfully transmitted it without errors.
>
It's still a watchdog event. We can make it a separate counter,
like watchdog_tx_event, and not add it to oerrors, but still show
it in em_print_hw_stats(). It'd be useful to have this statistics
available.
> Because it's hard to reproduce it I guess it only happens under
> certain conditions. In addition we don't know how many Tx completion
> interrupts are lost. If you think it should recover fast from the
> above condition wihtout waiting for a watchdog timeout we could
> embebd an em_txeof() into em_local_timer() to sweep up Tx
> descriptors sucessfully transmitted.
>
That would make it look more like polling. :-)
I'm pretty sure this problem is not unique to em(4). Adding
these quirks to all known to be subject to this issue drivers
and gathering the statistics would be a good thing IMO.
Cheers,
--
Ruslan Ermilov
ru at FreeBSD.org
FreeBSD committer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/cvs-src/attachments/20060822/8e245d63/attachment.pgp
More information about the cvs-src
mailing list