svn commit: r323516 - in head/sys: dev/bnxt dev/e1000 kern net sys
Bruce Evans
brde at optusnet.com.au
Sat Sep 16 09:41:30 UTC 2017
On Sat, 16 Sep 2017, Alexander Leidinger wrote:
> Quoting Bruce Evans <brde at optusnet.com.au> (from Sat, 16 Sep 2017 13:46:37
> +1000 (EST)):
>
>> It gives lesser breakage here:
>> - with an old PCI em, an error that occur every few makeworlds over nfs now
>> hang the hardware. It used to be recovered from afger about 10 seconds.
>> This only happened once. I then applied my old fix which ignores the
>> error better so as to recover from it immediately. This seems to work as
>> before.
>
> As I also have an em device which switches into non-working state: what's the
> patch you have for this? I would like to see if your change also helps my
> device to get back into working shape again.
X Index: em_txrx.c
X ===================================================================
X --- em_txrx.c (revision 323636)
X +++ em_txrx.c (working copy)
X @@ -640,9 +640,20 @@
X
X /* Make sure bad packets are discarded */
X if (errors & E1000_RXD_ERR_FRAME_ERR_MASK) {
X +#if 0
X adapter->dropped_pkts++;
X - /* XXX fixup if common */
X return (EBADMSG);
X +#else
X + /*
X + * XXX the above error handling is worse than none.
X + * First it it drops 'i' packets before the current
X + * one and doesn't count them. Then it returns an
X + * error. iflib can't really handle this error.
X + * It just resets, and this usually drops many more
X + * packets (without counting them) and much time.
X + */
X + printf("lem: frame error: ignored\n");
X +#endif
X }
X
X ri->iri_frags[i].irf_flid = 0;
This is for old em. nfs doesn't seem to notice the dropped packet(s) after
this.
I think the comment "fixup if common" means "this error should actually
be handled if it occurs enough to matter".
I removed the increment of the dropped packet count because with the change
none are dropped directly here. I think the error is just for this packet
but more than 1 packet might be dropped by returning in the old code, but
debugging code seem to show no more than 1 packet at a time having an error.
I think returning drops good packets after the bad one together with leaving
the state inconsistent, and it takes almost a reset to recover.
X @@ -703,8 +714,12 @@
X
X /* Make sure bad packets are discarded */
X if (staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) {
X +#if 0
X adapter->dropped_pkts++;
X return EBADMSG;
X +#else
X + printf("em: frame error: ignored\n");
X +#endif
X }
X
X ri->iri_frags[i].irf_flid = 0;
This is for newer em. I haven't noticed any problems with that (except it
has 27 usec higher latency).
Bruce
More information about the svn-src-head
mailing list