[Bug 237915] "netstat -i" for ixl/lagg shows idrop as 18446744073709551612 (-4) - incorrectly intialized counters?

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 11 May 2022 15:55:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237915

Brian Poole <brian90013@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |brian90013@gmail.com

--- Comment #2 from Brian Poole <brian90013@gmail.com> ---
Created attachment 233858
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=233858&action=edit
Patch to call ixl_vsi_reset_stats()

Hello,

I am seeing the same issue with the ixl driver on FreeBSD-12.3 and believe I
have a fix.

The hardware counters can start with any value so the
ixl_stat_update48()/ixl_stat_update32() functions check if this is the first
read. If so, they save the current value as an offset which is subtracted from
all later reads.

On two different servers (one using ixl for 10G the other for 40G) I used
debugging output to determine some counters were small but non-zero at boot
time. These values are saved as the offsets. Then the counters appear to be
reset to 0 so on the next call newval < oldval and return_value = newval +
(1<<bitsize) - offset = 0 + (1<<bitsize) - offset producing the huge values we
see. Note the multicast counters are 48 bits (281T) and the discard counter is
32 bit (4.3G). If you run netstat with the -a option you can see the
contribution from the multicast registers. On my machines multicast represents
the total count.


There is an ixl_vsi_reset_stats() function that resets the stats, the offsets,
and the flag indicating offsets have been set. This function is not called
anywhere in the driver. Looking at the ice driver, it has a similarly named
function ice_reset_vsi_stats() that is called at the end of
ice_initialize_vsi(). Back to ixl, there is a ixl_initialize_vsi() function
which seems like a likely place to add the call.

I made the addition and tested on both machines. Across multiple reboots the
elapsed stats have always started at zero. I have observed no change in stats
when carrier is lost or the transceiver is pulled. The stats do reset to 0 when
the interface is administratively marked 'DOWN' and then 'UP'.

-- 
You are receiving this mail because:
You are the assignee for the bug.