cvs commit: src/sys/netinet in_var.h ip_fastfwd.c ip_flow.c
ip_flow.h ip_input.c ip_output.c src/sys/sys mbuf.h src/sys/conf
files src/sys/net if_arcsubr.c if_ef.c if_ethersubr.c if_fddisubr.c
if_iso88025subr.c if_ppp.c
Peter Jeremy
peterjeremy at optushome.com.au
Sat Nov 15 17:48:30 PST 2003
On Sat, Nov 15, 2003 at 11:35:45AM +0100, Andre Oppermann wrote:
> To put this more into perspective wrt counter wrapping, on
>my interfaces I have a byte counter wrap every 40 minutes or so.
>So the true ratio is probably even far less than one percent and
>more in the region of one per mille. The wrapping looks really ugly
>on MRTG and RRtool graphs. Interface counters should be 64bit or
>they become useless with todays traffic levels...
A perennial favourite. Atomically incremented 64-bit counters are
_very_ expensive on i386 and the concensus is that the cost is
unjustified in the general case. Feel free to supply patches to
optionally (at build-time) allow selection of 32-bit or 64-bit
counters. A work-around would be to simulate the top 32-bits by
counting rollovers in the bottom 32 bits (though this requires
co-operation by all consumers that want to see 64-bit values as
well as a background process).
I notice that even DEC/Compaq/HP Tru64 uses 32-bit counters for
network stats.
>> i am pretty sure that in any non-trivial case you will end up having
>> both the slow path and the fast path conflicting for the instruction
>> cache. Merging them might help -- i have seen many cases where
>> inlining code as opposed to explicit function calls makes things
>> slower for this precise reason.
>
>I will try to measure that with more precision. You did have
>code which was able to record and timestamp events several
>thousand times per second. Do still have that code somewhere?
I've done similar things a couple of times using circular buffers
along the following lines:
#define RING_SIZE (1 << some_suitable_value)
int next_entry;
struct entry {
some_time_t now;
foo_t event;
} ring[RING_SIZE];
void __inline insert_event(foo_t event)
{
int ix;
/* following two lines need to be atomic to make this re-entrant */
ix = next_entry;
next_entry = (ix + 1) & (RING_SIZE - 1);
ring[ix].now = read_time();
ring[ix].event = event;
}
In userland, mmap(2) next_entry and ring to unload the events. Pick
RING_SIZE and the time types to suit requirements. The TSC has the
lowest overhead but worst jitter.
Peter
More information about the cvs-src
mailing list