svn commit: r252032 - head/sys/amd64/include
Bruce Evans
brde at optusnet.com.au
Tue Jun 25 03:24:43 UTC 2013
On Tue, 25 Jun 2013, I wrote:
> My current best design:
> - use ordinary mutexes to protect counter fetches in non-per-CPU contexts.
> - use native-sized or always 32-bit counters. Counter updates are done
> by a single addl on i386. Fix pcpu.h on arches other than amd64 and
> i386 and use the same method as there.
> - counter fetches add the native-sized pcpu counters to 64-bit non-pcpu
> counters, when the native-size counters are in danger of overflowing
> or always, under the mutex. Transferring data uses an ordinary
> atomic_cmpset. To avoid ifdefs, always use u_int pcpu counters.
> The 64-bit non-pcpu counters can easily be changed to pcpu counters
> if the API is extended to support pcpu data.
> - run a daemon every few minutes to fetch all the counters, so that
> the native-sized counters are in no danger of overflowing on systems
> that don't run statistics programs often enough to fetch the counters
> to actually use.
There is at least 1 counter decrement (add -1) in tcp, so the native counters
need to be signed.
> ...
> With my design:
>
> extern struct mtx counter_locks[];
> extern uint64_t counters[];
This is pseudo-code. The extra structure must be dynamically allocated
with each counter. I'm not sure how to do that. uint64_t_pcpu_zone
is specialized for pcpu counters, and a non-pcpu part is needed.
> uint64_t r;
> volatile u_int *p;
> u_int v;
Change to int.
> int cnum;
>
> cnum = ctonum(c);
> mtx_lock(&counter_locks[cnum]); /* might not need 1 per counter */
> r = counters[cnum];
> for (i = 0; i < mp_ncpus; i++) {
> p = (u_int *)((char *)c + sizeof(struct pcpu) * i);
Change to int *.
> v = *p; /* don't care if it is stale */
> if (v >= 0x80000000) {
Change the critical level to 2 critical levels, 0x40000000 for positive
values and -0x40000000 for negative values.
> /* Transfer only when above critical level. */
> while (atomic_cmpset_rel_int(p, v, 0) == 0)
> v = *p; /* still don't care if it is stale */
> counters[cnum] += v;
Even though full counter values are not expected to become negative,
the native counters can easily become negative when a decrement occurs
after the transfer resets them to 0.
> }
> r += v;
> }
> mtx_unlock(&counter_locks[cnum]);
> return (r);
>
> Mutexes give some slowness in the fetching code, but fetches eare expected
> to be relatively rare.
>
> I added the complication to usually avoid atomic ops at the last minute.
> The complication might not be worth it.
The complication is to test v so as to normally avoid doing the transfer
and its atomic ops (and to not use atomic ops for loading v). The
complication is larger with 2 thresholds. If we always transferred,
then *p would usually be small and often 0, so that decrementing it
would often make it -1. This -1 must be transferred by adding -1, not
by adding 0xfffffffff. Changing the type of the native counter to int
gives this.
Bruce
More information about the svn-src-head
mailing list