[CFR][CFT] counter(9): new API for faster and raceless counters

Pawel Jakub Dawidek pjd at FreeBSD.org
Wed Apr 3 10:02:11 UTC 2013


On Wed, Apr 03, 2013 at 02:28:46AM +0200, Luigi Rizzo wrote:
> On Wed, Apr 03, 2013 at 01:26:07AM +0200, Pawel Jakub Dawidek wrote:
> > On Mon, Apr 01, 2013 at 03:51:28PM +0400, Gleb Smirnoff wrote:
> > >   Hi!
> > > 
> > >   Together with Konstantin Belousov (kib@) we developed a new API that is
> > > initially purposed for (but not limited to) collecting statistical
> > > data in kernel.
> > 
> > Is there any plan to implement universal way of exporting those
> > statistics out of the kernel?
> > 
> > Solaris has a framework for in-kernel statistics, which are exported via
> > kstat tool. For ZFS I export them via sysctl. If you have ZFS loaded you
> > can try 'sysctl kstat'.
> > 
> > It would be nice for counter_u64_alloc() to take additional argument
> > 'name' and to create sysctl for the counter automatically. We could then
> > slowly start migrating userland tools to use sysctls (or some wrapper
> > userland API), but we immediately make those statistics available for
> > use in scripts.
> 
> that is an interesting idea but i believe it can be effectively
> built as a wrapper on top of the counter_u64_alloc() routine:
> 
> 	name_counter(counter_t c, const char *fmt, ...);
> 	free_named_counter(counter_t c);
> 
> After all the name->counter mapping is unidirectional,
> and possibly not even necessary on every single counter
> (think of ipfw dynamic rules, created on packet arrivals,  so
> the counter alloc/dealloc needs to be fast).

Right, although I'd optimize API naming and usage for the common case.
Eventhough we do want to able to alloc/free counters quickly sometimes,
most of the time we don't care about alloc/free speed and we would like
to have a name. Having a name argument that could be NULL for
short-living counter would allow to call only one allocation function in
the common case (actually in every case).

> It might be useful for the name_counter() routine to support
> a printf-style argument to make it easy to build names.

Indeed.

> > > o Tiny API for counter(9):
> > > 
> > >      counter_u64_t
> > >      counter_u64_alloc(int wait);
> > > 
> > >      void
> > >      counter_u64_free(counter_u64_t cnt);
> > > 
> > >      void
> > >      counter_u64_add(counter_u64_t cnt, uint64_t inc);
> > > 
> > >      uint64_t
> > >      counter_u64_fetch(counter_u64_t cnt);
> > 
> > Do you really expect other types in the future? If so, could we at least
> > create generic counter_t that internally keeps the type?
> 
> I read the u64 in the name mostly as a reminder to users
> of the counter size. 

Should the users care? As a user of this KPI I'd prefer to have simpler
name and just assume the counter is big enough.

> It might actually make sense is to change the type to s64.
> This way we could have counters that go negative,
> and also use them to accumulate sbintime_t values.

Agreed, int64_t seems better.

> But otherwise i am not sure that we want other types.
> 
> u32/s32 might save atomic/critical_enter ops on some archs,
> but they saturate so quickly that probably are a bad idea.
> And 63/64 bits are quite large already.

Right, I don't think 32bit counters are needed at all and I can't find
any use for 128bit counters either.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20130403/40a3d10a/attachment.sig>


More information about the freebsd-arch mailing list