Memory allocation performance
Robert Watson
rwatson at FreeBSD.org
Sat Feb 2 01:59:45 PST 2008
On Sat, 2 Feb 2008, Alexander Motin wrote:
> Robert Watson wrote:
>> I guess the question is: where are the cycles going? Are we suffering
>> excessive cache misses in managing the slabs? Are you effectively "cycling
>> through" objects rather than using a smaller set that fits better in the
>> cache?
>
> In my test setup only several objects from zone usually allocated same time,
> but they allocated two times per every packet.
>
> To check UMA dependency I have made a trivial one-element cache which in my
> test case allows to avoid two for four allocations per packet.
Avoiding unnecessary allocations is a good general principle, but duplicating
cache logic is a bad idea. If you're able to structure the below without
using locking, it strikes me you'd do much better, especially if it's in a
single processing pass. Can you not use a per-thread/stack/session variable
to avoid that?
> .....alloc.....
> - item = uma_zalloc(ng_qzone, wait | M_ZERO);
> + mtx_lock_spin(&itemcachemtx);
> + item = itemcache;
> + itemcache = NULL;
> + mtx_unlock_spin(&itemcachemtx);
Why are you using spin locks? They are quite a bit more expensive on several
hardwawre platforms, and any environment it's safe to call uma_zalloc() from
will be equally safe to use regular mutexes from (i.e., mutex-sleepable).
> + if (item == NULL)
> + item = uma_zalloc(ng_qzone, wait | M_ZERO);
> + else
> + bzero(item, sizeof(*item));
> .....free.....
> - uma_zfree(ng_qzone, item);
> + mtx_lock_spin(&itemcachemtx);
> + if (itemcache == NULL) {
> + itemcache = item;
> + item = NULL;
> + }
> + mtx_unlock_spin(&itemcachemtx);
> + if (item)
> + uma_zfree(ng_qzone, item);
> ...............
>
> To be sure that test system is CPU-bound I have throttled it with sysctl to
> 1044MHz. With this patch my test PPPoE-to-PPPoE router throughput has grown
> from 17 to 21Mbytes/s. Profiling results I have sent promised close results.
>
>> Is some bit of debugging enabled that shouldn't be, perhaps due to a
>> failure of ifdefs?
>
> I have commented out all INVARIANTS and WITNESS options from GENERIC kernel
> config. What else should I check?
Hence my request for drilling down a bit on profiling -- the question I'm
asking is whether profiling shows things running or taking time that shouldn't
be.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-hackers
mailing list