What is the PREEMPTION option good for?

Sat Dec 2 02:25:51 PST 2006

On Fri, 1 Dec 2006, Matthew Dillon wrote:

> :...
> :the client.  The difference is entirely due to dead time somewhere in
> :nfs.  Unfortunately, turning on PREEMPTION and IPI_PREEMPTION didn't
> :recover all the lost performance.  This is despite the ~current kernel
> :having slightly lower latency for flood pings and similar optimizations
> :for nfs that reduce the RPC count by a factor of 4 and the ping latency
> :by a factor of 2.
>
>    The single biggest NFS client performance issue I have encountered
>    in an environment where most of the data can be cached from earlier
>    runs is with negative name lookups.

That is one of my previous optimizations.  I obtained it from NetBSD,
not from you, sorry :-).  It is not quite ready to commit since I
haven't figured the correct cache timeouts to use with it.  I think
the timeouts for negative cache hits need to be much smaller than
for positive ones, especially for directories, since stale positive
hits tend to cause RPCs which refresh the cache while stale negative
hits tend to prevent RPCs until the cache times out.

>    Due the large number of -I
>    options used in builds, the include search path is fairly long and
>    this usually results in a large number of negative lookups, all of
>    which introduce synchronous dead times while the stat() or open()
>    waits for the over-the-wire transaction to complete.
>
>    The #1 solution is to cache negative namecache hits for NFS clients.
>    You don't have to cache them for long... just 3 seconds is usually
>    enough to remove most of the dead time.  Also make sure your access
>    cache timeout is something reasonable.
>
>    It is possible to reduce the number of over-the-wire transactions to
>    zero but it requires seriously nerfing the access and negative cache

Negative cache hits were only my 3rd or 4th largest optimization.
Avoiding some foot-shooting gave #1 and #2 or 3.  My normal, fairly
safe configuration requires 36000 RPCs for building a RELENG_4 kernel
(down from 120000 unoptimized).  Turning off close-to-open consitency
reduces this to 14000 but is in the serious nerfing class so I don't
normally use it.  Reducing network latency by turning off interrupt
moderation and/or not using NICs that have it, and compiling with -j4
even on UP systems also helped, but since they use more CPU they are
not as free as reducing RPCs.  The dead time with all of these except
turning off close-to-open consistency is about 2.5% for "make -j4" of
a RELENG_4 kernel with warm caches under 2-way SMP.  The dead time
for "make -j4 depend" is much larger since "depend" is not parallelized
so the latency for the RPCs can't be hidden.  OTOH, for parallelized
things, -jN works well for hiding the latency so the main cost of the
extra RPCs is just the CPU time to do them.

Bruce