cvs commit: src/sys/kern init_main.c kern_malloc.c md5c.c
subr_autoconf.c subr_mbuf.c subr_prf.c tty_subr.c vfs_cluster.c
vfs_subr.c
Marcel Moolenaar
marcel at xcllnt.net
Tue Jul 22 15:53:08 PDT 2003
On Tue, Jul 22, 2003 at 11:32:58PM +0200, Poul-Henning Kamp wrote:
>
> text data bss dec hex filename
> inlined: 17408 76 420 17904 45f0 vm_object.o
> regular: 14944 76 420 15440 3c50 vm_object.o
> -----
> 2464
>
> At least I find that 2k+ code is a non-trivial amount which is
> likely, through prefetch and cache flushing, to have a negative
> performance impact.
Oh?
vm_object_backing_scan() has 3 call-sites. Each of the call-sites
has numerous calls to other functions that may or may not be
predicted right, prefetched right or mess up the instruction or
unified caches. While the inlined code yields a larger amount of
text, I find it hard to claim that this by itself overshadows the
performance advantages of increased ILP, improved scheduling due
to dead-code elimination, better cache behaviour due to increased
locality, branch misprediction avoidance, call overhead avoidance,
or just plain better PRE (partial redundant expression elimination),
GCM (global code motion), GVN (global value numbering) or RA
(register allocation).
Although I do support the removal of the inline keyword to allow
-Werror again and also to provide a sensible (though pessimistic)
starting point for reintroducing some of them, I do not think
there's any ground to use performance gains or losses to defend
the removal of the inline keyword without also providing the
results of measurements performed on all platforms (ok, all tier
1 platforms).
--
Marcel Moolenaar USPA: A-39004 marcel at xcllnt.net
More information about the cvs-src
mailing list