UPDATE Re: making use of userland dtrace on FreeBSD

Konstantin Belousov kostikbel at gmail.com
Thu Dec 27 19:09:15 UTC 2012


On Thu, Dec 27, 2012 at 11:39:44PM +1100, Bruce Evans wrote:
> After working around these bugs by putting the functions in separate files
> (and removing the now-unneeded volatiles):
> 
> main.c:
> % void foo(void);
> % 
> % int
> % main(void)
> % {
> % 	int i;
> % 
> % 	for (i = 0; i < 100000000; i++)
> % 		foo();
> % }
> 
> foo.c:
> % void bar(void);
> % 
> % void
> % foo(void)
> % {
> % 	bar();
> % }
> 
> bar.c:
> % void
> % bar(void)
> % {
> % }
> 
> we can seem how much the frame pointer optimization is saving: this
> now takes 0.43 seconds with clang and 0.87 seconds with gcc.  It
> is weird that the gcc time increased from 0.65 seconds to 0.87
> despite doing less.  After adding back the volatiles, the times
> are 0.43 seconds with clang and 0.85 seconds with gcc -- doing
> more gave a small optimization, but didn't recover 0.65 seconds.
> There is apparently some magic alignment or misalignment which
> costs or saves about the same as omitting the frame pointer.
> Finally, with gcc -O -fomit-frame-pointer, the program takes 0.60
> seconds, and with gcc -O2 -fomit-frame-pointer, it takes 0.49
> seconds, and with gcc -O2, it takes 0.49 seconds (this really doesn't
> omit frame pointers, so omitting the frame pointer saves nothing),
> With cc -O -fno-omit-frame-pointer, it takes 0.43 seconds, but this
> case is just broken -- the -fno-omit-frame-pointer is silently ignored :-(.
I do not believe this measurement is indicative. i386 is
register-starved architecture. Using the frame pointer means that
you are left with only 6 registers instead of 7. For the PIC code,
there are 5 vs. 6. It is real code that does something more than
incrementing the same variable which could get the performance hit with
-fno-omit-frame-pointer for i386. But on i386 use of the frame pointer
is ABI mandated.

For amd64, there is no so high pressure on the register file, but I do
not know that much debugging tools which expect the frame pointer on
amd64 or could detect and use it if present. It is only ddb for our
kernel and dtrace for solaris and freebsd, gdb definitely does not.

> >>>> need a dwarf2+ unwinder and somebody to instrument the call frame
> >>>> state through the remaining assembler code.
> 
> I wouldn't want it for ddb.  ddb doesn't have access to any debug info
> except the symbol table.
The unwind tables are not debugging. They, if requested, are put into
the loadable segments. The dwarf unwind is required by the ABI on amd64,
and is specified for all other architectures.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20121227/25d32572/attachment.sig>


More information about the freebsd-arch mailing list