cvs commit: src/usr.bin/gprof printgprof.c
Bruce Evans
bde at zeta.org.au
Fri Oct 7 08:49:45 PDT 2005
On Fri, 7 Oct 2005, Bruce Evans wrote:
> (2) If a function has a zero call count but has a nonzero self or child
> time, then print its total self time in the self time per call
> column as a percentage of the total (self + child) time. It is
> not possible to print the times per call in this case because the
> call count is zero. Previously, this was handled by leaving both
> per-call columns blank. The self time is printed in another column
> but there was no way to recover the total time.
> ...
> (2) improves mainly cases like kernel threads. Most kernel threads
> appear to be never called because they are always started before
> userland can run to turn on profiling. As for main(), the fact that
> they are called is not very interesting and their callers are
> uninteresting, but their relative self time is interesting since they
> are long-running.
gprof output for hi-res (should be normal) kernel profiling now looks
like this:
% granularity: each sample hit covers 16 byte(s) for 0.00% of 65.15 seconds
%
% % cumulative self self total
% time seconds seconds calls ns/call ns/call name
% 86.8 56.584 56.584 103196 548312 548412 sleepq_add [4]
% 3.1 58.573 1.990 0 100.00% mcount [10]
% 2.3 60.085 1.512 381 3967867 3967867 cpu_idle_default [15]
% 1.8 61.244 1.159 0 100.00% mexitcount [16]
% 0.8 61.741 0.497 0 100.00% cputime [23]
% 0.5 62.063 0.321 0 100.00% user [29]
% 0.3 62.247 0.184 9 20480664 31839385 vmtotal [34]
% 0.3 62.421 0.174 60381 2876 2876 sse_pagecopy [42]
% 0.2 62.526 0.105 108567 970 970 cpu_switch [48]
% [Only lines with self% shown from here on -- mostly for kernel threads]
% 0.0 65.037 0.002 0 12.21% bintr [120]
% 0.0 65.098 0.001 0 100.00% __mcount [316]
% 0.0 65.107 0.001 0 0.48% ithread_loop [39]
% 0.0 65.111 0.001 0 60.26% scheduler [293]
% 0.0 65.126 0.001 0 100.00% __mexitcount [361]
% 0.0 65.147 0.000 0 0.11% random_kthread [77]
% 0.0 65.150 0.000 0 0.24% sched_sync [156]
% 0.0 65.151 0.000 0 0.00% idle_proc [13]
% 0.0 65.152 0.000 0 0.15% buf_daemon [197]
% 0.0 65.152 0.000 0 0.13% vnlru_proc [196]
% 0.0 65.152 0.000 0 0.01% schedcpu_thread [60]
% 0.0 65.153 0.000 0 0.12% vm_pageout [283]
% 0.0 65.153 0.000 0 0.10% poll_idle [284]
% 0.0 65.153 0.000 0 0.00% btrap [1]
I forget what this was profiling. It was for a kernel long ago and
far away from -current.
All threads that make a function call show up as having a self% (even
if the percentage is 0, since the profiling clock is an i386 TSC so
the unscaled counts are nonzero since a function call takes more than
0 cycles). Leaf functions for profiling support always have a self
usage of 100% because they are not profiled in the normal way so they
appear to have no callers (and being leaves, have no children). "user"
(userland) has a self usage of 100% because its callers are not seen
and its children are handled specially under Xsyscall, btrap and bintr
for historical and efficiency reasons.
Bruce
More information about the cvs-src
mailing list