cvs commit: src/usr.bin/gprof printgprof.c

Bruce Evans bde at zeta.org.au
Fri Oct 7 08:49:45 PDT 2005


On Fri, 7 Oct 2005, Bruce Evans wrote:

>  (2) If a function has a zero call count but has a nonzero self or child
>      time, then print its total self time in the self time per call
>      column as a percentage of the total (self + child) time.  It is
>      not possible to print the times per call in this case because the
>      call count is zero.  Previously, this was handled by leaving both
>      per-call columns blank.  The self time is printed in another column
>      but there was no way to recover the total time.
> ...
>  (2) improves mainly cases like kernel threads.  Most kernel threads
>  appear to be never called because they are always started before
>  userland can run to turn on profiling.  As for main(), the fact that
>  they are called is not very interesting and their callers are
>  uninteresting, but their relative self time is interesting since they
>  are long-running.

gprof output for hi-res (should be normal) kernel profiling now looks
like this:

% granularity: each sample hit covers 16 byte(s) for 0.00% of 65.15 seconds
% 
%   %   cumulative   self              self     total 
%  time   seconds   seconds    calls  ns/call  ns/call  name 
%  86.8     56.584   56.584   103196   548312   548412  sleepq_add [4]
%   3.1     58.573    1.990        0  100.00%           mcount [10]
%   2.3     60.085    1.512      381  3967867  3967867  cpu_idle_default [15]
%   1.8     61.244    1.159        0  100.00%           mexitcount [16]
%   0.8     61.741    0.497        0  100.00%           cputime [23]
%   0.5     62.063    0.321        0  100.00%           user [29]
%   0.3     62.247    0.184        9 20480664 31839385  vmtotal [34]
%   0.3     62.421    0.174    60381     2876     2876  sse_pagecopy [42]
%   0.2     62.526    0.105   108567      970      970  cpu_switch [48]
% [Only lines with self% shown from here on -- mostly for kernel threads]
%   0.0     65.037    0.002        0   12.21%           bintr [120]
%   0.0     65.098    0.001        0  100.00%           __mcount [316]
%   0.0     65.107    0.001        0    0.48%           ithread_loop [39]
%   0.0     65.111    0.001        0   60.26%           scheduler [293]
%   0.0     65.126    0.001        0  100.00%           __mexitcount [361]
%   0.0     65.147    0.000        0    0.11%           random_kthread [77]
%   0.0     65.150    0.000        0    0.24%           sched_sync [156]
%   0.0     65.151    0.000        0    0.00%           idle_proc [13]
%   0.0     65.152    0.000        0    0.15%           buf_daemon [197]
%   0.0     65.152    0.000        0    0.13%           vnlru_proc [196]
%   0.0     65.152    0.000        0    0.01%           schedcpu_thread [60]
%   0.0     65.153    0.000        0    0.12%           vm_pageout [283]
%   0.0     65.153    0.000        0    0.10%           poll_idle [284]
%   0.0     65.153    0.000        0    0.00%           btrap [1]

I forget what this was profiling.  It was for a kernel long ago and
far away from -current.

All threads that make a function call show up as having a self% (even
if the percentage is 0, since the profiling clock is an i386 TSC so
the unscaled counts are nonzero since a function call takes more than
0 cycles).  Leaf functions for profiling support always have a self
usage of 100% because they are not profiled in the normal way so they
appear to have no callers (and being leaves, have no children).  "user"
(userland) has a self usage of 100% because its callers are not seen
and its children are handled specially under Xsyscall, btrap and bintr
for historical and efficiency reasons.

Bruce


More information about the cvs-src mailing list