cvs commit: src/sys/conf files.amd64
Bruce Evans
bde at zeta.org.au
Sun May 23 23:27:20 PDT 2004
On Sun, 23 May 2004, Nate Lawson wrote:
> On Mon, 24 May 2004, Bruce Evans wrote:
> > %%%
> > ...
> > granularity: each sample hit covers 16 byte(s) for 0.00% of 5.40 seconds
> >
> > % cumulative self self total
> > time seconds seconds calls ns/call ns/call name
> > 40.4 2.181 2.181 mcount [1]
> > 15.3 3.009 0.828 814149 1017 1017 acpi_timer_read [7]
> > 14.7 3.802 0.793 mexitcount [9]
> > 5.1 4.076 0.275 cputime [22]
> > 2.3 4.202 0.125 user [42]
> > 1.9 4.307 0.105 408446 258 258 cpu_switch [44]
> > 0.9 4.358 0.051 76 671400 671400 acpi_cpu_c1 [51]
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This is suspicious since it doesn't show up in your TSC profiling below.
I truncated the output since (for another run) it's 9102 lines althogether,
including 975 lines for the flat profile.
> Something is going on if the cpu is hitting the idle thread in one run but
> not the other.
It's probably just for waiting for the disk to load time(1) or ping(8)
after starting profiling. I didn't try hard to avoid such extraneous
stuff.
> Can you run both with machdep.cpu_idle_hlt=0 and let me
> know if the results change? In any case, 1us per call seems accurate.
For profiling "sleep 1" instead of ping:
machdep.cpu_idle_hlt=1:
%%%
...
granularity: each sample hit covers 16 byte(s) for 0.00% of 1.03 seconds
% cumulative self self total
time seconds seconds calls ns/call ns/call name
97.9 1.008 1.008 1192 845410 845410 acpi_cpu_c1 [4]
0.8 1.015 0.008 mcount [5]
0.3 1.019 0.003 3301 1019 1019 acpi_timer_read [8]
0.3 1.022 0.003 mexitcount [13]
0.1 1.023 0.001 262 5331 5331 rtcin [19]
0.1 1.024 0.001 1201 1017 1017 AcpiOsReadPort [25]
0.1 1.025 0.001 cputime [27]
0.0 1.026 0.000 2194 166 166 cpu_switch [36]
...
%%%
machdep.cpu_idle_hlt=0:
%%%
...
granularity: each sample hit covers 16 byte(s) for 0.00% of 1.00 seconds
% cumulative self self total
time seconds seconds calls ns/call ns/call name
60.3 0.605 0.605 mcount [1]
21.9 0.825 0.220 mexitcount [2]
7.6 0.901 0.076 cputime [4]
3.3 0.934 0.033 2752980 12 15 sched_runnable [5]
2.8 0.962 0.028 2751960 10 10 cpu_idle [6]
2.3 0.985 0.024 idle_proc [3]
0.8 0.994 0.008 2752980 3 3 runq_check [7]
0.3 0.997 0.003 3145 1018 1018 acpi_timer_read [10]
0.1 0.998 0.001 256 5330 5330 rtcin [32]
0.1 0.999 0.001 1762 287 437 AcpiUtReleaseToCache [36]
0.0 0.999 0.000 2106 165 165 cpu_switch [48]
...
%%%
The results aren't very interesting of course. They just show that that
idling is working right (the call graph would should this even better),
and that the functions in the non-halting-idle spinloop don't take very
long, and that the functions that do i/o do take very long...
> > The system is configured with acpi, so it chooses an ACPI timecounter
> > (ACPI-fast) as the highest quality, but in fact the ACPI timecounter
> > has much lower quality than the TSC (it is about 100 times slower and
> > 100 times less precise, and no more stable since this is a desktop
> > system which never suspends or throttles the CPU if I can help it).
>
> I agree and think it may be good to do something about this. Care to
> suggest a way to adjust or at least detect TSC rate changes? On my
> laptop, the switch from 1 Ghz to 733 Mhz is done via SMM and thus is not
> user-detectable except through profiling. However, on desktop systems,
> the switch is almost certainly never automatic. We can have the control
> system notify the timecounters when the rate changes.
The TSC could be sample on every clock interrupt or on every timecounter
call. This should at least tell you if the average TSC rate for the
samping period was significantly different, and if there is any long-term
drift.
Bruce
More information about the cvs-src
mailing list