cvs commit: src/sys/conf files.amd64

Nate Lawson nate at root.org
Sun May 23 22:30:42 PDT 2004


On Mon, 24 May 2004, Bruce Evans wrote:
> On Sun, 23 May 2004, Nate Lawson wrote:
>
> > On Sun, 23 May 2004, Bruce Evans wrote:
> > >   The SMP case hasn't been tested.  The high resolution subcase of this uses
> > >   the i8254, and as on i386's, the locking for this is deficient and the
> > >   i8254 is too inefficient.  The acpi timer is also too inefficient.
> >
> > The ACPI timer is significantly better than it used to be.  It is
> > currently just a bus_space_read, which maps directly to inl.  The fact
> > that IO ports are slow is inescapable.
>
> Actually, its speed hasn't changed, since it is the hardare speed that I
> mean.  It is about 200 times as slow as rdtsc() on my amd64 test system.
>
> %%%
> ...
> granularity: each sample hit covers 16 byte(s) for 0.00% of 5.40 seconds
>
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ns/call  ns/call  name
>  40.4      2.181    2.181                             mcount [1]
>  15.3      3.009    0.828   814149     1017     1017  acpi_timer_read [7]
>  14.7      3.802    0.793                             mexitcount [9]
>   5.1      4.076    0.275                             cputime [22]
>   2.3      4.202    0.125                             user [42]
>   1.9      4.307    0.105   408446      258      258  cpu_switch [44]
>   0.9      4.358    0.051       76   671400   671400  acpi_cpu_c1 [51]
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is suspicious since it doesn't show up in your TSC profiling below.
Something is going on if the cpu is hitting the idle thread in one run but
not the other.  Can you run both with machdep.cpu_idle_hlt=0 and let me
know if the results change?  In any case, 1us per call seems accurate.

> The system is configured with acpi, so it chooses an ACPI timecounter
> (ACPI-fast) as the highest quality, but in fact the ACPI timecounter
> has much lower quality than the TSC (it is about 100 times slower and
> 100 times less precise, and no more stable since this is a desktop
> system which never suspends or throttles the CPU if I can help it).

I agree and think it may be good to do something about this.  Care to
suggest a way to adjust or at least detect TSC rate changes?  On my
laptop, the switch from 1 Ghz to 733 Mhz is done via SMM and thus is not
user-detectable except through profiling.  However, on desktop systems,
the switch is almost certainly never automatic.  We can have the control
system notify the timecounters when the rate changes.

> %%%
> ...
> granularity: each sample hit covers 16 byte(s) for 0.00% of 4.49 seconds
>
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ns/call  ns/call  name
>  47.3      2.123    2.123                             mcount [1]
>  17.2      2.895    0.772                             mexitcount [2]
>   6.0      3.162    0.267                             cputime [13]
>   2.8      3.289    0.126                             user [23]
>   2.4      3.394    0.106   406425      260      260  cpu_switch [27]
>   1.1      3.445    0.051   200006      255     1129  ip_output [15]
>   0.9      3.487    0.041   600153       69     1155  syscall [5]
>   0.9      3.525    0.039  1301899       30       30  bzero [41]
>   0.8      3.561    0.036  1950822       18       18  critical_enter [44]
>   0.7      3.594    0.033  1950822       17       17  critical_exit [46]
>   0.7      3.626    0.032   200003      160      160  fgetsock [47]
>   0.7      3.658    0.032   800100       39       39  copyout [48]
>   0.6      3.687    0.029   600099       48       48  copyin [51]
>   0.6      3.714    0.027   406429       67      474  mi_switch [16]
>   0.5      3.738    0.024   300048       80       80  mb_free [58]
>   0.5      3.761    0.023   200006      114     1353  ip_input [12]
>   0.5      3.783    0.022   100004      221      743  soreceive [36]
>   0.5      3.804    0.021   100018      214      678  kern_select [38]
>   0.4      3.823    0.018   700032       26       26  in_cksumdata [65]
>   0.4      3.841    0.018   200000       91     1196  icmp_input [14]
>   ...
>   0.2      4.184    0.009   811079       11       11  tsc_get_timecount [93]
> ...
> %%%

No c1 idling (HLT) being used above.

-Nate


More information about the cvs-src mailing list