svn commit: r297039 - head/sys/x86/x86
Konstantin Belousov
kostikbel at gmail.com
Fri Mar 25 08:49:10 UTC 2016
On Fri, Mar 25, 2016 at 07:13:54AM +1100, Bruce Evans wrote:
> On Thu, 24 Mar 2016, Konstantin Belousov wrote:
[Skipped lock adaptive spinning text for now].
>
> >> My systems allow speed variations of about 4000:800 = 5:1 for one CPU and
> >> about 50:1 for different CPUs. So the old method gave a variation of up
> >> to 50:1. This can be reduced to only 5:1 using the boot-time calibration.
> > What do you mean by 'for different CPUs' ? I understand that modern ESS
> > can give us CPU frequency between 800-4200MHz, which is what you mean
> > by 'for one CPU'. We definitely do not care if 5usec timeout becomes
> > 25usecs, since we practically never time-out there at all.
>
> Yes, I actually get 4400:800 on i4790K.
>
> The ratio is even larger than that with a hard-coded limit because old
> CPUs are much slower than i4790K. I sometimes run a 367 MHz (P2 class)
> CPU. It is several times slower than a new CPU at the same clock
> frequency, and any throttling would make it even slower.
>
> 50 times slower means that a reasonable emergency timeout of 60 seconds
> becomes 3000 seconds. Local users would get tired of waiting and reset,
> and remote users might have to wait.
But you do not downclock a machine booted at the 4.0Ghz datasheet clock,
down to 367Mhz. For 400Mhz P2 machine, LAPIC would be calibrated at that
400Mhz rate.
> There is another thread about early DELAY() using the i8254 not working
> to calibrate the TSC. That might be just because DELAY() is interrupted.
> DELAY() never bothered to disable interrupts. Its early use for calibrating
> the TSC depends on interrupts mostly not happening then. (My version is
> a bit more careful, but it still doesn't disable interrupts. It
> establishes error bounds provided interrupts are shorter than the i8254
> wrap period.) If the i8254 is virtual, then even disabling interrupts
> on the target wouldn't help, since the disabling would only be virtual.
Yes, the DELAY() calibration is something I wanted to ask about.
Could you, please, take a look at
https://reviews.freebsd.org/D5738
there is a code which would benefit from better (re-)calibration.
Below is the patch to implement calibration of the ipi_wait() busy loop.
On my sandybridge 3.4Ghz, I get the message
LAPIC: ipi_wait() us multiplier 37 (r 128652089678 tsc 3392383992)
diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c
index 7e5087b..0842de5 100644
--- a/sys/x86/x86/local_apic.c
+++ b/sys/x86/x86/local_apic.c
@@ -56,6 +56,7 @@ __FBSDID("$FreeBSD$");
#include <vm/pmap.h>
#include <x86/apicreg.h>
+#include <machine/clock.h>
#include <machine/cpufunc.h>
#include <machine/cputypes.h>
#include <machine/frame.h>
@@ -162,6 +163,7 @@ int x2apic_mode;
int lapic_eoi_suppression;
static u_long lapic_timer_divisor;
static struct eventtimer lapic_et;
+static uint64_t lapic_ipi_wait_mult;
SYSCTL_NODE(_hw, OID_AUTO, apic, CTLFLAG_RD, 0, "APIC options");
SYSCTL_INT(_hw_apic, OID_AUTO, x2apic_mode, CTLFLAG_RD, &x2apic_mode, 0, "");
@@ -391,6 +393,7 @@ lvt_mode(struct lapic *la, u_int pin, uint32_t value)
static void
native_lapic_init(vm_paddr_t addr)
{
+ uint64_t r;
uint32_t ver;
u_int regs[4];
int i, arat;
@@ -484,6 +487,34 @@ native_lapic_init(vm_paddr_t addr)
TUNABLE_INT_FETCH("hw.lapic_eoi_suppression",
&lapic_eoi_suppression);
}
+
+#define LOOPS 1000000
+ /*
+ * Calibrate the busy loop waiting for IPI ack in xAPIC mode.
+ * lapic_ipi_wait_mult contains the number of iterations which
+ * approximately delay execution for 1 microsecond (the
+ * argument to native_lapic_ipi_wait() is in microseconds).
+ *
+ * We assume that TSC is present and already measured.
+ * Possible TSC frequency jumps are irrelevant to the
+ * calibration loop below, the CPU clock management code is
+ * not yet started, and we do not enter sleep states.
+ */
+ KASSERT((cpu_feature & CPUID_TSC) != 0 && tsc_freq != 0,
+ ("TSC not initialized"));
+ r = rdtsc();
+ for (r = 0; r < LOOPS; r++) {
+ (void)lapic_read_icr_lo();
+ ia32_pause();
+ }
+ r = rdtsc() - r;
+ lapic_ipi_wait_mult = (r * 1000000) / tsc_freq / LOOPS;
+ if (bootverbose) {
+ printf("LAPIC: ipi_wait() us multiplier %jd (r %jd tsc %jd)\n",
+ (uintmax_t)lapic_ipi_wait_mult, (uintmax_t)r,
+ (uintmax_t)tsc_freq);
+ }
+#undef LOOPS
}
/*
@@ -1621,31 +1652,26 @@ SYSINIT(apic_setup_io, SI_SUB_INTR, SI_ORDER_THIRD, apic_setup_io, NULL);
* private to the MD code. The public interface for the rest of the
* kernel is defined in mp_machdep.c.
*/
+
+/*
+ * Wait delay microseconds for IPI to be sent. If delay is -1, we
+ * wait forever.
+ */
static int
native_lapic_ipi_wait(int delay)
{
- int x;
+ uint64_t i, counter;
/* LAPIC_ICR.APIC_DELSTAT_MASK is undefined in x2APIC mode */
- if (x2apic_mode)
+ if (x2apic_mode || delay == -1)
return (1);
- /*
- * Wait delay microseconds for IPI to be sent. If delay is
- * -1, we wait forever.
- */
- if (delay == -1) {
- while ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) !=
- APIC_DELSTAT_IDLE)
- ia32_pause();
- return (1);
- }
-
- for (x = 0; x < delay; x++) {
+ counter = lapic_ipi_wait_mult * delay;
+ for (i = 0; i < counter; i++) {
if ((lapic_read_icr_lo() & APIC_DELSTAT_MASK) ==
APIC_DELSTAT_IDLE)
return (1);
- DELAY(1);
+ ia32_pause();
}
return (0);
}
More information about the svn-src-all
mailing list