An experiment in PowerMac G5 multi-socket/multi-core having better matching mftb() values
Mark Millard
marklmi at yahoo.com
Mon May 13 10:23:48 UTC 2019
I've been experimenting with a alternate
technique of dealing with boot-time 970 family
PowerMac G5 tbr value synchronization across
sockets/cores. So far it has narrowed the
range significantly. I've reverted my hack for
tolerating the mismatches in order to see how
it goes.
I'm not aware of other contexts having the
threads-get-stuck-sleeping problem from the
tbr mismatch scale that can happen as things
are officially. And, if there are any,
I've no environment to test.
The technique definitely requires the
relationship between the mftb() value
changing rate and the time it takes to
store-release/load-acq each way between
an ap and the bsp to be such that the
round trip time is reasonably measurable,
with a useful combination of accuracy
and precision. Thus there are limits
to its generality if some other context
attempted something analogous.
Each ap does its own instance of the
process. No single delta would work.
It is also based on the expectation that
the store-release/load-acq each way takes
a non-trivial amount of the round trip
time, putting the bsp's activity in the
middle part of the round trip range.
(Interrupts disabled around the relevant
code.) What Ive seen suggests that this
is true.
I've included my exploratory code below.
It is based on my head -r345758 context.
The ap sends the bsp a mftb() value (that
the bsp only used as a flag that it is
time to send back its own mftb() value).
The ap also calculates the approximate
round trip ap-time for this exchange.
From such the ap comes up with the
adjustment to its mftb() values to
approximate the mftb() value the bsp
provided. The ap then uses that as an
adjustment to the ap's tbr value (via
mttb()). So far the results seem to be
a sizable improvement.
[In experiments I've been labeling some variables
volatile, just to indicate that I generally do not
expect loads/stores to be skipped for them. This
does not mean that I'd observed any cases of just
holding a value in a register. This may produce
minor text mismatches with other files not shown
here.]
# svnlite diff /usr/src/sys/powerpc/powermac/platform_powermac.c /usr/src/sys/powerpc/powerpc/mp_machdep.c | more
Index: /usr/src/sys/powerpc/powermac/platform_powermac.c
===================================================================
--- /usr/src/sys/powerpc/powermac/platform_powermac.c (revision 345758)
+++ /usr/src/sys/powerpc/powermac/platform_powermac.c (working copy)
@@ -55,7 +55,7 @@
#include "platform_if.h"
-extern void *ap_pcpu;
+extern void * volatile ap_pcpu;
static int powermac_probe(platform_t);
static int powermac_attach(platform_t);
@@ -333,6 +333,10 @@
return (powermac_smp_fill_cpuref(cpuref, bsp));
}
+#ifdef __powerpc64__
+extern volatile int alternate_timebase_sync_style;
+#endif
+
static int
powermac_smp_start_cpu(platform_t plat, struct pcpu *pc)
{
@@ -366,6 +370,19 @@
}
ap_pcpu = pc;
+#ifdef __powerpc64__
+ switch (mfpvr()>>16)
+ {
+ case IBM970:
+ case IBM970FX:
+ case IBM970MP:
+ alternate_timebase_sync_style= 1;
+ break;
+ default:
+ break;
+ }
+#endif
+ powerpc_sync();
if (rstvec_virtbase == NULL)
rstvec_virtbase = pmap_mapdev(0x80000000, PAGE_SIZE);
Index: /usr/src/sys/powerpc/powerpc/mp_machdep.c
===================================================================
--- /usr/src/sys/powerpc/powerpc/mp_machdep.c (revision 345758)
+++ /usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy)
@@ -70,6 +70,13 @@
static struct mtx ap_boot_mtx;
struct pcb stoppcbs[MAXCPU];
+#if defined(__powerpc64__) && defined(AIM)
+// Part of: Attempt a better-than-historical approximately equal timebase value for ap vs. bsp
+volatile int alternate_timebase_sync_style= 0;
+volatile uint64_t timebase_samples[2]; // 0: from ap; 1: from bsp.
+ // Consider separate cache lines?
+#endif
+
void
machdep_ap_bootstrap(void)
{
@@ -77,19 +84,65 @@
PCPU_SET(awake, 1);
__asm __volatile("msync; isync");
+#if defined(__powerpc64__) && defined(AIM)
+ // Attempt a better-than-historical approximately equal timebase value for ap vs. bsp
+ powerpc_sync();
+ isync();
+ if (alternate_timebase_sync_style) // Requires: timeframe with only one ap at a time
+ {
+ register_t oldmsr= intr_disable();
+
+ while (1u!=timebase_samples[1])
+ ; // spin waiting for bsp to flag that ready to start.
+
+ // Measure a round trip:: to the bsp and back.
+
+ isync(); // Be sure below mftb() result is not from earlier speculative execution.
+ atomic_store_rel_64(&timebase_samples[0], mftb()); // bsp waits for this before its mftb().
+
+ while (1u==timebase_samples[1]) // expect bsp to have: 1u<mftb()
+ ; // spin waiting for bsp's tbr value
+ // Mid-point of ap round trip and the bsp timebase value should be approximately equal
+ // when the tbr's are well matched, absent interruptions on both sides.
+
+ isync(); // Be sure below mftb() result is not from earlier speculative execution.
+ register_t const end_round_trip_time_on_ap= mftb(); // Allows estimate round-trip time.
+
+ int64_t const approx_round_trip_tbr_detla_on_ap= end_round_trip_time_on_ap-timebase_samples[0];
+ int64_t const ap_midpoint_tbr_value= timebase_samples[0] + approx_round_trip_tbr_detla_on_ap/2;
+
+ // Establish delta_to_match_bsp_example such that:
+ // ap_midpoint_tbr_value+delta_to_match_bsp_example==timebase_samples[1] (from bsp)
+ int64_t const delta_to_match_bsp_tbr_example= timebase_samples[1]-ap_midpoint_tbr_value;
+
+ isync(); // Be sure below mftb() result is not from earlier speculative execution.
+ mttb((int64_t)mftb()+delta_to_match_bsp_tbr_example); // Make the ap tbr adjustment.
+
+ atomic_store_rel_64(&timebase_samples[0], 0u); // Get ready for next ap in bsp loop
+ atomic_store_rel_64(&timebase_samples[1], 0u); // also flaging bsp that this ap is done
+
+ mtmsr(oldmsr);
+ }
+#endif
+
while (ap_letgo == 0)
__asm __volatile("or 31,31,31");
__asm __volatile("or 6,6,6");
- /*
- * Set timebase as soon as possible to meet an implicit rendezvous
- * from cpu_mp_unleash(), which sets ap_letgo and then immediately
- * sets timebase.
- *
- * Note that this is instrinsically racy and is only relevant on
- * platforms that do not support better mechanisms.
- */
- platform_smp_timebase_sync(ap_timebase, 1);
+#if defined(__powerpc64__) && defined(AIM)
+ if (!alternate_timebase_sync_style)
+#endif
+ {
+ /*
+ * Set timebase as soon as possible to meet an implicit rendezvous
+ * from cpu_mp_unleash(), which sets ap_letgo and then immediately
+ * sets timebase.
+ *
+ * Note that this is instrinsically racy and is only relevant on
+ * platforms that do not support better mechanisms.
+ */
+ platform_smp_timebase_sync(ap_timebase, 1);
+ }
/* Give platform code a chance to do anything else necessary */
platform_smp_ap_init();
@@ -251,6 +304,34 @@
pc->pc_cpuid, (uintmax_t)pc->pc_hwref,
pc->pc_awake);
smp_cpus++;
+
+#if defined(__powerpc64__) && defined(AIM)
+ // Part of: Attempt a better-than-historical approximately
+ // equal timebase value for ap vs. bsp
+ powerpc_sync();
+ isync();
+ if (alternate_timebase_sync_style)
+ {
+ register_t oldmsr= intr_disable();
+
+ atomic_store_rel_64(&timebase_samples[1], 1u); // flag ap that bsp is ready to start.
+
+ while (0u==timebase_samples[0]) // Expect on ap's: 0u<mftb()
+ ; // spin waiting for ap's tbr value to flag that it is time to send one back.
+
+ isync(); // Be sure below mftb() result is not from earlier speculative execution.
+ atomic_store_rel_64(&timebase_samples[1], mftb()); // Give ap the bsp tbr value
+ // from the time frame.
+
+ // Most of the rest of the usage is in machdep_ap_bootstrap,
+ // other than controling alternate_timebase_sync_style.
+
+ while (0u!=timebase_samples[1])
+ ; // spin waiting for ap's to be done with the samples.
+
+ mtmsr(oldmsr);
+ }
+#endif
} else
CPU_SET(pc->pc_cpuid, &stopped_cpus);
}
@@ -257,14 +338,22 @@
ap_awake = 1;
- /* Provide our current DEC and TB values for APs */
- ap_timebase = mftb() + 10;
- __asm __volatile("msync; isync");
+#if defined(__powerpc64__) && defined(AIM)
+ if (!alternate_timebase_sync_style)
+#endif
+ {
+ /* Provide our current DEC and TB values for APs */
+ ap_timebase = mftb() + 10;
+ __asm __volatile("msync; isync");
+ }
/* Let APs continue */
atomic_store_rel_int(&ap_letgo, 1);
- platform_smp_timebase_sync(ap_timebase, 0);
+#if defined(__powerpc64__) && defined(AIM)
+ if (!alternate_timebase_sync_style)
+#endif
+ platform_smp_timebase_sync(ap_timebase, 0);
while (ap_awake < smp_cpus)
;
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-ppc
mailing list