Questions with a powerpc64/powerpc context: relaxed use of smp_cpus in umtx_busy vs. relaxed updates to smp_cpus in machine dependent code?
Eric van Gyzen
eric at vangyzen.net
Wed Feb 13 21:45:24 UTC 2019
On 2/13/19 2:23 PM, Mark Millard via freebsd-hackers wrote:
> Why I ask the questions below (after providing context):
> There are boot issues on old multi-processor PowerMac G5s that
> frequently hang up during cpu_mp_unleash --but not always.
>
>
> /usr/src/sys/kern/kern_umtx.c has the following code
> (note the smp_cpus use in the machine-independent code):
>
>
> static inline void
> umtxq_busy(struct umtx_key *key)
> {
> struct umtxq_chain *uc;
>
> uc = umtxq_getchain(key);
> mtx_assert(&uc->uc_lock, MA_OWNED);
> if (uc->uc_busy) {
> #ifdef SMP
> if (smp_cpus > 1) {
> int count = BUSY_SPINS;
> if (count > 0) {
> umtxq_unlock(key);
> while (uc->uc_busy && --count > 0)
> cpu_spinwait();
> umtxq_lock(key);
> }
> }
> #endif
> while (uc->uc_busy) {
> uc->uc_waiters++;
> msleep(uc, &uc->uc_lock, 0, "umtxqb", 0);
> uc->uc_waiters--;
> }
> }
> uc->uc_busy = 1;
> }
>
> The use of smp_cpus here on powerpc would be what is called
> a std::memory_order_relaxed load in c++ terms. smp_cpus
> does change during the machine dependent-code cpu_mp_unleash
> in /usr/src/sys/powerpc/powerpc/mp_machdep.c :
>
> static void
> cpu_mp_unleash(void *dummy)
> {
> . . .
> smp_cpus = 0;
> . . .
> STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) {
> . . .
> if (pc->pc_awake) {
> if (bootverbose)
> printf("Adding CPU %d, hwref=%jx, awake=%x\n",
> pc->pc_cpuid, (uintmax_t)pc->pc_hwref,
> pc->pc_awake);
> smp_cpus++;
> } else
> . . .
> }
>
> which are relaxed stores.
>
> [This dos not appear to be a std::memory_order_consume like
> context (no dependency ordered before usage).]
>
> /usr/src/sys/kern/subr_smp.c does initialize smp_cpus to 1
> in its definition. (But it temporarily reverts to zero in
> the above code.)
>
> So far I've not managed to track down examples of specific
> code (in an objdump of the kernel, say) that matches up
> using some form(s) of the following to control access
> order in the various places umtxq_busy is used:
>
> lwsync (acquire/release/AcqRel fence or store-release [with load-acquire code as well])
> or:
> sync (a.k.a. hwsync and sync 0) (sequentially consistent fence/store/load)
>
> Note: smp_cpus is not even volatile so, potentially, for a time a register
> could be all that holds the sequence of smp_cpus values before memory is
> updated later.
>
> Nor have I yet found the earliest use of the umtxq_busy code. If it is
> late enough after cpu_mp_unleash, that might implicitly provide something
> that is not a local code structure.
>
> Can anyone point me to example(s) of what controls umtxq_busy necessarily
> accessing the intended smp_cpus value?
umtxq_busy() is only called by userland synchronization primitives, such
as mutexes, condition variables, and semaphores. Assuming
cpu_mp_unleash() is called before userland is started, umtxq_busy()
should see the correct value of smp_cpus.
However, even if umtxq_busy() sees a value of 0 or 1 when the correct
value would be greater than 1, I don't see how this could cause a
problem, since it would take the safer approach of sleeping instead of
spinning.
Best of luck,
Eric
More information about the freebsd-ppc
mailing list