ULE steal_idle questions

Don Lewis truckman at FreeBSD.org
Wed Aug 23 15:04:57 UTC 2017


I've been looking at the steal_idle code in tdq_idled() and found some
things that puzzle me.

Consider a machine with three CPUs:
  A, which is idle
  B, which is busy running a thread
  C, which is busy running a thread and has another thread in queue
It would seem to make sense that the tdq_load values for these three
CPUs would be 0, 1, and 2 respectively in order to select the best CPU
to run a new thread.

If so, then why do we pass thresh=1 to sched_highest() in the code that
implements steal_idle?  That value is used to set cs_limit which is used
in this comparison in cpu_search:
                        if (match & CPU_SEARCH_HIGHEST)
                                if (tdq->tdq_load >= hgroup.cs_limit &&
That would seem to make CPU B a candidate for stealing a thread from.
Ignoring CPU C for the moment, that shouldn't happen if the thread is
running, but even if it was possible, it would just make CPU B go idle,
which isn't terribly helpful in terms of load balancing and would just
thrash the caches.  The same comparison is repeated in tdq_idled() after
a candidate CPU has been chosen:
                if (steal->tdq_load < thresh || steal->tdq_transferable == 0) {
                        tdq_unlock_pair(tdq, steal);
                        continue;
                }

It looks to me like there is an off-by-one error here, and there is a
similar problem in the code that implements kern.sched.balance.


The reason I ask is that I've been debugging random segfaults and other
strange errors on my Ryzen machine and the problems mostly go away if I
either disable kern.sched.steal_idle and kern_sched.balance, or if I
leave kern_sched.steal_idle enabled and hack the code to change the
value of thresh from 1 to 2.  See
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221029> for the gory
details.  I don't know if my CPU has what AMD calls the "performance
marginality issue".




More information about the freebsd-arch mailing list