ULE steal_idle questions
Don Lewis
truckman at FreeBSD.org
Sat Aug 26 17:50:25 UTC 2017
On 25 Aug, To: avg at FreeBSD.org wrote:
> On 24 Aug, To: avg at FreeBSD.org wrote:
>> Aside from the Ryzen problem, I think the steal_idle code should be
>> re-written so that it doesn't block interrupts for so long. In its
>> current state, interrupt latence increases with the number of cores and
>> the complexity of the topology.
>>
>> What I'm thinking is that we should set a flag at the start of the
>> search for a thread to steal. If we are preempted by another, higher
>> priority thread, that thread will clear the flag. Next we start the
>> loop to search up the hierarchy. Once we find a candidate CPU:
>>
>> steal = TDQ_CPU(cpu);
>> CPU_CLR(cpu, &mask);
>> tdq_lock_pair(tdq, steal);
>> if (tdq->tdq_load != 0) {
>> goto out; /* to exit loop and switch to the new thread */
>> }
>> if (flag was cleared) {
>> tdq_unlock_pair(tdq, steal);
>> goto restart; /* restart the search */
>> }
>> if (steal->tdq_load < thresh || steal->tdq_transferable == 0 ||
>> tdq_move(steal, tdq) == 0) {
>> tdq_unlock_pair(tdq, steal);
>> continue;
>> }
>> out:
>> TDQ_UNLOCK(steal);
>> clear flag;
>> mi_switch(SW_VOL | SWT_IDLE, NULL);
>> thread_unlock(curthread);
>> return (0);
>>
>> And we also have to clear the flag if we did not find a thread to steal.
>
> I've implemented something like this and added a bunch of counters to it
> to get a better understanding of its behavior. Instead of adding a flag
> to detect preemption, I used the same switchcnt test as is used by
> sched_idletd(). These are the results of a ~9 hour poudriere run:
>
> kern.sched.steal.none: 9971668 # no threads were stolen
> kern.sched.steal.fail: 23709 # unable to steal from cpu=sched_highest()
> kern.sched.steal.level2: 191839 # somewhere on this chip
> kern.sched.steal.level1: 557659 # a core on this CCX
> kern.sched.steal.level0: 4555426 # the other SMT thread on this core
> kern.sched.steal.restart: 404 # preemption detected so restart the search
> kern.sched.steal.call: 15276638 # of times tdq_idled() called
>
> There are a few surprises here.
>
> One is the number of failed moves. I don't know if the load on the
> source CPU fell below thresh, tdq_transferable went to zero, or if
> tdq_move() failed. I also wonder if the failures are evenly distributed
> across CPUs. It is possible that these failures are concentrated on CPU
> 0, which handles most interrupts. If interrupts don't affect switchcnt,
> then the data collected by sched_highest() could be a bit stale and we
> would not know it.
Most of the above failed moves were do to the either tdq_load dropping
below the threshold or tdq_transferable going to zero. These are evenly
distributed across CPUs that we want to steal from. I didn't not bin
the results by which CPU this code was running on. Actual failures of
tdq_move() are bursty and not evenly distributed across CPUs.
I've created this review for my changes:
https://reviews.freebsd.org/D12130
More information about the freebsd-arch
mailing list