ULE steal_idle questions

Don Lewis truckman at FreeBSD.org
Sat Aug 26 17:50:25 UTC 2017


On 25 Aug, To: avg at FreeBSD.org wrote:
> On 24 Aug, To: avg at FreeBSD.org wrote:
>> Aside from the Ryzen problem, I think the steal_idle code should be
>> re-written so that it doesn't block interrupts for so long.  In its
>> current state, interrupt latence increases with the number of cores and
>> the complexity of the topology.
>> 
>> What I'm thinking is that we should set a flag at the start of the
>> search for a thread to steal.  If we are preempted by another, higher
>> priority thread, that thread will clear the flag.  Next we start the
>> loop to search up the hierarchy.  Once we find a candidate CPU:
>> 
>>                 steal = TDQ_CPU(cpu);
>>                 CPU_CLR(cpu, &mask);
>>                 tdq_lock_pair(tdq, steal);
>> 		if (tdq->tdq_load != 0) {
>> 			goto out; /* to exit loop and switch to the new thread */
>> 		}
>> 		if (flag was cleared) {
>> 			tdq_unlock_pair(tdq, steal);
>> 			goto restart; /* restart the search */
>> 		}
>> 		if (steal->tdq_load < thresh || steal->tdq_transferable == 0 ||
>> 		    tdq_move(steal, tdq) == 0) {
>> 			tdq_unlock_pair(tdq, steal);
>> 			continue;
>> 		}
>> 	    out:
>> 	    	TDQ_UNLOCK(steal);
>> 	    	clear flag;
>> 	    	mi_switch(SW_VOL | SWT_IDLE, NULL);
>> 	    	thread_unlock(curthread);
>> 	    	return (0);
>> 
>> And we also have to clear the flag if we did not find a thread to steal.
> 
> I've implemented something like this and added a bunch of counters to it
> to get a better understanding of its behavior.  Instead of adding a flag
> to detect preemption, I used the same switchcnt test as is used by
> sched_idletd().  These are the results of a ~9 hour poudriere run:
> 
> kern.sched.steal.none: 9971668   # no threads were stolen
> kern.sched.steal.fail: 23709     # unable to steal from cpu=sched_highest()
> kern.sched.steal.level2: 191839  # somewhere on this chip
> kern.sched.steal.level1: 557659  # a core on this CCX
> kern.sched.steal.level0: 4555426 # the other SMT thread on this core
> kern.sched.steal.restart: 404    # preemption detected so restart the search
> kern.sched.steal.call: 15276638  # of times tdq_idled() called
> 
> There are a few surprises here.
> 
> One is the number of failed moves.  I don't know if the load on the
> source CPU fell below thresh, tdq_transferable went to zero, or if
> tdq_move() failed.  I also wonder if the failures are evenly distributed
> across CPUs.  It is possible that these failures are concentrated on CPU
> 0, which handles most interrupts.  If interrupts don't affect switchcnt,
> then the data collected by sched_highest() could be a bit stale and we
> would not know it.

Most of the above failed moves were do to the either tdq_load dropping
below the threshold or tdq_transferable going to zero.  These are evenly
distributed across CPUs that we want to steal from.  I didn't not bin
the results by which CPU this code was running on.  Actual failures of
tdq_move() are bursty and not evenly distributed across CPUs.

I've created this review for my changes:
https://reviews.freebsd.org/D12130


More information about the freebsd-arch mailing list