Re: Periodic rant about SCHED_ULE
- Reply: Andriy Gapon : "Re: Periodic rant about SCHED_ULE"
- In reply to: Dewayne Geraghty : "Re: Periodic rant about SCHED_ULE"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 19 Jul 2021 00:37:43 UTC
On Thu, 15 Jul 2021 11:03:04 +1000 Dewayne Geraghty wrote: > On 15/07/2021 1:47 am, RW via freebsd-hackers wrote: > > kern.sched.preempt_thresh=224 > > ... > > I think the default only allows preemption by real-time and kernel > > threads. > > > Hi RW, Note the PRI(ority) column when you perform /usr/bin/top. > Processes with a PRI below the default kern.sched.preempt_thresh=80 > (ie nice -n 8) may pre-empt other processes or send interprocessor > interrupts to others (CPUs). I haven't got time to look into this in detail but from a cursory examination it looks like there must be some kind of translation between the PRI values seen in top and the priorities used in the scheduler. A threshold of 80 would be sensible in the context of top. I just ran a test and a cpu-bound process got a PRI of 85. But this seems to be a pure coincidence. kern.sched.preempt_thresh is being compared with the schedulers internal priorities and is defaulted to PRI_MIN_KERN, the highest priority in the kernel range, one level below realtime. From sys/sys/priority.h #define PRI_MIN_REALTIME (48) #define PRI_MAX_REALTIME (PRI_MIN_KERN - 1) #define PRI_MIN_KERN (80) #define PRI_MAX_KERN (PRI_MIN_TIMESHARE - 1) #define PRI_MIN_TIMESHARE (120) #define PRI_MAX_TIMESHARE (PRI_MIN_IDLE - 1) #define PRI_MIN_IDLE (224) > idprio 0 top > is assigned a starting PRI of 124; so on SCHED_ULE, these processes > will receive cpu time (even at idprio 31) but won't pre-empt others. > > If you really want all processes to pre-empt others, enabling > FULL_PREEMPTION achieves the same goal as 224. I don't have a use > case for no pre-emption. Anyone? > > Why kern.sched.preempt_thresh=224 helps desktop users, I can only > speculate that with a high threshold, more IPI's are sent to other CPU > cores so they can be busy (?). Refer to > /usr/src/sys/kern/sched_ule.c -- > Returning to the topic. Its a very hard choice between schedulers. I > did a lot of testing between them and tuning to see if one excelled on > my humble Xeon-E3. I couldn't see a significant difference between > workloads - though next time (and a hint for others) I'll disable SMT > and set dev.cpu.0.freq to disable turbo behaviour. For now, > sched_4bsd appears to be more efficient in terms of code complexity > and people with high CPU workloads have preferred sched_4bsd in the > past, while sched_ule has a lot of things to tweak and is recommended > by the FreeBSD project. Otherwise it wouldn't be the default > > Looking at > https://github.com/freebsd/freebsd-src/tree/main/sys/kern/sched_*.c > their histories are tweaked a couple of times a year, so I wouldn't > rule sched_4bsd out of contention just yet. > > FWIW, my servers modify only: > kern.sched.affinity=7 > kern.sched.interact=0 > kern.sched.slice=128 > while firewalls: > kern.sched.balance=0 > kern.sched.interact=0 > > A loadable schedule has been discussed here a few times - I vaguely > recall it being inefficient (complexity) and unnecessary (you'll > determine one scheduler and unless testing, unlikely to change). > Further in the past, sched_4bsd was to be removed, but some > demonstrated it had better performance for their workload. > Cheerio. > > > >