Re: git: 589aed00e36c - main - sched: separate out schedinit_ap()
Date: Wed, 17 Nov 2021 23:25:55 UTC
On Thu, Nov 18, 2021 at 01:10:17AM +0200, Konstantin Belousov wrote: > On Wed, Nov 17, 2021 at 04:44:29PM -0600, Kyle Evans wrote: > > On Wed, Nov 3, 2021 at 3:55 PM Kyle Evans <kevans@freebsd.org> wrote: > > > > > > The branch main has been updated by kevans: > > > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=589aed00e36c22733d3fd9c9016deccf074830b1 > > > > > > commit 589aed00e36c22733d3fd9c9016deccf074830b1 > > > Author: Kyle Evans <kevans@FreeBSD.org> > > > AuthorDate: 2021-11-02 18:06:47 +0000 > > > Commit: Kyle Evans <kevans@FreeBSD.org> > > > CommitDate: 2021-11-03 20:54:59 +0000 > > > > > > sched: separate out schedinit_ap() > > > > > > schedinit_ap() sets up an AP for a later call to sched_throw(NULL). > > > > > > Currently, ULE sets up some pcpu bits and fixes the idlethread lock with > > > a call to sched_throw(NULL); this results in a window where curthread is > > > setup in platforms' init_secondary(), but it has the wrong td_lock. > > > Typical platform AP startup procedure looks something like: > > > > > > - Setup curthread > > > - ... other stuff, including cpu_initclocks_ap() > > > - Signal smp_started > > > - sched_throw(NULL) to enter the scheduler > > > > > > cpu_initclocks_ap() may have callouts to process (e.g., nvme) and > > > attempt to sched_add() for this AP, but this attempt fails because > > > of the noted violated assumption leading to locking heartburn in > > > sched_setpreempt(). > > > > > > Interrupts are still disabled until cpu_throw() so we're not really at > > > risk of being preempted -- just let the scheduler in on it a little > > > earlier as part of setting up curthread. > > > > > > > What's the general consensus on potential out-of-tree archs maintained > > on stable branches? I'd like to MFC this at least to stable/13 to > > avoid it being in the way of the nvme change that spurred it, and I'm > > trying to decide if it should have something like this added to make > > it safe: > I do not believe that we even think of guaranteeing this level of source > stability. At first I assumed this was referencing sparc64, but that is not present in stable/13 either. I believe stable/13 and main support the same set of platforms, in which case I agree that we shouldn't bother with trying to provide extra compatibility, and I think it's probably not necessary to merge this to 12. > > > > diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c > > index 217d685b8587..f07f5e91d8f3 100644 > > --- a/sys/kern/sched_ule.c > > +++ b/sys/kern/sched_ule.c > > @@ -2995,6 +2995,11 @@ sched_throw(struct thread *td) > > > > tdq = TDQ_SELF(); > > if (__predict_false(td == NULL)) { > > + if (tdq == NULL || PCPU_GET(idlethread)->td_lock != > > + TDQ_LOCKPTR(tdq)) { > > + schedinit_ap(); > > + tdq = TDQ_SELF(); > > + } > > TDQ_LOCK(tdq); > > /* Correct spinlock nesting. */ > > spinlock_exit(); > > >