EARLY_AP_STARTUP hangs during boot
Gary Jennejohn
gljennjohn at gmail.com
Sun Jul 31 09:29:20 UTC 2016
On Sat, 30 Jul 2016 12:03:59 -0700
John Baldwin <jhb at freebsd.org> wrote:
> On Saturday, July 30, 2016 09:44:22 AM Gary Jennejohn wrote:
> > On Fri, 29 Jul 2016 13:17:42 -0700
> > John Baldwin <jhb at freebsd.org> wrote:
> >
> > > On Thursday, July 28, 2016 12:31:31 AM Gary Jennejohn wrote:
> > > > Well, now I know that ULE is a prerequiste for EARLY_AP_STARTUP! I
> > > > wasn't aware of that. I prefer BSD and that's the scheduler I did
> > > > the first tests with.
> > > >
> > > > But with the ULE scheduler the system comes up all the way.
> > > >
> > > > It would be nice if the BSD scheduler could also be modified to
> > > > work with EARLY_AP_STARTUP.
> > >
> > > I wasn't able to reproduce your hang with 4BSD, but I think I see a
> > > possible problem. Try this:
> > >
> > > diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> > > index 7de56b6..d53331a 100644
> > > --- a/sys/kern/sched_4bsd.c
> > > +++ b/sys/kern/sched_4bsd.c
> > > @@ -327,7 +327,6 @@ maybe_preempt(struct thread *td)
> > > * - The current thread has a higher (numerically lower) or
> > > * equivalent priority. Note that this prevents curthread from
> > > * trying to preempt to itself.
> > > - * - It is too early in the boot for context switches (cold is set).
> > > * - The current thread has an inhibitor set or is in the process of
> > > * exiting. In this case, the current thread is about to switch
> > > * out anyways, so there's no point in preempting. If we did,
> > > @@ -348,7 +347,7 @@ maybe_preempt(struct thread *td)
> > > ("maybe_preempt: trying to run inhibited thread"));
> > > pri = td->td_priority;
> > > cpri = ctd->td_priority;
> > > - if (panicstr != NULL || pri >= cpri || cold /* || dumping */ ||
> > > + if (panicstr != NULL || pri >= cpri /* || dumping */ ||
> > > TD_IS_INHIBITED(ctd))
> > > return (0);
> > > #ifndef FULL_PREEMPTION
> > > @@ -1127,7 +1126,7 @@ forward_wakeup(int cpunum)
> > > if ((!forward_wakeup_enabled) ||
> > > (forward_wakeup_use_mask == 0 && forward_wakeup_use_loop == 0))
> > > return (0);
> > > - if (!smp_started || cold || panicstr)
> > > + if (!smp_started || panicstr)
> > > return (0);
> > >
> > > forward_wakeups_requested++;
> > >
> >
> > Thanks, but with this patch the kernel hangs in exactly the same
> > place as before - after the HPET output.
> >
> > Maybe I'm missing some kernel option which ULE works around, or
> > something like that.
>
> Hmm, ok. Please add KTR_RUNQ and KTR_SMP to the KTR masks, that is
> 'options KTR_COMPILE=(KTR_PROC|KTR_RUNQ|KTR_SMP)' and
> 'options KTR_MASK=(KTR_PROC|KTR_RUNQ|KTR_SMP)'
>
> Please also add this patch (on top of the previous patch):
>
> diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> index 2973a23..bab2278 100644
> --- a/sys/kern/sched_4bsd.c
> +++ b/sys/kern/sched_4bsd.c
> @@ -1278,6 +1278,8 @@ sched_add(struct thread *td, int flags)
> KASSERT(td->td_flags & TDF_INMEM,
> ("sched_add: thread swapped out"));
>
> + CTR2(KTR_PROC, "sched_add: thread %d (%s)", td->td_tid,
> + sched_tdname(td));
> KTR_STATE2(KTR_SCHED, "thread", sched_tdname(td), "runq add",
> "prio:%d", td->td_priority, KTR_ATTR_LINKED,
> sched_tdname(curthread));
> diff --git a/sys/x86/x86/cpu_machdep.c b/sys/x86/x86/cpu_machdep.c
> index f07b97e..1f418f1 100644
> --- a/sys/x86/x86/cpu_machdep.c
> +++ b/sys/x86/x86/cpu_machdep.c
> @@ -440,6 +440,7 @@ cpu_idle_wakeup(int cpu)
> return (0);
> if (*state == STATE_MWAIT)
> *state = STATE_RUNNING;
> + CTR1(KTR_PROC, "cpu_idle_wakeup: wokeup CPU %d", cpu);
> return (1);
> }
>
> (I haven't tried compiling it, you might have to add the sys/ktr.h
> header to cpu_machdep.c if it doesn't build.)
>
> Hopefully we will get some better trace messages before it hangs
> with this added info. The root issue seems to be that 4BSD is
> pinning thread0 to some other CPU (due to sched_bind that happens
> inside of bus_bind_intr() when the HPET driver pins IRQs to CPUs)
> and that other CPU isn't waking up to realize it needs to run thread0.
>
It compiled with no changes needed.
Even though I set MAXCPU to a mere 2, the boot still hadn't
completed after 90 minutes and I broke it off. I still have
the kernel, so I can try it another time when I have less need
for my FreeBSD box.
--
Gary Jennejohn
More information about the freebsd-current
mailing list