The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching
Mark Johnston
markj at freebsd.org
Mon Jul 24 02:15:09 UTC 2017
On Thu, Jul 20, 2017 at 03:45:39PM +0200, Mark Martinec wrote:
> 2017-07-20 02:03, Mark Johnston wrote:
> > One thing to try at this point would be to disable EARLY_AP_STARTUP in
> > the kernel config. That is, take a configuration with which you're able
> > to reproduce the hang during boot, and remove "options
> > EARLY_AP_STARTUP".
>
> Done. And it avoids the problem altogether! Thanks.
> Tried a reboot several times and it succeeds every time.
Thanks. Sorry for the delayed follow-up.
>
> Here is all that I had in a config file for building a kernel,
> i.e. I took away the 'options DDB' which also seemingly avoided
> the problem:
> include GENERIC
> ident NELI
> nooptions EARLY_AP_STARTUP
Could you try re-enabling EARLY_AP_STARTUP, applying the patch at the
end of this email, and see if the message "sleeping before eventtimer
init" appears in the boot output? If it does, it'll be followed by a
backtrace that might be useful for tracking down the hang. It might
produce false positives, but we'll see.
>
> > This feature has a fairly large impact on the bootup process and has
> > had a few problems that manifested as hangs during boot. There was at
> > least one other case where an innocuous change to the kernel
> > configuration "fixed" the problem by introducing some second-order
> > effect (causing kernel threads to be scheduled in a different
> > order, for instance).
>
> > Regardless of whether the suggestion above makes a difference, it would
> > be helpful to see verbose dmesgs from both a clean boot and a boot that
> > hangs. If disabling EARLY_AP_STARTUP helps, then we can try adding some
> > assertions that will cause the system to panic when the hang occurs,
> > making it easier to see what's going on.
>
> Hmmm.
> I have now saved a couple of versions of /var/run/dmesg.boot
> (in boot_verbose mode) when EARLY_AP_STARTUP is disabled and
> the boot is successful. However, I don't know how to capture
> such log when booting hangs, as I have no serial interface
> and the boot never completes. All I have is a screen photo
> of the last state when a hang occurs (showing ada disks
> successfully attached, followed immediately by the attempt
> to attach a da disk, which hangs).
Ok, let's not worry about this for now.
Index: sys/kern/kern_clock.c
===================================================================
--- sys/kern/kern_clock.c (revision 321401)
+++ sys/kern/kern_clock.c (working copy)
@@ -385,6 +385,8 @@
static int devpoll_run = 0;
#endif
+bool inited_clocks = false;
+
/*
* Initialize clock frequencies and start both clocks running.
*/
@@ -412,6 +414,8 @@
#ifdef SW_WATCHDOG
EVENTHANDLER_REGISTER(watchdog_list, watchdog_config, NULL, 0);
#endif
+
+ inited_clocks = true;
}
/*
Index: sys/kern/kern_synch.c
===================================================================
--- sys/kern/kern_synch.c (revision 321401)
+++ sys/kern/kern_synch.c (working copy)
@@ -298,6 +298,8 @@
return (rval);
}
+extern bool inited_clocks;
+
/*
* pause() delays the calling thread by the given number of system ticks.
* During cold bootup, pause() uses the DELAY() function instead of
@@ -330,6 +332,10 @@
DELAY(sbt);
return (0);
}
+ if (cold && !inited_clocks) {
+ printf("%s: sleeping before eventtimer init\n", curthread->td_name);
+ kdb_backtrace();
+ }
return (_sleep(&pause_wchan[curcpu], NULL, 0, wmesg, sbt, pr, flags));
}
More information about the freebsd-stable
mailing list