Re: boot hangs after installworld at FreeBSD 14.0-CURRENT main-n248198-72f7ddb587a

From: Gary Jennejohn <gljennjohn_at_gmail.com>
Date: Mon, 26 Jul 2021 06:19:41 UTC
On Sun, 25 Jul 2021 19:02:29 +0200
Gary Jennejohn <gljennjohn@gmail.com> wrote:

> On Sun, 25 Jul 2021 09:54:35 -0600
> Warner Losh <imp@bsdimp.com> wrote:
> 
> > On Sun, Jul 25, 2021 at 3:30 AM Gary Jennejohn <gljennjohn@gmail.com> wrote:
> >   
> > > I updated my FBSD-14 tree yesterday.
> > >
> > > uname -a shows FreeBSD 14.0-CURRENT #5 main-n248198-72f7ddb587a.
> > >
> > > Did a buildkernel and a clean buildworld yesterday.
> > >
> > > This morning I booted the new kernel, did an installworld and rebooted
> > > the new kernel.
> > >
> > > Or, should I say, I tried to reboot the new kernel.
> > >
> > > During boot I see the following outptut:
> > >
> > > loading /boot/defaults/loader.conf
> > > /
> > >
> > > and the boot hangs.
> > >
> > > The second line should have contained
> > > /boot/test/kernel (I always install new kernels to /boot/test)
> > >
> > > followed by lines containing the various modules which get loaded.
> > >
> > > Luckily, I had a USB thumb drive with a FreeBSD memstick.img AND a
> > > complete backup of the old /boot, so I could boot from the thumb
> > > drive and restore /boot (but I moved /boot to /boot.bad before I
> > > did that just in case).  With the restored (old) /boot everything
> > > works.
> > >    
> > 
> > Little has changed in the boot loader. Do you know the hash that worked? Or
> > if I misread above, the has that failed?
> >   
> 
> The /boot code which works was installed at 07:36 UTC July 9th. So,
> every change to the boot code since then is a culprit.
> 
> Example: 9c1c02093b90ae49745a174eb26ea85dd1990eec change to support.4th.
> It just so happens that I had a nextboot.conf in the "bad" /boot at the
> time that the hang occurred.  This is the only potential candidate I
> can see.
> 
> So I'll try overwriting support.4th with the known-good version and
> see what happens.  But probably not until tomorrow my time.
> 

After deleting the nextboot.conf from the "bad" /boot I was able to
boot using the "bad" /boot.  That's the only change between this boot
and the previous boot which hung the computer.  Whether this is a
strong hint that the change to support.4th is the culprit I can't say,
but since the commit message explicitly mentions nextboot.conf as a
reason for the change, that may very well be the case.

I decided that removing nextboot.conf was a better test than using
the old support.4th.

The change looks very simple and innocent, but my 4th knowledge is
pretty much non-existent, so I don't really understand what it does.

I went back to the "old" /boot because I use nextboot.conf a lot.

-- 
Gary Jennejohn