Re: boot hangs after installworld at FreeBSD 14.0-CURRENT main-n248198-72f7ddb587a

From: Gary Jennejohn <gljennjohn_at_gmail.com>
Date: Sun, 25 Jul 2021 17:15:57 UTC
On Sun, 25 Jul 2021 12:35:58 -0400
Dennis Clarke via freebsd-current <freebsd-current@freebsd.org> wrote:

> On 7/25/21 11:54, Warner Losh wrote:
> > On Sun, Jul 25, 2021 at 3:30 AM Gary Jennejohn <gljennjohn@gmail.com> wrote:
> >   
> >> I updated my FBSD-14 tree yesterday.
> >>
> >> uname -a shows FreeBSD 14.0-CURRENT #5 main-n248198-72f7ddb587a.
> >>
> >> Did a buildkernel and a clean buildworld yesterday.
> >>
> >> This morning I booted the new kernel, did an installworld and rebooted
> >> the new kernel.
> >>
> >> Or, should I say, I tried to reboot the new kernel.
> >>
> >> During boot I see the following outptut:
> >>
> >> loading /boot/defaults/loader.conf
> >> /
> >>
> >> and the boot hangs.
> >>
> >> The second line should have contained
> >> /boot/test/kernel (I always install new kernels to /boot/test)
> >>
> >> followed by lines containing the various modules which get loaded.
> >>  
> 
> That is interesting. I have uname -apKU here :
> 
> FreeBSD europa 14.0-CURRENT FreeBSD 14.0-CURRENT #3: Sun Jul 25 13:50:33
> GMT 2021     root@europa:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
> amd64 1400026 1400026
> 
> Seems to be running fine with multiple ZFS pools and a lot of snapshots.
> 
> The most recent activity I see in the git log is :
> 
> europa$
> europa$ /opt/bw/bin/git --no-pager log -n 16 --graph
> * commit bbe80bff7c3549128bd19862eea7899b3def1d7f (HEAD -> main,
> origin/main, origin/HEAD)
> | Author: Peter Grehan <grehan@FreeBSD.org>
> | Date:   Sun Jul 25 19:34:14 2021 +1000
> |
> |     arm64: HWCAP/HWCAP2 aux args support for 32-bit ARM binaries.
> |
> |     This fixes build/run of golang under COMPAT32 emulation.
> |
> |     PR:     256897
> |     Reviewed by:    andrew, mmel, manu, jhb, cognet, Robert Clausecker
> |     Tested by:      brd, andrew, Robert Clausecker
> |     MFC after:      3 weeks
> |     Relnotes:       yes
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |     Differential Revision:  https://reviews.freebsd.org/D31175
> |
> .
> .
> .
> 
> I hate being one of those "works for me"(tm) jerks but perhaps there is
> a commit somewhere since yesterday that borked up your kernel? Hardly
> likely.
> 
> Going backwards to the 17th of July I see :
> 
> * commit 87c010e6e364e96e2c1546b3c2bbcbef1dcd422f
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Sat Jul 24 09:47:40 2021 +0200
> |
> |     pf: batch critical section for several counters
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit 02cf67ccf6538b14677672640e405f7f94044dc3
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Thu Jul 22 16:45:14 2021 +0200
> |
> |     pf: switch rule counters to pf_counter_u64
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit d40d4b3ed788b05697541b9ae94b1960ff2cf6f6
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Fri Jul 23 12:29:46 2021 +0200
> |
> |     pf: switch kif counters to pf_counter_u64
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit fc4c42ce0b5ce87901b327e25f55b4e3ab4c6cf5
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Sat Jul 24 07:33:52 2021 +0200
> |
> |     pf: switch pf_status.fcounters to pf_counter_u64
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit defdcdd5648dc1ea789bc54bb45108fcab546a6b
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Thu Jul 22 22:47:24 2021 +0200
> |
> |     pf: add hybrid 32- an 64- bit counters
> |
> |     Numerous counters got migrated from straight uint64_t to the
> counter(9)
> |     API. Unfortunately the implementation comes with a significiant
> |     performance hit on some platforms and cannot be easily fixed.
> |
> |     Work around the problem by implementing a pf-specific variant.
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit 6f1fb6561236fa933835a9a67bd442053fb509e9
> | Author: Mateusz Guzik <mjg@FreeBSD.org>
> | Date:   Sat Jul 24 07:17:27 2021 +0200
> |
> |     pf: drop redundant 'else' in pf_normalize_*
> |
> |     Reviewed by:    kp
> |     Sponsored by:   Rubicon Communications, LLC ("Netgate")
> |
> * commit 0d60235ecd6c711b997345c28e15f0335811e19f
> | Author: Peter Holm <pho@FreeBSD.org>
> | Date:   Sun Jul 25 09:00:53 2021 +0200
> |
> |     stress2: Add another "mdconfig -d -o force" test scenario
> |
> * commit 0626b0a89c2de9c5bfa5b22ed6b021e735a46bbe
> | Author: Robert Wing <rew@FreeBSD.org>
> | Date:   Sat Jul 24 15:57:41 2021 -0800
> |
> |     Add myself to the calendar
> |
> * commit 40cb9b435782de2bc44ff23582d8660072510efc
> | Author: Emmanuel Vadot <manu@FreeBSD.org>
> | Date:   Sat Jul 24 22:05:55 2021 +0200
> |
> |     arm64: allwinner: dtbo: Add dtb overlays to disable mmc node
> |
> |     This is useful for development.
> |     Sponsored by:   Diablotin Systems
> |
> * commit c44685732899aa76e8c77107d711f98717ddc5c8
> | Author: Jason A. Harmening <jah@FreeBSD.org>
> | Date:   Mon Jul 19 08:33:02 2021 -0700
> |
> |     Add stress2 test to exercise FFS forcible unmount with stacked nullfs
> |
> |     Reviewed by:    kib, mckusick
> |     Tested by:      pho
> |     Differential Revision:  https://reviews.freebsd.org/D31016
> |
> * commit 211ec9b7d6ec2d52e2fec2ce10e82c12ec0e4ddd
> | Author: Jason A. Harmening <jah@FreeBSD.org>
> | Date:   Sat Jul 17 22:35:42 2021 -0700
> |
> |     FFS: remove ffs_fsfail_task
> |
> |     Now that dounmount() supports a dedicated taskqueue, we can simply
> call
> |     it with MNT_DEFERRED directly from the failing context.  This also
> |     avoids blocking taskqueue_thread with a potentially-expensive unmount
> |     operation.
> |
> |     Reviewed by:    kib, mckusick
> |     Tested by:      pho
> |     Differential Revision:  https://reviews.freebsd.org/D31016
> |
> 
> Are you on arm64 or ppc64 or some such tier-NOT-1 ? Even my RISC-V stuff
> seems to be working well.
> 

No, I'm using AMD64.

The /boot code which works is from July 9th.  The "bad" code is from
yesterday.  So any change to the loader since then is suspect.  But
there were only a few of those.

I replied to a mail from Warner Losch (imp@) with a suspected commit
which I plan to look into tomorrow UTC time.

-- 
Gary Jennejohn