Re: Tracking down BTX behavior change.

From: Warner Losh <imp_at_bsdimp.com>
Date: Mon, 09 Aug 2021 23:41:56 UTC
On Mon, Aug 9, 2021 at 5:33 PM Ian Lepore <ian@freebsd.org> wrote:

> On Mon, 2021-08-09 at 15:19 -0700, John W wrote:
> > Ah, forgot the bug link:
> >
> > [1]  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257722
> >
> > On 8/9/21, John W <jwdevel@gmail.com> wrote:
> > > I have a problem on my system where 11.3 boots fine, but any
> > > supported
> > > release (11.4, 12.2, 13.0) fail during the BTX stage. There's a bug
> > > report if you're interested[1]
> > >
> > > But my question is: how can I track down differences in BTX code? I
> > > have the 11.3 and 11.4 SVN sources checked out, and so far I am not
> > > able to find any differences.
> > >
> > > So far, I am looking in: src/stand/i386/btx/ and src/usr.sbin/btxld
> > >
>
> Don't focus solely on the stand/i386/btx directory.  All the code under
> src/stand is part of the loader.
>

And BIOS boot on amd64 is a 32-bit i386 binary, so all its source lives
under stand/i386/btx...  But it's not BTX that's dying, per se, rather it's
the thing btx is loading / is running is hitting a problem. The EIP is weird
as well, at 0x1011f002 which is at 269,611,010 and that's well above
where the ~500k loader should be and well below where I'd expect
the BIOS routines to live...

Not much was merged between 11.3 and 11.4. Can you describe your system
in terms of CPU, RAM, etc?

But the interface between the  different parts of the boot system didn't
change, so it's safe to use 11.3 gptboot pmbr with an 11.4 /boot/loader
and kernel.

The bug posted shows what I think is gptboot dying.  At least I think so,
the bug
didn't say if it was UFS or ZFS...

My guess is that things are a bit bigger, and an allocation is failing and
the
boot code isn't coping as well as it should with a nice error, but it could
always
be some other rare, edge case bug that you're running into.

It may also make sense to check out the stable/13 branch to see where,
exactly
on that branch it fails to give some clue about what the root cause might
be.

Warner