Re: Mountroot problems on RPi3/aarch64

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sun, 26 Jun 2022 02:22:26 UTC
On 2022-Jun-25, at 14:51, bob prohaska <fbsd@www.zefox.net> wrote:

> On Thu, Jun 23, 2022 at 06:43:24PM -0700, Mark Millard wrote:
>> There is another checkin to main for superblock handling:
>> 
>> QUOTE
>> The branch main has been updated by mckusick:
>> 
>> URL: https://cgit.FreeBSD.org/src/commit/?id=50dc4c7df4156863148e6a9609c03e852e2aeb35
>> 
> 
> Here's the tail of the boot transcript:
> 
> Root mount waiting for: CAM
> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> da0: <ASMT ASM105x 0> Fixed Direct Access SPC-4 SCSI device
> da0: Serial Number 12345678D558
> da0: 40.000MB/s transfers
> da0: 953869MB (1953525168 512 byte sectors)
> da0: quirks=0x2<NO_6_BYTE>
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)

I'm not UFS/FFS implementation literate but I can show what I
see when I look up some of the related source code. Looking up
cgdmin leads to sys/ufs/ffs/fs.h material:

/*
 * Cylinder group macros to locate things in cylinder groups.
 * They calc filesystem addresses of cylinder group data structures.
 */
#define cgbase(fs, c)   (((ufs2_daddr_t)(fs)->fs_fpg) * (c))
. . .
#define cgdmin(fs, c)   (cgstart(fs, c) + (fs)->fs_dblkno)      /* 1st data */
. . .
#define cgstart(fs, c)                                                  \
       ((fs)->fs_magic == FS_UFS2_MAGIC ? cgbase(fs, c) :               \
       (cgbase(fs, c) + (fs)->fs_old_cgoffset * ((c) & ~((fs)->fs_old_cgmask))))

From the cgdmin(fs, 0) I learn that the cgbase(fs, 0)
involved is 0 (i.e., zero). From that, looking at
what cgstart(fs, 0) would be leads to concluding that:

(fs)->fs_old_cgoffset * ((c) & ~((fs)->fs_old_cgmask))

is in use and ends up being 5056. From that I see that:

(fs)->fs_magic == FS_UFS2_MAGIC

is false.

But the messages produced via:

	CHK(fs->fs_csaddr, !=, cgdmin(fs, 0), %jd);

implies that the code did validate the (fs)->fs_magic
value and it passed. For reference:

# grep MAGIC /usr/main-src/sys/ufs/ffs/fs.h
#define	FS_UFS1_MAGIC	0x011954	/* UFS1 fast filesystem magic number */
#define	FS_UFS2_MAGIC	0x19540119	/* UFS2 fast filesystem magic number */
#define	FS_BAD_MAGIC	0x19960408	/* UFS incomplete newfs magic number */
#define	CG_MAGIC	0x090255
#define	cg_chkmagic(cgp) ((cgp)->cg_magic == CG_MAGIC)
       ((fs)->fs_magic == FS_UFS2_MAGIC ? cgbase(fs, c) :		\

From the code structure and messaging I infer that:

(fs)->fs_magic == FS_UFS1_MAGIC

or the code would have done:

	} else {
		/* Bad magic number, so assume not a superblock */
		return (ENOENT);
	}

without producing the messaging.

Why it would be a UFS1 file system that was created originally,
I do not know.

> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Mounting from ufs:/dev/da0s2a failed with error 22; retrying for 3 more seconds
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Mounting from ufs:/dev/da0s2a failed with error 22: Invalid fstype.
> 
> Loader variables:
>  vfs.root.mountfrom=ufs:/dev/da0s2a
>  vfs.root.mountfrom.options=rw
> 
> Manual root filesystem specification:
>  <fstype>:<device> [options]
>      Mount <device> using filesystem <fstype>
>      and with the specified (optional) option list.
> 
>    eg. ufs:/dev/da0s1a
>        zfs:zroot/ROOT/default
>        cd9660:/dev/cd0 ro
>          (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)
> 
>  ?               List valid disk boot devices
>  .               Yield 1 second (for background tasks)
>  <empty line>    Abort manual input
> 
> mountroot> 
> 
> Rebooting using a kernel of:
> 
> FreeBSD 14.0-CURRENT #74 main-n255816-e26ef41f799: Wed May 25 15:05:14 PDT 2022
>    bob@www.zefox.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64
> 
> stops in single user with:
> Root mount waiting for: CAM
> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> da0: <ASMT ASM105x 0> Fixed Direct Access SPC-4 SCSI device
> da0: Serial Number 12345678D558
> da0: 40.000MB/s transfers
> da0: 953869MB (1953525168 512 byte sectors)
> da0: quirks=0x2<NO_6_BYTE>
> Warning: no time-of-day clock registered, system time will not be set accurately
> Dual Console: Serial Primary, Video Secondary
> Setting hostuuid: 30303030-3030-3030-3064-626136386435.
> Setting hostid: 0x5cd40a6a.
> Starting file system checks:
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Cannot find file system superblock
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Cannot find file system superblock
> Warning! Some of the devices might not be available; retrying
> Restarting file system checks:
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Cannot find file system superblock
> UFS2 superblock failed: fs->fs_csaddr (806456) != cgdmin(fs, 0) (5056)
> Cannot find file system superblock
> Unknown error 3; help!
> ERROR: ABORTING BOOT (sending SIGTERM to parent)!
> 2022-06-25T14:23:46.792050-07:00 - init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
> Enter full pathname of shell or RETURN for /bin/sh: 
> root@:/ # 
> 
> However, simply exiting the single-user shell seems to bring up
> normal multi-user operation. 
> 
> Network connectivity remains sporadic, but is much helped by an outgoing
> ping process. 
> 
> Could it be significant that this filesystem was created on June 4, 2020?

June 4 is after:

2022-05-27: Do comprehensive UFS/FFS superblock integrity checks when reading a superblock.
2022-06-01: Two bug fixes to UFS/FFS superblock integrity checks when reading a superblock.

but before:

2022-06-11: Bug fix to UFS/FFS superblock integrity checks when reading a superblock.

(and before the later additions of messages about superblock failure details).

But I can not tell what the status of the system was that created
the apparently problematical file system. It could be much older
for all I know.

None of these messages suggest code changes to creating UFS file
systems, just to validation.

I will note that the 2022-05-27 check-in does report:

QUOTE
. . . Although
it appears in only one place, the new code will apply to the kernel
modules and (through libufs) user applications that read in superblocks.
END QUOTE

This gets into why the older kernel behaves differently when used
with the newer world and why there are messages from the newer
world code even with the older kernel.


Of course, none of these notes provide any solutions, just
background information.


===
Mark Millard
marklmi at yahoo.com