ZFS/bectl use appears to have an example of not waiting for "Root mount waiting for: CAM" (aarch64 example)

From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 13 Jan 2023 03:17:54 UTC
The failure:

The failure is for making BE 13S-CA72 (in zopt0 on nda1p3)
activated (temporary or not or selected via "8" in the boot
loader) and then attempting to boot. It finds and uses the
kernel okay but the "mount root" stage gets:

CPU  7: ARM Cortex-A72 r0p3 affinity:  3  1
Trying to mount root from zfs:zopt0/ROOT/13S-CA72 []...
Mounting from zfs:zopt0/ROOT/13S-CA72 failed with error 2: unknown file system.
CPU  8: ARM Cortex-A72 r0p3 affinity:  4  0

right after the "Trying" message. This is long before
the boot sequence later gets to:

Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
nda0 at nvme0 bus 0 scbus4 target 0 lun 1
nda0: <INTEL SSDPE21D960GA E2010480 PHM2911200Z0960CGN>
nda0: Serial Number PHM2911200Z0960CGN
nda0: nvme version 1.0 x4 (max x4) lanes PCIe Gen3 (max Gen3) link
nda0: 915715MB (1875385008 512 byte sectors)
nda1 at nvme1 bus 0 scbus5 target 0 lun 1
nda1: <INTEL SSDPED1D960GAY E2010480 PHMB829600B4960EGN>
nda1: Serial Number PHMB829600B4960EGN
nda1: nvme version 1.0 x4 (max x4) lanes PCIe Gen3 (max Gen3) link
nda1: 915715MB (1875385008 512 byte sectors)

so that nda1p3 is finally available to provide pool
zopt0 . I use:


So it appears that ZFS gives up on the partition way too
early under at least some condition(s).

At the mountroot> prompt all the BE alternatives in zopt0
fail when the above happened. An example is:

mountroot> zfs:zopt0/ROOT/main-CA72
Trying to mount root from zfs:zopt0/ROOT/main-CA72 []...
Mounting from zfs:zopt0/ROOT/main-CA72 failed with error 2: unknown file system.

This is despite being able to boot the BE main-CA72 directly.

It looks like, once having given up early, it does not get
out of that state for the drive/partition during mountroot>

For reference, after the failure:

mountroot> ?

List of GEOM managed disk devices:
  gpt/CA72opt0ZFS gpt/CA72opt0SWP gpt/CA72opt0EFI nda1p3 nda1p2 nda1p1 nda1 gpt/RPi3swp3p5 gpt/CA72optM2swp174 gpt/CA72optM2swp32 gpt/CA72optM2swp16 gpt/CA72optM2swp14 ufsid/619582a0ef9c00b3 gpt/CA72optM2ufs gpt/CA72optM2swp10 gpt/CA72optM2efi nda0p8 nda0p7 nda0p6 nda0p5 nda0p4 nda0p3 nda0p2 nda0p1 nda0

Scrubbing the pool zopt0 does not find anything to fix.
But the context is not set up for redundancy, just to
allow bectl use.

I'd also used zfs sends to update a (nearly) duplicate
that I keep on a USB3 NVMe drive, well before
discovering the issue existed. That duplicate has no
problems booting its updated 13S-CA72 BE on an RPi4B.

Creating BE 13S-CA72-copy from an older 13S-CA72 snapshot
produced a BE that boots on the example system:

  zopt0/ROOT/13S-CA72-copy                           -      -          736K  2023-01-12 16:20
    zopt0/ROOT/13S-CA72@to-zprpi-2022-11-16-21-05-58 -      -          1.73G 2022-11-16 21:05

13S-CA72-copy's normal kernel (1301509):
stable/13-n252944-e52aaa644ce1-dirty: Mon Nov  7 09:55:56 PST 2022

13S-CA72's normal kernel (1301510):
stable/13-n253355-d30b57252df8-dirty: Sat Jan  7 01:07:12 PST 2023

Copying 13S-CA72-copy's kernel (1301509-based) into
13S-CA72 and attempting booting based on it still
gets the failure in 13S-CA72 .

Copying 13S-CA72's kernel (1301510-based) to
13S-CA72-copy and attempting to boot 13S-CA72-copy
works just fine.

I've no clue why BE 13S-CA72 is "lucky" enough to show
the problem.

General context information:

The bectl context in question is on a HoneyComb (EDK2
UEFI/ACPI style booting).

# bectl list -s
BE/Dataset/Snapshot                                  Active Mountpoint Space Created

  zopt0/ROOT/13S-CA72                                -      -          5.38G 2021-09-29 00:57
    zopt0/ROOT/main-CA72@2021-04-28-01:40:48-0       -      -          3.92G 2021-04-28 01:40
  13S-CA72@to-zprpi-2022-11-16-21-05-58              -      -          1.73G 2022-11-16 21:05
  13S-CA72@to-zprpi-2023-01-08-13-18-20              -      -          0     2023-01-08 13:18
  13S-CA72@to-zprpi-2023-01-10-19-14-05              -      -          0     2023-01-10 19:14

  zopt0/ROOT/13S-CA72-copy                           -      -          736K  2023-01-12 16:20
    zopt0/ROOT/13S-CA72@to-zprpi-2022-11-16-21-05-58 -      -          1.73G 2022-11-16 21:05

  zopt0/ROOT/13_0R-CA72                              -      -          1.80G 2021-09-29 00:45
    zopt0/ROOT/main-CA72@2021-04-28-01:40:48-0       -      -          3.92G 2021-04-28 01:40
  13_0R-CA72@to-zprpi-2022-11-16-21-05-58            -      -          0     2022-11-16 21:05
  13_0R-CA72@to-zprpi-2023-01-08-13-18-20            -      -          0     2023-01-08 13:18
  13_0R-CA72@to-zprpi-2023-01-10-19-14-05            -      -          0     2023-01-10 19:14

  zopt0/ROOT/13_1R-CA72                              -      -          3.52G 2022-03-10 14:24
    zopt0/ROOT/main-CA72@2021-04-28-01:40:48-0       -      -          3.92G 2021-04-28 01:40
  13_1R-CA72@to-zprpi-2022-11-16-21-05-58            -      -          1.61G 2022-11-16 21:05
  13_1R-CA72@to-zprpi-2023-01-08-13-18-20            -      -          0     2023-01-08 13:18
  13_1R-CA72@to-zprpi-2023-01-10-19-14-05            -      -          0     2023-01-10 19:14

  zopt0/ROOT/main-CA72                               NR     /          10.4G 2023-01-06 17:43
  main-CA72@2021-04-28-01:40:48-0                    -      -          3.92G 2021-04-28 01:40
  main-CA72@to-zprpi-2022-11-16-21-05-58             -      -          451M  2022-11-16 21:05
  main-CA72@2023-01-06-17:43:54-0                    -      -          227M  2023-01-06 17:43
  main-CA72@to-zprpi-2023-01-08-13-18-20             -      -          2.68M 2023-01-08 13:18
  main-CA72@to-zprpi-2023-01-10-19-14-05             -      -          696K  2023-01-10 19:14

  zopt0/ROOT/old-main-CA72                           -      -          404K  2022-11-06 20:28
    zopt0/ROOT/main-CA72@2023-01-06-17:43:54-0       -      -          227M  2023-01-06 17:43
  old-main-CA72@to-zprpi-2023-01-08-13-18-20         -      -          0     2023-01-08 13:18
  old-main-CA72@to-zprpi-2023-01-10-19-14-05         -      -          0     2023-01-10 19:14

The boot media here looks like the below as
seen via "gpart show -pl" :

=>        40  1875384928    nda1  GPT  (894G)
          40      532480  nda1p1  CA72opt0EFI  (260M)
      532520        2008          - free -  (1.0M)
      534528   515899392  nda1p2  CA72opt0SWP  (246G)
   516433920    20971520          - free -  (10G)
   537405440  1337979528  nda1p3  CA72opt0ZFS  (638G)

( nda1 is an Optane 960GB in the PCIe slot in the
HoneyComb. nda1p3 is the partition holding pool
zopt0 .)

(Note: nda0 is a ufs based boot media that is not
what I normally use.)

Mark Millard
marklmi at yahoo.com