How to get past "internal error: cannot import 'zroot': Integrity check failed" (no ability to import the pool)?
Date: Thu, 25 Aug 2022 03:57:00 UTC
I seem to have gotten into a state where no zpool related command that required identification of a pool (such as by name) can work because import can not make the zpool available. (I give more context later.) How do I re-establish the freebsd-zfs partition into a form that I can repopulate it when its failed pool can not be imported? I'm appearently limited to zpool commands that reference the device instead of the pool (name) because the likes of "zpool import -f -FX . . ." leads to a panic. Note that this was for media that used zfs just to use bectl, not for other typical zfs reasons. For example, redundancy was-not/is-not a goal. For reference: => 40 3907029088 da0 GPT (1.8T) 40 32728 - free - (16M) 32768 524288 1 efi (256M) 557056 7340032 2 freebsd-swap (3.5G) 7897088 26214400 - free - (13G) 34111488 20971520 3 freebsd-swap (10G) 55083008 12582912 - free - (6.0G) 67665920 29360128 4 freebsd-swap (14G) 97026048 4194304 - free - (2.0G) 101220352 33554432 5 freebsd-swap (16G) 134774784 67108864 6 freebsd-swap (32G) 201883648 364904448 7 freebsd-swap (174G) 566788096 2795503616 8 freebsd-zfs (1.3T) 3362291712 544737416 - free - (260G) At this point no attempt to preserve the content of the freebsd-zfs partition seems a likely way of going. But I'm unclear on how to even start over, given no ability to make the pool accessible by name. The sequence leading to how things are went like . . . # git -C /usr/ports fetch error: error reading from .git/objects/pack/pack-8e819c78469vm_fault: pager read error, pid 1370 (git) c212148fe5d3922cc807e6858768e.pack: Input/output error vm_fault: pager read error, pid 1370 (git) . . . # bectl activate main-CA72 panic: VERIFY3(0 == bpobj_open(&bpo, dl->dl_os, dlce->dlce_bpobj)) failed (0 == 97) cpuid = 0 time = 1661389515 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x30 pc = 0xffff0000007fcfd0 lr = 0xffff000000101b80 sp = 0xffff0000b49f4ee0 fp = 0xffff0000b49f50e0 db_trace_self_wrapper() at vpanic+0x178 pc = 0xffff000000101b80 lr = 0xffff0000004cef08 sp = 0xffff0000b49f50f0 fp = 0xffff0000b49f5150 vpanic() at spl_panic+0x40 pc = 0xffff0000004cef08 lr = 0xffff00000129f360 sp = 0xffff0000b49f5160 fp = 0xffff0000b49f51f0 spl_panic() at dsl_deadlist_space_range+0x264 pc = 0xffff00000129f360 lr = 0xffff00000133d4f4 sp = 0xffff0000b49f5200 fp = 0xffff0000b49f53c0 dsl_deadlist_space_range() at snaplist_space+0x4c pc = 0xffff00000133d4f4 lr = 0xffff00000133899c sp = 0xffff0000b49f53d0 fp = 0xffff0000b49f5440 snaplist_space() at dsl_dataset_promote_check+0x648 pc = 0xffff00000133899c lr = 0xffff0000013385e8 sp = 0xffff0000b49f5450 fp = 0xffff0000b49f5530 dsl_dataset_promote_check() at dsl_sync_task_sync+0xcc pc = 0xffff0000013385e8 lr = 0xffff00000135fb3c sp = 0xffff0000b49f5540 fp = 0xffff0000b49f5590 dsl_sync_task_sync() at dsl_pool_sync+0x3cc pc = 0xffff00000135fb3c lr = 0xffff00000135251c sp = 0xffff0000b49f55a0 fp = 0xffff0000b49f55e0 dsl_pool_sync() at spa_sync+0x8f8 pc = 0xffff00000135251c lr = 0xffff00000138be28 sp = 0xffff0000b49f55f0 fp = 0xffff0000b49f57f0 spa_sync() at txg_sync_thread+0x1d8 pc = 0xffff00000138be28 lr = 0xffff0000013a2bf8 sp = 0xffff0000b49f5800 fp = 0xffff0000b49f58f0 txg_sync_thread() at fork_exit+0x88 pc = 0xffff0000013a2bf8 lr = 0xffff00000047c568 sp = 0xffff0000b49f5900 fp = 0xffff0000b49f5950 fork_exit() at fork_trampoline+0x14 pc = 0xffff00000047c568 lr = 0xffff00000081edd4 sp = 0xffff0000b49f5960 fp = 0x0000000000000000 KDB: enter: panic [ thread pid 4 tid 100198 ] Stopped at kdb_enter+0x48: undefined f907011f I was unable to boot from the media after this. Plugged the media into another machine . . . # zpool import -F -n zroot cannot import 'zroot': pool was previously in use from another system. Last accessed by <unknown> (hostid=0) at Wed Aug 24 18:05:15 2022 The pool can be imported, use 'zpool import -f' to import the pool. # zpool import -f zroot Aug 24 18:13:47 CA72_16Gp_ZFS ZFS[1588]: failed to load zpool zroot Aug 24 18:13:47 CA72_16Gp_ZFS ZFS[1612]: failed to load zpool zroot Aug 24 18:13:47 CA72_16Gp_ZFS ZFS[1616]: failed to load zpool zroot internal error: cannot import 'zroot': Integrity check failed Abort trap (core dumped) # gdb zpool zpool.core . . . Core was generated by `zpool import -f zroot'. Program terminated with signal SIGABRT, Aborted. Sent by thr_kill() from pid 1716 and user 0. #0 thr_kill () at thr_kill.S:4 4 RSYSCALL(thr_kill) (gdb) bt #0 thr_kill () at thr_kill.S:4 #1 0x00002b0c6f3794f0 in __raise (s=s@entry=6) at /usr/main-src/lib/libc/gen/raise.c:52 #2 0x00002b0c6f420494 in abort () at /usr/main-src/lib/libc/stdlib/abort.c:67 #3 0x00002b0c69415744 in zfs_verror (hdl=0x2b0c76263000, error=2092, fmt=fmt@entry=0x2b0c693d3135 "%s", ap=...) at /usr/main-src/sys/contrib/openzfs/lib/libzfs/libzfs_util.c:344 #4 0x00002b0c69416324 in zpool_standard_error_fmt (hdl=hdl@entry=0x2b0c76263000, error=error@entry=97, fmt=0x2b0c693d3135 "%s") at /usr/main-src/sys/contrib/openzfs/lib/libzfs/libzfs_util.c:729 #5 0x00002b0c69415ec8 in zpool_standard_error (hdl=0x0, hdl@entry=0x2b0c76263000, error=0, error@entry=97, msg=0x2b0c6ea23350 <__thr_sigprocmask> "\377\203", msg@entry=0x2b0c665668e8 "cannot import 'zroot'") at /usr/main-src/sys/contrib/openzfs/lib/libzfs/libzfs_util.c:619 #6 0x00002b0c6940687c in zpool_import_props (hdl=0x2b0c76263000, config=config@entry=0x2b0c95939080, newname=newname@entry=0x0, props=props@entry=0x0, flags=flags@entry=2) at /usr/main-src/sys/contrib/openzfs/lib/libzfs/libzfs_pool.c:2193 #7 0x00002b0be60f3344 in do_import (config=0x2b0c95939080, newname=0x0, mntopts=0x0, props=props@entry=0x0, flags=flags@entry=2) at /usr/main-src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:3190 #8 0x00002b0be60f3108 in import_pools (pools=pools@entry=0x2b0c762780e0, props=<optimized out>, mntopts=mntopts@entry=0x0, flags=flags@entry=2, orig_name=0x2b0c7622d028 "zroot", new_name=0x0, do_destroyed=do_destroyed@entry=B_FALSE, pool_specified=pool_specified@entry=B_TRUE, do_all=B_FALSE, import=0x2b0c665684a0) at /usr/main-src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:3318 #9 0x00002b0be60e9074 in zpool_do_import (argc=1, argv=<optimized out>) at /usr/main-src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:3804 #10 0x00002b0be60e3ce8 in main (argc=4, argv=<optimized out>) at /usr/main-src/sys/contrib/openzfs/cmd/zpool/zpool_main.c:10918 (gdb) quit # zpool import -f -FX -N -R /zroot-mnt -t zroot zprpi . . . evantually . . . panic: Solaris(panic): zfs: adding existent segment to range tree (offset=7a001ba000 size=40000) cpuid = 8 time = 1661395806 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x30 vpanic() at vpanic+0x13c panic() at panic+0x44 vcmn_err() at vcmn_err+0x10c zfs_panic_recover() at zfs_panic_recover+0x64 range_tree_add_impl() at range_tree_add_impl+0x184 range_tree_walk() at range_tree_walk+0xa4 metaslab_load() at metaslab_load+0x6a4 metaslab_preload() at metaslab_preload+0x8c taskq_run() at taskq_run+0x1c taskqueue_run_locked() at taskqueue_run_locked+0x190 taskqueue_thread_loop() at taskqueue_thread_loop+0x130 fork_exit() at fork_exit+0x88 fork_trampoline() at fork_trampoline+0x14 KDB: enter: panic [ thread pid 6 tid 108968 ] Stopped at kdb_enter+0x44: undefined f907c27f db> I had to unplug the disk to avoid reboots simply retrying the import and crashing the same way again. SIDE NOTE: Then, on reboot, I saw the following: . . . Setting hostid: 0x6522bfc4. cannot import 'zroot': no such pool or dataset Destroy and re-create the pool from a backup pid 49 (zpool) is attempting to use unsafe AIO requests - not logging anymore pid 49 (zpool), jid 0, uid 0: exited on signal 6 source. cachefile import failed, retrying nvpair_value_nvlist(nvp, &rv) == 0 (0x16 == 0) ASSERT at /usr/main-src/sys/contrib/openzfs/module/nvpair/fnvpair.c:592:fnvpair_value_nvlist()Abort trap Import of zpool cache /etc/zfs/zpool.cache failed, will retry after root mount hold release cannot import 'zroot': no such pool or dataset Destroy and re-create the pool from a backup source. cachefile imporpid 55 (zpool), jid 0, uid 0: exited on signal 6 t failed, retrying nvpair_value_nvlist(nvp, &rv) == 0 (0x16 == 0) ASSERT at /usr/main-src/sys/contrib/openzfs/module/nvpair/fnvpair.c:592:fnvpair_value_nvlist()Abort trap Starting file system checks: /dev/gpt/CA72opt0EFI: 281 files, 231 MiB free (14770 clusters) FIXED . . . Removing /etc/zfs/zpool.cache allowed reboots to avoid such. END SIDE NOTE. For reference, for the machine where I can plug in the media: # uname -apKU # line split for better readability FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #59 main-n256584-5bc926af9fd1-dirty: Wed Jul 6 18:10:52 PDT 2022 root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm64.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400063 1400063 === Mark Millard marklmi at yahoo.com