Re: Open ZFS vs FreeBSD ZFS boot issues
- Reply: mike tancsa : "Re: Open ZFS vs FreeBSD ZFS boot issues (resolved sort of)"
- In reply to: mike tancsa : "Open ZFS vs FreeBSD ZFS boot issues"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 16 May 2024 14:38:08 UTC
On Thu, May 16, 2024 at 8:14 AM mike tancsa <mike@sentex.net> wrote: > I have a strange edge case I am trying to work around. I have a > customer's legacy VM which is RELENG_11 on ZFS. There is some > corruption that wont clear on a bunch of directories, so I want to > re-create it from backups. I have done this many times in the past but > this one is giving me grief. Normally I do something like this on my > backup server (RELENG_13) > > truncate -s 100G file.raw > mdconfig -f file.raw > gpart create -s gpt md0 > gpart add -t freebsd-boot -s 512k md0 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0 > gpart add -t freebsd-swap -s 2G md0 > gpart add -t freebsd-zfs md0 > zpool create -d -f -o altroot=/mnt2 -o feature@lz4_compress=enabled -o > cachefile=/var/tmp/zpool.cache myZFSPool /dev/md0p3 > I'm surprised you don't specifically create compatibility with some older standard and then maybe add compression. But I'd start there: create one that doesn't use lz4_compress (it's not read-only compatible, meaning the old boot loader has to 100% implement it faithfully). You might also look at disabling one or more of hold_birth and embedded_data as well. Those aren't 'read-only' compatible. But all of that is kinda speculative. I'm not aware of any bugs in this area that were fixed, and all these options are in the features_for_read list that is in the boot loader. > Then zfs send -r backuppool | zfs recv myZFSPool > > I can then export / import the myZFSPool without issue. I can even > import and examine myZFSPool on the original RELENG_11 VM that is > currently running. A checksum of all the files under /boot are > identical. But every time I try to boot it (KVM), it panics early > > FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1 > > (Tues Oct 10:24:17 EDT 2018 user@hostname) > panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0 > --> Press a key on the console to reboot <-- > This is a memory corruption bug. You'll need to find what's corrupting memory and make it stop. I imagine this might be a small incompatibility with OpenZFS or just a bug in what openZFS is generating on the releng13 server. > Through a bunch of pf rdrs and nfs mounts, I was able to do the same > above steps on the live RELENG_11 image and do the zfs send/recv and the > image boots up no problem. Any ideas on how to work around this or > what the problem might be I am running into ? The issue seems to be that > I do the zfs recv on a RELENG_13 box. If I do the zfs recv on RELENG_11 > (which takes a LOT longer) it takes forever. zdb differences [1] below. > > The kernel is r339251 11.2-STABLE. I know this is a crazy old issue, > but hoping to at least learn something about ZFS as a result of going > down this rabbit hole. I will I think just do the send|recv via a > RELENG_11 just to get them up and running. They dont have the $ to get > me to upgrade it all for them and this is partly a favor to them to help > them limp along a bit more... > What version is the boot loader? There's been like 6 years of fixes and churn since the date above? Maybe the latest on RELENG_11 for it if you are still running 11.2-stable. Any chance you can use the stable/13 or stable/14 loaders? 11 is really not supported anymore and hasn't been for quite some time. I have no time for it beyond this quick peek. Warner > ---Mike > > > 1 zdb live pool > > ns9zroot: > version: 5000 > name: 'livezroot' > state: 0 > txg: 26872926 > pool_guid: 15183996218106005646 > hostid: 2054190969 > hostname: 'customer-hostname' > com.delphix:has_per_vdev_zaps > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 15183996218106005646 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 15258031439924457243 > path: '/dev/vtbd0p3' > whole_disk: 1 > metaslab_array: 256 > metaslab_shift: 32 > ashift: 12 > asize: 580889083904 > is_log: 0 > DTL: 865260 > create_txg: 4 > com.delphix:vdev_zap_leaf: 129 > com.delphix:vdev_zap_top: 130 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data > > > MOS Configuration: > version: 5000 > name: 'fromBackupPool' > state: 0 > txg: 2838 > pool_guid: 1150606583960632990 > hostid: 2054190969 > hostname: 'customer-hostname' > com.delphix:has_per_vdev_zaps > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 1150606583960632990 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 4164348845485675975 > path: '/dev/md0p3' > whole_disk: 1 > metaslab_array: 256 > metaslab_shift: 29 > ashift: 12 > asize: 105221193728 > is_log: 0 > create_txg: 4 > com.delphix:vdev_zap_leaf: 129 > com.delphix:vdev_zap_top: 130 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data > Neither of these