Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 17 Apr 2023 13:45:01 UTC
José_Pérez <fbl_at_aoek.com> wrote on Date: Mon, 17 Apr 2023 12:28:40 UTC : > El 2023-04-17 12:43, Pawel Jakub Dawidek escribió: > > On 4/17/23 18:15, Pawel Jakub Dawidek wrote: > >> There were three issues that I know of after the recent OpenZFS merge: > >> > >> 1. Data corruption unrelated to block cloning, so it can happen even > >> with block cloning disabled or not in use. This was the problematic > >> commit: > >> > >> https://github.com/openzfs/zfs/commit/519851122b1703b8445ec17bc89b347cea965bb9 > >> > >> It was reverted in 63ee747febbf024be0aace61161241b53245449e. > >> > >> 2. Data corruption with embedded blocks when block cloning is enabled. > >> It can happen when compression is enabled and the block contains > >> between 60 to 112 bytes (this might be hard to determine). Fix exists, > >> it is merged to OpenZFS already, but isn't in FreeBSD yet. > >> OpenZFS pull request: https://github.com/openzfs/zfs/pull/14739 > >> > >> 3. Panic on VERIFY(zil_replaying(zfsvfs->z_log, tx)). This is > >> triggered when block cloning is enabled, the sync property is set to > >> disabled and copy_file_range(2) is used. Easy fix exists, it is not > >> yet merged to OpenZFS and not yet in FreeBSD HEAD. > >> OpenZFS pull request: https://github.com/openzfs/zfs/pull/14758 > >> > >> Block cloning was disabled in > >> 46ac8f2e7d9601311eb9b3cd2fed138ff4a11a66, so 2 and 3 should not occur. > > > > As of 068913e4ba3dd9b3067056e832cefc5ed264b5cc all known issues are > > fixed, as far as I can tell. > > > > Block cloning remains disabled for now just to be on the safe side, > > but can be enabled by setting sysctl vfs.zfs.bclone_enabled to 1. > > > > Don't relay on this sysctl as it will be removed in 2-3 weeks. > > Hi Pawel, > thank you for your reply and for the fixes. > > I think there is a 4th issue that needs to be addressed: how do we > recover from the worst case scenario which is a machine with a kernel > > 2a58b312b62f and ZFS root upgraded with block cloning enabled. > > In particular, is it safe to turn such a machine on in the first place, > and what are the risks involved in doing so? Any potential data loss? > > Would such a machine be able to fix itself by compiling a kernel, or > would compilation fail and might data be corrupted in the process? > > I have two poudriere builders powered off (I am not alone in this > situation) and I need to recover them, ideally minimizing data loss. The > builders are also hosting current and used to build kernels and worlds > for 13 and current: as of now all my production machines are stuck on > the 13 they run, I cannot update binaries nor packages and I would like > to be back online. > > Whatever the fixing procedure, it shall be outlined in the UPDATING > document. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270811 is an example issue where a FreeBSD powerpc package building server can not boot --after patching so it no longer gets a boot time "panic: floating-point unavailable trap" (that jhibbits patch is still not committed): QUOTE from the description: . . . nda1: 953869MB (1953525168 512 byte sectors) GEOM_MIRROR: Device mirror/swap0 launched (2/2). Mounting from zfs:zroot failed with error 6; retrying for 3 more seconds Mounting from zfs:zroot failed with error 6. Loader variables: vfs.root.mountfrom=zfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot> This machine is part of the FreeBSD cluster for building PowerPC packages, so we can build kernels to test anytime necessary. END QUOTE === Mark Millard marklmi at yahoo.com