Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics
- Reply: Glen Barber : "Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics"
- In reply to: Mark Millard : "main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 07 Sep 2023 18:17:22 UTC
[Drat, the request to rerun my tests did not not mention the more recent change: vfs: copy_file_range() between multiple mountpoints of the same fs type and I'd not noticed on my own and ran the test without updating.] On Sep 7, 2023, at 11:02, Mark Millard <marklmi@yahoo.com> wrote: > I was requested to do a test with vfs.zfs.bclone_enabled=1 and > the bulk -a build paniced (having stored 128 *.pkg files in > .building/ first): Unfortunately, rerunning my tests with this set was testing a context predating: Wed, 06 Sep 2023 . . . • git: 969071be938c - main - vfs: copy_file_range() between multiple mountpoints of the same fs type Martin Matuska So the information might be out of date for main and for stable/14 : I've no clue how good of a test it was. May be some of those I've cc'd would know. When I next have time, should I retry based on a more recent vintage of main that includes 969071be938c ? > # more /var/crash/core.txt.3 > . . . > Unread portion of the kernel message buffer: > panic: Solaris(panic): zfs: accessing past end of object 422/1108c16 (size=2560 access=2560+2560) > cpuid = 15 > time = 1694103674 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0352758590 > vpanic() at vpanic+0x132/frame 0xfffffe03527586c0 > panic() at panic+0x43/frame 0xfffffe0352758720 > vcmn_err() at vcmn_err+0xeb/frame 0xfffffe0352758850 > zfs_panic_recover() at zfs_panic_recover+0x59/frame 0xfffffe03527588b0 > dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x97/frame 0xfffffe0352758960 > dmu_brt_clone() at dmu_brt_clone+0x61/frame 0xfffffe03527589f0 > zfs_clone_range() at zfs_clone_range+0xa6a/frame 0xfffffe0352758bc0 > zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x1ae/frame 0xfffffe0352758c40 > vn_copy_file_range() at vn_copy_file_range+0x11e/frame 0xfffffe0352758ce0 > kern_copy_file_range() at kern_copy_file_range+0x338/frame 0xfffffe0352758db0 > sys_copy_file_range() at sys_copy_file_range+0x78/frame 0xfffffe0352758e00 > amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe0352758f30 > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0352758f30 > --- syscall (569, FreeBSD ELF64, copy_file_range), rip = 0x1ce4506d155a, rsp = 0x1ce44ec71e88, rbp = 0x1ce44ec72320 --- > KDB: enter: panic > > __curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57 > 57 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu, > (kgdb) #0 __curthread () at /usr/main-src/sys/amd64/include/pcpu_aux.h:57 > #1 doadump (textdump=textdump@entry=0) > at /usr/main-src/sys/kern/kern_shutdown.c:405 > #2 0xffffffff804a442a in db_dump (dummy=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) > at /usr/main-src/sys/ddb/db_command.c:591 > #3 0xffffffff804a422d in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=true) > at /usr/main-src/sys/ddb/db_command.c:504 > #4 0xffffffff804a3eed in db_command_loop () > at /usr/main-src/sys/ddb/db_command.c:551 > #5 0xffffffff804a7876 in db_trap (type=<optimized out>, code=<optimized out>) > at /usr/main-src/sys/ddb/db_main.c:268 > #6 0xffffffff80bb9e57 in kdb_trap (type=type@entry=3, code=code@entry=0, tf=tf@entry=0xfffffe03527584d0) at /usr/main-src/sys/kern/subr_kdb.c:790 > #7 0xffffffff8104ad3d in trap (frame=0xfffffe03527584d0) > at /usr/main-src/sys/amd64/amd64/trap.c:608 > #8 <signal handler called> > #9 kdb_enter (why=<optimized out>, msg=<optimized out>) > at /usr/main-src/sys/kern/subr_kdb.c:556 > #10 0xffffffff80b6aab3 in vpanic (fmt=0xffffffff82be52d6 "%s%s", ap=ap@entry=0xfffffe0352758700) > at /usr/main-src/sys/kern/kern_shutdown.c:958 > #11 0xffffffff80b6a943 in panic ( > fmt=0xffffffff820aa2e8 <vt_conswindow+16> "\312C$\201\377\377\377\377") > at /usr/main-src/sys/kern/kern_shutdown.c:894 > #12 0xffffffff82993c5b in vcmn_err (ce=<optimized out>, fmt=0xffffffff82bfdd1f "zfs: accessing past end of object %llx/%llx (size=%u access=%llu+%llu)", adx=0xfffffe0352758890) > at /usr/main-src/sys/contrib/openzfs/module/os/freebsd/spl/spl_cmn_err.c:60 > #13 0xffffffff82a84d69 in zfs_panic_recover ( > fmt=0x12 <error: Cannot access memory at address 0x12>) > at /usr/main-src/sys/contrib/openzfs/module/zfs/spa_misc.c:1594 > #14 0xffffffff829f8e27 in dmu_buf_hold_array_by_dnode (dn=0xfffff813dfc48978, offset=offset@entry=2560, length=length@entry=2560, read=read@entry=0, tag=0xffffffff82bd8175, numbufsp=numbufsp@entry=0xfffffe03527589bc, dbpp=0xfffffe03527589c0, flags=0) > at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:543 > #15 0xffffffff829fc6a1 in dmu_buf_hold_array (os=<optimized out>, object=<optimized out>, read=0, numbufsp=0xfffffe03527589bc, dbpp=0xfffffe03527589c0, offset=<optimized out>, length=<optimized out>, tag=<optimized out>) > at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:654 > #16 dmu_brt_clone (os=os@entry=0xfffff8010ae0e000, object=<optimized out>, offset=offset@entry=2560, length=length@entry=2560, tx=tx@entry=0xfffff81aaeb6e100, bps=bps@entry=0xfffffe0595931000, nbps=1, replay=0) at /usr/main-src/sys/contrib/openzfs/module/zfs/dmu.c:2301 > #17 0xffffffff82b4440a in zfs_clone_range (inzp=0xfffff8100054c910, inoffp=0xfffff81910c3c7c8, outzp=0xfffff80fb3233000, outoffp=0xfffff819860a2c78, lenp=lenp@entry=0xfffffe0352758c00, cr=0xfffff80e32335200) > at /usr/main-src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1302 > #18 0xffffffff829b4ece in zfs_freebsd_copy_file_range (ap=0xfffffe0352758c58) > at /usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:6294 > #19 0xffffffff80c7160e in VOP_COPY_FILE_RANGE (invp=<optimized out>, inoffp=0x40, outvp=0xfffffe03527581d0, outoffp=0xffffffff811e6eb7, lenp=0x0, flags=0, incred=0xfffff80e32335200, outcred=0x0, fsizetd=0xfffffe03586c0720) at ./vnode_if.h:2381 > #20 vn_copy_file_range (invp=invp@entry=0xfffff8095e1e8000, inoffp=0x40, inoffp@entry=0xfffff81910c3c7c8, outvp=0xfffffe03527581d0, outvp@entry=0xfffff805d6107380, outoffp=0xffffffff811e6eb7, outoffp@entry=0xfffff819860a2c78, lenp=0x0, lenp@entry=0xfffffe0352758d50, flags=flags@entry=0, incred=0xfffff80e32335200, outcred=0xfffff80e32335200, fsize_td=0xfffffe03586c0720) at /usr/main-src/sys/kern/vfs_vnops.c:3085 > #21 0xffffffff80c6b998 in kern_copy_file_range ( > td=td@entry=0xfffffe03586c0720, infd=<optimized out>, inoffp=0xfffff81910c3c7c8, inoffp@entry=0x0, outfd=<optimized out>, outoffp=0xfffff819860a2c78, outoffp@entry=0x0, len=9223372036854775807, flags=0) at /usr/main-src/sys/kern/vfs_syscalls.c:4971 > #22 0xffffffff80c6bab8 in sys_copy_file_range (td=0xfffffe03586c0720, uap=0xfffffe03586c0b20) at /usr/main-src/sys/kern/vfs_syscalls.c:5009 > #23 0xffffffff8104bab9 in syscallenter (td=0xfffffe03586c0720) > at /usr/main-src/sys/amd64/amd64/../../kern/subr_syscall.c:187 > #24 amd64_syscall (td=0xfffffe03586c0720, traced=0) > at /usr/main-src/sys/amd64/amd64/trap.c:1197 > #25 <signal handler called> > #26 0x00001ce4506d155a in ?? () > Backtrace stopped: Cannot access memory at address 0x1ce44ec71e88 > (kgdb) > > > Context details follow. > > Absent a openzfs-2.2 in: > > ls -C1 /usr/share/zfs/compatibility.d/openzfs-2.* > /usr/share/zfs/compatibility.d/openzfs-2.0-freebsd > /usr/share/zfs/compatibility.d/openzfs-2.0-linux > /usr/share/zfs/compatibility.d/openzfs-2.1-freebsd > /usr/share/zfs/compatibility.d/openzfs-2.1-linux > > I have copied: > > /usr/main-src/sys/contrib/openzfs/cmd/zpool/compatibility.d/openzfs-2.2 > > over to: > > # ls -C1 /etc/zfs/compatibility.d/* > /etc/zfs/compatibility.d/openzfs-2.2 > > and used it: > > # zpool get compatibility zamd64 > NAME PROPERTY VALUE SOURCE > zamd64 compatibility openzfs-2.2 local > > For reference: > > # zpool upgrade > This system supports ZFS pool feature flags. > > All pools are formatted using feature flags. > > > Some supported features are not enabled on the following pools. Once a > feature is enabled the pool may become incompatible with software > that does not support the feature. See zpool-features(7) for details. > > Note that the pool 'compatibility' feature can be used to inhibit > feature upgrades. > > POOL FEATURE > --------------- > zamd64 > redaction_list_spill > > which agrees with openzfs-2.2 . > > I did: > > # sysctl vfs.zfs.bclone_enabled=1 > vfs.zfs.bclone_enabled: 0 -> 1 > > I also made a snapshot: zamd64@before-bclone-test and > I then made a checkpoint. These were establshed just > after the above enable. > > I then did a: zpool trim -w zamd64 > > The poudriere bulk command was: poudriere bulk -jmain-amd64-bulk_a -a > where main-amd64-bulk_a has nothing prebuilt. USE_TMPFS=no > is in use. No form of ALLOW_MAKE_JOBS is in use. It is a > 32 builder context (32 hardware threads). > > For reference: > > # uname -apKU > FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #118 main-n265152-f49d6f583e9d-dirty: Mon Sep 4 14:26:56 PDT 2023 root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000 > > I'll note that with openzfs-2.1-freebsd compatibility I'd > previously let such a bulk -a run for about 10 hr and it > had reached 6366 port->package builds. > > Prior to that I'd done shorter experiments with default > zpool features (no explicit compatibility constraint) > but vfs.zfs.bclone_enabled=0 and I'd had no problems. > > (I have a separate M.2 boot media just for such experiments > and can reconstruct its content at will.) > > All these have been based on the same personal > main-n265152-f49d6f583e9d-dirty system build. Unfortunately, > no appropriate snapshot of main was available to avoid my > personal context being involved for the system build used. > Similarly, the snapshot(s) of stable/14 predate: > > Sun, 03 Sep 2023 > . . . > git: f789381671a3 - stable/14 - zfs: merge openzfs/zfs@32949f256 (zfs-2.2-release) into stable/14 > > that has required fixes for other issues. === Mark Millard marklmi at yahoo.com