Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Mon, 03 Apr 2023 23:54:33 UTC
On Mon, Apr 3, 2023 at 4:38 PM Cy Schubert <Cy.Schubert@cschubert.com> wrote:
>
> In message <20230403231444.CF48911F@slippy.cwsent.com>, Cy Schubert writes:
> > In message <202304031513.333FD6qw014903@gitrepo.freebsd.org>, Martin
> > Matuska wr
> > ites:
> > > The branch main has been updated by mm:
> > >
> > > URL: https://cgit.FreeBSD.org/src/commit/?id=2a58b312b62f908ec92311d1bd8536
> > db
> > > aeb8e55b
> > >
> > > commit 2a58b312b62f908ec92311d1bd8536dbaeb8e55b
> > > Merge: b98fbf3781df 431083f75bdd
> > > Author:     Martin Matuska <mm@FreeBSD.org>
> > > AuthorDate: 2023-04-03 14:49:30 +0000
> > > Commit:     Martin Matuska <mm@FreeBSD.org>
> > > CommitDate: 2023-04-03 14:49:30 +0000
> > >
> > >     zfs: merge openzfs/zfs@431083f75
> > >
> > >     Notable upstream pull request merges:
> > >       #12194 Fix short-lived txg caused by autotrim
> > >       #13368 ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()
> > >       #13392 Implementation of block cloning for ZFS
> > >       #13741 SHA2 reworking and API for iterating over multiple implementat
> > io
> > > ns
> > >       #14282 Sync thread should avoid holding the spa config write lock
> > >              when possible
> > >       #14283 txg_sync should handle write errors in ZIL
> > >       #14359 More adaptive ARC eviction
> > >       #14469 Fix NULL pointer dereference in zio_ready()
> > >       #14479 zfs redact fails when dnodesize=auto
> > >       #14496 improve error message of zfs redact
> > >       #14500 Skip memory allocation when compressing holes
> > >       #14501 FreeBSD: don't verify recycled vnode for zfs control directory
> > >       #14502 partially revert PR 14304 (eee9362a7)
> > >       #14509 Fix per-jail zfs.mount_snapshot setting
> > >       #14514 Fix data race between zil_commit() and zil_suspend()
> > >       #14516 System-wide speculative prefetch limit
> > >       #14517 Use rw_tryupgrade() in dmu_bonus_hold_by_dnode()
> > >       #14519 Do not hold spa_config in ZIL while blocked on IO
> > >       #14523 Move dmu_buf_rele() after dsl_dataset_sync_done()
> > >       #14524 Ignore too large stack in case of dsl_deadlist_merge
> > >       #14526 Use .section .rodata instead of .rodata on FreeBSD
> > >       #14528 ICP: AES-GCM: Refactor gcm_clear_ctx()
> > >       #14529 ICP: AES-GCM: Unify gcm_init_ctx() and gmac_init_ctx()
> > >       #14532 Handle unexpected errors in zil_lwb_commit() without ASSERT()
> > >       #14544 icp: Prevent compilers from optimizing away memset()
> > >              in gcm_clear_ctx()
> > >       #14546 Revert zfeature_active() to static
> > >       #14556 Remove bad kmem_free() oversight from previous zfsdev_state_li
> > st
> > >              patch
> > >       #14563 Optimize the is_l2cacheable functions
> > >       #14565 FreeBSD: zfs_znode_alloc: lock the vnode earlier
> > >       #14566 FreeBSD: fix false assert in cache_vop_rmdir when replaying ZI
> > L
> > >       #14567 spl: Add cmn_err_once() to log a message only on the first cal
> > l
> > >       #14568 Fix incremental receive silently failing for recursive sends
> > >       #14569 Restore ASMABI and other Unify work
> > >       #14576 Fix detection of IBM Power8 machines (ISA 2.07)
> > >       #14577 Better handling for future crypto parameters
> > >       #14600 zcommon: Refactor FPU state handling in fletcher4
> > >       #14603 Fix prefetching of indirect blocks while destroying
> > >       #14633 Fixes in persistent error log
> > >       #14639 FreeBSD: Remove extra arc_reduce_target_size() call
> > >       #14641 Additional limits on hole reporting
> > >       #14649 Drop lying to the compiler in the fletcher4 code
> > >       #14652 panic loop when removing slog device
> > >       #14653 Update vdev state for spare vdev
> > >       #14655 Fix cloning into already dirty dbufs
> > >       #14678 Revert "Do not hold spa_config in ZIL while blocked on IO"
> > >
> > >     Obtained from:  OpenZFS
> > >     OpenZFS commit: 431083f75bdd3efaee992bdd672625ec7240d252
> >
> > Just a heads up, I'm encountering the following error with an NFS share of
> > a ZFS dataset.
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 01
> > fault virtual address = 0x178
> > fault code            = supervisor read data, page not present
> > instruction pointer   = 0x20:0xffffffff814eebcd
> > stack pointer         = 0x28:0xfffffe00ec6c7cd0
> > frame pointer         = 0x28:0xfffffe00ec6c7d50
> > code segment          = base 0x0, limit 0xfffff, type 0x1b
> >                       = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags      = interrupt enabled, resume, IOPL = 0
> > current process               = 3735 (nfsd: master)
> > rdi: fffff8020a6f8570 rsi: fffffe00ec6c80d8 rdx: fffff8020f56f3a0
> > rcx: fffffe00ec6c80e0  r8:                0  r9:          1000000
> > rax:                0 rbx: fffff80210123540 rbp: fffffe00ec6c7d50
> > r10:             1876 r11: ffffffff81714596 r12: fffffe00ec6c7d20
> > r13:                0 r14: fffff8020b2f3e00 r15: fffffe00ec6c7d68
> > trap number           = 12
> > panic: page fault
> > cpuid = 1
> > time = 1680563351
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfffffe00ec6c7a90
> > vpanic() at vpanic+0x152/frame 0xfffffe00ec6c7ae0
> > panic() at panic+0x43/frame 0xfffffe00ec6c7b40
> > trap_fatal() at trap_fatal+0x409/frame 0xfffffe00ec6c7ba0
> > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00ec6c7c00
> > calltrap() at calltrap+0x8/frame 0xfffffe00ec6c7c00
> > --- trap 0xc, rip = 0xffffffff814eebcd, rsp = 0xfffffe00ec6c7cd0, rbp =
> > 0xfffffe00ec6c7d50 ---
> > zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x1bd/frame
> > 0xfffffe00ec6c7d50
Can you find out the line# for the above?
(I'll admit I don't know how to do that for loadable modules.)

I know nothing about ZFS, but I might be able to spot why the
VOP_COPY_FILE_RANGE() might be crashing.
Looks like a NULL pointer reference to a fairly large structure.
(I don't think a zfs_copy_file_range() exists in older ZFS versions?)

rick

> > vn_copy_file_range() at vn_copy_file_range+0x11f/frame 0xfffffe00ec6c7df0
> > nfsrvd_copy_file_range() at nfsrvd_copy_file_range+0x7d1/frame
> > 0xfffffe00ec6c81c0
> > nfsrvd_dorpc() at nfsrvd_dorpc+0x17b5/frame 0xfffffe00ec6c83f0
> > nfssvc_program() at nfssvc_program+0x6dd/frame 0xfffffe00ec6c85e0
> > svc_run_internal() at svc_run_internal+0xb0f/frame 0xfffffe00ec6c8720
> > svc_run() at svc_run+0x1b7/frame 0xfffffe00ec6c8770
> > nfsrvd_nfsd() at nfsrvd_nfsd+0x364/frame 0xfffffe00ec6c88d0
> > nfssvc_nfsd() at nfssvc_nfsd+0x58b/frame 0xfffffe00ec6c8de0
> > sys_nfssvc() at sys_nfssvc+0x9c/frame 0xfffffe00ec6c8e00
> > amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe00ec6c8f30
> > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00ec6c8f30
> > --- syscall (155, FreeBSD ELF64, nfssvc), rip = 0x2c12ca606bea, rsp =
> > 0x2c12c7614098, rbp = 0x2c12c7614330 ---
> > Uptime: 39m11s
> > Dumping 1426 out of 8159 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91
> > %
> > Dump complete
> > acpi0: reset failed - timeout
> > Automatic reboot in 15 seconds - press a key on the console to abort
> > Rebooting...
> > cpu_reset: Restarting BSP
> > cpu_reset_proxy: Stopped CPU 1
> >
> >
> > I haven't had a chance to look at the dump yet. The block_cloning feature
> > has not been enabled yet.
>
> Enabling block_cloning makes no difference. Reverted back to without
> block_cloning using a zpool checkpoint.
>
> I'll also fix what is causing it not to capture a dump and try to capture
> one. Probably because swap is on a gmirrored device.
>
>
> --
> Cheers,
> Cy Schubert <Cy.Schubert@cschubert.com>
> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
>
>                         e^(i*pi)+1=0
>
>
>    ŚĚL
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>