Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

From: Cy Schubert <Cy.Schubert_at_cschubert.com>
Date: Tue, 04 Apr 2023 05:28:11 UTC
In message <CAM5tNy64HTeC8+OT_SHg1osnKKAH3_qQJkyWFuOy-LDAFVzu+A@mail.gmail.c
om>
, Rick Macklem writes:
> On Mon, Apr 3, 2023 at 6:55=E2=80=AFPM Rick Macklem <rick.macklem@gmail.com=
> > wrote:
> >
> > On Mon, Apr 3, 2023 at 4:58=E2=80=AFPM Cy Schubert <Cy.Schubert@cschubert=
> .com> wrote:
> > >
> > > In message <CAM5tNy45XwDNGK27i_Z_96H-sLDXXHuaZbSQ=3DE7507eCiCvgJw@mail.=
> gmail.c
> > > om>
> > > , Rick Macklem writes:
> > > > On Mon, Apr 3, 2023 at 4:38=3DE2=3D80=3DAFPM Cy Schubert <Cy.Schubert=
> @cschubert.c=3D
> > > > om> wrote:
> > > > >
> > > > > In message <20230403231444.CF48911F@slippy.cwsent.com>, Cy Schubert=
>  write=3D
> > > > s:
> > > > > > In message <202304031513.333FD6qw014903@gitrepo.freebsd.org>, Mar=
> tin
> > > > > > Matuska wr
> > > > > > ites:
> > > > > > > The branch main has been updated by mm:
> > > > > > >
> > > > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=3D3D2a58b312b62f90=
> 8ec92311=3D
> > > > d1bd8536
> > > > > > db
> > > > > > > aeb8e55b
> > > > > > >
> > > > > > > commit 2a58b312b62f908ec92311d1bd8536dbaeb8e55b
> > > > > > > Merge: b98fbf3781df 431083f75bdd
> > > > > > > Author:     Martin Matuska <mm@FreeBSD.org>
> > > > > > > AuthorDate: 2023-04-03 14:49:30 +0000
> > > > > > > Commit:     Martin Matuska <mm@FreeBSD.org>
> > > > > > > CommitDate: 2023-04-03 14:49:30 +0000
> > > > > > >
> > > > > > >     zfs: merge openzfs/zfs@431083f75
> > > > > > >
> > > > > > >     Notable upstream pull request merges:
> > > > > > >       #12194 Fix short-lived txg caused by autotrim
> > > > > > >       #13368 ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_syn=
> ced()
> > > > > > >       #13392 Implementation of block cloning for ZFS
> > > > > > >       #13741 SHA2 reworking and API for iterating over multiple=
>  imple=3D
> > > > mentat
> > > > > > io
> > > > > > > ns
> > > > > > >       #14282 Sync thread should avoid holding the spa config wr=
> ite lo=3D
> > > > ck
> > > > > > >              when possible
> > > > > > >       #14283 txg_sync should handle write errors in ZIL
> > > > > > >       #14359 More adaptive ARC eviction
> > > > > > >       #14469 Fix NULL pointer dereference in zio_ready()
> > > > > > >       #14479 zfs redact fails when dnodesize=3D3Dauto
> > > > > > >       #14496 improve error message of zfs redact
> > > > > > >       #14500 Skip memory allocation when compressing holes
> > > > > > >       #14501 FreeBSD: don't verify recycled vnode for zfs contr=
> ol dir=3D
> > > > ectory
> > > > > > >       #14502 partially revert PR 14304 (eee9362a7)
> > > > > > >       #14509 Fix per-jail zfs.mount_snapshot setting
> > > > > > >       #14514 Fix data race between zil_commit() and zil_suspend=
> ()
> > > > > > >       #14516 System-wide speculative prefetch limit
> > > > > > >       #14517 Use rw_tryupgrade() in dmu_bonus_hold_by_dnode()
> > > > > > >       #14519 Do not hold spa_config in ZIL while blocked on IO
> > > > > > >       #14523 Move dmu_buf_rele() after dsl_dataset_sync_done()
> > > > > > >       #14524 Ignore too large stack in case of dsl_deadlist_mer=
> ge
> > > > > > >       #14526 Use .section .rodata instead of .rodata on FreeBSD
> > > > > > >       #14528 ICP: AES-GCM: Refactor gcm_clear_ctx()
> > > > > > >       #14529 ICP: AES-GCM: Unify gcm_init_ctx() and gmac_init_c=
> tx()
> > > > > > >       #14532 Handle unexpected errors in zil_lwb_commit() witho=
> ut ASS=3D
> > > > ERT()
> > > > > > >       #14544 icp: Prevent compilers from optimizing away memset=
> ()
> > > > > > >              in gcm_clear_ctx()
> > > > > > >       #14546 Revert zfeature_active() to static
> > > > > > >       #14556 Remove bad kmem_free() oversight from previous zfs=
> dev_st=3D
> > > > ate_li
> > > > > > st
> > > > > > >              patch
> > > > > > >       #14563 Optimize the is_l2cacheable functions
> > > > > > >       #14565 FreeBSD: zfs_znode_alloc: lock the vnode earlier
> > > > > > >       #14566 FreeBSD: fix false assert in cache_vop_rmdir when =
> replay=3D
> > > > ing ZI
> > > > > > L
> > > > > > >       #14567 spl: Add cmn_err_once() to log a message only on t=
> he fir=3D
> > > > st cal
> > > > > > l
> > > > > > >       #14568 Fix incremental receive silently failing for recur=
> sive s=3D
> > > > ends
> > > > > > >       #14569 Restore ASMABI and other Unify work
> > > > > > >       #14576 Fix detection of IBM Power8 machines (ISA 2.07)
> > > > > > >       #14577 Better handling for future crypto parameters
> > > > > > >       #14600 zcommon: Refactor FPU state handling in fletcher4
> > > > > > >       #14603 Fix prefetching of indirect blocks while destroyin=
> g
> > > > > > >       #14633 Fixes in persistent error log
> > > > > > >       #14639 FreeBSD: Remove extra arc_reduce_target_size() cal=
> l
> > > > > > >       #14641 Additional limits on hole reporting
> > > > > > >       #14649 Drop lying to the compiler in the fletcher4 code
> > > > > > >       #14652 panic loop when removing slog device
> > > > > > >       #14653 Update vdev state for spare vdev
> > > > > > >       #14655 Fix cloning into already dirty dbufs
> > > > > > >       #14678 Revert "Do not hold spa_config in ZIL while blocke=
> d on I=3D
> > > > O"
> > > > > > >
> > > > > > >     Obtained from:  OpenZFS
> > > > > > >     OpenZFS commit: 431083f75bdd3efaee992bdd672625ec7240d252
> > > > > >
> > > > > > Just a heads up, I'm encountering the following error with an NFS=
>  share=3D
> > > >  of
> > > > > > a ZFS dataset.
> > > > > >
> > > > > > Fatal trap 12: page fault while in kernel mode
> > > > > > cpuid =3D3D 1; apic id =3D3D 01
> > > > > > fault virtual address =3D3D 0x178
> > > > > > fault code            =3D3D supervisor read data, page not presen=
> t
> > > > > > instruction pointer   =3D3D 0x20:0xffffffff814eebcd
> > > > > > stack pointer         =3D3D 0x28:0xfffffe00ec6c7cd0
> > > > > > frame pointer         =3D3D 0x28:0xfffffe00ec6c7d50
> > > > > > code segment          =3D3D base 0x0, limit 0xfffff, type 0x1b
> > > > > >                       =3D3D DPL 0, pres 1, long 1, def32 0, gran =
> 1
> > > > > > processor eflags      =3D3D interrupt enabled, resume, IOPL =3D3D=
>  0
> > > > > > current process               =3D3D 3735 (nfsd: master)
> > > > > > rdi: fffff8020a6f8570 rsi: fffffe00ec6c80d8 rdx: fffff8020f56f3a0
> > > > > > rcx: fffffe00ec6c80e0  r8:                0  r9:          1000000
> > > > > > rax:                0 rbx: fffff80210123540 rbp: fffffe00ec6c7d50
> > > > > > r10:             1876 r11: ffffffff81714596 r12: fffffe00ec6c7d20
> > > > > > r13:                0 r14: fffff8020b2f3e00 r15: fffffe00ec6c7d68
> > > > > > trap number           =3D3D 12
> > > > > > panic: page fault
> > > > > > cpuid =3D3D 1
> > > > > > time =3D3D 1680563351
> > > > > > KDB: stack backtrace:
> > > > > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > > > > > 0xfffffe00ec6c7a90
> > > > > > vpanic() at vpanic+0x152/frame 0xfffffe00ec6c7ae0
> > > > > > panic() at panic+0x43/frame 0xfffffe00ec6c7b40
> > > > > > > > > > > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00ec6c7c00
> > > > > > calltrap() at calltrap+0x8/frame 0xfffffe00ec6c7c00
> > > > > > --- trap 0xc, rip =3D3D 0xffffffff814eebcd, rsp =3D3D 0xfffffe00e=
> c6c7cd0, r=3D
> > > > bp =3D3D
> > > > > > 0xfffffe00ec6c7d50 ---
> > > > > > zfs_freebsd_copy_file_range() at zfs_freebsd_copy_file_range+0x1b=
> d/fram=3D
> > > > e
> > > > > > 0xfffffe00ec6c7d50
> > > > Can you find out the line# for the above?
> > > > (I'll admit I don't know how to do that for loadable modules.)
> > > >
> > > > I know nothing about ZFS, but I might be able to spot why the
> > > > VOP_COPY_FILE_RANGE() might be crashing.
> > > > Looks like a NULL pointer reference to a fairly large structure.
> > > > (I don't think a zfs_copy_file_range() exists in older ZFS versions?)
> > >
> > > I don't think it does so it calls vn_generic_copy_file_range() instead.
> > >
> > > A workaround at vfs_vnops.c:3077 should fix it but I'm not sure if this=
>  is
> > > the correct fix.
> > Well, I looked and the crash is obvious. For the NFS server, the last
> > argument (fsize_td) is NULL (as noted by "man VOP_COPY_FILE_RANGE()).
> >
> > zfs_copy_file_range() cannot use ap->a_fsizetd->td_ucred for the NFS serv=
> er
> > case. It should probably use ap->a_incred or ap->a_outcred when
> > ap->a_fsizetd =3D=3D NULL.
> > (Both ap->a_incred and ap->a_outcred are the same for the NFS server,
> > so either one
> > should be ok.)
> >
> Looking at zfs_clone_range(), the only use of the cred argument is a call
> to zfs_clear_setid_bits_if_necessary().  Since UFS decides whether or not
> to clear SUID when writing to a file based on the credentials used when the
> file is open()'d for writing, I think
>    ap->a_outcred is the correct credential to pass into
>    zfs_clone_range() whenever it is called. The thread credential
>    might have changed, due to something like a seteuid() after opening
>    the file.

Will update your patch to use outcred instead.

>
> rick
>
> > You could try the attached patch, rick
> >
> > >
> > >
> > > --
> > > Cheers,
> > > Cy Schubert <Cy.Schubert@cschubert.com>
> > > FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> > > NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
> > >
> > >                         e^(i*pi)+1=3D0
> > >
> > >
> > >
> > >       h=C3=A1L

I've noticed random artifacts in my emails since the new ZFS. I'm not sure 
if it's related to block_cloning or something else. This could mean deeper 
trouble.


-- 
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
NTP:           <cy@nwtime.org>    Web:  https://nwtime.org

			e^(i*pi)+1=0