From nobody Sat Aug 12 14:11:10 2023 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RNMyn2KHXz4q5V3 for ; Sat, 12 Aug 2023 14:11:13 +0000 (UTC) (envelope-from des@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RNMyn1WBlz4qGB for ; Sat, 12 Aug 2023 14:11:13 +0000 (UTC) (envelope-from des@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1691849473; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xG3BPEfyNnhwESgDQSm2/nn18H2zGVki3BIcpbUS/xw=; b=ucvEzo0i4cMJU3z6kE40He1YWp3cIuCb0+dATFZg4dTQjDn/GSTx32zvNfU+O7Ma4qmfow 1SmaTn8xYD9nvl/dwSGDr8k624wb1mFrFAbujqdfK07Mr2CM1yOSh67uDWoeZnjsEZRxPg H/ClEM+/6+dUkk0tOBqc3KrXS63j/Sz3EPQN97B1IzmMaveQ2J9yIU5jVHrRHfcFbwkI7z ozk+Z4xscrxEqC6mtr1XggVa8Ay+GX8AJjL/7OGJQoHA6ruSHhfzXYDo2ya8MB2wIze8CA lMJjtsX8SQDki9R8JQQgxVIQkzkFHRI+4+VisPzO0Kc7z9JZyPRelQJzF9X1DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1691849473; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xG3BPEfyNnhwESgDQSm2/nn18H2zGVki3BIcpbUS/xw=; b=Isp/lzv81fv/GtyCIl/HAgxvSsGG5r2qRNSVYshKE63orRjjLvqPENAcb2t3ak2JjHUQba FUXaKccgraq4NPwoXMTqgSW6T+CzB6RVPFvpUPvfhMPUFjxNxu7r91DuNHqYCjYhpBuacy tREBiCQvM8lSNFNOim5Y4+rteahR+F6/Rp5V2oYffDUL2vtmGsxU+sjC6mzL5SAHkw+Wdq feYzy+G+OkYOUiPrE9PGLL2xsQ7brd9Z3HdHOCqrWXJLpUhxcGaUXlWC2r6qBXDwncDsfs pCIFj5/0DWpvBh8N8Z32iIV6pC3sLEfPUytiJvJxrrG0xdNcINEwVgZVvkWW/w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1691849473; a=rsa-sha256; cv=none; b=T8Al2fQBuNnLcrDSNldkTTNQQhgz7hz09LLR2NppDc7frQaEZHsrsXuHYPJyntjWvh7beU DWX+cVvf6afdAulmcGngCDCFNkWJrp88ugfAmhjHQYjW1DnR51413+CjPVmz6drSXAAZT+ yr8l4L+XZ2G11UiIi4l9vGGT7SUd8A3ODb4beCRHub03uyTVTrRCb6o7KrShkY5T/NbQ/0 6iSQfT1RIltB/THz1LaFa7kIuqr3ojJgeGhQEZfLxpDm6fW603XNQjwFjAnaZCD1SYJXvX B6rDSGFc5pkQKEEMZHthnWV1vh18KWq5fdJDA9+nRWgih8a3poX9qUrAoEpurA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from ltc.des.no (unknown [IPv6:2001:4647:d671:0:36e8:94ff:feca:9834]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: des) by smtp.freebsd.org (Postfix) with ESMTPSA id 4RNMym68zHzKB for ; Sat, 12 Aug 2023 14:11:12 +0000 (UTC) (envelope-from des@freebsd.org) Received: by ltc.des.no (Postfix, from userid 1001) id 5144C183CD; Sat, 12 Aug 2023 16:11:10 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: current@freebsd.org Subject: Re: ZFS deadlock in 14 In-Reply-To: <86leeltqcb.fsf@ltc.des.no> ("Dag-Erling =?utf-8?Q?Sm=C3=B8rg?= =?utf-8?Q?rav=22's?= message of "Tue, 08 Aug 2023 19:07:48 +0200") References: <86leeltqcb.fsf@ltc.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (berkeley-unix) Date: Sat, 12 Aug 2023 16:11:10 +0200 Message-ID: <86h6p4s64h.fsf@ltc.des.no> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Dag-Erling Sm=C3=B8rgrav writes: > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > deadlock was introduced in ZFS. Trying to narrow this range down, I did not get a deadlock with 4e8d558c9d1c (10 June) but I did with b7198dcfc039 (16 June), albeit after building ~1800 packages. This is surprising since we have a report of this or a very similar deadlock occurring with a kernel from 8 June (https://bugs.freebsd.org/271945). Perhaps I should try 4e8d558c9d1c again. Here's the complete kgdb session showing, once again, a zfs rollback stuck waiting for the txg to sync: Reading symbols from /boot/GENERIC/kernel... Reading symbols from /usr/lib/debug//boot/GENERIC/kernel.debug... =20=20=20=20 Unread portion of the kernel message buffer: panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe0356= 7a01e0 (sh), blocked for 180242 ticks =20=20=20=20 cpuid =3D 17 time =3D 1691802362 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe021= 507ce00 vpanic() at vpanic+0x150/frame 0xfffffe021507ce50 panic() at panic+0x43/frame 0xfffffe021507ceb0 deadlkres() at deadlkres+0x350/frame 0xfffffe021507cef0 fork_exit() at fork_exit+0x80/frame 0xfffffe021507cf30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe021507cf30 --- trap 0xdeadc0de, rip =3D 0xdeadc0dedeadc0de, rsp =3D 0xdeadc0dedead= c0de, rbp =3D 0xdeadc0dedeadc0de --- KDB: enter: panic =20=20=20=20 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:59 59 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(struct pcpu, (kgdb) tid 0xfffffe03567a01e0 (kgdb) bt #0 sched_switch (td=3Dtd@entry=3D0xfffffe03567a01e0, flags=3Dflags@ent= ry=3D259) at /usr/src/sys/kern/sched_ule.c:2299 #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /usr= /src/sys/kern/kern_synch.c:550 #2 0xffffffff80bb1257 in sleepq_switch (wchan=3D0xfffff80b139e4770, wc= han@entry=3D0xffffffff8113878f, pri=3Dpri@entry=3D64) at /usr/src/sys/kern/= subr_sleepqueue.c:609 #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, pri=3D) at /usr/src/sys/kern/subr_sleepqueue.c:660 #4 0xffffffff80b21d6f in sleeplk (lk=3Dlk@entry=3D0xfffff80b139e4770, = flags=3Dflags@entry=3D2122752, ilk=3Dilk@entry=3D0x0, wmesg=3Dwmesg@entry= =3D0xffffffff8113878f "tmpfs", pri=3D, pri@entry=3D64, timo= =3Dtimo@entry=3D6, queue=3D1) at /usr/src/sys/kern/kern_lock.c:310 #5 0xffffffff80b1fd9f in lockmgr_slock_hard (lk=3D0xfffff80b139e4770, = flags=3D2122752, ilk=3D, file=3D0xffffffff81296919 "/usr/src= /sys/kern/vfs_lookup.c", line=3D1012, lwa=3D0x0) at /usr/src/sys/kern/kern_= lock.c:705 #6 0xffffffff80c5e444 in VOP_LOCK1 (vp=3D0xfffff80b139e4700, flags=3D2= 106368, file=3D0xffffffff81296919 "/usr/src/sys/kern/vfs_lookup.c", line=3D= 1012) at ./vnode_if.h:1120 #7 _vn_lock (vp=3D0xfffff80b139e4700, flags=3D2106368, file=3D, line=3D) at /usr/src/sys/kern/vfs_vnops.c:1808 #8 0xffffffff80c36eae in vfs_lookup (ndp=3Dndp@entry=3D0xfffffe03c63a6= bd8) at /usr/src/sys/kern/vfs_lookup.c:1010 #9 0xffffffff80c36291 in namei (ndp=3Dndp@entry=3D0xfffffe03c63a6bd8) = at /usr/src/sys/kern/vfs_lookup.c:689 #10 0xffffffff80c5681f in kern_statat (td=3D0xfffffe03567a01e0, td@entr= y=3D, flag=3D, fd=3D-100, path=3D0x1032a8685a15= , pathseg=3Dpathseg= @entry=3DUIO_USERSPACE, sbp=3Dsbp@entry=3D0xfffffe03c63a6d18) at /usr/src/sys/kern/vfs_syscalls.c:2441 #11 0xffffffff80c56f17 in sys_fstatat (td=3D, td@entry=3D<= error reading variable: value is not available>, uap=3D0xfffffe03567a05e0, = uap@entry=3D) at /usr/src/s= ys/kern/vfs_syscalls.c:2419 #12 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:190 #13 amd64_syscall (td=3D0xfffffe03567a01e0, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1199 #14 #15 0x0000103acaf3b03a in ?? () Backtrace stopped: Cannot access memory at address 0x103ac929af28 (kgdb) f 5 #5 0xffffffff80b1fd9f in lockmgr_slock_hard (lk=3D0xfffff80b139e4770, = flags=3D2122752, ilk=3D, file=3D0xffffffff81296919 "/usr/src= /sys/kern/vfs_lookup.c", line=3D1012, lwa=3D0x0) at /usr/src/sys/kern/kern_= lock.c:705 705 error =3D sleeplk(lk, flags, ilk, iwmesg, ipri, itimo, (kgdb) p (struct thread *)(lk->lk_lock & ~0x1f) $1 =3D (struct thread *) 0xfffffe02ddae0e40 (kgdb) tid 0xfffffe02ddae0e40 (kgdb) bt #0 sched_switch (td=3Dtd@entry=3D0xfffffe02ddae0e40, flags=3Dflags@ent= ry=3D259) at /usr/src/sys/kern/sched_ule.c:2299 #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /usr= /src/sys/kern/kern_synch.c:550 #2 0xffffffff80bb1257 in sleepq_switch (wchan=3Dwchan@entry=3D0xfffff8= 1ab3c81154, pri=3D87, pri@entry=3D-1278734048) at /usr/src/sys/kern/subr_sl= eepqueue.c:609 #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, pri=3D) at /usr/src/sys/kern/subr_sleepqueue.c:660 #4 0xffffffff80b5f11d in _sleep (ident=3D0xfffff81ab3c81154, lock=3D0x= fffff81ab3c81120, priority=3D87, wmesg=3D0xffffffff82239fba "zfs teardown i= nactive", sbt=3D0, pr=3D0, flags=3D256) at /usr/src/sys/kern/kern_synch.c:2= 25 #5 0xffffffff80b4b640 in rms_rlock_fallback (rms=3Drms@entry=3D0xfffff= 81ab3c81120) at /usr/src/sys/kern/kern_rmlock.c:1015 #6 0xffffffff80b4b51c in rms_rlock (rms=3D, rms@entry=3D0= xfffff81ab3c81120) at /usr/src/sys/kern/kern_rmlock.c:1036 #7 0xffffffff81faff5c in zfs_freebsd_reclaim (ap=3D) at= /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:5164 #8 0xffffffff811215e4 in VOP_RECLAIM_APV (vop=3D0xffffffff822e61a0 , a=3Da@entry=3D0xfffffe02fb2118a0) at vnode_if.c:2180 #9 0xffffffff80c47d54 in VOP_RECLAIM (vp=3D0xfffff80912340000) at ./vn= ode_if.h:1084 #10 vgonel (vp=3Dvp@entry=3D0xfffff80912340000) at /usr/src/sys/kern/vf= s_subr.c:4143 #11 0xffffffff80c436f2 in vtryrecycle (vp=3D0xfffff80912340000) at /usr= /src/sys/kern/vfs_subr.c:1693 #12 vnlru_free_impl (count=3Dcount@entry=3D1, mnt_op=3Dmnt_op@entry=3D0= x0, mvp=3D0xfffff8010945da00) at /usr/src/sys/kern/vfs_subr.c:1344 #13 0xffffffff80c4dd83 in vnlru_free_locked (count=3D1) at /usr/src/sys= /kern/vfs_subr.c:1357 #14 vn_alloc_hard (mp=3Dmp@entry=3D0xfffffe0314140000) at /usr/src/sys/= kern/vfs_subr.c:1744 #15 0xffffffff80c43db1 in vn_alloc (mp=3D0xfffffe0314140000) at /usr/sr= c/sys/amd64/include/atomic.h:375 #16 getnewvnode (tag=3D0xffffffff8113878f "tmpfs", mp=3D0xfffffe0314140= 000, vops=3D0xffffffff816b7a70 , vpp=3Dvpp@entry=3D0= xfffffe02fb211a30) at /usr/src/sys/kern/vfs_subr.c:1810 #17 0xffffffff80a7b27c in tmpfs_alloc_vp (mp=3D0xfffffe0314140000, node= =3Dnode@entry=3D0xfffff81924deabc8, lkflag=3Dlkflag@entry=3D524288, vpp=3D0= xfffffe02fb211cf0) at /usr/src/sys/fs/tmpfs/tmpfs_subr.c:1027 #18 0xffffffff80a7b985 in tmpfs_alloc_file (dvp=3Ddvp@entry=3D0xfffff80= b139e4700, vpp=3D, vpp@entry=3D0xfffffe02fb211cf0, vap=3D, cnp=3Dcnp@entry=3D0xfffffe02fb211d18, target=3Dtarget@entry=3D0= x0) at /usr/src/sys/fs/tmpfs/tmpfs_subr.c:1203 #19 0xffffffff80a74d28 in tmpfs_create (v=3D) at /usr/sr= c/sys/fs/tmpfs/tmpfs_vnops.c:271 #20 0xffffffff8111eb99 in VOP_CREATE_APV (vop=3D0xffffffff816b7a70 , a=3Da@entry=3D0xfffffe02fb211be0) at vnode_if.c:244 #21 0xffffffff80c5d94c in VOP_CREATE (dvp=3D, vpp=3D0xffff= fe02fb211cf0, cnp=3D0xfffffe02fb211d18, vap=3D0xfffffe02fb211b20) at ./vnod= e_if.h:133 #22 vn_open_cred (ndp=3Dndp@entry=3D0xfffffe02fb211c98, flagp=3Dflagp@e= ntry=3D0xfffffe02fb211da4, cmode=3Dcmode@entry=3D420, vn_open_flags=3Dvn_op= en_flags@entry=3D16, cred=3D0xfffff8010978bc00, fp=3D0xfffff804f42cff00) at= /usr/src/sys/kern/vfs_vnops.c:287 #23 0xffffffff80c53f43 in kern_openat (td=3D0xfffffe02ddae0e40, td@entr= y=3D, fd=3Dfd@entry=3D-100, path=3D0x8222f799b , pathseg=3Dpathseg@entry=3DUIO_USERSP= ACE, flags=3D1538, mode=3D) at /usr/src/sys/kern/vfs_syscalls.c:1167 #24 0xffffffff80c53cad in sys_open (td=3Dtd@entry=3D, uap= =3Duap@entry=3D) at /usr/src/sys/kern/vfs_syscalls.c:1095 #25 0xffffffff82b18365 in filemon_wrapper_open (td=3D, td@= entry=3D, uap=3D, uap@entry=3D) at /usr/s= rc/sys/dev/filemon/filemon_wrapper.c:220 #26 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:190 #27 amd64_syscall (td=3D0xfffffe02ddae0e40, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1199 #28 #29 0x0000000829c8227a in ?? () Backtrace stopped: Cannot access memory at address 0x8222f6868 (kgdb) f 7 #7 0xffffffff81faff5c in zfs_freebsd_reclaim (ap=3D) at= /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:5164 5164 ZFS_TEARDOWN_INACTIVE_ENTER_READ(zfsvfs); (kgdb) p zp->z_zfsvfs->z_teardown_inactive_lock->owner $2 =3D (struct thread *) 0xfffffe0314249020 (kgdb) tid 0xfffffe0314249020 (kgdb) bt #0 sched_switch (td=3Dtd@entry=3D0xfffffe0314249020, flags=3Dflags@ent= ry=3D259) at /usr/src/sys/kern/sched_ule.c:2299 #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /usr= /src/sys/kern/kern_synch.c:550 #2 0xffffffff80bb1257 in sleepq_switch (wchan=3Dwchan@entry=3D0xfffff8= 0108fe1540, pri=3D0, pri@entry=3D150869200) at /usr/src/sys/kern/subr_sleep= queue.c:609 #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, wchan@ent= ry=3D0xfffff80108fe1540, pri=3D, pri@entry=3D0) at /usr/src/sy= s/kern/subr_sleepqueue.c:660 #4 0xffffffff80ade224 in _cv_wait (cvp=3D0xfffff80108fe1540, lock=3D0x= fffff80108fe14d0) at /usr/src/sys/kern/kern_condvar.c:146 #5 0xffffffff820b383b in txg_wait_synced_impl (dp=3D0xfffff80108fe1000= , txg=3D8751529, txg@entry=3D0, wait_sig=3Dwait_sig@entry=3D0) at /usr/src/= sys/contrib/openzfs/module/zfs/txg.c:726 #6 0xffffffff820b31eb in txg_wait_synced (dp=3D, txg=3D, txg@entry=3D0) at /usr/src/sys/contrib/openzfs/module/zfs/txg.= c:736 #7 0xffffffff81fa5fc5 in zfsvfs_teardown (zfsvfs=3D0xfffff81ab3c81000,= unmounting=3Dunmounting@entry=3D0) at /usr/src/sys/contrib/openzfs/module/= os/freebsd/zfs/zfs_vfsops.c:1661 #8 0xffffffff81fa5db9 in zfs_suspend_fs (zfsvfs=3D) at /u= sr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c:1954 #9 0xffffffff821680ff in zfs_ioc_rollback (fsname=3D0xfffffe0301913000= "zroot-default-ref/03", fsname@entry=3D, innvl=3D, innvl@entry=3D,=20 outnvl=3D0xfffff81601748640, outnvl@entry=3D) at /usr/src/sys/contrib/openzfs/module/zfs/zfs_i= octl.c:4401 #10 0xffffffff82163836 in zfsdev_ioctl_common (vecnum=3Dvecnum@entry=3D= 25, zc=3Dzc@entry=3D0xfffffe0301913000, flag=3Dflag@entry=3D0) at /usr/src/= sys/contrib/openzfs/module/zfs/zfs_ioctl.c:7798 #11 0xffffffff81f969aa in zfsdev_ioctl (dev=3D, zcmd=3D<= unavailable>, zcmd@entry=3D= , arg=3D0xfffffe02fd546d50 "\017", arg@entry=3D, flag=3D, td=3D) at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core.c:1= 68 #12 0xffffffff809dc9cc in devfs_ioctl (ap=3D0xfffffe02fd546c40) at /usr= /src/sys/fs/devfs/devfs_vnops.c:935 #13 0xffffffff80c5cac0 in vn_ioctl (fp=3D0xfffff81e9207f0a0, com=3D, data=3D0xfffffe02fd546d50, active_cred=3D0xfffff8026a65a900, t= d=3D) at /usr/src/sys/kern/vfs_vnops.c:1697 #14 0xffffffff809dd07e in devfs_ioctl_f (fp=3D, fp@entry= =3D, com=3D, c= om@entry=3D, data=3D, data@entry=3D,=20 cred=3D, cred@entry=3D, td=3D, td@entry=3D) at /usr/src/sys/fs/devfs/devfs_vnops.c:866 #15 0xffffffff80bca1ce in fo_ioctl (fp=3D0xfffff81e9207f0a0, com=3D3222= 821401, data=3D, active_cred=3D, td=3D) at /usr/src/sys/sys/file.h:367 #16 kern_ioctl (td=3Dtd@entry=3D0xfffffe0314249020, fd=3D, com=3Dcom@entry=3D3222821401, data=3D, data@entry=3D0xfffff= e02fd546d50 "\017") at /usr/src/sys/kern/sys_generic.c:807 #17 0xffffffff80bc9f64 in sys_ioctl (td=3D0xfffffe0314249020, td@entry= =3D, uap=3D0xfffffe03142494= 20, uap@entry=3D) at /usr/s= rc/sys/kern/sys_generic.c:715 #18 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:190 #19 amd64_syscall (td=3D0xfffffe0314249020, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1199 #20 #21 0x000005c8e125953a in ?? () Backtrace stopped: Cannot access memory at address 0x5c8d89c8018 DES --=20 Dag-Erling Sm=C3=B8rgrav - des@FreeBSD.org