From nobody Sat Aug 12 17:08:53 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RNRx62Cfgz4qJ6W; Sat, 12 Aug 2023 17:10:02 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta002.cacentral1.a.cloudfilter.net (omta002.cacentral1.a.cloudfilter.net [3.97.99.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4RNRx55qG2z3Rgt; Sat, 12 Aug 2023 17:10:01 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; none Received: from shw-obgw-4004a.ext.cloudfilter.net ([10.228.9.227]) by cmsmtp with ESMTP id UqZGqvEXh6NwhUs7kqMxpt; Sat, 12 Aug 2023 17:10:00 +0000 Received: from spqr.komquats.com ([70.66.152.170]) by cmsmtp with ESMTPA id Us7jqt6yE3fOSUs7kqtY71; Sat, 12 Aug 2023 17:10:00 +0000 X-Authority-Analysis: v=2.4 cv=J8G5USrS c=1 sm=1 tr=0 ts=64d7bce8 a=y8EK/9tc/U6QY+pUhnbtgQ==:117 a=y8EK/9tc/U6QY+pUhnbtgQ==:17 a=IkcTkHD0fZMA:10 a=UttIx32zK-AA:10 a=6I5d2MoRAAAA:8 a=YxBL1-UpAAAA:8 a=EkcXrb_YAAAA:8 a=fQWtEyl-p239-_sfO9UA:9 a=QEXdDO2ut3YA:10 a=IjZwj45LgO3ly-622nXo:22 a=Ia-lj3WSrqcvXOmTRaiG:22 a=LK5xJRSDVpKd5WXXoEvA:22 Received: from [127.0.0.1] (unknown [192.168.0.252]) by spqr.komquats.com (Postfix) with ESMTPSA id CED0028A; Sat, 12 Aug 2023 10:09:58 -0700 (PDT) Date: Sat, 12 Aug 2023 10:08:53 -0700 From: Cy Schubert To: freebsd-current@freebsd.org, =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , current@freebsd.org Subject: Re: ZFS deadlock in 14 In-Reply-To: <86h6p4s64h.fsf@ltc.des.no> References: <86leeltqcb.fsf@ltc.des.no> <86h6p4s64h.fsf@ltc.des.no> Message-ID: <6CB02436-1FD6-43C9-BB77-925A7443D8B8@cschubert.com> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-CMAE-Envelope: MS4xfDSFH7e2T/w2PdR38WcHI70vOq/a1OVQx1XMUW4ePbOWSbG9iMlbfaB6TnZo613w39zMGbYW3r2u8xkLFzfQq7XVS9CQdonS/aIIg/+Z8V78X226iA+6 YX2cULl3BzzOKpQMG3kgqt7+UxRu1ou/bCxllPG6AIU346Tyqgjzmjb4O6PaQLcHaqBhapgzpWtoQOutxUhekzaOOrQxtGqUu/84kVOB5axN+D5jSvAzVtPu J0nJKZ12b3nckG6HNf4uZzgrGl96+8I104Tp2pO6GtY= X-Rspamd-Queue-Id: 4RNRx55qG2z3Rgt X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US] On August 12, 2023 7:11:10 AM PDT, "Dag-Erling Sm=C3=B8rgrav" wrote: >Dag-Erling Sm=C3=B8rgrav writes: >> At some point between 42d088299c (4 May) and f0c9703301 (26 June), a >> deadlock was introduced in ZFS=2E > >Trying to narrow this range down, I did not get a deadlock with >4e8d558c9d1c (10 June) but I did with b7198dcfc039 (16 June), albeit >after building ~1800 packages=2E This is surprising since we have a >report of this or a very similar deadlock occurring with a kernel from 8 >June (https://bugs=2Efreebsd=2Eorg/271945)=2E Perhaps I should try >4e8d558c9d1c again=2E > >Here's the complete kgdb session showing, once again, a zfs rollback >stuck waiting for the txg to sync: > > Reading symbols from /boot/GENERIC/kernel=2E=2E=2E > Reading symbols from /usr/lib/debug//boot/GENERIC/kernel=2Edebug=2E= =2E=2E > =20 > Unread portion of the kernel message buffer: > panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe03= 567a01e0 (sh), blocked for 180242 ticks > =20 > cpuid =3D 17 > time =3D 1691802362 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0= 21507ce00 > vpanic() at vpanic+0x150/frame 0xfffffe021507ce50 > panic() at panic+0x43/frame 0xfffffe021507ceb0 > deadlkres() at deadlkres+0x350/frame 0xfffffe021507cef0 > fork_exit() at fork_exit+0x80/frame 0xfffffe021507cf30 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe021507cf30 > --- trap 0xdeadc0de, rip =3D 0xdeadc0dedeadc0de, rsp =3D 0xdeadc0dede= adc0de, rbp =3D 0xdeadc0dedeadc0de --- > KDB: enter: panic > =20 > __curthread () at /usr/src/sys/amd64/include/pcpu_aux=2Eh:59 > 59 __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n" (offsetof(struct pcp= u, > (kgdb) tid 0xfffffe03567a01e0 > (kgdb) bt > #0 sched_switch (td=3Dtd@entry=3D0xfffffe03567a01e0, flags=3Dflags@e= ntry=3D259) at /usr/src/sys/kern/sched_ule=2Ec:2299 > #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /u= sr/src/sys/kern/kern_synch=2Ec:550 > #2 0xffffffff80bb1257 in sleepq_switch (wchan=3D0xfffff80b139e4770, = wchan@entry=3D0xffffffff8113878f, pri=3Dpri@entry=3D64) at /usr/src/sys/ker= n/subr_sleepqueue=2Ec:609 > #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, pri=3D<= unavailable>) at /usr/src/sys/kern/subr_sleepqueue=2Ec:660 > #4 0xffffffff80b21d6f in sleeplk (lk=3Dlk@entry=3D0xfffff80b139e4770= , flags=3Dflags@entry=3D2122752, ilk=3Dilk@entry=3D0x0, wmesg=3Dwmesg@entry= =3D0xffffffff8113878f "tmpfs", pri=3D, pri@entry=3D64, timo= =3Dtimo@entry=3D6, queue=3D1) at /usr/src/sys/kern/kern_lock=2Ec:310 > #5 0xffffffff80b1fd9f in lockmgr_slock_hard (lk=3D0xfffff80b139e4770= , flags=3D2122752, ilk=3D, file=3D0xffffffff81296919 "/usr/s= rc/sys/kern/vfs_lookup=2Ec", line=3D1012, lwa=3D0x0) at /usr/src/sys/kern/k= ern_lock=2Ec:705 > #6 0xffffffff80c5e444 in VOP_LOCK1 (vp=3D0xfffff80b139e4700, flags= =3D2106368, file=3D0xffffffff81296919 "/usr/src/sys/kern/vfs_lookup=2Ec", l= ine=3D1012) at =2E/vnode_if=2Eh:1120 > #7 _vn_lock (vp=3D0xfffff80b139e4700, flags=3D2106368, file=3D, line=3D) at /usr/src/sys/kern/vfs_vnops=2Ec:1808 > #8 0xffffffff80c36eae in vfs_lookup (ndp=3Dndp@entry=3D0xfffffe03c63= a6bd8) at /usr/src/sys/kern/vfs_lookup=2Ec:1010 > #9 0xffffffff80c36291 in namei (ndp=3Dndp@entry=3D0xfffffe03c63a6bd8= ) at /usr/src/sys/kern/vfs_lookup=2Ec:689 > #10 0xffffffff80c5681f in kern_statat (td=3D0xfffffe03567a01e0, td@en= try=3D, flag=3D, fd=3D-100, path=3D0x1032a8685a= 15 , pathseg=3Dpaths= eg@entry=3DUIO_USERSPACE, sbp=3Dsbp@entry=3D0xfffffe03c63a6d18) > at /usr/src/sys/kern/vfs_syscalls=2Ec:2441 > #11 0xffffffff80c56f17 in sys_fstatat (td=3D, td@entry= =3D, uap=3D0xfffffe03567a05= e0, uap@entry=3D) at /usr/s= rc/sys/kern/vfs_syscalls=2Ec:2419 > #12 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr= /src/sys/amd64/amd64/=2E=2E/=2E=2E/kern/subr_syscall=2Ec:190 > #13 amd64_syscall (td=3D0xfffffe03567a01e0, traced=3D0) at /usr/src/s= ys/amd64/amd64/trap=2Ec:1199 > #14 > #15 0x0000103acaf3b03a in ?? () > Backtrace stopped: Cannot access memory at address 0x103ac929af28 > (kgdb) f 5 > #5 0xffffffff80b1fd9f in lockmgr_slock_hard (lk=3D0xfffff80b139e4770= , flags=3D2122752, ilk=3D, file=3D0xffffffff81296919 "/usr/s= rc/sys/kern/vfs_lookup=2Ec", line=3D1012, lwa=3D0x0) at /usr/src/sys/kern/k= ern_lock=2Ec:705 > 705 error =3D sleeplk(lk, flags, ilk, iwmesg, ipri, itimo, > (kgdb) p (struct thread *)(lk->lk_lock & ~0x1f) > $1 =3D (struct thread *) 0xfffffe02ddae0e40 > (kgdb) tid 0xfffffe02ddae0e40 > (kgdb) bt > #0 sched_switch (td=3Dtd@entry=3D0xfffffe02ddae0e40, flags=3Dflags@e= ntry=3D259) at /usr/src/sys/kern/sched_ule=2Ec:2299 > #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /u= sr/src/sys/kern/kern_synch=2Ec:550 > #2 0xffffffff80bb1257 in sleepq_switch (wchan=3Dwchan@entry=3D0xffff= f81ab3c81154, pri=3D87, pri@entry=3D-1278734048) at /usr/src/sys/kern/subr_= sleepqueue=2Ec:609 > #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, pri=3D<= unavailable>) at /usr/src/sys/kern/subr_sleepqueue=2Ec:660 > #4 0xffffffff80b5f11d in _sleep (ident=3D0xfffff81ab3c81154, lock=3D= 0xfffff81ab3c81120, priority=3D87, wmesg=3D0xffffffff82239fba "zfs teardown= inactive", sbt=3D0, pr=3D0, flags=3D256) at /usr/src/sys/kern/kern_synch= =2Ec:225 > #5 0xffffffff80b4b640 in rms_rlock_fallback (rms=3Drms@entry=3D0xfff= ff81ab3c81120) at /usr/src/sys/kern/kern_rmlock=2Ec:1015 > #6 0xffffffff80b4b51c in rms_rlock (rms=3D, rms@entry= =3D0xfffff81ab3c81120) at /usr/src/sys/kern/kern_rmlock=2Ec:1036 > #7 0xffffffff81faff5c in zfs_freebsd_reclaim (ap=3D) = at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os=2Ec:5164 > #8 0xffffffff811215e4 in VOP_RECLAIM_APV (vop=3D0xffffffff822e61a0 <= zfs_vnodeops>, a=3Da@entry=3D0xfffffe02fb2118a0) at vnode_if=2Ec:2180 > #9 0xffffffff80c47d54 in VOP_RECLAIM (vp=3D0xfffff80912340000) at = =2E/vnode_if=2Eh:1084 > #10 vgonel (vp=3Dvp@entry=3D0xfffff80912340000) at /usr/src/sys/kern/= vfs_subr=2Ec:4143 > #11 0xffffffff80c436f2 in vtryrecycle (vp=3D0xfffff80912340000) at /u= sr/src/sys/kern/vfs_subr=2Ec:1693 > #12 vnlru_free_impl (count=3Dcount@entry=3D1, mnt_op=3Dmnt_op@entry= =3D0x0, mvp=3D0xfffff8010945da00) at /usr/src/sys/kern/vfs_subr=2Ec:1344 > #13 0xffffffff80c4dd83 in vnlru_free_locked (count=3D1) at /usr/src/s= ys/kern/vfs_subr=2Ec:1357 > #14 vn_alloc_hard (mp=3Dmp@entry=3D0xfffffe0314140000) at /usr/src/sy= s/kern/vfs_subr=2Ec:1744 > #15 0xffffffff80c43db1 in vn_alloc (mp=3D0xfffffe0314140000) at /usr/= src/sys/amd64/include/atomic=2Eh:375 > #16 getnewvnode (tag=3D0xffffffff8113878f "tmpfs", mp=3D0xfffffe03141= 40000, vops=3D0xffffffff816b7a70 , vpp=3Dvpp@entry= =3D0xfffffe02fb211a30) at /usr/src/sys/kern/vfs_subr=2Ec:1810 > #17 0xffffffff80a7b27c in tmpfs_alloc_vp (mp=3D0xfffffe0314140000, no= de=3Dnode@entry=3D0xfffff81924deabc8, lkflag=3Dlkflag@entry=3D524288, vpp= =3D0xfffffe02fb211cf0) at /usr/src/sys/fs/tmpfs/tmpfs_subr=2Ec:1027 > #18 0xffffffff80a7b985 in tmpfs_alloc_file (dvp=3Ddvp@entry=3D0xfffff= 80b139e4700, vpp=3D, vpp@entry=3D0xfffffe02fb211cf0, vap=3D, cnp=3Dcnp@entry=3D0xfffffe02fb211d18, target=3Dtarget@entry= =3D0x0) at /usr/src/sys/fs/tmpfs/tmpfs_subr=2Ec:1203 > #19 0xffffffff80a74d28 in tmpfs_create (v=3D) at /usr/= src/sys/fs/tmpfs/tmpfs_vnops=2Ec:271 > #20 0xffffffff8111eb99 in VOP_CREATE_APV (vop=3D0xffffffff816b7a70 , a=3Da@entry=3D0xfffffe02fb211be0) at vnode_if=2Ec:24= 4 > #21 0xffffffff80c5d94c in VOP_CREATE (dvp=3D, vpp=3D0xff= fffe02fb211cf0, cnp=3D0xfffffe02fb211d18, vap=3D0xfffffe02fb211b20) at =2E/= vnode_if=2Eh:133 > #22 vn_open_cred (ndp=3Dndp@entry=3D0xfffffe02fb211c98, flagp=3Dflagp= @entry=3D0xfffffe02fb211da4, cmode=3Dcmode@entry=3D420, vn_open_flags=3Dvn_= open_flags@entry=3D16, cred=3D0xfffff8010978bc00, fp=3D0xfffff804f42cff00) = at /usr/src/sys/kern/vfs_vnops=2Ec:287 > #23 0xffffffff80c53f43 in kern_openat (td=3D0xfffffe02ddae0e40, td@en= try=3D, fd=3Dfd@entry=3D-100, path=3D0x8222f799b , pathseg=3Dpathseg@entry=3DUIO_USER= SPACE, flags=3D1538, mode=3D) > at /usr/src/sys/kern/vfs_syscalls=2Ec:1167 > #24 0xffffffff80c53cad in sys_open (td=3Dtd@entry=3D, ua= p=3Duap@entry=3D) at /usr/src/sys/kern/vfs_syscalls=2Ec:1095 > #25 0xffffffff82b18365 in filemon_wrapper_open (td=3D, t= d@entry=3D, uap=3D, uap@entry=3D) at /usr= /src/sys/dev/filemon/filemon_wrapper=2Ec:220 > #26 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr= /src/sys/amd64/amd64/=2E=2E/=2E=2E/kern/subr_syscall=2Ec:190 > #27 amd64_syscall (td=3D0xfffffe02ddae0e40, traced=3D0) at /usr/src/s= ys/amd64/amd64/trap=2Ec:1199 > #28 > #29 0x0000000829c8227a in ?? () > Backtrace stopped: Cannot access memory at address 0x8222f6868 > (kgdb) f 7 > #7 0xffffffff81faff5c in zfs_freebsd_reclaim (ap=3D) = at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os=2Ec:5164 > 5164 ZFS_TEARDOWN_INACTIVE_ENTER_READ(zfsvfs); > (kgdb) p zp->z_zfsvfs->z_teardown_inactive_lock->owner > $2 =3D (struct thread *) 0xfffffe0314249020 > (kgdb) tid 0xfffffe0314249020 > (kgdb) bt > #0 sched_switch (td=3Dtd@entry=3D0xfffffe0314249020, flags=3Dflags@e= ntry=3D259) at /usr/src/sys/kern/sched_ule=2Ec:2299 > #1 0xffffffff80b5fbd4 in mi_switch (flags=3Dflags@entry=3D259) at /u= sr/src/sys/kern/kern_synch=2Ec:550 > #2 0xffffffff80bb1257 in sleepq_switch (wchan=3Dwchan@entry=3D0xffff= f80108fe1540, pri=3D0, pri@entry=3D150869200) at /usr/src/sys/kern/subr_sle= epqueue=2Ec:609 > #3 0xffffffff80bb112e in sleepq_wait (wchan=3D, wchan@e= ntry=3D0xfffff80108fe1540, pri=3D, pri@entry=3D0) at /usr/src/= sys/kern/subr_sleepqueue=2Ec:660 > #4 0xffffffff80ade224 in _cv_wait (cvp=3D0xfffff80108fe1540, lock=3D= 0xfffff80108fe14d0) at /usr/src/sys/kern/kern_condvar=2Ec:146 > #5 0xffffffff820b383b in txg_wait_synced_impl (dp=3D0xfffff80108fe10= 00, txg=3D8751529, txg@entry=3D0, wait_sig=3Dwait_sig@entry=3D0) at /usr/sr= c/sys/contrib/openzfs/module/zfs/txg=2Ec:726 > #6 0xffffffff820b31eb in txg_wait_synced (dp=3D, txg=3D= , txg@entry=3D0) at /usr/src/sys/contrib/openzfs/module/zfs/tx= g=2Ec:736 > #7 0xffffffff81fa5fc5 in zfsvfs_teardown (zfsvfs=3D0xfffff81ab3c8100= 0, unmounting=3Dunmounting@entry=3D0) at /usr/src/sys/contrib/openzfs/modul= e/os/freebsd/zfs/zfs_vfsops=2Ec:1661 > #8 0xffffffff81fa5db9 in zfs_suspend_fs (zfsvfs=3D) at = /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops=2Ec:1954 > #9 0xffffffff821680ff in zfs_ioc_rollback (fsname=3D0xfffffe03019130= 00 "zroot-default-ref/03", fsname@entry=3D, innvl=3D, innvl@entry=3D,=20 > outnvl=3D0xfffff81601748640, outnvl@entry=3D) at /usr/src/sys/contrib/openzfs/module/zfs/zfs= _ioctl=2Ec:4401 > #10 0xffffffff82163836 in zfsdev_ioctl_common (vecnum=3Dvecnum@entry= =3D25, zc=3Dzc@entry=3D0xfffffe0301913000, flag=3Dflag@entry=3D0) at /usr/s= rc/sys/contrib/openzfs/module/zfs/zfs_ioctl=2Ec:7798 > #11 0xffffffff81f969aa in zfsdev_ioctl (dev=3D, zcmd= =3D, zcmd@entry=3D, arg=3D0xfffffe02fd546d50 "\017", arg@entry=3D, flag=3D, td=3D) > at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core= =2Ec:168 > #12 0xffffffff809dc9cc in devfs_ioctl (ap=3D0xfffffe02fd546c40) at /u= sr/src/sys/fs/devfs/devfs_vnops=2Ec:935 > #13 0xffffffff80c5cac0 in vn_ioctl (fp=3D0xfffff81e9207f0a0, com=3D, data=3D0xfffffe02fd546d50, active_cred=3D0xfffff8026a65a900,= td=3D) at /usr/src/sys/kern/vfs_vnops=2Ec:1697 > #14 0xffffffff809dd07e in devfs_ioctl_f (fp=3D, fp@entry= =3D, com=3D, c= om@entry=3D, data=3D, data@entry=3D,=20 > cred=3D, cred@entry=3D, td=3D, td@entry=3D) at /usr/src/sys/fs/devfs/devfs_vnops=2Ec:866 > #15 0xffffffff80bca1ce in fo_ioctl (fp=3D0xfffff81e9207f0a0, com=3D32= 22821401, data=3D, active_cred=3D, td=3D) at /usr/src/sys/sys/file=2Eh:367 > #16 kern_ioctl (td=3Dtd@entry=3D0xfffffe0314249020, fd=3D, com=3Dcom@entry=3D3222821401, data=3D, data@entry=3D0xfff= ffe02fd546d50 "\017") at /usr/src/sys/kern/sys_generic=2Ec:807 > #17 0xffffffff80bc9f64 in sys_ioctl (td=3D0xfffffe0314249020, td@entr= y=3D, uap=3D0xfffffe0314249= 420, uap@entry=3D) at /usr/= src/sys/kern/sys_generic=2Ec:715 > #18 0xffffffff8104d8e0 in syscallenter (td=3D) at /usr= /src/sys/amd64/amd64/=2E=2E/=2E=2E/kern/subr_syscall=2Ec:190 > #19 amd64_syscall (td=3D0xfffffe0314249020, traced=3D0) at /usr/src/s= ys/amd64/amd64/trap=2Ec:1199 > #20 > #21 0x000005c8e125953a in ?? () > Backtrace stopped: Cannot access memory at address 0x5c8d89c8018 > >DES Yes, this is the same panic my poudriere builder building amd64 packages g= ets=2E The poudeiere builder, also running on amd64, building i386 packages= gets a different panic=2E I'm on my phone and don't have a keyboard to loo= k up the PR number=2E --=20 Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD=2Eorg NTP: Web: https://nwtime=2Eorg e^(i*pi)+1=3D0 Pardon the typos=2E Small keyboard in use=2E