Re: git: 315ee00fa961 - main - zfs: merge openzfs/zfs@804414aad

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Fri, 01 Sep 2023 13:05:28 UTC
On 31.08.2023 20:19, Cy Schubert wrote:
> Cy Schubert writes:
>> In message <20230831234437.0EB38133@slippy.cwsent.com>, Cy Schubert writes:
>>> In message <20230831234023.917DFAB@slippy.cwsent.com>, Cy Schubert writes:
>>>>>> One of the two machines is hung.
>>>>>>
>>>>>> cwfw# ping bob
>>>>>> PING bob (10.1.1.7): 56 data bytes
>>>>>> ^C
>>>>>> --- bob ping statistics ---
>>>>>> 2 packets transmitted, 0 packets received, 100.0% packet loss
>>>>>> cwfw# console bob
>>>>>> [Enter `^Ec?' for help]
>>>>>> [halt sent]
>>>>>> KDB: enter: Break to debugger
>>>>>> [ thread pid 31259 tid 100913 ]
>>>>>> Stopped at      kdb_break+0x48: movq    $0,0xa1069d(%rip)
>>>>>> db> bt
>>>>>> Tracing pid 31259 tid 100913 td 0xfffffe00c4eca000
>>>>>> kdb_break() at kdb_break+0x48/frame 0xfffffe00c53ef2d0
>>>>>> uart_intr() at uart_intr+0xf7/frame 0xfffffe00c53ef310
>>>>>> intr_event_handle() at intr_event_handle+0x12b/frame 0xfffffe00c53ef3
>> 80
>>>>>> intr_execute_handlers() at intr_execute_handlers+0x63/frame
>>>>>> 0xfffffe00c53ef3b0
>>>>>> Xapic_isr1() at Xapic_isr1+0xdc/frame 0xfffffe00c53ef3b0
>>>>>> --- interrupt, rip =3D 0xffffffff806d5c70, rsp =3D 0xfffffe00c53ef480
>> ,
>>> rb
>>>> =
>>>>> p =3D
>>>>>> 0xfffffe00c53ef480 ---
>>>>>> getbinuptime() at getbinuptime+0x30/frame 0xfffffe00c53ef480
>>>>>> arc_access() at arc_access+0x250/frame 0xfffffe00c53ef4d0
>>>>>> arc_buf_access() at arc_buf_access+0xd0/frame 0xfffffe00c53ef4f0
>>>>>> dbuf_hold_impl() at dbuf_hold_impl+0xf3/frame 0xfffffe00c53ef580
>>>>>> dbuf_hold() at dbuf_hold+0x25/frame 0xfffffe00c53ef5b0
>>>>>> dnode_hold_impl() at dnode_hold_impl+0x194/frame 0xfffffe00c53ef670
>>>>>> dmu_bonus_hold() at dmu_bonus_hold+0x20/frame 0xfffffe00c53ef6a0
>>>>>> zfs_zget() at zfs_zget+0x20d/frame 0xfffffe00c53ef750
>>>>>> zfs_dirent_lookup() at zfs_dirent_lookup+0x16d/frame 0xfffffe00c53ef7
>> a0
>>>>>> zfs_dirlook() at zfs_dirlook+0x7f/frame 0xfffffe00c53ef7d0
>>>>>> zfs_lookup() at zfs_lookup+0x3c0/frame 0xfffffe00c53ef8a0
>>>>>> zfs_freebsd_cachedlookup() at zfs_freebsd_cachedlookup+0x67/frame
>>>>>> 0xfffffe00c53ef9e0
>>>>>> vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe00c53efa30
>>>>>> vfs_lookup() at vfs_lookup+0x457/frame 0xfffffe00c53efac0
>>>>>> namei() at namei+0x2e1/frame 0xfffffe00c53efb20
>>>>>> vn_open_cred() at vn_open_cred+0x505/frame 0xfffffe00c53efca0
>>>>>> kern_openat() at kern_openat+0x287/frame 0xfffffe00c53efdf0
>>>>>> ia32_syscall() at ia32_syscall+0x156/frame 0xfffffe00c53eff30
>>>>>> int0x80_syscall_common() at int0x80_syscall_common+0x9c/frame 0xffff8
>> 9d
>>> c
>>>>>> db>
>>>>>>
>>>>>> I'll let it continue. Hopefully the watchdog timer will pop and we ge
>> t
>>> a
>>>>>> dump.
>>>>>>
>>>>>
>>>>>
>>>>> Might also be interesting to see if this moves around or is really hung
>>>>> getting the time. I suspect it's live lock given this traceback.
>>>>
>>>> It's moving around.
>>>>
>>>>
>>>> db> c
>>>> [halt sent]
>>>> KDB: enter: Break to debugger
>>>> [ thread pid 31259 tid 100913 ]
>>>> Stopped at      kdb_break+0x48: movq    $0,0xa1069d(%rip)
>>>> db> bt
>>>> Tracing pid 31259 tid 100913 td 0xfffffe00c4eca000
>>>> kdb_break() at kdb_break+0x48/frame 0xfffffe00c53ef410
>>>> uart_intr() at uart_intr+0xf7/frame 0xfffffe00c53ef450
>>>> intr_event_handle() at intr_event_handle+0x12b/frame 0xfffffe00c53ef4c0
>>>> intr_execute_handlers() at intr_execute_handlers+0x63/frame
>>>> 0xfffffe00c53ef4f0
>>>> Xapic_isr1() at Xapic_isr1+0xdc/frame 0xfffffe00c53ef4f0
>>>> --- interrupt, rip = 0xffffffff807117c4, rsp = 0xfffffe00c53ef5c0, rbp =
>>>> 0xfffffe00c53ef5c0 ---
>>>> lock_delay() at lock_delay+0x14/frame 0xfffffe00c53ef5c0
>>>> _sx_xlock_hard() at _sx_xlock_hard+0x1a9/frame 0xfffffe00c53ef660
>>>> _sx_xlock() at _sx_xlock+0xb0/frame 0xfffffe00c53ef6a0
>>>> zfs_zget() at zfs_zget+0x1fa/frame 0xfffffe00c53ef750
>>>> zfs_dirent_lookup() at zfs_dirent_lookup+0x16d/frame 0xfffffe00c53ef7a0
>>>> zfs_dirlook() at zfs_dirlook+0x7f/frame 0xfffffe00c53ef7d0
>>>> zfs_lookup() at zfs_lookup+0x3c0/frame 0xfffffe00c53ef8a0
>>>> zfs_freebsd_cachedlookup() at zfs_freebsd_cachedlookup+0x67/frame
>>>> 0xfffffe00c53ef9e0
>>>> vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe00c53efa30
>>>> vfs_lookup() at vfs_lookup+0x457/frame 0xfffffe00c53efac0
>>>> namei() at namei+0x2e1/frame 0xfffffe00c53efb20
>>>> vn_open_cred() at vn_open_cred+0x505/frame 0xfffffe00c53efca0
>>>> kern_openat() at kern_openat+0x287/frame 0xfffffe00c53efdf0
>>>> ia32_syscall() at ia32_syscall+0x156/frame 0xfffffe00c53eff30
>>>> int0x80_syscall_common() at int0x80_syscall_common+0x9c/frame 0xffff89dc
>>>> db> c
>>>
>>> It seems to have settled there for now.
>>
>> I manually panicked the machine. It captured a good dump of the above
>> backtrace.
>>
>> This is the amd64 machine that was building i386 packages under poudriere.
>> I purposely set -J high to stress it enough to panic the machine.
>>
>> This ker  c¦`bfnel includes amotin's patch from https://github.com/openzfs/zfs/pull
>> /15228. This manually triggered panic is different from the one it suffered
>> two days ago.
>>
>> The machine building amd64 packages is still running nicely but it's
>> building some large packages. Because of this there's not as much stress on
>> zfs. The panicked machine was building a lot of small packages. As such it
>> was doing a lot of zfs clones, destroys, and rollbacks.
> 
> An initial look at the dump:
> 
> (kgdb) bt
> #0  __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:57
> #1  doadump (textdump=textdump@entry=1)
>      at /opt/src/git-src/sys/kern/kern_shutdown.c:405
> #2  0xffffffff806c1b30 in kern_reboot (howto=260)
>      at /opt/src/git-src/sys/kern/kern_shutdown.c:526
> #3  0xffffffff806c202f in vpanic (fmt=0xffffffff80b8653c "from debugger",
>      ap=ap@entry=0xfffffe00c53eef30)
>      at /opt/src/git-src/sys/kern/kern_shutdown.c:970
> #4  0xffffffff806c1dd3 in panic (fmt=<unavailable>)
>      at /opt/src/git-src/sys/kern/kern_shutdown.c:894
> #5  0xffffffff80413937 in db_panic (addr=<optimized out>,
>      have_addr=<optimized out>, count=<optimized out>, modif=<optimized out>)
>      at /opt/src/git-src/sys/ddb/db_command.c:531
> #6  0xffffffff80412ddd in db_command (last_cmdp=<optimized out>,
>      cmd_table=<optimized out>, dopager=true)
>      at /opt/src/git-src/sys/ddb/db_command.c:504
> #7  0xffffffff80412a9d in db_command_loop ()
>      at /opt/src/git-src/sys/ddb/db_command.c:551
> #8  0xffffffff80416176 in db_trap (type=<optimized out>, code=<optimized
> out>)
>      at /opt/src/git-src/sys/ddb/db_main.c:268
> #9  0xffffffff80710993 in kdb_trap (type=type@entry=3, code=code@entry=0,
>      tf=tf@entry=0xfffffe00c53ef340) at /opt/src/git-src/sys/kern/subr_kdb.c:
> 790
> #10 0xffffffff80a72cb9 in trap (frame=0xfffffe00c53ef340)
>      at /opt/src/git-src/sys/amd64/amd64/trap.c:608
> #11 <signal handler called>
> #12 kdb_enter (why=<optimized out>, msg=<optimized out>)
>      at /opt/src/git-src/sys/kern/subr_kdb.c:556
> #13 kdb_break () at /opt/src/git-src/sys/kern/subr_kdb.c:346
> #14 0xffffffff804dc7f7 in uart_intr_break (arg=0xfffff80006b84400)
>      at /opt/src/git-src/sys/dev/uart/uart_core.c:298
> #15 uart_intr (arg=0xfffff80006b84400)
>      at /opt/src/git-src/sys/dev/uart/uart_core.c:439
> #16 0xffffffff8067ae8b in intr_event_handle (ie=ie@entry=0xfffff80005814c00,
> 
>      frame=frame@entry=0xfffffe00c53ef500)
>      at /opt/src/git-src/sys/kern/kern_intr.c:1410
> #17 0xffffffff80a37653 in intr_execute_handlers (isrc=0xfffff8000580c9c8,
> --Type <RET> for more, q to quit, c to continue without paging--c
>      frame=0xfffffe00c53ef500)
>      at /opt/src/git-src/sys/x86/x86/intr_machdep.c:354
> #18 <signal handler called>
> #19 lock_delay (la=la@entry=0xfffffe00c53ef600)
>      at /opt/src/git-src/sys/kern/subr_lock.c:124
> #20 0xffffffff806ccfb9 in _sx_xlock_hard (sx=sx@entry=0xfffff80123712340,
>      x=<optimized out>, opts=opts@entry=0,
>      file=file@entry=0xffffffff81910597 "/opt/src/git-src/sys/contrib/openzfs
> /module/os/freebsd/zfs/zfs_znode.c", line=line@entry=949)
>      at /opt/src/git-src/sys/kern/kern_sx.c:693
> #21 0xffffffff806ccdb0 in _sx_xlock (sx=sx@entry=0xfffff80123712340,
>      opts=opts@entry=0,
>      file=0xffffffff81910597 "/opt/src/git-src/sys/contrib/openzfs/module/os/
> freebsd/zfs/zfs_znode.c", line=line@entry=949)
>      at /opt/src/git-src/sys/kern/kern_sx.c:330
> #22 0xffffffff816b4ffa in zfs_zget (zfsvfs=zfsvfs@entry=0xfffff80123712000,
>      obj_num=80130, zpp=zpp@entry=0xfffffe00c53ef760)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_znode.
> c:949
> #23 0xffffffff816a018d in zfs_dirent_lookup (dzp=dzp@entry=0xfffff80131e6dae
> 0,
>      name=0xfffffe00c53ef8c0 "xlocale", zpp=zpp@entry=0xfffffe00c53ef7b0,
>      flag=flag@entry=2)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_dir.c:
> 190
> #24 0xffffffff816a028f in zfs_dirlook (dzp=dzp@entry=0xfffff80131e6dae0,
>      name=0xfffffe00c4bbdc80 "", name@entry=0xfffffe00c53ef8c0 "xlocale",
>      zpp=zpp@entry=0xfffffe00c53ef860)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_dir.c:
> 246
> #25 0xffffffff816b0d20 in zfs_lookup (dvp=0xfffff800c2da9380,
>      nm=nm@entry=0xfffffe00c53ef8c0 "xlocale", vpp=0xfffffe00c53efd10,
>      cnp=cnp@entry=0xfffffe00c53efd38, nameiop=0, cr=<optimized out>,
> flags=0,
>      cached=1)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_
> os.c:923
> #26 0xffffffff816ab907 in zfs_freebsd_lookup (ap=0xfffffe00c53ef9f0,
> cached=1)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_
> os.c:4558
> #27 zfs_freebsd_cachedlookup (ap=0xfffffe00c53ef9f0)
>      at /opt/src/git-src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_
> os.c:4566
> #28 0xffffffff80797d86 in VOP_CACHEDLOOKUP (dvp=0xfffff800c2da9380,
>      vpp=0xfffffe00c53efd10, cnp=0xfffffe00c53efd38) at ./vnode_if.h:101
> #29 vfs_cache_lookup (ap=<optimized out>)
>      at /opt/src/git-src/sys/kern/vfs_cache.c:3068
> #30 0xffffffff807a9487 in VOP_LOOKUP (dvp=0xfffff800c2da9380,
>      vpp=0xfffffe00c53efd10, cnp=0xfffffe00c53efd38) at ./vnode_if.h:67
> #31 vfs_lookup (ndp=ndp@entry=0xfffffe00c53efcb8)
>      at /opt/src/git-src/sys/kern/vfs_lookup.c:1252
> #32 0xffffffff807a8531 in namei (ndp=ndp@entry=0xfffffe00c53efcb8)
>      at /opt/src/git-src/sys/kern/vfs_lookup.c:684
> #33 0xffffffff807d0095 in vn_open_cred (ndp=ndp@entry=0xfffffe00c53efcb8,
>      flagp=flagp@entry=0xfffffe00c53efdc4, cmode=cmode@entry=0,
>      vn_open_flags=vn_open_flags@entry=16, cred=0xfffff801ea492800,
>      fp=0xfffff801b9a54690) at /opt/src/git-src/sys/kern/vfs_vnops.c:331
> #34 0xffffffff807c61f7 in kern_openat (td=0xfffffe00c4eca000, fd=-100,
>      path=0xffff8a24 <error: Cannot access memory at address 0xffff8a24>,
>      pathseg=UIO_USERSPACE, flags=1048577, mode=<optimized out>)
>      at /opt/src/git-src/sys/kern/vfs_syscalls.c:1165
> #35 0xffffffff80a993a6 in syscallenter (td=0xfffffe00c4eca000)
>      at /opt/src/git-src/sys/amd64/ia32/../../kern/subr_syscall.c:187
> #36 ia32_syscall (frame=0xfffffe00c53eff40)
>      at /opt/src/git-src/sys/amd64/ia32/ia32_syscall.c:222
> #37 0xffffffff80a47f8f in int0x80_syscall_common ()
>      at /opt/src/git-src/sys/amd64/ia32/ia32_exception.S:75
> #38 0x0000000026a12000 in ?? ()
> #39 0x0000000000000000 in ?? ()
> (kgdb)

Cy, considering the machine didn't crash or deadlock immediately, and 
that the only trace you've got from it shows only a lock waiting in VFS 
layer, it can be anything, including something in vnode life cycle.  If 
you have more than one CPU, then trace from the one arbitrary CPU where 
serial interrupt is routed may show only some consequences, not the 
source of problem.  To diagnose some deadlocks we would need something 
like `procstat -akk`.

I've asked you in private email yesterday to try experiment with setting 
sync=disabled on your pools/datasets to bypass ZIL code and see whether 
the errors persist.  It means you may loose few seconds of updates in 
case of crash, so I do not propose to do it normally, but if you see the 
problems even after that, it may mean there is more than one problem and 
they should be handled separately.

-- 
Alexander Motin