Re: git: 315ee00fa961 - main - zfs: merge openzfs/zfs@804414aad
Date: Thu, 31 Aug 2023 23:32:28 UTC
In message <20230831223526.DCB701A1@slippy.cwsent.com>, Cy Schubert writes: > In message <a5c51f3f-8c7f-8bd5-f718-72bc33fe22ed@FreeBSD.org>, Alexander > Motin > writes: > > On 31.08.2023 13:53, Cy Schubert wrote: > > > One thing that circumvents my two problems is reducing poudriere bulk job > s > > > from 8 to 5 on my 4 core machines. > > > > Cy, I have no real evidences to think it is related, other than your > > panics look like some memory corruptions, but could you try is patch: > > https://github.com/openzfs/zfs/pull/15228 . If it won't do the trick, > > then I am out of ideas without additional input. > > So far so good. Poudriere has been running with a decent -J jobs on both > machines for over an hour. I'll let you know if they survive the night. It > can take some time before the panics happen though. > > The problem is more likely to occur when there are a lot of small package > builds than large long running jobs, probably because of the parallel ZFS > dataset creations, deletions, and rollbacks. > > > > > Gleb, you may try to add this too, just as a choice between impossible > > and improbable. > > > > -- > > Alexander Motin > > > -- > Cheers, > Cy Schubert <Cy.Schubert@cschubert.com> > FreeBSD UNIX: <cy@FreeBSD.org> Web: https://FreeBSD.org > NTP: <cy@nwtime.org> Web: https://nwtime.org > > e^(i*pi)+1=0 > > One of the two machines is hung. cwfw# ping bob PING bob (10.1.1.7): 56 data bytes ^C --- bob ping statistics --- 2 packets transmitted, 0 packets received, 100.0% packet loss cwfw# console bob [Enter `^Ec?' for help] [halt sent] KDB: enter: Break to debugger [ thread pid 31259 tid 100913 ] Stopped at kdb_break+0x48: movq $0,0xa1069d(%rip) db> bt Tracing pid 31259 tid 100913 td 0xfffffe00c4eca000 kdb_break() at kdb_break+0x48/frame 0xfffffe00c53ef2d0 uart_intr() at uart_intr+0xf7/frame 0xfffffe00c53ef310 intr_event_handle() at intr_event_handle+0x12b/frame 0xfffffe00c53ef380 intr_execute_handlers() at intr_execute_handlers+0x63/frame 0xfffffe00c53ef3b0 Xapic_isr1() at Xapic_isr1+0xdc/frame 0xfffffe00c53ef3b0 --- interrupt, rip = 0xffffffff806d5c70, rsp = 0xfffffe00c53ef480, rbp = 0xfffffe00c53ef480 --- getbinuptime() at getbinuptime+0x30/frame 0xfffffe00c53ef480 arc_access() at arc_access+0x250/frame 0xfffffe00c53ef4d0 arc_buf_access() at arc_buf_access+0xd0/frame 0xfffffe00c53ef4f0 dbuf_hold_impl() at dbuf_hold_impl+0xf3/frame 0xfffffe00c53ef580 dbuf_hold() at dbuf_hold+0x25/frame 0xfffffe00c53ef5b0 dnode_hold_impl() at dnode_hold_impl+0x194/frame 0xfffffe00c53ef670 dmu_bonus_hold() at dmu_bonus_hold+0x20/frame 0xfffffe00c53ef6a0 zfs_zget() at zfs_zget+0x20d/frame 0xfffffe00c53ef750 zfs_dirent_lookup() at zfs_dirent_lookup+0x16d/frame 0xfffffe00c53ef7a0 zfs_dirlook() at zfs_dirlook+0x7f/frame 0xfffffe00c53ef7d0 zfs_lookup() at zfs_lookup+0x3c0/frame 0xfffffe00c53ef8a0 zfs_freebsd_cachedlookup() at zfs_freebsd_cachedlookup+0x67/frame 0xfffffe00c53ef9e0 vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe00c53efa30 vfs_lookup() at vfs_lookup+0x457/frame 0xfffffe00c53efac0 namei() at namei+0x2e1/frame 0xfffffe00c53efb20 vn_open_cred() at vn_open_cred+0x505/frame 0xfffffe00c53efca0 kern_openat() at kern_openat+0x287/frame 0xfffffe00c53efdf0 ia32_syscall() at ia32_syscall+0x156/frame 0xfffffe00c53eff30 int0x80_syscall_common() at int0x80_syscall_common+0x9c/frame 0xffff89dc db> I'll let it continue. Hopefully the watchdog timer will pop and we get a dump. -- Cheers, Cy Schubert <Cy.Schubert@cschubert.com> FreeBSD UNIX: <cy@FreeBSD.org> Web: https://FreeBSD.org NTP: <cy@nwtime.org> Web: https://nwtime.org e^(i*pi)+1=0