Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder
- Reply: Mark Millard : "Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- In reply to: Mark Millard : "poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 23 Aug 2023 22:10:11 UTC
On 8/23/23, Mark Millard <marklmi@yahoo.com> wrote: > [Forked off the ZFS deadlock 14 discussion, per feedback.] > > On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote: > >> On 22.08.2023 14:24, Mark Millard wrote: >>> Alexander Motin <mav_at_FreeBSD.org> wrote on >>> Date: Tue, 22 Aug 2023 16:18:12 UTC : >>>> I am waiting for final test results from George Wilson and then will >>>> request quick merge of both to zfs-2.2-release branch. Unfortunately >>>> there are still not many reviewers for the PR, since the code is not >>>> trivial, but at least with the test reports Brian Behlendorf and Mark >>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody else >>>> have tested and/or reviewed the PR, you may comment on it. >>> I had written to the list that when I tried to test the system >>> doing poudriere builds (initially with your patches) using >>> USE_TMPFS=no so that zfs had to deal with all the file I/O, I >>> instead got only one builder that ended up active, the others >>> never reaching "Builder started": >> >>> Top was showing lots of "vlruwk" for the cpdup's. For example: >>> . . . >>> 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23 >>> 0.00% cpdup -i0 -o ref 32 >>> 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20 >>> 0.01% cpdup -i0 -o ref 31 >>> 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30 >>> 0.01% cpdup -i0 -o ref 30 >>> 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18 >>> 0.01% cpdup -i0 -o ref 29 >>> 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33 >>> 0.01% cpdup -i0 -o ref 28 >>> 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28 >>> 0.00% cpdup -i0 -o ref 27 >>> . . . >>> But those processes did show CPU?? on occasion, as well as >>> *vnode less often. None of the cpdup's was stuck in >>> Removing your patches did not change the behavior. >> >> Mark, to me "vlruwk" looks like a limit on number of vnodes. I was not >> deep in that area at least recently, so somebody with more experience >> there could try to diagnose it. At very least it does not look related to >> the ZIL issue discussed in this thread, at least with the information >> provided, so I am not surprised that the mentioned patches do not affect >> it. > > I did the above intending to test the deadlock in my context but > ended up not getting that far when I tried to make zfs handle all > the file I/O (USE_TMPFS=no and no other use of tmpfs or the like). > > The zfs context is a simple single partition on the boot media. I > use ZFS for bectl BE use, not for other typical reasons. The media > here is PCIe Optane 1.4T media. The machine is a ThreadRipper > 1950X, so first generation. 128 GiBytes of RAM. 491520 MiBytes of > swap, also on that Optane. > > # uname -apKU > FreeBSD amd64-ZFS 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #112 > main-n264912-b1d3e2b77155-dirty: Sun Aug 20 10:01:48 PDT 2023 > root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG > amd64 amd64 1400096 1400096 > > The GENERIC-DBG variant of the kernel did not report any issues in > earlier testing. > > The alter referenced /usr/obj/DESTDIRs/main-amd64-poud-bulk_a was > installed from the same build. > > # zfs list > NAME USED AVAIL REFER > MOUNTPOINT > zoptb 79.9G 765G 96K /zoptb > zoptb/BUILDs 20.5G 765G 8.29M > /usr/obj/BUILDs > zoptb/BUILDs/alt-main-amd64-dbg-clang-alt 1.86M 765G 1.86M > /usr/obj/BUILDs/alt-main-amd64-dbg-clang-alt > zoptb/BUILDs/alt-main-amd64-nodbg-clang-alt 30.2M 765G 30.2M > /usr/obj/BUILDs/alt-main-amd64-nodbg-clang-alt > zoptb/BUILDs/main-amd64-dbg-clang 9.96G 765G 9.96G > /usr/obj/BUILDs/main-amd64-dbg-clang > zoptb/BUILDs/main-amd64-dbg-gccxtc 38.5M 765G 38.5M > /usr/obj/BUILDs/main-amd64-dbg-gccxtc > zoptb/BUILDs/main-amd64-nodbg-clang 10.3G 765G 10.3G > /usr/obj/BUILDs/main-amd64-nodbg-clang > zoptb/BUILDs/main-amd64-nodbg-clang-alt 37.2M 765G 37.2M > /usr/obj/BUILDs/main-amd64-nodbg-clang-alt > zoptb/BUILDs/main-amd64-nodbg-gccxtc 94.6M 765G 94.6M > /usr/obj/BUILDs/main-amd64-nodbg-gccxtc > zoptb/DESTDIRs 4.33G 765G 104K > /usr/obj/DESTDIRs > zoptb/DESTDIRs/main-amd64-poud 2.16G 765G 2.16G > /usr/obj/DESTDIRs/main-amd64-poud > zoptb/DESTDIRs/main-amd64-poud-bulk_a 2.16G 765G 2.16G > /usr/obj/DESTDIRs/main-amd64-poud-bulk_a > zoptb/ROOT 13.1G 765G 96K none > zoptb/ROOT/build_area_for-main-amd64 5.03G 765G 3.24G none > zoptb/ROOT/main-amd64 8.04G 765G 3.23G none > zoptb/poudriere 6.58G 765G 112K > /usr/local/poudriere > zoptb/poudriere/data 6.58G 765G 128K > /usr/local/poudriere/data > zoptb/poudriere/data/.m 112K 765G 112K > /usr/local/poudriere/data/.m > zoptb/poudriere/data/cache 17.4M 765G 17.4M > /usr/local/poudriere/data/cache > zoptb/poudriere/data/images 96K 765G 96K > /usr/local/poudriere/data/images > zoptb/poudriere/data/logs 2.72G 765G 2.72G > /usr/local/poudriere/data/logs > zoptb/poudriere/data/packages 3.84G 765G 3.84G > /usr/local/poudriere/data/packages > zoptb/poudriere/data/wrkdirs 112K 765G 112K > /usr/local/poudriere/data/wrkdirs > zoptb/poudriere/jails 96K 765G 96K > /usr/local/poudriere/jails > zoptb/poudriere/ports 96K 765G 96K > /usr/local/poudriere/ports > zoptb/tmp 68.5M 765G 68.5M /tmp > zoptb/usr 35.1G 765G 96K /usr > zoptb/usr/13_0R-src 2.64G 765G 2.64G > /usr/13_0R-src > zoptb/usr/alt-main-src 96K 765G 96K > /usr/alt-main-src > zoptb/usr/home 181M 765G 181M /usr/home > zoptb/usr/local 5.08G 765G 5.08G > /usr/local > zoptb/usr/main-src 833M 765G 833M > /usr/main-src > zoptb/usr/ports 26.4G 765G 26.4G > /usr/ports > zoptb/usr/src 96K 765G 96K /usr/src > zoptb/var 52.6M 765G 96K /var > zoptb/var/audit 356K 765G 356K > /var/audit > zoptb/var/crash 128K 765G 128K > /var/crash > zoptb/var/db 49.7M 765G 96K /var/db > zoptb/var/db/pkg 49.4M 765G 49.4M > /var/db/pkg > zoptb/var/db/ports 164K 765G 164K > /var/db/ports > zoptb/var/log 1.61M 765G 1.61M /var/log > zoptb/var/mail 632K 765G 632K /var/mail > zoptb/var/tmp 128K 765G 128K /var/tmp > > # poudriere jail -jmain-amd64-bulk_a -i > Jail name: main-amd64-bulk_a > Jail version: 14.0-ALPHA2 > Jail arch: amd64 > Jail method: null > Jail mount: /usr/obj/DESTDIRs/main-amd64-poud-bulk_a > Jail fs: > Jail updated: 2021-12-04 14:55:22 > Jail pkgbase: disabled > > > > So, setting up another test with some related information > shown before, during, and after. sysctl output is from > another ssh session than the bulk -a run. > > # sysctl -a | grep vnode > kern.maxvnodes: 2213808 > kern.ipc.umtx_vnode_persistent: 0 > kern.minvnodes: 553452 > vm.vnode_pbufs: 2048 > vm.stats.vm.v_vnodepgsout: 0 > vm.stats.vm.v_vnodepgsin: 272429 > vm.stats.vm.v_vnodeout: 0 > vm.stats.vm.v_vnodein: 12461 > vfs.vnode_alloc_sleeps: 0 > vfs.wantfreevnodes: 553452 > vfs.freevnodes: 962766 > vfs.vnodes_created: 2538980 > vfs.numvnodes: 1056233 > vfs.cache.debug.vnodes_cel_3_failures: 0 > vfs.cache.stats.heldvnodes: 91878 > debug.vnode_domainset: <NULL> > debug.sizeof.vnode: 448 > debug.fail_point.status_fill_kinfo_vnode__random_path: off > debug.fail_point.fill_kinfo_vnode__random_path: off > > # poudriere bulk -jmain-amd64-bulk_a -a > . . . > [00:01:34] Building 34042 packages using up to 32 builders > [00:01:34] Hit CTRL+t at any time to see build progress and stats > [00:01:34] [01] [00:00:00] Builder starting > [00:01:57] [01] [00:00:23] Builder started > [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4 > [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success > [00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1 > [00:03:22] [02] [00:00:00] Builder starting > [00:03:22] [03] [00:00:00] Builder starting > . . . > [00:03:22] [31] [00:00:00] Builder starting > [00:03:22] [32] [00:00:00] Builder starting > [00:03:31] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1: > Success > [00:03:31] [01] [00:00:00] Building devel/gettext-runtime | > gettext-runtime-0.22 > . . . > > Note that only [01] makes progress: no new "Builder started" > notices occur. top shows 31 of the pattern: > cpdup -i0 -o ref ?? > > Then during the 31 cpudup's showing vlruwk most of the time: > > # sysctl -a | grep vnode > kern.maxvnodes: 2213808 > kern.ipc.umtx_vnode_persistent: 0 > kern.minvnodes: 553452 > vm.vnode_pbufs: 2048 > vm.stats.vm.v_vnodepgsout: 22844 > vm.stats.vm.v_vnodepgsin: 582398 > vm.stats.vm.v_vnodeout: 890 > vm.stats.vm.v_vnodein: 34296 > vfs.vnode_alloc_sleeps: 2994 > vfs.wantfreevnodes: 553452 > vfs.freevnodes: 2209662 > vfs.vnodes_created: 12206299 > vfs.numvnodes: 2214071 > vfs.cache.debug.vnodes_cel_3_failures: 0 > vfs.cache.stats.heldvnodes: 459 > debug.vnode_domainset: <NULL> > debug.sizeof.vnode: 448 > debug.fail_point.status_fill_kinfo_vnode__random_path: off > debug.fail_point.fill_kinfo_vnode__random_path: off > > Wait a while but still the mostly cpdup vlruwk status: > > # sysctl -a | grep vnode > kern.maxvnodes: 2213808 > kern.ipc.umtx_vnode_persistent: 0 > kern.minvnodes: 553452 > vm.vnode_pbufs: 2048 > vm.stats.vm.v_vnodepgsout: 22844 > vm.stats.vm.v_vnodepgsin: 583527 > vm.stats.vm.v_vnodeout: 890 > vm.stats.vm.v_vnodein: 34396 > vfs.vnode_alloc_sleeps: 8053 > vfs.wantfreevnodes: 553452 > vfs.freevnodes: 2210166 > vfs.vnodes_created: 12212061 > vfs.numvnodes: 2215106 > vfs.cache.debug.vnodes_cel_3_failures: 0 > vfs.cache.stats.heldvnodes: 497 > debug.vnode_domainset: <NULL> > debug.sizeof.vnode: 448 > debug.fail_point.status_fill_kinfo_vnode__random_path: off > debug.fail_point.fill_kinfo_vnode__random_path: off > > ^C[00:14:55] Error: Signal SIGINT caught, cleaning up and exiting > > # sysctl -a | grep vnode > kern.maxvnodes: 2213808 > kern.ipc.umtx_vnode_persistent: 0 > kern.minvnodes: 553452 > vm.vnode_pbufs: 2048 > vm.stats.vm.v_vnodepgsout: 22844 > vm.stats.vm.v_vnodepgsin: 584474 > vm.stats.vm.v_vnodeout: 890 > vm.stats.vm.v_vnodein: 34591 > vfs.vnode_alloc_sleeps: 17584 > vfs.wantfreevnodes: 553452 > vfs.freevnodes: 2210796 > vfs.vnodes_created: 12222343 > vfs.numvnodes: 2216564 > vfs.cache.debug.vnodes_cel_3_failures: 0 > vfs.cache.stats.heldvnodes: 539 > debug.vnode_domainset: <NULL> > debug.sizeof.vnode: 448 > debug.fail_point.status_fill_kinfo_vnode__random_path: off > debug.fail_point.fill_kinfo_vnode__random_path: off > > [main-amd64-bulk_a-default] [2023-08-23_13h58m08s] [sigint:] Queued: 34435 > Built: 2 Failed: 0 Skipped: 35 Ignored: 358 Fetched: 0 > Tobuild: 34040 Time: 00:14:36 > [00:16:13] Logs: > /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-23_13h58m08s > [00:16:49] Cleaning up > load: 5.28 cmd: sh 77057 [vlruwk] 141.63r 0.00u 30.98s 28% 6932k > #0 0xffffffff80b76ebb at mi_switch+0xbb > #1 0xffffffff80bc960f at sleepq_timedwait+0x2f > #2 0xffffffff80b76610 at _sleep+0x1d0 > #3 0xffffffff80c5b2dc at vn_alloc_hard+0x2ac > #4 0xffffffff80c50a12 at getnewvnode_reserve+0x92 > #5 0xffffffff829afb12 at zfs_zget+0x22 > #6 0xffffffff8299ca8d at zfs_dirent_lookup+0x16d > #7 0xffffffff8299cb5f at zfs_dirlook+0x7f > #8 0xffffffff829ac410 at zfs_lookup+0x350 > #9 0xffffffff829a782a at zfs_freebsd_cachedlookup+0x6a > #10 0xffffffff80c368ad at vfs_cache_lookup+0xad > #11 0xffffffff80c3b6d8 at cache_fplookup_final_modifying+0x188 > #12 0xffffffff80c38766 at cache_fplookup+0x356 > #13 0xffffffff80c43fb2 at namei+0x112 > #14 0xffffffff80c62e5b at kern_funlinkat+0x13b > #15 0xffffffff80c62d18 at sys_unlink+0x28 > #16 0xffffffff83b8e583 at filemon_wrapper_unlink+0x13 > #17 0xffffffff81049a79 at amd64_syscall+0x109 > > [00:26:28] Unmounting file systems > Exiting with status 1 > > # sysctl -a | grep vnode > kern.maxvnodes: 2213808 > kern.ipc.umtx_vnode_persistent: 0 > kern.minvnodes: 553452 > vm.vnode_pbufs: 2048 > vm.stats.vm.v_vnodepgsout: 22844 > vm.stats.vm.v_vnodepgsin: 585384 > vm.stats.vm.v_vnodeout: 890 > vm.stats.vm.v_vnodein: 34798 > vfs.vnode_alloc_sleeps: 27578 > vfs.wantfreevnodes: 553452 > vfs.freevnodes: 61362 > vfs.vnodes_created: 20135479 > vfs.numvnodes: 59860 > vfs.cache.debug.vnodes_cel_3_failures: 0 > vfs.cache.stats.heldvnodes: 208 > debug.vnode_domainset: <NULL> > debug.sizeof.vnode: 448 > debug.fail_point.status_fill_kinfo_vnode__random_path: off > debug.fail_point.fill_kinfo_vnode__random_path: off > > > For reference (from after): > > # kldstat > Id Refs Address Size Name > 1 95 0xffffffff80200000 274b308 kernel > 2 1 0xffffffff8294c000 5d5238 zfs.ko > 3 1 0xffffffff82f22000 7718 cryptodev.ko > 4 1 0xffffffff83b10000 3390 acpi_wmi.ko > 5 1 0xffffffff83b14000 3220 intpm.ko > 6 1 0xffffffff83b18000 2178 smbus.ko > 7 1 0xffffffff83b1b000 2240 cpuctl.ko > 8 1 0xffffffff83b1e000 3360 uhid.ko > 9 1 0xffffffff83b22000 4364 ums.ko > 10 1 0xffffffff83b27000 33c0 usbhid.ko > 11 1 0xffffffff83b2b000 3380 hidbus.ko > 12 1 0xffffffff83b2f000 4d20 ng_ubt.ko > 13 6 0xffffffff83b34000 abb8 netgraph.ko > 14 2 0xffffffff83b3f000 a250 ng_hci.ko > 15 4 0xffffffff83b4a000 2670 ng_bluetooth.ko > 16 1 0xffffffff83b4d000 83a0 uftdi.ko > 17 1 0xffffffff83b56000 4e58 ucom.ko > 18 1 0xffffffff83b5b000 3360 wmt.ko > 19 1 0xffffffff83b5f000 e268 ng_l2cap.ko > 20 1 0xffffffff83b6e000 1bf68 ng_btsocket.ko > 21 1 0xffffffff83b8a000 38f8 ng_socket.ko > 22 1 0xffffffff83b8e000 3250 filemon.ko > 23 1 0xffffffff83b92000 4758 nullfs.ko > 24 1 0xffffffff83b97000 73c0 linprocfs.ko > 25 3 0xffffffff83b9f000 be70 linux_common.ko > 26 1 0xffffffff83bab000 3558 fdescfs.ko > 27 1 0xffffffff83baf000 31b20 linux.ko > 28 1 0xffffffff83be1000 2ed40 linux64.ko > > > Note that before the "Cleaning up" notice, the vfs.freevnodes > shows as being around (for example) 2210796. But after > "Exiting with status": 61362. vfs.vnodes_created has a > similar staging of in the ball park of up to 12222343 but > then the change to: 20135479. Similarly, vfs.numvnodes > 2216564 -> 59860. > > > > Anything else I should gather and report as basic information? > This is a known problem, but it is unclear if you should be running into it in this setup. Can you try again but this time *revert* 138a5dafba312ff39ce0eefdbe34de95519e600d, like so: git revert 138a5dafba312ff39ce0eefdbe34de95519e600d may want to switch to a different branch first, for example: git checkout -b vfstesting -- Mateusz Guzik <mjguzik gmail.com>