poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder
Date: Wed, 23 Aug 2023 21:50:38 UTC
[Forked off the ZFS deadlock 14 discussion, per feedback.] On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote: > On 22.08.2023 14:24, Mark Millard wrote: >> Alexander Motin <mav_at_FreeBSD.org> wrote on >> Date: Tue, 22 Aug 2023 16:18:12 UTC : >>> I am waiting for final test results from George Wilson and then will >>> request quick merge of both to zfs-2.2-release branch. Unfortunately >>> there are still not many reviewers for the PR, since the code is not >>> trivial, but at least with the test reports Brian Behlendorf and Mark >>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody else >>> have tested and/or reviewed the PR, you may comment on it. >> I had written to the list that when I tried to test the system >> doing poudriere builds (initially with your patches) using >> USE_TMPFS=no so that zfs had to deal with all the file I/O, I >> instead got only one builder that ended up active, the others >> never reaching "Builder started": > >> Top was showing lots of "vlruwk" for the cpdup's. For example: >> . . . >> 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23 0.00% cpdup -i0 -o ref 32 >> 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20 0.01% cpdup -i0 -o ref 31 >> 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30 0.01% cpdup -i0 -o ref 30 >> 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18 0.01% cpdup -i0 -o ref 29 >> 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33 0.01% cpdup -i0 -o ref 28 >> 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28 0.00% cpdup -i0 -o ref 27 >> . . . >> But those processes did show CPU?? on occasion, as well as >> *vnode less often. None of the cpdup's was stuck in >> Removing your patches did not change the behavior. > > Mark, to me "vlruwk" looks like a limit on number of vnodes. I was not deep in that area at least recently, so somebody with more experience there could try to diagnose it. At very least it does not look related to the ZIL issue discussed in this thread, at least with the information provided, so I am not surprised that the mentioned patches do not affect it. I did the above intending to test the deadlock in my context but ended up not getting that far when I tried to make zfs handle all the file I/O (USE_TMPFS=no and no other use of tmpfs or the like). The zfs context is a simple single partition on the boot media. I use ZFS for bectl BE use, not for other typical reasons. The media here is PCIe Optane 1.4T media. The machine is a ThreadRipper 1950X, so first generation. 128 GiBytes of RAM. 491520 MiBytes of swap, also on that Optane. # uname -apKU FreeBSD amd64-ZFS 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #112 main-n264912-b1d3e2b77155-dirty: Sun Aug 20 10:01:48 PDT 2023 root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1400096 1400096 The GENERIC-DBG variant of the kernel did not report any issues in earlier testing. The alter referenced /usr/obj/DESTDIRs/main-amd64-poud-bulk_a was installed from the same build. # zfs list NAME USED AVAIL REFER MOUNTPOINT zoptb 79.9G 765G 96K /zoptb zoptb/BUILDs 20.5G 765G 8.29M /usr/obj/BUILDs zoptb/BUILDs/alt-main-amd64-dbg-clang-alt 1.86M 765G 1.86M /usr/obj/BUILDs/alt-main-amd64-dbg-clang-alt zoptb/BUILDs/alt-main-amd64-nodbg-clang-alt 30.2M 765G 30.2M /usr/obj/BUILDs/alt-main-amd64-nodbg-clang-alt zoptb/BUILDs/main-amd64-dbg-clang 9.96G 765G 9.96G /usr/obj/BUILDs/main-amd64-dbg-clang zoptb/BUILDs/main-amd64-dbg-gccxtc 38.5M 765G 38.5M /usr/obj/BUILDs/main-amd64-dbg-gccxtc zoptb/BUILDs/main-amd64-nodbg-clang 10.3G 765G 10.3G /usr/obj/BUILDs/main-amd64-nodbg-clang zoptb/BUILDs/main-amd64-nodbg-clang-alt 37.2M 765G 37.2M /usr/obj/BUILDs/main-amd64-nodbg-clang-alt zoptb/BUILDs/main-amd64-nodbg-gccxtc 94.6M 765G 94.6M /usr/obj/BUILDs/main-amd64-nodbg-gccxtc zoptb/DESTDIRs 4.33G 765G 104K /usr/obj/DESTDIRs zoptb/DESTDIRs/main-amd64-poud 2.16G 765G 2.16G /usr/obj/DESTDIRs/main-amd64-poud zoptb/DESTDIRs/main-amd64-poud-bulk_a 2.16G 765G 2.16G /usr/obj/DESTDIRs/main-amd64-poud-bulk_a zoptb/ROOT 13.1G 765G 96K none zoptb/ROOT/build_area_for-main-amd64 5.03G 765G 3.24G none zoptb/ROOT/main-amd64 8.04G 765G 3.23G none zoptb/poudriere 6.58G 765G 112K /usr/local/poudriere zoptb/poudriere/data 6.58G 765G 128K /usr/local/poudriere/data zoptb/poudriere/data/.m 112K 765G 112K /usr/local/poudriere/data/.m zoptb/poudriere/data/cache 17.4M 765G 17.4M /usr/local/poudriere/data/cache zoptb/poudriere/data/images 96K 765G 96K /usr/local/poudriere/data/images zoptb/poudriere/data/logs 2.72G 765G 2.72G /usr/local/poudriere/data/logs zoptb/poudriere/data/packages 3.84G 765G 3.84G /usr/local/poudriere/data/packages zoptb/poudriere/data/wrkdirs 112K 765G 112K /usr/local/poudriere/data/wrkdirs zoptb/poudriere/jails 96K 765G 96K /usr/local/poudriere/jails zoptb/poudriere/ports 96K 765G 96K /usr/local/poudriere/ports zoptb/tmp 68.5M 765G 68.5M /tmp zoptb/usr 35.1G 765G 96K /usr zoptb/usr/13_0R-src 2.64G 765G 2.64G /usr/13_0R-src zoptb/usr/alt-main-src 96K 765G 96K /usr/alt-main-src zoptb/usr/home 181M 765G 181M /usr/home zoptb/usr/local 5.08G 765G 5.08G /usr/local zoptb/usr/main-src 833M 765G 833M /usr/main-src zoptb/usr/ports 26.4G 765G 26.4G /usr/ports zoptb/usr/src 96K 765G 96K /usr/src zoptb/var 52.6M 765G 96K /var zoptb/var/audit 356K 765G 356K /var/audit zoptb/var/crash 128K 765G 128K /var/crash zoptb/var/db 49.7M 765G 96K /var/db zoptb/var/db/pkg 49.4M 765G 49.4M /var/db/pkg zoptb/var/db/ports 164K 765G 164K /var/db/ports zoptb/var/log 1.61M 765G 1.61M /var/log zoptb/var/mail 632K 765G 632K /var/mail zoptb/var/tmp 128K 765G 128K /var/tmp # poudriere jail -jmain-amd64-bulk_a -i Jail name: main-amd64-bulk_a Jail version: 14.0-ALPHA2 Jail arch: amd64 Jail method: null Jail mount: /usr/obj/DESTDIRs/main-amd64-poud-bulk_a Jail fs: Jail updated: 2021-12-04 14:55:22 Jail pkgbase: disabled So, setting up another test with some related information shown before, during, and after. sysctl output is from another ssh session than the bulk -a run. # sysctl -a | grep vnode kern.maxvnodes: 2213808 kern.ipc.umtx_vnode_persistent: 0 kern.minvnodes: 553452 vm.vnode_pbufs: 2048 vm.stats.vm.v_vnodepgsout: 0 vm.stats.vm.v_vnodepgsin: 272429 vm.stats.vm.v_vnodeout: 0 vm.stats.vm.v_vnodein: 12461 vfs.vnode_alloc_sleeps: 0 vfs.wantfreevnodes: 553452 vfs.freevnodes: 962766 vfs.vnodes_created: 2538980 vfs.numvnodes: 1056233 vfs.cache.debug.vnodes_cel_3_failures: 0 vfs.cache.stats.heldvnodes: 91878 debug.vnode_domainset: <NULL> debug.sizeof.vnode: 448 debug.fail_point.status_fill_kinfo_vnode__random_path: off debug.fail_point.fill_kinfo_vnode__random_path: off # poudriere bulk -jmain-amd64-bulk_a -a . . . [00:01:34] Building 34042 packages using up to 32 builders [00:01:34] Hit CTRL+t at any time to see build progress and stats [00:01:34] [01] [00:00:00] Builder starting [00:01:57] [01] [00:00:23] Builder started [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4 [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success [00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1 [00:03:22] [02] [00:00:00] Builder starting [00:03:22] [03] [00:00:00] Builder starting . . . [00:03:22] [31] [00:00:00] Builder starting [00:03:22] [32] [00:00:00] Builder starting [00:03:31] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1: Success [00:03:31] [01] [00:00:00] Building devel/gettext-runtime | gettext-runtime-0.22 . . . Note that only [01] makes progress: no new "Builder started" notices occur. top shows 31 of the pattern: cpdup -i0 -o ref ?? Then during the 31 cpudup's showing vlruwk most of the time: # sysctl -a | grep vnode kern.maxvnodes: 2213808 kern.ipc.umtx_vnode_persistent: 0 kern.minvnodes: 553452 vm.vnode_pbufs: 2048 vm.stats.vm.v_vnodepgsout: 22844 vm.stats.vm.v_vnodepgsin: 582398 vm.stats.vm.v_vnodeout: 890 vm.stats.vm.v_vnodein: 34296 vfs.vnode_alloc_sleeps: 2994 vfs.wantfreevnodes: 553452 vfs.freevnodes: 2209662 vfs.vnodes_created: 12206299 vfs.numvnodes: 2214071 vfs.cache.debug.vnodes_cel_3_failures: 0 vfs.cache.stats.heldvnodes: 459 debug.vnode_domainset: <NULL> debug.sizeof.vnode: 448 debug.fail_point.status_fill_kinfo_vnode__random_path: off debug.fail_point.fill_kinfo_vnode__random_path: off Wait a while but still the mostly cpdup vlruwk status: # sysctl -a | grep vnode kern.maxvnodes: 2213808 kern.ipc.umtx_vnode_persistent: 0 kern.minvnodes: 553452 vm.vnode_pbufs: 2048 vm.stats.vm.v_vnodepgsout: 22844 vm.stats.vm.v_vnodepgsin: 583527 vm.stats.vm.v_vnodeout: 890 vm.stats.vm.v_vnodein: 34396 vfs.vnode_alloc_sleeps: 8053 vfs.wantfreevnodes: 553452 vfs.freevnodes: 2210166 vfs.vnodes_created: 12212061 vfs.numvnodes: 2215106 vfs.cache.debug.vnodes_cel_3_failures: 0 vfs.cache.stats.heldvnodes: 497 debug.vnode_domainset: <NULL> debug.sizeof.vnode: 448 debug.fail_point.status_fill_kinfo_vnode__random_path: off debug.fail_point.fill_kinfo_vnode__random_path: off ^C[00:14:55] Error: Signal SIGINT caught, cleaning up and exiting # sysctl -a | grep vnode kern.maxvnodes: 2213808 kern.ipc.umtx_vnode_persistent: 0 kern.minvnodes: 553452 vm.vnode_pbufs: 2048 vm.stats.vm.v_vnodepgsout: 22844 vm.stats.vm.v_vnodepgsin: 584474 vm.stats.vm.v_vnodeout: 890 vm.stats.vm.v_vnodein: 34591 vfs.vnode_alloc_sleeps: 17584 vfs.wantfreevnodes: 553452 vfs.freevnodes: 2210796 vfs.vnodes_created: 12222343 vfs.numvnodes: 2216564 vfs.cache.debug.vnodes_cel_3_failures: 0 vfs.cache.stats.heldvnodes: 539 debug.vnode_domainset: <NULL> debug.sizeof.vnode: 448 debug.fail_point.status_fill_kinfo_vnode__random_path: off debug.fail_point.fill_kinfo_vnode__random_path: off [main-amd64-bulk_a-default] [2023-08-23_13h58m08s] [sigint:] Queued: 34435 Built: 2 Failed: 0 Skipped: 35 Ignored: 358 Fetched: 0 Tobuild: 34040 Time: 00:14:36 [00:16:13] Logs: /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-23_13h58m08s [00:16:49] Cleaning up load: 5.28 cmd: sh 77057 [vlruwk] 141.63r 0.00u 30.98s 28% 6932k #0 0xffffffff80b76ebb at mi_switch+0xbb #1 0xffffffff80bc960f at sleepq_timedwait+0x2f #2 0xffffffff80b76610 at _sleep+0x1d0 #3 0xffffffff80c5b2dc at vn_alloc_hard+0x2ac #4 0xffffffff80c50a12 at getnewvnode_reserve+0x92 #5 0xffffffff829afb12 at zfs_zget+0x22 #6 0xffffffff8299ca8d at zfs_dirent_lookup+0x16d #7 0xffffffff8299cb5f at zfs_dirlook+0x7f #8 0xffffffff829ac410 at zfs_lookup+0x350 #9 0xffffffff829a782a at zfs_freebsd_cachedlookup+0x6a #10 0xffffffff80c368ad at vfs_cache_lookup+0xad #11 0xffffffff80c3b6d8 at cache_fplookup_final_modifying+0x188 #12 0xffffffff80c38766 at cache_fplookup+0x356 #13 0xffffffff80c43fb2 at namei+0x112 #14 0xffffffff80c62e5b at kern_funlinkat+0x13b #15 0xffffffff80c62d18 at sys_unlink+0x28 #16 0xffffffff83b8e583 at filemon_wrapper_unlink+0x13 #17 0xffffffff81049a79 at amd64_syscall+0x109 [00:26:28] Unmounting file systems Exiting with status 1 # sysctl -a | grep vnode kern.maxvnodes: 2213808 kern.ipc.umtx_vnode_persistent: 0 kern.minvnodes: 553452 vm.vnode_pbufs: 2048 vm.stats.vm.v_vnodepgsout: 22844 vm.stats.vm.v_vnodepgsin: 585384 vm.stats.vm.v_vnodeout: 890 vm.stats.vm.v_vnodein: 34798 vfs.vnode_alloc_sleeps: 27578 vfs.wantfreevnodes: 553452 vfs.freevnodes: 61362 vfs.vnodes_created: 20135479 vfs.numvnodes: 59860 vfs.cache.debug.vnodes_cel_3_failures: 0 vfs.cache.stats.heldvnodes: 208 debug.vnode_domainset: <NULL> debug.sizeof.vnode: 448 debug.fail_point.status_fill_kinfo_vnode__random_path: off debug.fail_point.fill_kinfo_vnode__random_path: off For reference (from after): # kldstat Id Refs Address Size Name 1 95 0xffffffff80200000 274b308 kernel 2 1 0xffffffff8294c000 5d5238 zfs.ko 3 1 0xffffffff82f22000 7718 cryptodev.ko 4 1 0xffffffff83b10000 3390 acpi_wmi.ko 5 1 0xffffffff83b14000 3220 intpm.ko 6 1 0xffffffff83b18000 2178 smbus.ko 7 1 0xffffffff83b1b000 2240 cpuctl.ko 8 1 0xffffffff83b1e000 3360 uhid.ko 9 1 0xffffffff83b22000 4364 ums.ko 10 1 0xffffffff83b27000 33c0 usbhid.ko 11 1 0xffffffff83b2b000 3380 hidbus.ko 12 1 0xffffffff83b2f000 4d20 ng_ubt.ko 13 6 0xffffffff83b34000 abb8 netgraph.ko 14 2 0xffffffff83b3f000 a250 ng_hci.ko 15 4 0xffffffff83b4a000 2670 ng_bluetooth.ko 16 1 0xffffffff83b4d000 83a0 uftdi.ko 17 1 0xffffffff83b56000 4e58 ucom.ko 18 1 0xffffffff83b5b000 3360 wmt.ko 19 1 0xffffffff83b5f000 e268 ng_l2cap.ko 20 1 0xffffffff83b6e000 1bf68 ng_btsocket.ko 21 1 0xffffffff83b8a000 38f8 ng_socket.ko 22 1 0xffffffff83b8e000 3250 filemon.ko 23 1 0xffffffff83b92000 4758 nullfs.ko 24 1 0xffffffff83b97000 73c0 linprocfs.ko 25 3 0xffffffff83b9f000 be70 linux_common.ko 26 1 0xffffffff83bab000 3558 fdescfs.ko 27 1 0xffffffff83baf000 31b20 linux.ko 28 1 0xffffffff83be1000 2ed40 linux64.ko Note that before the "Cleaning up" notice, the vfs.freevnodes shows as being around (for example) 2210796. But after "Exiting with status": 61362. vfs.vnodes_created has a similar staging of in the ball park of up to 12222343 but then the change to: 20135479. Similarly, vfs.numvnodes 2216564 -> 59860. Anything else I should gather and report as basic information? === Mark Millard marklmi at yahoo.com