[Bug 261448] ZFS VOP_RECLAIM causes long stalls during package builds
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 24 Jan 2022 22:13:05 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261448 Bug ID: 261448 Summary: ZFS VOP_RECLAIM causes long stalls during package builds Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: markj@FreeBSD.org I noticed that a system running "poudriere bulk -a" often appears to stall for several seconds. During a stall, all CPUs are nearly idle even though the system is ostensibly building NCPU packages in parallel. Digging a bit, the problem happens when a package build finishes and poudriere does a "zfs rollback" of some dataset. The kernel implementation of zfs rollback calls zfsvfs_teardown(), which uses the per-dataset "teardown lock" to suspend all VOPs on the dataset while it walks the list of vnodes associated with the mount point (multiple times) to invalidate the name cache and cached data. Any user threads performing operations on files in the dataset are blocked for this period. The second part of the problem is getnewvnode(), which may call vn_alloc_hard() to reclaim a used vnode if the total number of vnodes in the system is above the "desiredvnodes" threshold. vn_alloc_hard() performs direct reclamation from the vnode free list, and as a part of this recycling it must call VOP_RECLAIM. ZFS' VOP_RECLAIM implementation, like all VOPs, acquires the teardown lock, so it can block for a long time. If we get unlucky and a batch of vnodes belonging to a suspended dataset are at the head of the free list, then the system very quickly gets stuck until the rollback completes. Logging CPU usage during poudriere runs shows lots of idle CPU caused by this problem. The amount varies depending on how quickly package builds are finishing. Logging output from "vmstat 1" overnight gives: $ cat vmstat-log | awk '/^[0-9]/{user += $(NF - 2); sys += $(NF - 1); idle += $(NF)} END{printf "user %d sys %d idle %d total %d\n", user, sys, idle, user + sys + idle}' user 2123240 sys 259657 idle 126830 total 2509727 so something like 5% of CPU is lost. Data collected over a shorter time period during which there were lots of small package builds shows it can be much worse: user 112162 sys 25396 idle 47896 total 185454 -- You are receiving this mail because: You are the assignee for the bug.