[Bug 275594] High CPU usage by arc_prune; analysis and fix
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 14 Dec 2023 06:58:31 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275594 --- Comment #12 from Seigo Tanimura <seigo.tanimura@gmail.com> --- (In reply to Seigo Tanimura from comment #10) I have added the fix to enable the extra vnode recycling and tested with the same setup. Source on GitHub: - Repo: https://github.com/altimeter-130ft/freebsd-freebsd-src - Branches - Fix: release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-interval-fix - Counters atop Fix: release/14.0.0/release-14_0_0-p2-topic-openzfs-arc_prune-interval-counters Test setup: The same as "Ongoing test" in bug #275594, comment #6. - vfs.vnode.vnlru.max_free_per_call: 4000000 (== vfs.vnode.vnlru.max_free_per_call) - vfs.zfs.arc.prune_interval: 1000 (my fix for arc_prune interval enabled) - vfs.vnode.vnlru.extra_recycle: 1 (extra vnode recycle fix enabled) Build time: 06:50:05 (312 pkgs / hr) Counters after completing the build, with some remarks: # The iteration attempts in vnlru_free_impl(). # This includes the retry from the head of vnode_list. vfs.vnode.free.free_attempt: 33934506866 # The number of the vnodes recycled successfully, including vtryrecycle(). vfs.vnode.free.free_success: 42945537 # The number of the successful recycles in phase 2 upon the VREG (regular file) vnodes. # - cleanbuf_vmpage_only: the vnodes held by the clean bufs and resident VM pages only. # - cleanbuf_only: the vnodes held by the clean bufs only. vfs.vnode.free.free_phase2_retry_reg_cleanbuf_vmpage_only: 845659 vfs.vnode.free.free_phase2_retry_reg_cleanbuf_only: 3 # The number of the iteration skips due to a held vnode. ("phase 2" hereafter) # NB the successful recycles in phase 2 are not included. vfs.vnode.free.free_phase2_retry: 8923850577 # The number of the phase 2 skips upon the VREG vnodes. vfs.vnode.free.free_phase2_retry_reg: 8085735334 # The number of the phase 2 skips upon the VREG vnodes in use. # Almost all phase 2 skips upon VREG fell into this. vfs.vnode.free.free_phase2_retry_reg_inuse: 8085733060 # The number of the successful recycles in phase 2 upon the VDIR (directory) vnodes. # - free_phase2_retry_dir_nc_src_only: the vnodes held by the namecache entries only. vfs.vnode.free.free_phase2_retry_dir_nc_src_only: 2234194 # The number of the phase 2 skips upon the VDIR vnodes. vfs.vnode.free.free_phase2_retry_dir: 834902819 # The number of the phase 2 skips upon the VDIR vnodes in use. # Almost all phase 2 skips upon VDIR fell into this. vfs.vnode.free.free_phase2_retry_dir_inuse: 834902780 Other findings: - The behaviour upon the arc_prune thread CPU usage was mostly the same. - The peak reduced just a few percents, not likely to be the essential fix. - The namecache hit ratio degraded about 10 - 20%. - Maybe the recycled vnodes are looked up again, especially the directories. ----- The issue still exists essentially with the extra vnode recycle. Maybe the root cause is in ZFS rather than the OS. There are some suspicious findings on the in-memory dnode behaviour during the tests so far: - vfs.zfs.arc_max does not enforce the max size of kstat.zfs.misc.arcstats.dnode_size. - vfs.zfs.arc_max: 4GB - vfs.zfs.arc.dnode_limit_percent: 10 (default) - sizeof(struct dnode_t): 808 bytes - Found by "vmstat -z | grep dnode_t". - kstat.zfs.misc.arcstats.arc_dnode_limit: 400MB (default, vfs.zfs.arc.dnode_limit_percent percent of vfs.zfs.arc_max) - ~495K dnodes. - kstat.zfs.misc.arcstats.dnode_size, max: ~ 1.8GB - ~2.2M dnodes. - Almost equal to the max observed number of the vnodes. - The dnode_t zone of uma(9) does not have the limit. From above, the number of the in-memory dnodes looks like the bottleneck. Maybe the essential solution is to configure vfs.zfs.arc.dnode_limit explicitly so that ZFS can hold all dnodes required by the application in the memory. -- You are receiving this mail because: You are the assignee for the bug.