Re: vm.oom_pf_secs (vm_oom_pf_secs) vs. vm.pfault_oom_wait (vm_pfault_oom_wait): appropriate relationship for the settings?
Date: Sun, 13 Apr 2025 01:53:50 UTC
On Apr 12, 2025, at 18:18, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Sat, Apr 12, 2025 at 09:12:39AM -0700, Mark Millard wrote: >> For reference: >> >> # sysctl -d vm | grep oom >> vm.oom_pf_secs: >> vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM >> vm.panic_on_oom: Panic on the given number of out-of-memory errors instead of killing the largest process >> vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler >> vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling >> >> vm.oom_pf_secs looks to be for rate limiting OOMs. >> >> The following are defaults, other than the >> vm.pageout_oom_seq that I explicitly set: >> >> # sysctl vm | grep oom >> vm.oom_pf_secs: 10 >> vm.pageout_oom_seq: 120 >> vm.panic_on_oom: 0 >> vm.pfault_oom_wait: 10 >> vm.pfault_oom_attempts: 3 >> >> Note the: vm.oom_pf_secs == vm.pfault_oom_wait and >> vm.oom_pf_secs happens to be an exact factor of: >> vm_pfault_oom_attempts * vm_pfault_oom_wait >> >> Is that appropriate? Is there a better relationship >> that should/could be used? Would such involve >> vm.pfault_oom_attempts as well? Ignoring special >> handling of -1, vm.pfault_oom_attempts seems to only >> be used for a >> vm_pfault_oom_attempts * vm_pfault_oom_wait >> calculation in one place: >> >> # grep -r "pfault_oom_attempts" /usr/src/sys/ | more >> /usr/src/sys/vm/vm_fault.c:static int vm_pfault_oom_attempts = 3; >> /usr/src/sys/vm/vm_fault.c:SYSCTL_INT(_vm, OID_AUTO, pfault_oom_attempts, CTLFLAG_RWTUN, >> /usr/src/sys/vm/vm_fault.c: &vm_pfault_oom_attempts, 0, >> /usr/src/sys/vm/vm_fault.c: if (vm_pfault_oom_attempts < 0) >> /usr/src/sys/vm/vm_fault.c: if (now.tv_sec < vm_pfault_oom_attempts * vm_pfault_oom_wait) >> /usr/src/sys/vm/vm_pagequeue.h: * vm_pfault_oom_attempts page allocation failures with intervening >> >> So may be vm.oom_pf_secs == vm.pfault_oom_wait is required >> for the "Number of page allocation attempts in page fault >> handler before it triggers OOM handling" description of >> vm.pfault_oom_attempts to be reasonably accurate? But, >> then, why the separate vm.oom_pf_secs ? >> >> I noticed vm.oom_pf_secs because aarch64 FreeBSD under >> Parallels on macOS seems to on occasion end up with OOMs >> from reaching vm.pfault_oom_attempts*vm.pfault_oom_wait >> and I was looking around seeing what I might do. (It can >> also do notable paging to/from the FreeBSD swap >> partition(s) without this occurring.) >> >> So far, it has seemed best to poweroff the VM session and >> start it back up again once an OOM based on >> vm.pfault_oom_attempts*vm.pfault_oom_wait happens. >> The condition tends to return. >> >> (Note: I've been doing poudriere-devel based >> "bulk -a" based experiments/information-gathering >> related to pkg 2.1.0 activity during such builds. >> My normal activities do not seem likely to end up >> with any OOMs. It is the same media that I also >> native-boot actual aarch64 for other systems >> and I've never had such problems with the native >> boot contexts, including past bulk -a activity >> in the slower-system contexts.) > > There is no meaningful connection between parameters for the > page-fault time triggered OOM and OOM occuring when the pagedaemon > cannot produce enough free pages. Okay thanks for the information. I've since discovered that a change in my Parallels configuration for the FreeBSD VM seems to avoid the "was killed: a thread waited too long to allocate a page": limit the number of virtual cpus to the number of Performance cores (or less?). I now have 12 instead of 14. The M4 MAX involved has 12 Performance and 4 Efficiency. I did not vary the RAM allocation. I've pushed the paging harder with this change and have not run into the issue so far. === Mark Millard marklmi at yahoo.com