Re: Chasing OOM Issues - good sysctl metrics to use?

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 10 May 2022 08:01:52 UTC
On 2022-Apr-29, at 13:57, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Apr-29, at 13:41, Pete Wright <pete@nomadlogic.org> wrote:
>> 
>>> . . .
>> 
>> d'oh - went out for lunch and workstation locked up.  i *knew* i shouldn't have said anything lol.
> 
> Any interesting console messages ( or dmesg -a or /var/log/messages )?
> 

I've been doing some testing of a patch by tijl at FreeBSD.org
and have reproduced both hang-ups (ZFS/ARC context) and kills
(UFS/noARC and ZFS/ARC) for "was killed: failed to reclaim
memory", both with and without the patch. This is with only a
tiny fraction of the swap partition(s) enabled being put to
use. So far, the testing was deliberately with
vm.pageout_oom_seq=12 (the default value). My testing has been
with main [so: 14].

But I also learned how to avoid the hang-ups that I got --but
it costs making kills more likely/quicker, other things being
equal.

I discovered that the hang-ups that I got were from all the
processes that I interact with the system via ending up with
the process's kernel threads swapped out and were not being
swapped in. (including sshd, so no new ssh connections). In
some contexts I only had escaping into the kernel debugger
available, not even ^T would work. Other times ^T did work.

So, when I'm willing to risk kills in order to maintain
the ability to interact normally, I now use in
/etc/sysctl.conf :

vm.swap_enabled=0

This disables swapping out of process kernel stacks. It
is just with that option removedfor gaining free RAM, there
fewer options tried before a kill is initiated. It is not a
loader-time tunable but is writable, thus the
/etc/sysctl.conf placement.

Note that I get kills both for vm.swap_enabled=0 and for
vm.swap_enabled=1 . It is just what looks like a hangup
that I'm trying to control via using =0 .

For now, I view my use as experimental. It might require
adjusting my vm.pageout_oom_seq=120 usage.

I've yet to use protect to also prevent kills of processes
needed for the interactions ( see: man 1 protect ). Most
likely I'd try to protect enough to allow the console
interactions to avoid being killed.


For reference . . .

The type of testing is to use the likes of:

# stress -m 2 --vm-bytes ????M --vm-keep

and part of the time with grep activity also
running, such as:

# grep -r nfreed /usr/*-src/sys/ | more

for specific values where the * is. (I have
13_0R , 13_1R , 13S , and main .) Varying
the value leads to reading new material
instead of referencing buffered/cached
material from the prior grep(s).

The ???? is roughly set up so that the system
ends up about where its initial Free RAM is
used up, so near (above or below) where some
sustained paging starts. I explore figures
that make the system land in this state.
I do not have a use-exactly-this computed
figure technique. But I run into the problems
fairly easily/quickly so far. As stress itself
uses some memory, the ???? need not be strictly
based on exactly 1/2 of the initial Free RAM
value --but that figure suggests were I explore
around.

The kills sometimes are not during the grep
but somewhat after. Sometimes, after grep is
done, stopping stress and starting it again
leads to a fairly quick kill.

The system used for the testing is an aarch64
MACCHIATObin Double Shot (4 Cortex-A72s) with
16 GiBytes of RAM. I can boot either its ZFS
media or its UFS media. (The other OS media is
normally ignored by the system configuration.)

===
Mark Millard
marklmi at yahoo.com