Re: Armv7 (rpi2) getting stuck in buildworld for -current

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sun, 14 May 2023 19:31:29 UTC
On May 14, 2023, at 08:36, bob prohaska <fbsd@www.zefox.net> wrote:

> Lately a Pi2 running -current has gotten stuck while buildworld is running.
> There's no escape to debugger, no obvious errors on the console and only
> modest swap use (tens of MB). So far the stoppages have been when building
> clang.
> 
> One possible culprit is /boot/loader.conf, which has accumulated some baggage
> over the years:
> 
> bob@www:/usr/src % more /boot/loader.conf
> # Configure USB OTG; see usb_template(4).
> hw.usb.template=3
> umodem_load="YES"
> # Disable the beastie menu and color
> beastie_disable="YES"
> loader_color="NO"
> vm.pageout_oom_seq="4096"
> #vm.pfault_oom_attempts="3"
> vm.pfault_oom_attempts="120"
> vm.pfault_oom_wait="20"

(I oroginally had a note here but I think it
would just confuse things and not be tied to
your problem.)  . . .

However, I expect that your user process had
its kernel stack swapped out. See below.

> kern.cam.boot_delay="20000"
> vfs.ffs.dotrimcons="1"
> vfs.root_mount_always_wait="1"
> filemon_load="YES"
> net.inet.tcp.tolerate_missing_ts="1"
> vm.swap_enabled=0
> vm.swap_idle_enabled=0

Those last two lines are for avoiding having
interactive sessions (sshd, serial console)
processes end up with their kernel stacks swapped
out. (But it does so by preventing such for all
kernel stacks, not just the ones of interest.)

When a kernel stack is swapped out for a
process/thread, the process/thread can not run
at all until the kernel stack is is read back
into the kernel.

Those last two lines you have in a file for
tunables --but are not tunables:

# sysctl -T vm.swap_enabled
# sysctl -T vm.swap_idle_enabled
# 

Compared that to the check for the writable
category:

# sysctl -W vm.swap_enabled
vm.swap_enabled: 0
# sysctl -W vm.swap_idle_enabled
vm.swap_idle_enabled: 0
# 

In my environment, I use /etc/sysctl.conf , which
is a place appropriate for non-tunable but writable
sysctl values:

# grep vm.swap_ /etc/sysctl.conf 
vm.swap_enabled=0
vm.swap_idle_enabled=0

I suggest moving the assignments to /etc/sysctl.conf .
I expect that this will get rid of your problem once
you reboot with them in a right place. (You can also
interactively set them via sysctl use.)

I suggest avoiding confusions by not having copies of
those 2 lines in /boot/loader.conf (where they will
not work).

> --More--(END)
> 
> However, the problem emerged well after the changes above were made.
> 
> 
> A running diary of experiments is at
> http://www.zefox.net/~fbsd/rpi2/crashes/20230514/armv7hang

There you report reducing the swap space partition size.
Were you getting the message about the swap possibly being
mistuned prior to that?

For 1 GiByte of RAM 3647M looks to me to likely be a little
below where that message about mistuning shows up. If you
were not getting the message, the size should have been
fine.

In other words, I expect it is appropriate to put back
the original size (or some approximation of it that
avoids the message about possibly being mistuned).


Everything that you reported looks to me to be consistent
with some kernel stacks having been swapped out for some
processes/threads that would otherwise be involved in
interactive I/O activity.

===
Mark Millard
marklmi at yahoo.com