Re: Armv7 (rpi2) getting stuck in buildworld for -current

From: bob prohaska <fbsd_at_www.zefox.net>
Date: Sun, 14 May 2023 23:58:16 UTC
On Sun, May 14, 2023 at 12:31:29PM -0700, Mark Millard wrote:
> 
> 
> In my environment, I use /etc/sysctl.conf , which
> is a place appropriate for non-tunable but writable
> sysctl values:
> 
> # grep vm.swap_ /etc/sysctl.conf 
> vm.swap_enabled=0
> vm.swap_idle_enabled=0
> 
> I suggest moving the assignments to /etc/sysctl.conf .
> I expect that this will get rid of your problem once
> you reboot with them in a right place. (You can also
> interactively set them via sysctl use.)
> 

At some point in the past I did that and failed to clean
up /boot/loader.conf .

> I suggest avoiding confusions by not having copies of
> those 2 lines in /boot/loader.conf (where they will
> not work).
>
I elected to comment the incorrect lines out with a note
indicating why. If I got confused once it may happen again.

IIRC the lines were added because ssh connections tend to
drop when the system gets busy. That's still happening, so
they're not the cure, or at least not the whole cure.

> > A running diary of experiments is at
> > http://www.zefox.net/~fbsd/rpi2/crashes/20230514/armv7hang
> 
> There you report reducing the swap space partition size.
> Were you getting the message about the swap possibly being
> mistuned prior to that?
> 
> For 1 GiByte of RAM 3647M looks to me to likely be a little
> below where that message about mistuning shows up. If you
> were not getting the message, the size should have been
> fine.
> 

The last "too much swap" message I can find was:
warning: total configured swap (1048576 pages) exceeds maximum recommended amount (922200 pages).
Space was reserved for 4GB of swap, suggesting that only about 1.6 GB is recommended
if I did the arithmetic right. Resizing the swap partition is easy and 1 GB should
have been more than enough, but the machine stalled again with 30-odd MB in use.
 


In the distant past armv7 seemed to use little or no swap with a 
-j4 buildworld, now it seems to require at least some when building 
llvm. So far having too much swap hasn't caused visible problems, 
but that may have been an artifact of it not being used. 

> In other words, I expect it is appropriate to put back
> the original size (or some approximation of it that
> avoids the message about possibly being mistuned).
> 
> 
> Everything that you reported looks to me to be consistent
> with some kernel stacks having been swapped out for some
> processes/threads that would otherwise be involved in
> interactive I/O activity.
>

For the moment I've updated /usr/src, set buildworld to -j4 and
am expecting it to hang sometime overnight if the problem is 
repeatable. As I write this swap use is pushing 600MB with ~60%
idle time, which is far more than I recall seeing for armv7 in
the past. It's still running, and the scheduler does seem to
find threads to favor.

The behavior starts to resemble aarch64 on a Pi3 but less extreme.

For some reason the ssh session controlling buildworld
tends to live longer than an ssh session running a tip connection
to an adjacent Pi's serial console. Since the problem of dropped
ssh connections hasn't been cured by use of
vm.swap_enabled=0
vm.swap_idle_enabled=0 
perhaps it's best to remove them, for sake of simplicity. 

Thanks for reading and all your help!

bob prohaska