Re: Armv7 (rpi2) getting stuck in buildworld for -current

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 15 May 2023 03:12:23 UTC
On May 14, 2023, at 16:58, bob prohaska <fbsd@www.zefox.net> wrote:

> On Sun, May 14, 2023 at 12:31:29PM -0700, Mark Millard wrote:
>> 
>> 
>> In my environment, I use /etc/sysctl.conf , which
>> is a place appropriate for non-tunable but writable
>> sysctl values:
>> 
>> # grep vm.swap_ /etc/sysctl.conf 
>> vm.swap_enabled=0
>> vm.swap_idle_enabled=0
>> 
>> I suggest moving the assignments to /etc/sysctl.conf .
>> I expect that this will get rid of your problem once
>> you reboot with them in a right place. (You can also
>> interactively set them via sysctl use.)
>> 
> 
> At some point in the past I did that and failed to clean
> up /boot/loader.conf .
> 
>> I suggest avoiding confusions by not having copies of
>> those 2 lines in /boot/loader.conf (where they will
>> not work).
>> 
> I elected to comment the incorrect lines out with a note
> indicating why. If I got confused once it may happen again.
> 
> IIRC the lines were added because ssh connections tend to
> drop when the system gets busy. That's still happening, so
> they're not the cure, or at least not the whole cure.
> 
>>> A running diary of experiments is at
>>> http://www.zefox.net/~fbsd/rpi2/crashes/20230514/armv7hang
>> 
>> There you report reducing the swap space partition size.
>> Were you getting the message about the swap possibly being
>> mistuned prior to that?
>> 
>> For 1 GiByte of RAM 3647M looks to me to likely be a little
>> below where that message about mistuning shows up. If you
>> were not getting the message, the size should have been
>> fine.
>> 
> 
> The last "too much swap" message I can find was:
> warning: total configured swap (1048576 pages) exceeds maximum recommended amount (922200 pages).
> Space was reserved for 4GB of swap, suggesting that only about 1.6 GB is recommended
> if I did the arithmetic right. Resizing the swap partition is easy and 1 GB should
> have been more than enough, but the machine stalled again with 30-odd MB in use.

My screwup: about 3.6*RAM_SIZE is for aarch64, not armv7.

armv7 is more like 1.7*RAM_SIZE. For armv7 I've used:

# gpart show -pl
. . .
      534528     3563520  da0p2  BPIM3swap  (1.7G) # For 1 GiByte of RAM RAM
. . .
     4311040     6291456  da0p3  BPIM3swp2  (3.0G) # For 2 GiByte of RAM RAM

Going in another direction: Note that when top displays
something it is showing a point in the past by the time
you get to see it. "32M Used" need not be even
approximately true at the point of failure. And your
first top output shows "358M Used", indicating that it
staying small like 32M is not likely over the whole
build.

> In the distant past armv7 seemed to use little or no swap with a 
> -j4 buildworld,

Not just armv7.

> now it seems to require at least some when building 
> llvm. So far having too much swap hasn't caused visible problems, 
> but that may have been an artifact of it not being used. 
> 
>> In other words, I expect it is appropriate to put back
>> the original size (or some approximation of it that
>> avoids the message about possibly being mistuned).

So much for that claim. Sorry.

>> Everything that you reported looks to me to be consistent
>> with some kernel stacks having been swapped out for some
>> processes/threads that would otherwise be involved in
>> interactive I/O activity.
>> 
> 
> For the moment I've updated /usr/src, set buildworld to -j4 and
> am expecting it to hang sometime overnight if the problem is 
> repeatable. As I write this swap use is pushing 600MB

Like the "358M Used", there is plenty of evidence around that
expecting a -j4 build to use little swap space for 1 GiByte
of RAM is not reasonable for FreeBSD and its use of LLVM
(even on/for armv7, as well as the other architectures),
going back a fair ways: the status is not a recent change.

I'm unsure if you have well avoided having any tmpfs based
space or the like that would compete for RAM and use some
of the RAM+SWAP. In the low RAM environments, I avoid such
competition and use UFS to exclusion.

I'll note that causing swap space thrashing can make builds
take longer. "Thrashing" is not directly the space used but
the frequency/backlog of swap space I/O. I always avoided
configurations that thrashed for notable periods of time,
via using -j given that I'd already avoied RAM+SWAP
competition. But thrashing is also tied to the likes of
spinning rust vs. various, for example, NVMe USB media. It
is probably generally easier to make spinning rust thrash
for notable periods. I'd also avoided spinning rust.

> with ~60%
> idle time, which is far more than I recall seeing for armv7 in
> the past. It's still running, and the scheduler does seem to
> find threads to favor.
> 
> The behavior starts to resemble aarch64 on a Pi3 but less extreme.
> 
> For some reason the ssh session controlling buildworld
> tends to live longer than an ssh session running a tip connection
> to an adjacent Pi's serial console. Since the problem of dropped
> ssh connections hasn't been cured by use of
> vm.swap_enabled=0
> vm.swap_idle_enabled=0 
> perhaps it's best to remove them, for sake of simplicity. 
> 

No. Removing them would just mean there would be more
ways for you to lose interactive control, including
over a serial console without ssh involved if you
had such at the time, not just over ssh sessions.

I never claimed there was only one cause of control
loss. I have claimed that these lines have been used
by various folks to avoid one mode of failure. (Some
times one is lucky enough to have one access path fail
but another still working, such that one can inspect
to find out the cause for the failure path was. Such
has shown examples of kernel stacks swapped out. Such
folks that added the lines cut down the frequency and
conditions would lead to lack of access/control.)

Separately . . .

Your online file report says:

QUOTE
The disk activity light
pulsed steadily, the the time display in top stopped updating and the system
was unresponsive to the enter-tilda-control-B debugger escape. 
END QUOTE

The disk activity light suggests that the system was still
doing the build and what you lost was just interactive
control and interactive monitoring. If you could tolerate
waiting for it without access beyond the activity light,
you might have ended up with a completed build.


I'll also remind that having one or more logs with an
overall high frequency of updates being written to the
media adds to the I/O issues.

===
Mark Millard
marklmi at yahoo.com