Warning: FreeBSD using its (in partition) SWAP space when running under Parallels on aarch64 macOS (M4 MAX)

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 19 Feb 2025 02:13:10 UTC
I did some experiments with doing poudriere(-devel) "bulk -ca" activity
in FreeBSD that is running under Parallels on macOS (M4 MAX). I'll
note that the media is the same device as I use on the Windows Dev Kit
2023 (and sometimes use on the RPi5). I've never observed the below in
any prior context.

It looks like significant use of paging to SWAP space needs to be
avoided under Parallels.

Something I possibly should have done --but did not do was to set:

kern.hz=100

in /boot/loader.conf (or anywhere). I also did not assign:

debug.acpi.disabled="ged"

Other than eliminating virtio_gpu from the kernel to allow efifb to
be in use under Parallels, I did not reconfigure the kernel. The
system was running my personal kernel.NODBG and the world in the
poudriere jail was also my personal build. My builds target using
-mcpu=cortex-a76 . (I also have official PkgBase kernels and the
booted world is from that same official PkgBase vintage as well.)


Bulk build activity that fit in the RAM that I'd assigned (context:
14 VM cores of 16 physical) went fine. (I'll ignore a periodic
networking issue here.)

But assigning TMPFS_BLACKLIST proved insufficient to prevent about
37 GiBytes of tmpfs for 7 builders that overlapped in time (with
less than 2700 packages to go). (3: electron* , 2: *chromium ,
1: iridium-browser .) These used significant space under the
portdistfiles/ area for each. My context deliberately experiments
with allowing a high load average build to occur.

This lead to the later part of my 2nd round of testing being a
"paging to the parition SWAP space" test, as was desired to
include in the testing anyway. It did not go well.

I ended up with around 160 messages that were like:

vm_fault_allocate_oom: proc 92871 (clang-19) failed to alloc page on fault, starting OOM
vm_fault_allocate_oom: proc 93318 (pkg-static) failed to alloc page on fault, starting OOM
vm_fault_allocate_oom: proc 89123 (c++) failed to alloc page on fault, starting OOM
vm_fault_allocate_oom: proc 92695 (c++) failed to alloc page on fault, starting OOM
vm_fault_allocate_oom: proc 93783 (sh) failed to alloc page on fault, starting OOM
vm_fault_allocate_oom: proc 92949 (clang-19) failed to alloc page on fault, starting OOM
. . .

Using ". . ." for blocks of such, the rest of the messages looked
like (all are shown):

swap_pager: out of swap space
swp_pager_getswapspace(30): failed
vm_pageout_mightbe_oom: kill context: v_free_count: 101089, v_inactive_count: 1, v_laundry_count: 4252752, v_active_count: 20058293
pid 21636 (dot), jid 57, uid 0, was killed: failed to reclaim memory
. . .
swap_pager: out of swap space
swp_pager_getswapspace(13): failed
. . .
pid 21676 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page
. . .
pid 21752 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page
. . .
swp_pager_getswapspace(18): failed
. . .
pid 21715 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page
swap_pager: out of swap space
swp_pager_getswapspace(20): failed
. . .
pid 21728 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page
vm_pageout_mightbe_oom: kill context: v_free_count: 101114, v_inactive_count: 1, v_laundry_count: 5574020, v_active_count: 18724321
pid 21601 (dot), jid 57, uid 0, was killed: failed to reclaim memory
. . .
pid 21738 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page
swap_pager: out of swap space
swp_pager_getswapspace(6): failed
. . .
pid 21647 (dot), jid 57, uid 0, was killed: a thread waited too long to allocate a page


So, 8 killed builds.

I did (as I normally do) have in /boot/loader.conf:

vm.pageout_oom_seq=120

#vm.pfault_oom_attempts=-1
#vm.pfault_oom_attempts= 3
#vm.pfault_oom_wait= 10

(So: no assignments to those last 3 were involved.)

I had RAM+SWAP being a little over 351 GiBytes (104 GiBytes of it
being assigned RAM). (24 GiBytes not assigned, so left to macOS.)

It sort of looks to me like swap space was being allocated faster
than the macOS output was making it to the media and so the space
assigned (but not updated) just grew too big
("swp_pager_getswapspace(??): failed" examples). Similar issues
time/rate issues for "waited too long to allocate a page" and for
"failed to reclaim memory" ( on a scale of 120 for
vm.pageout_oom_seq ).

As for the media in use by FreeBSD: USB 3.2 Gen 2 adapter to U.2
with the U.2 device being an Optane. Parallels was not using the
macOS file system to hold the content of the VM's partitions.

===
Mark Millard
marklmi at yahoo.com