Belated out of swap kill on rpi3 at r359216
Mark Millard
marklmi at yahoo.com
Sat Mar 28 18:38:57 UTC 2020
On 2020-Mar-28, at 09:17, bob prohaska <fbsd at www.zefox.net> wrote:
> On Fri, Mar 27, 2020 at 07:25:45PM -0700, Mark Millard wrote:
>>
>>
>> On 2020-Mar-26, at 16:24, Mark Millard <marklmi at yahoo.com> wrote:
>>
>>>
>>> Anyway, I may, for a time, have one context that is
>>> more like yours than is normal for me. As stands, the
>>> RPi3 is doing a from-scratch buildworld buildkernel .
>>> (Reconstructing the head -r358966 that it is already
>>> running.) It is not splitting the I/O load but is
>>> using a USB SSD (via a powered hub), not the microsd
>>> card. No extra logging. vm.pfault_oom_attempts=-1
>>> and vm.pageout_oom_seq=120 for this attempt. 3072
>>> MiBytes of page/swap space. It is a -j4 build attempt.
>>>
>>
>> ("No extra logging" meant: beyond my normal typescript
>> recording of the build output. That file ended up at
>> 7741518 Bytes for size.)
>
> Does the process capture all the output from make buildworld?
> On my machines (pi2 and pi3) that's usually ~30 MB.
A likely explanation is that I use WITH_META_MODE
and you might not:
WITH_META_MODE
. . .
The build hides commands that are executed unless NO_SILENT is
defined. Errors cause make(1) to show some of its environment
for further debugging.
. . .
(I do not use NO_SILENT, so I get the hiding.)
Over 1/2 of the lines recorded looked like
sequences similar to:
. . .
Building /usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools/tools/build/dummy.o
Building /usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools/tools/build/libegacy.a
. . .
In this case:
# grep "^Building " /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aarch64-host-2020-03-26:12:02:47 | wc
52487 104974 5767152
vs. the file overall:
# wc /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aarch64-host-2020-03-26:12:02:47
94908 256377 7741518 /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aarch64-host-2020-03-26:12:02:47
WITH_META_MODE does record details for each "Building"
line in a .meta file specific to that line. A .meta
file even includes a list of what files were involved
(opened) for that step.
So their is still file I/O for such logging, likely
more in total than when not using WITH_META_MODE.
(Not that I'd thought about that before.)
>>
>> The build completed without any /var/log/message or
>> console output during the build. My modified version
>> of top reported (details copied from a ssh window) . . .
>>
>
> That seems to settle matters. My problems are with the old
> microSD card. New, it was marginally ok. Old, it's not. That
> crudely quantifies lifespan at around a year of active use,
> with trouble appearing roughly when the card was 75% full,
> at least a hint of required overprovisioning.
Since FreeBSD provides no means of having the
SATA drive in the USB enclosure trimmed(?), I do
not know how long before it would have issues
from that. It is a small form factor 240 GByte
SSD [user space, not GiByte, likely from internal
over-provisioning of a 240 GiByte media]. I left
a 21 GiByte area at the end free as well. The 197
GiByte ufs file system is only about 19% used.
smartctl reports for the USB SSD internals:
ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
The Firmware version 609ABBF0 listed suggests a
Seagate SATA controller is involved, if I
understand right.
The USB SSD drive is far from new. It now gets the
report:
Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
. . .
0x01 0x018 6 5499176259 --- Logical Sectors Written
0x01 0x028 6 2406890437 --- Logical Sectors Read
. . .
where earlier smartctl reported:
Sector Size: 512 bytes logical/physical
> Out of curiosity, have you tried leaving vm.pfault_oom_attempts at
> its default value? An OOM kill would be unexpected, but interesting
> if observed.
Nope. I've thought of locally updating gstat to do
something similar to what I did with top: record and
report the maximum observed figures for ms/r, ms/w,
ms/d, but for each line of data in this case.
I'd not be surprised if the heavier paging times had
some large figures compared to what I saw when watching
the display. (Rarely more than 20ms.) But my observations
are not much of a sample.
I'd be more likely to try picking vm.pfault_oom_wait
after seeing what is reported, then picking a
positive vm.pfault_oom_attempts value to go with it.
I'm not sure if I'll ever do this sort of experiment.
The resulting figures used would be rather
context-specific as well.
>> For Mem: 738512Ki MaxObsActive, 190608Ki MaxObsWired, 906372Ki MaxObs(Act+Wir)
>> For Swap: 1927Mi MaxObsUsed
>>
>
> . . .
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list