Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build

From: Archimedes Gaviola <archimedes.gaviola_at_gmail.com>
Date: Mon, 21 Nov 2022 09:29:25 UTC
On Mon, Nov 21, 2022 at 12:24 PM Mark Millard <marklmi@yahoo.com> wrote:

> On Nov 20, 2022, at 19:48, Archimedes Gaviola <
> archimedes.gaviola@gmail.com> wrote:
>
> > On Wed, Nov 9, 2022 at 10:15 AM Archimedes Gaviola <
> archimedes.gaviola@gmail.com> wrote:
> > On Wed, Nov 9, 2022 at 1:37 AM Mark Millard <marklmi@yahoo.com> wrote:
> > On Nov 8, 2022, at 04:15, Ronald Klop <ronald-lists@klop.ws> wrote:
> >
> > > Van: Warner Losh <imp@bsdimp.com>
> > > Datum: dinsdag, 8 november 2022 04:28
> > > Aan: Archimedes Gaviola <archimedes.gaviola@gmail.com
> > > . . .
> > > ...
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256929, size:
> 4096
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3628, size: 4096
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255839, size:
> 40960
> > > pid 46153 (c++), jid 0, uid 0, was killed: a thread waited too long to
> allocate a page
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255857, size:
> 28672
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3634, size: 8192
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256037, size:
> 4096
> > > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255320, size:
> 8192
> > >   This means that paging to the swap partition and/or swap file took
> too long (> 30 seconds... that's all that indefinite means). It also means
> that it can't write to backing store dirty pages to give to another
> process...
> > >   Typical reason is that the disk / flash is not responsive to writes
> for some reason. You'll need to find why... I'd look at trims.
> > >   Or.... if you can't change the disk... you need to put less memory
> pressure on it..
> > >   Warner
> > >
> > . . .
> >
> > Hi Mark,
> >
> > As a recap on the kernel tunables, the changes are the following,
> >
> > root@generic:~ # sysctl -a | grep oom
> > vm.pageout_oom_seq: 120
> > vm.pfault_oom_wait: 10
>

Hi,


>
> FYI . . .
>
> As long as:
>
> vm.pfault_oom_attempts == -1
>
> vm.pfault_oom_wait is ignored. It also likely does
> nothing for:
>
> vm.pfault_oom_attempts == 0
>
> vm.pfault_oom_wait gets involved for:
>
> 0 < vm.pfault_oom_attempts .
>

Okay, noted on this.


>
> > vm.pfault_oom_attempts: -1
> >
> > With -j1 and -j2 options, both were able to complete the kernel and
> buildworld compilation in 103 and 84 hours respectively. Though I still
> could see messages on "swap_pager: indefinite wait buffer: bufobj" but
> definitely it's ignorable as it survived the compilation process. With the
> -j3 option, it failed along the course of compilation, it encountered the
> previous error on "failed to reclaim memory" but this time this error is
> not that relevant as -j1 and -j2 already works. Preferably with -j2 as the
> appropriate choice for my RPi 3B build setup.
>
> Glad you got it working in your context.
>
> Thanks for the report.


I too am glad about the outcome of the testing. In the context of -j1 and
-j2 build options, the oom kernel tunable settings were very effective.
Thanks a lot for your help!


> My media does not lead to the
> conditions and, so, does not lead to learning the
> behavior when "swap_pager: indefinite wait buffer:
> bufobj" is significantly involved (for the time scale
> of waits that you got into).
>

Ah okay, as you are using a more powerful media so there's no manifestation
of such messages. Nice and good to know.


>
> The implication of the result is that you would
> need a larger vm.pageout_oom_seq value in order
> for -j3 to finish normally.


Oh I see I will take note of this one. Thanks for the further hint! I can
explore this soon.


> Based on my media,
> I've never had to use larger values, but, I knew
> it was a technical possibility to need such. I do
> not know how to pre-calculate what value would
> work.
>
> (I'm not suggesting any more -j3 experiments.)
>

It's alright, if I have spare time I will explore the -j3 settings further.
For now -j2 is sufficient.

To Warner and Ronald, thanks as well to your inputs!

Best regards,
Archimedes


>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>