Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 08 Nov 2022 03:53:04 UTC
On Nov 7, 2022, at 19:28, Warner Losh <imp@bsdimp.com> wrote:
> 
> . . .
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256929, size: 4096
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3628, size: 4096
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255839, size: 40960
> pid 46153 (c++), jid 0, uid 0, was killed: a thread waited too long to allocate a page
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255857, size: 28672
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3634, size: 8192
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256037, size: 4096
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255320, size: 8192
> 
> This means that paging to the swap partition and/or swap file took too long (> 30 seconds... that's all that indefinite means).

FYI: I think the "indefinite wait buffer" bound that leads
to those messages is 20 sec (the hz*20 below):

        /*
         * Wait for the pages we want to complete.  VPO_SWAPINPROG is always
         * cleared on completion.  If an I/O error occurs, SWAPBLK_NONE
         * is set in the metadata for each page in the request.
         */
        VM_OBJECT_WLOCK(object);
        /* This could be implemented more efficiently with aflags */
        while ((ma[0]->oflags & VPO_SWAPINPROG) != 0) {
                ma[0]->oflags |= VPO_SWAPSLEEP;
                VM_CNT_INC(v_intrans);
                if (VM_OBJECT_SLEEP(object, &object->handle, PSWP,
                    "swread", hz * 20)) {
                        printf(
"swap_pager: indefinite wait buffer: bufobj: %p, blkno: %jd, size: %ld\n",
                            bp->b_bufobj, (intmax_t)bp->b_blkno, bp->b_bcount);
                }
        }
        VM_OBJECT_WUNLOCK(object);

But the "was killed: a thread waited too long to allocate a page" is
tied to a total of 30 sec (3*10sec) from:

vm.pfault_oom_attempts= 3
vm.pfault_oom_wait= 10

(Presuming that you had defaults at the time.)

> It also means that it can't write to backing store dirty pages to give to another process...
> 
> Typical reason is that the disk / flash is not responsive to writes for some reason. You'll need to find why... I'd look at trims.
> 
> Or.... if you can't change the disk... you need to put less memory pressure on it..





===
Mark Millard
marklmi at yahoo.com