Re: 14.0-CURRENT failed to reclaim memory error in RPi 3B build
Date: Mon, 21 Nov 2022 03:48:58 UTC
On Wed, Nov 9, 2022 at 10:15 AM Archimedes Gaviola < archimedes.gaviola@gmail.com> wrote: > > > On Wed, Nov 9, 2022 at 1:37 AM Mark Millard <marklmi@yahoo.com> wrote: > >> On Nov 8, 2022, at 04:15, Ronald Klop <ronald-lists@klop.ws> wrote: >> >> > Van: Warner Losh <imp@bsdimp.com> >> > Datum: dinsdag, 8 november 2022 04:28 >> > Aan: Archimedes Gaviola <archimedes.gaviola@gmail.com >> > . . . >> > ... >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256929, size: 4096 >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3628, size: 4096 >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255839, size: >> 40960 >> > pid 46153 (c++), jid 0, uid 0, was killed: a thread waited too long to >> allocate a page >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255857, size: >> 28672 >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3634, size: 8192 >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 256037, size: 4096 >> > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 255320, size: 8192 >> > This means that paging to the swap partition and/or swap file took >> too long (> 30 seconds... that's all that indefinite means). It also means >> that it can't write to backing store dirty pages to give to another >> process... >> > Typical reason is that the disk / flash is not responsive to writes >> for some reason. You'll need to find why... I'd look at trims. >> > Or.... if you can't change the disk... you need to put less memory >> pressure on it.. >> > Warner >> > >> > >> > >> > NB: a way to put less memory pressure on it is not using -j3, but -j2 >> or -j1 in your make command. >> > >> > > Hi Mark, > > >> Extending Ronold's comment: If things are really taking this >> long for the paging I/O, you might actually find, say, -j2 >> takes less elapsed time than -j3 because of the latencies >> involved in -j3 causing more overall delay. >> > > Yes I'll take these options on lowering down N in the -jN parameter as my > next steps. So far so good with -j3, ongoing build is still observed for 17 > hours now. > > >> >> vm.pfault_oom_attempts=-1 would still be appropriate for avoiding >> I/O kills at any -jN: the smaller -jN just makes the issue less >> likely, not impossible. (Again, presuming sufficient swap/paging >> space if deadlock is to be well avoided.) >> > > The ongoing build is at the moment on > /usr/src/contrib/llvm-project/llvm/lib/*. I'm observing from time-to-time > if the error will occur again. > > >> (I use NVMe or SSD USB media that do not get such long delays but >> fit the power limitations of the context. I have about as little >> on microsd card media as I can get away with in my context. I also >> avoid spinning rust. Thus I've only gotten "indefinite wait buffer" >> or the like back before such was true, long ago.) >> > > Okay thanks for sharing this one. Keeping this in my mind just in case I > needed these types of media soon. > > Thanks and best regards, > Archimedes > Hi Mark, As a recap on the kernel tunables, the changes are the following, root@generic:~ # sysctl -a | grep oom vm.pageout_oom_seq: 120 vm.pfault_oom_wait: 10 vm.pfault_oom_attempts: -1 With -j1 and -j2 options, both were able to complete the kernel and buildworld compilation in 103 and 84 hours respectively. Though I still could see messages on "swap_pager: indefinite wait buffer: bufobj" but definitely it's ignorable as it survived the compilation process. With the -j3 option, it failed along the course of compilation, it encountered the previous error on "failed to reclaim memory" but this time this error is not that relevant as -j1 and -j2 already works. Preferably with -j2 as the appropriate choice for my RPi 3B build setup. Thanks and best regards, Archimedes