Re: swap_pager: cannot allocate bio
- In reply to: Chris Ross : "Re: swap_pager: cannot allocate bio"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 18 Jan 2022 20:29:48 UTC
On Fri, Dec 31, 2021 at 09:08:48PM -0500, Chris Ross wrote: > > > > On Nov 25, 2021, at 00:18, Chris Ross <cross+freebsd@distal.com> wrote: > >>> On Nov 20, 2021, at 13:23, Mark Johnston <markj@freebsd.org <mailto:markj@freebsd.org>> wrote: > >>> > >>> Here is a patch which tries to address the proximate cause of the > >>> problem. > > > > > The system is still cooking along, running the job that previously was > > causing it to get stuck after 48ish hours. It’s been running more than > > 80 hours now, so a definite improvement. > > > > zfs-stats reports things as stable, 80% cache hit ratio, 64GB max arc, > > ~85% of that currently as “adaptive target”. If there’s anything you > > would like to get from the system, data-wise, let me know. Happy to > > share, and help get this fix, or a different better fix if needed, into the > > tree. > > Hello all. Just was curious if anyone had a different solution for the > problem I was seeing, or if not, if the patch from Mark that I manually > applied can be integrated to the tree for current, and MFR’d to 13. > > Thank you. Please update me as to the current status of this issue, > so I don’t update and lose functionality at some later point. :-) Sorry for the delay. I submitted a pull request to openzfs which fixes the problem in a different way: https://github.com/openzfs/zfs/pull/12985#pullrequestreview-855857099 I expect it will be merged in some form soon, and merged into FreeBSD within the next several weeks. I still see some problems around low memory handling on NUMA systems when the ARC consumes most of RAM, but these aren't particularly related to the deadlock. More specifically, under severe memory pressure the page daemon will shrink UMA caches asynchronously, but the pages freed this way are not counted as frees by the page daemon, which may thus conclude that it's not making progress and trigger an OOM kill. Further exacerbating the problem is that the ARC's grow_retry constant was changed from 60s to 5s by default with the OpenZFS, which is smaller than the default lowmem period (10s). This makes it easy for the page daemon to fall behind its target, causing the integral term of its PID controller to grow quite large. The page daemon then goes into overdrive even though the instantaneous magnitude of the domain's page shortage is quite small.