Re: swap_pager: cannot allocate bio

From: Warner Losh <imp_at_bsdimp.com>
Date: Fri, 12 Nov 2021 19:52:52 UTC
On Fri, Nov 12, 2021 at 12:50 PM Chris Ross <cross+freebsd@distal.com>
wrote:

>
>
> > On Nov 12, 2021, at 11:15, Warner Losh <imp@bsdimp.com> wrote:
> > So the root cause of this problem is well known. You have a memory
> shortage, so you want to page out dirty pages to reclaim memory.
> > However, there's not enough memory to allocate the structures you need
> to do I/O and so the swapout I/O fails half way down
> > the stack not being able to allocate a bio. Some paths through the
> swapper cope with this well, other parts that execute less
> > often cope less well.
> >
> > There's some hacks in the tree today to help with the GELI case: we
> prioritize swapping I/O. But there's no g_alloc_bio_swapping() interface
> > for swapping I/O to get priority on allocating a bio to start with.
> Places that use g_clone_bio() could have the clone's copy allocated
> > from a special swap pool, but that starts to get messy and isn't done
> today. And the upper layers like geom_cfs and ZFS are
> > inconsistent in allocations, so there's work needed to make it robust in
> ZFS, but I have only a vague notion of what's needed. At the very
> > least, the swapping I/O that comes into the top of ZFS won't have
> swapping I/O marked coming out the bottom because the
> > BIO_SWAP flag is quite new.
> >
> > So until then, swapping on zvols is fraught with deadlocks like this and
> in the past there's been a strong admonishment
> > against it.
>
> Apologies, Warner, but I’m not sure I’m understanding this last
> statement.  If you mean swapping _onto_ zvols, I’m not doing that.  If you
> mean swapping in any way, while having zvols, then yes I am doing that.
>
> My swap is on a partition on the non-ZFS disk.  A physical disk as far as
> the kernel knows, hardware RAID1.
>
> # pstat -s
> Device           1K-blocks   Used    Avail  Capacity
> /dev/da0p3      445682648 1018524 444664124 0%
>

OK. That's well supported and should work w/o some of the issues that I
raised. I'd misunderstood and thought you were swapping to zvols...


> Let me know if what you’re saying above is true to my case, and any advice
> as to how I can avoid it.  I had a “not enough swap space” a while back,
> and accordingly increased the size of my swap partition.  I have 128GB of
> memory, though between the ARC and the big process I was running, that
> fills it easily.
>

Yea, this is a 'memory is exhausted' problem, and more swap won't help
that. It's unclear why we run out so fast, and why the separate zones for
the bio isn't providing a good level of insulation from out of memory
scenarios.

Warner