FreeBSD 11 i386 disk deadlock (I think) (now with reproduction steps!)
David Cross
dcrosstech at gmail.com
Mon Nov 28 17:50:58 UTC 2016
I wouldn't call this a 'workaround', but the right answer. Something in
the disk io path shouldn't be allocating memory out of the pool that can
cause paging (since any of that could be IN the path for paging). It was
what I assumed Fabian's proposed patch was.
>From looking at the process list on my machine, it seems that geli
allocates a process per core per provider, is there a reason to not have
each of these on startup allocate themselves a single buffer of
sector-size, and just put all operations through that? You're not
(realistically) going to get more concurrency than that. I guess another
approach would be to pre-allocate a ring buffer of the desired operational
depth.. but that seems overkill.
On Mon, Nov 28, 2016 at 11:22 AM, Slawa Olhovchenkov <slw at zxy.spb.ru> wrote:
> On Mon, Nov 28, 2016 at 06:03:11PM +0200, Konstantin Belousov wrote:
>
> > On Mon, Nov 28, 2016 at 02:43:30PM +0100, Fabian Keil wrote:
> > > David Cross <dcrosstech at gmail.com> wrote:
> > >
> > > > This is certainly new behavior, or a new manifestation.
> > >
> > > Recently a couple of uma consumers were changed to share uma zones
> > > instead of using a dedicated zone. As a result geli competes with
> > > more uma consumers and is more likely to deadlock. The bug isn't
> > > new, it's just triggered more often now.
> > The problem happens on layer much lower than UMA, it is whole reusable
> > page pool which is depleted and cannot be re-filled without allocating
> > more memory. If you think about it, the deadlock is obviously trivial:
> > pagedaemon is the main source of the free pages, but if producing free
> > page requires allocating one, low memory condition is equal to deadlock.
> >
> > It was always there, in the sense that for all versions of freebsd, if
> > file/disk write path requires memory allocation, there is the trouble.
> >
> > For geom, some special unique measures were taken so that bio allocations
> > do not cause the issue in typical situations.
>
> Typical workaround for this is pre-allocate some memory for this
> operation.
>
More information about the freebsd-hackers
mailing list