From nobody Mon Nov 15 14:50:51 2021 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id D05C31892D5C for ; Mon, 15 Nov 2021 14:50:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HtBvg55ZKz4rdn; Mon, 15 Nov 2021 14:50:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-il1-x136.google.com with SMTP id x9so16904081ilu.6; Mon, 15 Nov 2021 06:50:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=mgwPDYyK2+G8z9s9CRsqvMVHOYbvipxZxZEIFs9Q748=; b=qgiKNGq2Y5ZygIj/nlPtI+U53VFF24dpI/ryFkRkCgoHt4cYbnWvLkl1hBflRpPGXY lQMoq8NCNiG98OFiax9YuFvzSOMlmTgjTW14CfHZu01Qkt4icpCZOpDZ6R2+phxiPZ7h qixo3XizrpoUZgDam7q8mrl3obLpT2dp9dM1YOIhjqX+pzY4egMrH/eB9hP8qaJ3hDPk 1FLzPY+iqLAVbHYOU66mukOglv3SKUt2RR0JhsAH62FsqCb/aezBZIWe5haExTApls5Y /DmtWxYOxLRa/skFqRaO5nNvLDrIL9Id0c1YhVokWdl0mwb14b0U3AvTKB1O1Q4t3Yvr 7OfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=mgwPDYyK2+G8z9s9CRsqvMVHOYbvipxZxZEIFs9Q748=; b=kNAsd76S7Abadr7F7KosJREJoVL0/quBB4TlQNonaJ3rs7a8g1XFVTxbAH5p09Bdq/ kfHHWm8CNjjVT8s/PN32EsNZMSdOSU25SmK6IIy5ihDypgTdtkb+Jy07Pds1WkcWvq4C 6PR1OFX+aS6Z6mRnh3wiJo6s/cXkSFUy2+5LEZJBB7bAkH3kBGLKgdTtiG95FNkKM1z1 +n5j7z20C4bisWND8ZpROAzd1sXsHuLWZ5SHkH5w/8qZcE+onDKIiaD9nwA/NV9kjGpM 0v8jkax6rcSeokVWmrhx0UftygNXdfVOTGaYVTefs1n9gK64guqEwNZ611SKHQl4FxSS Oh1Q== X-Gm-Message-State: AOAM532ensOnODHkqLlI8nCgUQpoDKzZFjo6hDKp6H8AcrVYL7xTbAHw WZN8ySzYin4IwOyZZOR2AEB2NK674HU= X-Google-Smtp-Source: ABdhPJxlIRsITV0keLBJstqYjVw8Ob5dRTEXWD7LiLxDrhMi5n1vu74Ho1fNvNQgfeZ0aEtv6J091Q== X-Received: by 2002:a05:6e02:1bcb:: with SMTP id x11mr20680490ilv.94.1636987854720; Mon, 15 Nov 2021 06:50:54 -0800 (PST) Received: from nuc ([142.126.186.191]) by smtp.gmail.com with ESMTPSA id s15sm11606314ilu.16.2021.11.15.06.50.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 06:50:54 -0800 (PST) Date: Mon, 15 Nov 2021 09:50:51 -0500 From: Mark Johnston To: Andriy Gapon Cc: Chris Ross , freebsd-fs Subject: Re: swap_pager: cannot allocate bio Message-ID: References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> <19A3AAF6-149B-4A3C-8C27-4CFF22382014@distal.com> <6DA63618-F0E9-48EC-AB57-3C3C102BC0C0@distal.com> <35c14795-3b1c-9315-8e9b-a8dfad575a04@FreeBSD.org> List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <35c14795-3b1c-9315-8e9b-a8dfad575a04@FreeBSD.org> X-Rspamd-Queue-Id: 4HtBvg55ZKz4rdn X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; TAGGED_RCPT(0.00)[freebsd]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Mon, Nov 15, 2021 at 04:20:26PM +0200, Andriy Gapon wrote: > On 15/11/2021 05:26, Chris Ross wrote: > > A procstat -kka output is available (208kb of text, 1441 lines) at > > https://pastebin.com/SvDcvRvb > > 67 100542 pagedaemon dom0 mi_switch+0xc1 > _cv_wait+0xf2 arc_wait_for_eviction+0x1df arc_lowmem+0xca > vm_pageout_worker+0x3c4 vm_pageout+0x1d7 fork_exit+0x8a fork_trampoline+0xe > > I was always of an opinion that waiting for the ARC reclaim in arc_lowmem was > wrong. This shows an example of why it is so. > > > An ssh of a top command completed and shows: > > > > last pid: 91551; load averages: 0.00, 0.02, 0.30 up 2+00:19:33 22:23:15 > > 40 processes: 1 running, 38 sleeping, 1 zombie > > CPU: 3.9% user, 0.0% nice, 0.9% system, 0.0% interrupt, 95.2% idle > > Mem: 58G Active, 210M Inact, 1989M Laundry, 52G Wired, 1427M Buf, 12G Free > > To me it looks like there is still plenty of free memory. > > I am not sure why vm_wait_domain (called by vm_page_alloc_noobj_domain) is not > waking up. It's a deadlock: the page daemon is sleeping on the arc evict thread, and the arc evict thread is waiting for memory: 2561 100722 zfskern arc_evict mi_switch+0xc1 _sleep+0x1cb vm_wait_doms+0xe2 vm_wait_domain+0x51 vm_page_alloc_noobj_domain+0x184 uma_small_alloc+0x62 keg_alloc_slab+0xb0 zone_import+0xee zone_alloc_item+0x6f arc_evict_state+0x81 arc_evict_cb+0x483 zthr_procedure+0xba fork_exit+0x8a fork_trampoline+0xe I presume this is from the marker allocations in arc_evict_state(). The second problem is that UMA is refusing to try to allocate from the "wrong" NUMA domain, but that policy seems overly strict. Fixing that alone would make the problem harder to hit, but I think it wouldn't solve it completely. > Perhaps this is some sort of a NUMA related issue where one memory domain is > exhausted while other(s) still have a lot of memory. > Or maybe it's something else but it must be some sort of a bug. > > > ARC: 48G Total, 10G MFU, 38G MRU, 128K Anon, 106M Header, 23M Other > > 46G Compressed, 46G Uncompressed, 1.00:1 Ratio > > Swap: 425G Total, 3487M Used, 422G Free > > > -- > Andriy Gapon