From nobody Fri Nov 12 20:10:50 2021 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 1FA8D183D398 for ; Fri, 12 Nov 2021 20:10:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HrV8H09ccz4SXJ for ; Fri, 12 Nov 2021 20:10:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-il1-x133.google.com with SMTP id l8so10115638ilv.3 for ; Fri, 12 Nov 2021 12:10:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=lBha5EHXdzvQqpyz/VQUzBHzFu8ZuihEbgL8oIYBv6I=; b=a6jD7VfXT/mAPdtO6nvOcsVL0KvjmD1heq//h0eRs0nW/1mYPQZRxXVUQky21f62kx gfqF9+7q75Z8LsJ+5laFk/q4EXxHGH6JpTIV5h0BIA3axaNyHzR5Txh4eIwIS+r5BKr9 FMefuoYptBsHW1IgwPh9Pm7vpTnew2jTBBRAYG+p1pmz2uDDi4UN+VqnwfWkPGaWKLmF 9bYU6x5pfgwwkk/fwW0qGgqKRizNkV+wYcfoaLhqRxgNtU2DLFkS1GCRtcmfGqvLQUND SIHu/+7z8Sr/n8S81Dq9sAXants+YY4wB+r+OCWQ8xMTmUUUlBoXUvp0/E9MuPsr0dII tN4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=lBha5EHXdzvQqpyz/VQUzBHzFu8ZuihEbgL8oIYBv6I=; b=MtzSSMw+fpclm0OBPKWgJfe2V9Wx/fQrBaaJ53v3BWYmlC/t6/2NxWcSRCRZhOhNtW 0qFSABtFszXk3k0NJdAvID7QWfOyGvCZIwtCgtVfwjlnmlXiZ42UbnhrURaWrq5gokJY RkQ+O6Cwe2SyT+DcodVMTbMZzlm0YDzs9/xHp3SO9AMOgDaZYw+1ifpLHwjBfOISNMLh wmuKhTQ65GQpwIc/gxVf13s2ncqBpZrXf4T8jTAh96bx5Yb4MCjDSp4xBkRe+ycyQTBD UgZnzgaRVVBxzm6rgDrTCrN6ks/j+A+dTTJ7CfJDJ6fok8T+lPMj6ZiXCCsL2inaoxuN 9riA== X-Gm-Message-State: AOAM532qNGHt9pKf6ad3el6MC4KRuKCAxLC8Lvr4gO+pSyzH8orktS3O 9/bSqbwVCXYIDgKX7r7EAZcAsDKoSzg= X-Google-Smtp-Source: ABdhPJyt5B+ZfBhiSbcIJdvYI0U9d3r0OfYHgPVnbc5wpY2gyWBFW0scMaIWfgaWFkt0sBbV71vgdQ== X-Received: by 2002:a05:6e02:1c0e:: with SMTP id l14mr10325843ilh.8.1636747852916; Fri, 12 Nov 2021 12:10:52 -0800 (PST) Received: from nuc ([142.126.186.191]) by smtp.gmail.com with ESMTPSA id f2sm3431248ilu.54.2021.11.12.12.10.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Nov 2021 12:10:52 -0800 (PST) Date: Fri, 12 Nov 2021 15:10:50 -0500 From: Mark Johnston To: Chris Ross Cc: ronald-lists@klop.ws, freebsd-fs Subject: Re: swap_pager: cannot allocate bio Message-ID: References: <9FE99EEF-37C5-43D1-AC9D-17F3EDA19606@distal.com> <09989390-FED9-45A6-A866-4605D3766DFE@distal.com> <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4E5511DF-B163-4928-9CC3-22755683999E@distal.com> X-Rspamd-Queue-Id: 4HrV8H09ccz4SXJ X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; TAGGED_RCPT(0.00)[freebsd]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Thu, Nov 11, 2021 at 04:49:21PM -0500, Chris Ross wrote: > > > > On Nov 11, 2021, at 13:50, Ronald Klop via freebsd-fs wrote: > > > > Can you press ctrl-t on the hanging process? That should print the stacktrace indicating where it is waiting on. > > So, I rebooted the machine this morning, but now have [tried to] log into it to check on it and find that an ssh connection doesn’t result in a shell. I logged into the console, tried to start a “screen” to get more prompts, and it hung. Ctrl-T on that shows (after running a console screen-capture through OCR, and hand correction, so may not be 100%): > > root@host:~ # screen > load: 0.07 cmd: csh 56116 [vmwait] 35.00r 0.00u 0.01s 0% 3984k > mi_switch+0xc1 _sleep+0x1cb vm_wait_doms+0xe2 vm_wait_domain+0x51 vm_domain_alloc_fail+0x86 vm_page_alloc_domain_after+0x7e uma_small_alloc+0x58 keg_alloc_slab+0xba zone_import+0xee zone_alloc_item+0x6f malloc+0x5d sigacts_alloc+0x1c fork1+0x9fb sys_fork+0x54 amd64_syscall+0x10c fast_syscall_common+0xf8 > > As before, ps and even mount and df work here on console. But, a “zpool status tank” will hang as before. A Ctrl+D on it > > root@host:~ # screen > load: 0.00 cmd: zpool 62829 [aw.aew_cv] 37.89r 0.00u 0.00s 0% 6976k > mi_switch+0xc1 _cv_wait+0xf2 arc_wait_for_eviction+0x14a arc_get_data_impl+0xdb arc_hdr_alloc_abd+0xa6 arc_hdr_alloc+0x11e arc_read+0x4f4 dbuf_read+0xc08 dmu_buf_hold+0x46 zap_lookup_norm+0x35 zap_contains+0x26 vdev_rebuild_get_stats+0xac vdev_config_generate+0x3e9 vdev_config_generate+0x74f spa_config_generate+0x2a2 spa_open_common+0x25c spa_get_stats+0x4e zfs_ioc_pool_stats+0x22 > > > > > On Nov 11, 2021, at 14:10, Dave Cottlehuber <> wrote: > > > > Grab output of ‘procstat-kk’ and see if this is similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258208 a few more prods might get this one addressed! > > procstat -kk 62829 yields the same as above. Which I presume is expected, I’d just never used procstate -kk before. > > Unfortunately, I can’t tell if this is sufficiently similar to bug 258208. A different ZFS operation is happening here, so the calls behind my zpool status are different. The other non-zfs stat above (screen in my case) doesn’t seem to be hitting zfs at all, but I may be missing something. Andriy, Mark J, let me know if you think this is relevant, I can build a 13-STABLE with D32931 if you think it will be of use. No, this looks like a different problem. If it's possible to reproduce this and procstat -kka is usable, it would be helpful to see the full output. In particular, I am wondering if the page daemon is getting blocked waiting for the arc evict handler to successfully allocate memory. https://cgit.freebsd.org/src/commit/?id=97ed4babb51636d8a4b11bc7b207c3219ffcd0e3 is an example of a fix for such a problem, and is not present in 13.0. I would also suggest trying to apply that patch, though I'm fairly sure there are other such problems still lurking. > Thanks. Let me know any thoughts you have. > > - Chris >