From nobody Sun Nov 28 02:00:56 2021 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 76FEF18AFAF0; Sun, 28 Nov 2021 02:01:05 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4J1sCP1Ptpz3F4g; Sun, 28 Nov 2021 02:01:05 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 1AS20u8T070451 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sun, 28 Nov 2021 04:00:59 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 1AS20u8T070451 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 1AS20ud9070450; Sun, 28 Nov 2021 04:00:56 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 28 Nov 2021 04:00:56 +0200 From: Konstantin Belousov To: Peter Jeremy Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: b19740f4ce7a - main - swap_pager: lock vnode in swapdev_strategy() Message-ID: References: <202111251935.1APJZA1e094731@gitrepo.freebsd.org> List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4J1sCP1Ptpz3F4g X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Sun, Nov 28, 2021 at 12:22:46PM +1100, Peter Jeremy wrote: > On 2021-Nov-27 01:26:17 +0200, Konstantin Belousov wrote: > >commit 9c62295373f728459c19138f5aa03d9cb8422554 > >Author: Konstantin Belousov > >Date: Sat Nov 27 01:22:27 2021 +0200 > > > > swapoff_one(): only check free pages count manually turning swap off > > That didn't work but I don't think the underlying bug is related to > your recent work on swap_pager - digging back through my logs, I've > found another similar panic in August last year. It is definitely not caused by, it is just yet another issue. > > Nov 28 09:40:17 rock64 syslogd: exiting on signal 15 > Waiting (max 60 seconds) for system process `vnlru' to stop... done > Waiting (max 60 seconds) for system process `syncer' to stop... > Syncing disks, vnodes remaining... 0 0 done > Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done > Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... done > All buffers synced. > No strategy for buffer at 0xffff0000bf8dc000 > vnode 0xffffa00009024a80: type VBAD > usecount 2, writecount 0, refcount 33263 seqc users 1 > hold count flags () > flags (VIRF_DOOMED|VV_VMSIZEVNLOCK) > lock type nfs: SHARED (count 1) > swap_pager: I/O error - pagein failed; blkno 241400,size 4096, error 45 > panic: VOP_STRATEGY failed bp=0xffff0000bf8dc000 vp=0 > cpuid = 0 > time = 1638052821 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x178 > panic() at panic+0x44 > bufstrategy() at bufstrategy+0x80 > swapdev_strategy() at swapdev_strategy+0xcc > swap_pager_getpages_locked() at swap_pager_getpages_locked+0x460 > swapoff_one() at swapoff_one+0x3e4 > swapoff_all() at swapoff_all+0x9c > bufshutdown() at bufshutdown+0x2ac > kern_reboot() at kern_reboot+0x240 > sys_reboot() at sys_reboot+0x358 > do_el0_sync() at do_el0_sync+0x4a4 > handle_el0_sync() at handle_el0_sync+0x9c > --- exception, esr 0x56000000 > KDB: enter: panic > [ thread pid 1 tid 100002 ] > Stopped at kdb_enter+0x48: undefined f900c11f > db> > > This is the same traceback as my previous mail. Looking at the code > path, the test whether there's enough RAM to swap in all the data > passes in both cases: If swapoff_one() returned ENOMEM then > swapoff_all() would report a "Cannot remove swap device" error and > keep going (not bother to actually remove the swap device) - and > that's not happening. > > I think the important message is "No strategy for buffer at 0x..." > which comes from vop_nostrategy() and causes bufstrategy() to panic: > > swapdev_strategy() > => bstrategy() > => BO_STRATEGY() > => bufstrategy() > => VOP_STRATEGY() > => VOP_STRATEGY_APV() > => vop_nostrategy() > => bufdone() => swp_pager_async_iodone() > > Presumably, stopping the network means there's no longer any way for > swap operations to complete so the swap device has become associated > with default_vnodeops, (though I haven't dug into the actual code > path that does that). > > Moving up a level, does it really matter if swapoff_one() is skipped? > If it actually returned an error (eg if the free memory test failed), > then that's what would happen. By this point in the shutdown, there's > no userland left (which makes me wonder why there's anything left in > swap in any case) and only the final cleanups remain before the kernel > shuts down. What's really needed is a way to detect that the relevant > swap I/O provider has gone away and return to swapoff_all() without > panicing. I think the intent there is to ensure that the system is in as much steady state as possible. I can easily imagine some HBA doing weird things if some io is in flight when the reset comes in. So we want to ensure that no io occurs. The cause for your panic is not the network interface down state (in fact, I think that interface state up), but as you correctly analyzed, the call to vop_nostrategy(). It is there in stack because underlying filesystem was unmounted, and the swap vnode was reclaimed, replacing NFS vop vector with deadfs vop vector, which inherits from the default vop. The bit that I do not understand from your report, is why swapoff_all() did not occured earlier. Your earlier report, where spurious ENOMEM came out from swapoff(2) syscall, and which I fixed by the previous patch, means that there was an attempt to swapoff. Still, the solution is, IMO, to swapoff before unmount. We do not need swapin for flushing buffers. Please try this combined patch. diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c index 4b746a269171..201afeec9311 100644 --- a/sys/kern/vfs_bio.c +++ b/sys/kern/vfs_bio.c @@ -1452,16 +1452,21 @@ bufshutdown(int show_busybufs) */ printf("Giving up on %d buffers\n", nbusy); DELAY(5000000); /* 5 seconds */ + swapoff_all(); } else { if (!first_buf_printf) printf("Final sync complete\n"); + /* - * Unmount filesystems + * Unmount filesystems. Swapoff before unmount, + * because swap on file is unoperational after unmount + * of underlying filesystem. */ - if (!KERNEL_PANICKED()) + if (!KERNEL_PANICKED()) { + swapoff_all(); vfs_unmountall(); + } } - swapoff_all(); DELAY(100000); /* wait for console output to finish */ } diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c index 4cfdb3fd2cc8..981a71b2c4b1 100644 --- a/sys/vm/swap_pager.c +++ b/sys/vm/swap_pager.c @@ -469,7 +469,8 @@ static bool swp_pager_swblk_empty(struct swblk *sb, int start, int limit); static void swp_pager_free_empty_swblk(vm_object_t, struct swblk *sb); static int swapongeom(struct vnode *); static int swaponvp(struct thread *, struct vnode *, u_long); -static int swapoff_one(struct swdevt *sp, struct ucred *cred); +static int swapoff_one(struct swdevt *sp, struct ucred *cred, + bool swapoff_syscall); /* * Swap bitmap functions @@ -2523,14 +2524,14 @@ sys_swapoff(struct thread *td, struct swapoff_args *uap) error = EINVAL; goto done; } - error = swapoff_one(sp, td->td_ucred); + error = swapoff_one(sp, td->td_ucred, true); done: sx_xunlock(&swdev_syscall_lock); return (error); } static int -swapoff_one(struct swdevt *sp, struct ucred *cred) +swapoff_one(struct swdevt *sp, struct ucred *cred, bool swapoff_syscall) { u_long nblks; #ifdef MAC @@ -2552,8 +2553,16 @@ swapoff_one(struct swdevt *sp, struct ucred *cred) * available virtual memory in the system will fit the amount * of data we will have to page back in, plus an epsilon so * the system doesn't become critically low on swap space. + * The vm_free_count() part does not account e.g. for clean + * pages that can be immediately reclaimed without paging, so + * this is very rough estimation. + * + * On the other hand, not turning swap off on swapoff_all() + * means that we loose swap data when filesystems go away, + * which is arguably worse. */ - if (vm_free_count() + swap_pager_avail < nblks + nswap_lowat) + if (swapoff_syscall && + vm_free_count() + swap_pager_avail < nblks + nswap_lowat) return (ENOMEM); /* @@ -2603,7 +2612,7 @@ swapoff_all(void) devname = devtoname(sp->sw_vp->v_rdev); else devname = "[file]"; - error = swapoff_one(sp, thread0.td_ucred); + error = swapoff_one(sp, thread0.td_ucred, false); if (error != 0) { printf("Cannot remove swap device %s (error=%d), " "skipping.\n", devname, error);