From nobody Fri Nov 26 23:26:17 2021 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id A357B18A8B2D; Fri, 26 Nov 2021 23:26:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4J19qZ2THSz3Pln; Fri, 26 Nov 2021 23:26:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 1AQNQICr075647 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 27 Nov 2021 01:26:21 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 1AQNQICr075647 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 1AQNQHih075646; Sat, 27 Nov 2021 01:26:17 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 27 Nov 2021 01:26:17 +0200 From: Konstantin Belousov To: Peter Jeremy Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: b19740f4ce7a - main - swap_pager: lock vnode in swapdev_strategy() Message-ID: References: <202111251935.1APJZA1e094731@gitrepo.freebsd.org> List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4J19qZ2THSz3Pln X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Fri, Nov 26, 2021 at 09:53:03PM +1100, Peter Jeremy wrote: > On 2021-Nov-25 19:35:10 +0000, Konstantin Belousov wrote: > > swap_pager: lock vnode in swapdev_strategy() > > > > VOP_STRATEGY() requires locked vnode. Note that we lock the swap vnode > > while pages are busy, but this would only cause real LoR if pages belong > > to the swap vnode, which must not be the case for correct use. > > > > Reported and tested by: peterj > > Thanks for those fixes. Unfortunately, I've bumped into another edge > case: The system can panic during shutdown because it tries to swap > in data after the network is shutdown. For reasons I haven't tracked > down, a "swapoff" can fail even though there should be more than > enough RAM. As an example: > > Stopping cron. > Waiting for PIDS: 1024. > swapoff: /usr/obj/swapfile: Cannot allocate memory > Stopping ntpd. > Waiting for PIDS: 1012. > Stopping tincd for: vpn > Waiting for PIDS: 758. > Stopping rtsold. > Waiting for PIDS: 351. > Stopping devd. > Waiting for PIDS: 754. > Writing entropy file: . > Writing early boot entropy file: . > . > Terminated > Nov 26 03:18:44 rock64 syslogd: exiting on signal 15 > Waiting (max 60 seconds) for system process `vnlru' to stop... done > Waiting (max 60 seconds) for system process `syncer' to stop... > Syncing disks, vnodes remaining... 0 0 0 done > Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done > Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... done > All buffers synced. > No strategy for buffer at 0xffff0000c0cd3000 > vnode 0xffffa00006475e00: type VBAD > usecount 3, writecount 0, refcount 974016 seqc users 1 > hold count flags () > flags (VIRF_DOOMED|VV_VMSIZEVNLOCK) > lock type nfs: SHARED (count 1) > swap_pager: I/O error - pagein failed; blkno 184,size 4096, error 45 > panic: VOP_STRATEGY failed bp=0xffff0000c0cd3000 vp=0 > cpuid = 0 > time = 1637857131 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x178 > panic() at panic+0x44 > bufstrategy() at bufstrategy+0x80 > swapdev_strategy() at swapdev_strategy+0xcc > swap_pager_getpages_locked() at swap_pager_getpages_locked+0x460 > swapoff_one() at swapoff_one+0x3dc > swapoff_all() at swapoff_all+0x98 > bufshutdown() at bufshutdown+0x2ac > kern_reboot() at kern_reboot+0x240 > sys_reboot() at sys_reboot+0x358 > do_el0_sync() at do_el0_sync+0x4a4 > handle_el0_sync() at handle_el0_sync+0x90 > --- exception, esr 0x56000000 > KDB: enter: panic > [ thread pid 1 tid 100002 ] > Stopped at kdb_enter+0x48: undefined f900c11f > db> Try this. commit 9c62295373f728459c19138f5aa03d9cb8422554 Author: Konstantin Belousov Date: Sat Nov 27 01:22:27 2021 +0200 swapoff_one(): only check free pages count manually turning swap off When swap is turned off due to system shutdown or reboot, ignore the check. Problem is that the check is not accurate by any means, free page count can legitimately be low while system still able to page in everything from the swap. Then, we turn swap off if swapping on real file or some non-standard geom provider, and typically panic when system appears to actually need to unavailable page. For syscall, it is better to be safe than sorry. Reported by: peterj Sponsored by: The FreeBSD Foundation MFC after: 1 week diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c index 4cfdb3fd2cc8..981a71b2c4b1 100644 --- a/sys/vm/swap_pager.c +++ b/sys/vm/swap_pager.c @@ -469,7 +469,8 @@ static bool swp_pager_swblk_empty(struct swblk *sb, int start, int limit); static void swp_pager_free_empty_swblk(vm_object_t, struct swblk *sb); static int swapongeom(struct vnode *); static int swaponvp(struct thread *, struct vnode *, u_long); -static int swapoff_one(struct swdevt *sp, struct ucred *cred); +static int swapoff_one(struct swdevt *sp, struct ucred *cred, + bool swapoff_syscall); /* * Swap bitmap functions @@ -2523,14 +2524,14 @@ sys_swapoff(struct thread *td, struct swapoff_args *uap) error = EINVAL; goto done; } - error = swapoff_one(sp, td->td_ucred); + error = swapoff_one(sp, td->td_ucred, true); done: sx_xunlock(&swdev_syscall_lock); return (error); } static int -swapoff_one(struct swdevt *sp, struct ucred *cred) +swapoff_one(struct swdevt *sp, struct ucred *cred, bool swapoff_syscall) { u_long nblks; #ifdef MAC @@ -2552,8 +2553,16 @@ swapoff_one(struct swdevt *sp, struct ucred *cred) * available virtual memory in the system will fit the amount * of data we will have to page back in, plus an epsilon so * the system doesn't become critically low on swap space. + * The vm_free_count() part does not account e.g. for clean + * pages that can be immediately reclaimed without paging, so + * this is very rough estimation. + * + * On the other hand, not turning swap off on swapoff_all() + * means that we loose swap data when filesystems go away, + * which is arguably worse. */ - if (vm_free_count() + swap_pager_avail < nblks + nswap_lowat) + if (swapoff_syscall && + vm_free_count() + swap_pager_avail < nblks + nswap_lowat) return (ENOMEM); /* @@ -2603,7 +2612,7 @@ swapoff_all(void) devname = devtoname(sp->sw_vp->v_rdev); else devname = "[file]"; - error = swapoff_one(sp, thread0.td_ucred); + error = swapoff_one(sp, thread0.td_ucred, false); if (error != 0) { printf("Cannot remove swap device %s (error=%d), " "skipping.\n", devname, error);