From nobody Tue Mar 12 20:48:38 2024 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TvQj32qyMz5FSfp for ; Tue, 12 Mar 2024 20:48:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TvQj30DPJz4Yq2 for ; Tue, 12 Mar 2024 20:48:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710276519; a=rsa-sha256; cv=none; b=fNQfmKHbU9g2yh0hnqOs/8iZ6TFZ/WSuB+NMAu6FgFscRvkGB023o4yWYFdpzLSvORw4kd vRpDuTBVB2LFw8w0+brWtGndahYr94uOOqRUZaU3GnWr6P+ID9WKg3h/LgklyfsuyOJfr7 CpvFqxzCWRnDGVDfc3o5PrPPCI4W2KgEXKItKzt7aKVGbgCn4maLoresGlmY3XrAZk5MQd XSSR6CzRlg492JsdEZinHmO+zbnDMepTXrFb0DJRQKv1ZpvyF2Ledv7VjKBgmS6yD8tQBf Rh3uZHI/tBgf+ass/EFI2HlAsRf/HiQ4/zaTqmkSxLNgHiSIAguK0cwORA/8oA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710276519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F+dLjN9PoDnC1qwA7LiHKSodhgBPk4/QOVdTQH8YRlU=; b=y2cb2LMcQsRiNibo5cnxqP3LDoMGCpS1lKXu0jjWKL3pfPBxa/aFaQZxaYMvDS/6rFaUQV 2QcU4TuQmXS4H8BxKbRgwVFpzQYuuS3Gnc0k5qfn6If75Eo8H/6rAeP/G7EHDCorlNl771 H8SNsBk/06fD693OLxe3aALoVBh0HVoBtayqGR4M9sKo1ZbfHRTQ6l41XAlNUto6/cD2cK EGRryTqZ1X08/b9qPzOm4E3LadnivS9hh4eXIfthNr2C9UlTcmdVpYp8N6Hu84ZW/kPBuf zXlvStdrOq00THp4cDGXthbL0q8Hzviw4FeSmuLv9r7OPtbKcfJQaL/QR5CRcA== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TvQj26qFMzDcD for ; Tue, 12 Mar 2024 20:48:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 42CKmc6E091934 for ; Tue, 12 Mar 2024 20:48:38 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 42CKmcP7091933 for fs@FreeBSD.org; Tue, 12 Mar 2024 20:48:38 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 277389] Reproduceable low memory freeze on 14.0-RELEASE-p5 Date: Tue, 12 Mar 2024 20:48:38 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: marklmi26-fbsd@yahoo.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277389 --- Comment #10 from Mark Millard --- What OOM console messages are being generated? The kernel has multiple, distinct OOM messages. Which type(s) are you getting? : "failed to reclaim memory" "a thread waited too long to allocate a page" "out of swap space" "unknown OOM reason %d" Also, but only for boot verbose: "proc %d (%s) failed to alloc page on fault, starting OOM\n" (Note: "out of swap space" would better be described as: swblk or swpctrie zone exhausted. Such can happen without the swap space showing as being fully used.) For "failed to reclaim memory": sysctl -TW vm.pageout_oom_seq=3D120 (or even larger) could be of use in delaying the OOM activity. The default is 12. /boot/loader.conf would be a place for such a tunable. For reference: # sysctl -Td vm.pageout_oom_seq vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM Another issue that can happen is user I/O related processes ending up not being runnable beacuse of the associated kernel stacks being put in the swap space, blocking the processes from running until the kernel stacks are read back in. In /etc/sysctl.conf I have: # Together this pair avoids swapping out the process kernel stacks. # This also avoids processes for interacting with the system from # being hung-up by such. vm.swap_enabled=3D0 vm.swap_idle_enabled=3D0 These are live settable via: sysctl -W vm.swap_enabled=3D0 sysctl vm.swap_idle_enabled=3D0 (They are not tunable's, and so do not go in /boot/loader.conf .) For "a thread waited too long to allocate a page" there are . . . There also is control over the criteria for this but is is more complicated. In /boot/loader.conf (I'm using defaults): # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: #vm.pfault_oom_attempts=3D-1 # # For possibly insufficient swap/paging space # (might run out), increase the pageout delay # that leads to Out Of Memory killing of # processes (showing defaults at the time): #vm.pfault_oom_attempts=3D 3 #vm.pfault_oom_wait=3D 10 # (The multiplication is the total but there # are other potential tradoffs in the factors # multiplied, even for nearly the same total.) If you can be sure of not running out of swap/paging space, you might try vm.pfault_oom_attempts=3D-1 . If you do run out of swap/paging space, it would deadlock, as I understand. So, if you can tolerate that the -1 might be an option even if you do run out of swap/paging space. I do not have specific suggestions for alternatives to 3 and 10. It would be exploratory for me if I had to try such. For reference: # sysctl -Td vm.pfault_oom_attempts vm.pfault_oom_wait vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler --=20 You are receiving this mail because: You are on the CC list for the bug.=