From nobody Tue Mar 12 20:48:38 2024 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TvQj30K0vz5FScC for ; Tue, 12 Mar 2024 20:48:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TvQj25HyYz4Z87 for ; Tue, 12 Mar 2024 20:48:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710276518; a=rsa-sha256; cv=none; b=Wcn6UAnKZCicWsAMafCNmSRsBmkUba9R39a5rTt4nf/wPzecGJRbomqJZGaH5rSYiXIm14 /u97Rdmt12d8IATxJ1yyz/rf6KNE9qODa3g/EECr7xpSuxKLQiWp7Jfd0zuq9+5Lj0wkZD lYNQeQ2im3/ppX5GmWAdlh1z94wG8HV6gVup+Q5CaSc33/0sXdSwxaw9WzCWsd0+mtpbOE p09OsDawtCLdBPUKv8A+o5ECjgDWE69i0pildM93N87uwtwhVPajZLcd3XPfuVPCO3sG9Q TL6bgXy6B8NXK4n8bSdUM6aejBcNXV9KNZUMAoPHKwg+QXci99UT7oMIpbK1fg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710276518; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ixKdSR96bewgbwjWwzIomOh5icpQg1RegfCoEPkCO0M=; b=wKF5f2toXSfMXYVlrsFzKH6IXWbsNfuh9zHwWUDk5T7byv0I8ATN+QwbjT1cAAajYMGDZ4 U/DYgiWG98FzjTZm05LJ4/5vws8vUi1v7q62DGZEOE2t7ej1o55GE314cXmTK7G/M4olb0 wy2EyeFISM8f33CguTGkL31dh21EpVOEWUHLsnjz90kHCUBXHwPVXwiUS7dfpsCfo9RpbG pE6Mb8zphJwm0/kgrQbWm7xScpHdn1H1eRTmWYc64Pt3447BARHfRilWxKUMeWl4bHkoLm Lx2kCu63+GKStVv56SiYnupk0N4waYaAc9RyOQ3EVkBm5g1bxdesTmIS7yejuQ== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TvQj24tnwzDkN for ; Tue, 12 Mar 2024 20:48:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 42CKmc96091921 for ; Tue, 12 Mar 2024 20:48:38 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 42CKmcrs091920 for bugs@FreeBSD.org; Tue, 12 Mar 2024 20:48:38 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 277389] Reproduceable low memory freeze on 14.0-RELEASE-p5 Date: Tue, 12 Mar 2024 20:48:38 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 14.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: marklmi26-fbsd@yahoo.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277389 --- Comment #10 from Mark Millard --- What OOM console messages are being generated? The kernel has multiple, distinct OOM messages. Which type(s) are you getting? : "failed to reclaim memory" "a thread waited too long to allocate a page" "out of swap space" "unknown OOM reason %d" Also, but only for boot verbose: "proc %d (%s) failed to alloc page on fault, starting OOM\n" (Note: "out of swap space" would better be described as: swblk or swpctrie zone exhausted. Such can happen without the swap space showing as being fully used.) For "failed to reclaim memory": sysctl -TW vm.pageout_oom_seq=3D120 (or even larger) could be of use in delaying the OOM activity. The default is 12. /boot/loader.conf would be a place for such a tunable. For reference: # sysctl -Td vm.pageout_oom_seq vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM Another issue that can happen is user I/O related processes ending up not being runnable beacuse of the associated kernel stacks being put in the swap space, blocking the processes from running until the kernel stacks are read back in. In /etc/sysctl.conf I have: # Together this pair avoids swapping out the process kernel stacks. # This also avoids processes for interacting with the system from # being hung-up by such. vm.swap_enabled=3D0 vm.swap_idle_enabled=3D0 These are live settable via: sysctl -W vm.swap_enabled=3D0 sysctl vm.swap_idle_enabled=3D0 (They are not tunable's, and so do not go in /boot/loader.conf .) For "a thread waited too long to allocate a page" there are . . . There also is control over the criteria for this but is is more complicated. In /boot/loader.conf (I'm using defaults): # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: #vm.pfault_oom_attempts=3D-1 # # For possibly insufficient swap/paging space # (might run out), increase the pageout delay # that leads to Out Of Memory killing of # processes (showing defaults at the time): #vm.pfault_oom_attempts=3D 3 #vm.pfault_oom_wait=3D 10 # (The multiplication is the total but there # are other potential tradoffs in the factors # multiplied, even for nearly the same total.) If you can be sure of not running out of swap/paging space, you might try vm.pfault_oom_attempts=3D-1 . If you do run out of swap/paging space, it would deadlock, as I understand. So, if you can tolerate that the -1 might be an option even if you do run out of swap/paging space. I do not have specific suggestions for alternatives to 3 and 10. It would be exploratory for me if I had to try such. For reference: # sysctl -Td vm.pfault_oom_attempts vm.pfault_oom_wait vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler --=20 You are receiving this mail because: You are the assignee for the bug.=