From nobody Fri Jul 05 07:56:17 2024 X-Original-To: questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WFm7Z03JPz5QnR7 for ; Fri, 05 Jul 2024 07:56:58 +0000 (UTC) (envelope-from odhiambo@gmail.com) Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WFm7X5FDMz4ctq for ; Fri, 5 Jul 2024 07:56:56 +0000 (UTC) (envelope-from odhiambo@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=aKViM9xt; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of odhiambo@gmail.com designates 2607:f8b0:4864:20::331 as permitted sender) smtp.mailfrom=odhiambo@gmail.com Received: by mail-ot1-x331.google.com with SMTP id 46e09a7af769-700cc8e447aso748701a34.2 for ; Fri, 05 Jul 2024 00:56:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720166214; x=1720771014; darn=freebsd.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=ZUa8GvpwhYcgN3WaRVGsCIXbPLpkqJmqCNYtx9PJ3gs=; b=aKViM9xtNC076tJMuDV12w7Ac+DE3a5fkBQM2jlifDqx20TFkrOrJMHMXUVF9wFAD7 o6jpbjWuK6Wusi5DPGe+x/Fy2RR1tMwUsNLQyfYzWE6xAdjoZ9UHGG0xoRz5QSfeZTk3 3HWupYkwidLdK1LmAgGpTqXHRYCjlRJjpn4l9pwhLG7oIkLxj9oU11C0+iexNodZ6jcz ziGzyjAkZpLWc5LxsaSZiNtzIp5zTNmj+krcnJAbDOXzBH2EaE7sn6A4dQSRcAVSaKBZ o0K+QfkAT2vgNbLAtT3p6wiVxkOqrM6m4cPejNEylWl57/88KjqP4uux6SN89PaMminN IsRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720166214; x=1720771014; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ZUa8GvpwhYcgN3WaRVGsCIXbPLpkqJmqCNYtx9PJ3gs=; b=kqLTZxGC+oG5QdkFtbDRtOzJeYaFZMFyT5yYSmBvQNRnrR1a11BmUpqUdXohs32un4 jnpWVJs3QFNK8RN7j/c2UPFHcoErBQUzmzjEc83VIwzTTv310i68/UQVlLnkKnDr0p6F uNkSPRWDxwhSLp2DxZdU9S5ah0prdbwHLebIB3JlQMl4KXCpMSDZ6LWPSUBs2Dh3A/yb HBKDVLu40xfOO7rzbDRP0MiDI79MysjLyVnz0mp09Jx2KKYlFLn/bm1MCOcMeqLvY3Ln 4Vk6mDPb5vt7kUh/qmzRBnBTDx+TOmZiQK9Xzxrc8rj0nRnS52HW2fFhSCPU1yfabnDj g2lw== X-Gm-Message-State: AOJu0YzVQYGYao/jdbKI1uWVG/80CjXHI3nD/PsxCpgwSAunk1NEVCgw IIkN5fZtSa6HJo6U7iDHcHlvaQojqe+Ws9mQctoUt6rpWQk1iOQSu8R4UqAsFa0eCa7tElE8rog cnVMMBHYEFSQRj5Vv33nURsFbx8w6uy1wu0RBcg== X-Google-Smtp-Source: AGHT+IGIJfARaG2xk1/WQjjsh0Z5VTBL8cpFhBZmlc++tlakgeKDm9LUKSqkFKhX9sWUoQiAkT1zthBJ0H0DHGCJNys= X-Received: by 2002:a05:6870:968b:b0:25e:d37:6c20 with SMTP id 586e51a60fabf-25e2bec820bmr3193929fac.36.1720166214047; Fri, 05 Jul 2024 00:56:54 -0700 (PDT) List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-questions@freebsd.org Sender: owner-freebsd-questions@FreeBSD.org MIME-Version: 1.0 From: Odhiambo Washington Date: Fri, 5 Jul 2024 10:56:17 +0300 Message-ID: Subject: Server became inaccessible because it ran out of swap space To: questions Content-Type: multipart/alternative; boundary="000000000000249ff1061c7b68b8" X-Spamd-Bar: --- X-Spamd-Result: default: False [-4.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; TO_DN_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCPT_COUNT_ONE(0.00)[1]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FREEMAIL_FROM(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MISSING_XM_UA(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; PREVIOUSLY_DELIVERED(0.00)[questions@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MLMMJ_DEST(0.00)[questions@freebsd.org]; RCVD_COUNT_ONE(0.00)[1]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::331:from] X-Rspamd-Queue-Id: 4WFm7X5FDMz4ctq --000000000000249ff1061c7b68b8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I have a server with 64GB RAM, 2CPUs each with 16 cores. I have also configured 13GB or swap space. ``` root@gw:/usr/local/bhyve-vms/scripts # swapinfo Device 1K-blocks Used Avail Capacity /dev/ada0p3 3163136 703316 2459820 22% /dev/md0.eli 10485760 709352 9776408 7% Total 13648896 1412668 12236228 10% root@gw:/usr/local/bhyve-vms/scripts # ``` A number of times it has become inaccessible until I do a hard reboot and this has been caused by what I believe is running out of swap. Below is what I have obtained from /var/log/messages after I rebooted. How do I identify the culprit? Arrest the situation? ``` Jul 5 06:50:56 gw kernel: failed Jul 5 06:52:11 gw kernel: failed Jul 5 06:52:11 gw kernel: out of swap space Jul 5 06:52:11 gw kernel: failed Jul 5 06:52:11 gw kernel: failed Jul 5 06:52:12 gw kernel: failed Jul 5 06:52:12 gw kernel: failed Jul 5 06:54:06 gw kernel: out of swap space Jul 5 06:54:06 gw kernel: failed Jul 5 07:16:30 gw kernel: pid 4076 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: pid 4076 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: tap4: link state changed to DOWN Jul 5 07:16:30 gw kernel: out of swap space Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: pid 20849 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: pid 20849 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: tap5: link state changed to DOWN Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: sonewconn: pcb 0xfffff8002866d100 (local:/var/run/wsgi.38620.0.1.sock): Listen queue overflow: 151 already in queue awaiting acceptance (1 occurrences), euid 0, rgid 0, jail 0 Jul 5 07:16:30 gw kernel: pid 3591 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: pid 3591 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory Jul 5 07:16:30 gw kernel: tap3: link state changed to DOWN Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:30 gw kernel: out of swap space Jul 5 07:16:30 gw kernel: failed Jul 5 07:16:31 gw kernel: failed Jul 5 07:16:31 gw kernel: failed Jul 5 07:16:32 gw kernel: out of swap space Jul 5 07:16:33 gw kernel: out of swap space Jul 5 07:16:33 gw kernel: failed Jul 5 07:16:33 gw kernel: failed Jul 5 07:16:34 gw kernel: out of swap space Jul 5 07:16:34 gw kernel: failed Jul 5 07:16:36 gw kernel: failed Jul 5 07:16:36 gw kernel: failed Jul 5 07:16:36 gw kernel: failed Jul 5 07:16:36 gw kernel: failed Jul 5 07:16:36 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:37 gw kernel: failed Jul 5 07:16:38 gw kernel: failed ``` --=20 Best regards, Odhiambo WASHINGTON, Nairobi,KE +254 7 3200 0004/+254 7 2274 3223 In an Internet failure case, the #1 suspect is a constant: DNS. "Oh, the cruft.", egrep -v '^$|^.*#' =C2=AF\_(=E3=83=84)_/=C2=AF :-) [How to ask smart questions: http://www.catb.org/~esr/faqs/smart-questions.html] --000000000000249ff1061c7b68b8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have a server with 64GB RAM, 2CPUs each with 16 cores. I= have also configured 13GB or swap space.

```
= root@gw:/usr/local/bhyve-vms/scripts # swapinfo
Device =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A01K-blocks =C2=A0 =C2=A0 Used =C2=A0 =C2=A0Avail Capacity/dev/ada0p3 =C2=A0 =C2=A0 =C2=A0 3163136 =C2=A0 703316 =C2=A02459820 =C2= =A0 =C2=A022%
/dev/md0.eli =C2=A0 =C2=A0 10485760 =C2=A0 709352 =C2=A097= 76408 =C2=A0 =C2=A0 7%
Total =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A013= 648896 =C2=A01412668 12236228 =C2=A0 =C2=A010%
root@gw:/usr/local/bhyve-= vms/scripts #
```

A number of t= imes it has become inaccessible until I do a hard reboot and this has been = caused by what I believe is running out of swap.

B= elow is what I have obtained from /var/log/messages after I rebooted.
=

How do I identify the culprit? Arrest the situation?


```
Jul =C2=A05 06:50:56 gw= kernel: failed
Jul =C2=A05 06:52:11 gw kernel: failed
Jul =C2=A05 06= :52:11 gw kernel: out of swap space
Jul =C2=A05 06:52:11 gw kernel: fail= ed
Jul =C2=A05 06:52:11 gw kernel: failed
Jul =C2=A05 06:52:12 gw ker= nel: failed
Jul =C2=A05 06:52:12 gw kernel: failed
Jul =C2=A05 06:54:= 06 gw kernel: out of swap space
Jul =C2=A05 06:54:06 gw kernel: failedJul =C2=A05 07:16:30 gw kernel: pid 4076 (bhyve), jid 0, uid 0, was kille= d: failed to reclaim memory
Jul =C2=A05 07:16:30 gw kernel: pid 4076 (bh= yve), jid 0, uid 0, was killed: failed to reclaim memory
Jul =C2=A05 07:= 16:30 gw kernel: tap4: link state changed to DOWN
Jul =C2=A05 07:16:30 g= w kernel: out of swap space
Jul =C2=A05 07:16:30 gw kernel: failed
Ju= l =C2=A05 07:16:30 gw kernel: failed
Jul =C2=A05 07:16:30 gw kernel: fai= led
Jul =C2=A05 07:16:30 gw kernel: pid 20849 (bhyve), jid 0, uid 0, was= killed: failed to reclaim memory
Jul =C2=A05 07:16:30 gw kernel: pid 20= 849 (bhyve), jid 0, uid 0, was killed: failed to reclaim memory
Jul =C2= =A05 07:16:30 gw kernel: tap5: link state changed to DOWN
Jul =C2=A05 07= :16:30 gw kernel: failed
Jul =C2=A05 07:16:30 gw kernel: failed
Jul = =C2=A05 07:16:30 gw kernel: sonewconn: pcb 0xfffff8002866d100 (local:/var/r= un/wsgi.38620.0.1.sock): Listen queue overflow: 151 already in queue awaiti= ng acceptance (1 occurrences), euid 0, rgid 0, jail 0
Jul =C2=A05 07:16:= 30 gw kernel: pid 3591 (bhyve), jid 0, uid 0, was killed: failed to reclaim= memory
Jul =C2=A05 07:16:30 gw kernel: pid 3591 (bhyve), jid 0, uid 0, = was killed: failed to reclaim memory
Jul =C2=A05 07:16:30 gw kernel: tap= 3: link state changed to DOWN
Jul =C2=A05 07:16:30 gw kernel: failed
= Jul =C2=A05 07:16:30 gw kernel: out of swap space
Jul =C2=A05 07:16:30 g= w kernel: failed
Jul =C2=A05 07:16:31 gw kernel: failed
Jul =C2=A05 0= 7:16:31 gw kernel: failed
Jul =C2=A05 07:16:32 gw kernel: out of swap sp= ace
Jul =C2=A05 07:16:33 gw kernel: out of swap space
Jul =C2=A05 07:= 16:33 gw kernel: failed
Jul =C2=A05 07:16:33 gw kernel: failed
Jul = =C2=A05 07:16:34 gw kernel: out of swap space
Jul =C2=A05 07:16:34 gw ke= rnel: failed
Jul =C2=A05 07:16:36 gw kernel: failed
Jul =C2=A05 07:16= :36 gw kernel: failed
Jul =C2=A05 07:16:36 gw kernel: failed
Jul =C2= =A05 07:16:36 gw kernel: failed
Jul =C2=A05 07:16:36 gw kernel: failedJul =C2=A05 07:16:37 gw kernel: failed
Jul =C2=A05 07:16:37 gw kernel:= failed
Jul =C2=A05 07:16:37 gw kernel: failed
Jul =C2=A05 07:16:37 g= w kernel: failed
Jul =C2=A05 07:16:37 gw kernel: failed
Jul =C2=A05 0= 7:16:37 gw kernel: failed
Jul =C2=A05 07:16:37 gw kernel: failed
Jul = =C2=A05 07:16:38 gw kernel: failed
```

--
Best regards,
Odhiambo WASHINGTON,
Nairob= i,KE
+254 7 3200 0004/+254 7 2274 3223
=C2=A0In=C2=A0an In= ternet failure case, the #1 suspect is a constant: DNS.
"Oh, the cruft.",=C2=A0egrep -v '= ;^$|^.*#'=C2=A0=C2=AF\_(=E3=83=84)_/=C2=AF=C2=A0:-)
[How to ask smart questions:=C2=A0http://www.catb.org/~esr/faqs/smart-questions.h= tml]
--000000000000249ff1061c7b68b8--