From nobody Fri Jul 21 04:37:29 2023 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4R6cHV1tftz4p0bH for ; Fri, 21 Jul 2023 04:37:58 +0000 (UTC) (envelope-from scott.gasch@gmail.com) Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4R6cHT3Jt2z3rts for ; Fri, 21 Jul 2023 04:37:57 +0000 (UTC) (envelope-from scott.gasch@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20221208 header.b=YzFw1A3x; spf=pass (mx1.freebsd.org: domain of scott.gasch@gmail.com designates 2607:f8b0:4864:20::72b as permitted sender) smtp.mailfrom=scott.gasch@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qk1-x72b.google.com with SMTP id af79cd13be357-76714caf466so153573085a.1 for ; Thu, 20 Jul 2023 21:37:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689914276; x=1690519076; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=hCQ1miFcJXhe9ML0bGJPSM0LtblXCNjH+dbh5dCx8EI=; b=YzFw1A3xXrGUKqOeEJ1fNXI8vcqJZb7E19PFZBrVp9eU6l26B36hAKrGXPwXr4P9Tt EEnJJsv0FqK06qdWoRSqvNIuUkfR9hMiehmtQLBgeU+a1g3W4oauTcTg7gYIpiN6XXd0 zOVx3mKEutuzEMdZ7x54rMZAW1xiB8ycDA925ZWcu7Au+z3ljz1FsoQ2go+N4CN6hZpS Lw8tLjfrXRxBxXOv49pYVe69ZHIuCnSnwTD9o9yEn98XhAezIGKNxIb7CQ1mrEyRJBRI anaJWbE/y+zvfCaJL9Ui8Wvnjh4bJ7rzor4coLr3S190aJPL7RbPbwVWjX+0vXX1Fx7Q 7VTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689914276; x=1690519076; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hCQ1miFcJXhe9ML0bGJPSM0LtblXCNjH+dbh5dCx8EI=; b=NQd2bQkTuA9z9j1fYCqcbeRIE4rXrgPWmDEF4wAIWYhYCZ5iQTLx+kWoMjpbsZzHbi JQdhD557fd8uCYCQImFzZTgfMoeiexqgIQLIQq+ftqnMvSDPYokiOawkbgLpPGrH7Uac AR9SzW/Sh52jAOwnBbR+FtS/YaUw+Rz6P73u15G7EzXKvHMTocCXa2VKDVWkPwXmF8/g C5rjcGo/eysa3xMD71lc+lJuphjkhPekm+K6rwWlsbGnYInufd5dMalpiV8CyCaZ9XH6 nWDYT/GbDUzHRxJBzLLCwkwbaOeXtYjAXSTpHnecA+GbM6d6IF+90cizBUx4d90DGsuS H5Hw== X-Gm-Message-State: ABy/qLb6mC+qyXfNpo3ttYIwvsTGbr+9XY03PE7zsynscO/xFVYtbnqD Slbl/Ae0UICWgBdzXDZdUPAf4+oobYONb8gQEgJDaUdn X-Google-Smtp-Source: APBJJlEIdmAgSXS+jBCf92TFMNZeHhrHeZxGsBRkJMSZp2dG4n23tSqMHkhLitKH1PFT3hezA0pltvZm+HilQmhgi80= X-Received: by 2002:a05:620a:2956:b0:767:2b4e:213 with SMTP id n22-20020a05620a295600b007672b4e0213mr1030535qkp.21.1689914276275; Thu, 20 Jul 2023 21:37:56 -0700 (PDT) List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Scott Gasch Date: Thu, 20 Jul 2023 21:37:29 -0700 Message-ID: Subject: Re: Swap filling up, usermode process swap usage doesn't explain To: freebsd-questions , Pete Wright Content-Type: multipart/alternative; boundary="000000000000236b100600f7d43f" X-Spamd-Result: default: False [-3.77 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-0.93)[-0.931]; NEURAL_HAM_SHORT(-0.84)[-0.843]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20221208]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; FROM_HAS_DN(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; ARC_NA(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::72b:from]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; DKIM_TRACE(0.00)[gmail.com:+]; TO_DN_ALL(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; TAGGED_FROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Queue-Id: 4R6cHT3Jt2z3rts X-Spamd-Bar: --- --000000000000236b100600f7d43f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Ok, I'm an idiot. I'm writing to confess and to maybe save someone else in the future. The issue was I mounted a tmpfs on /tmp and didn't specify an upper size limit. Invariably over time, /tmp would begin to fill up and my swap space would start to be used. Of course, I couldn't find any usermode process that was using the swap and I jumped to the conclusion that this had something to do with kernel memory. But really it was my own stupidity= . Thank you to Pete and others who tried to help. Scott On Wed, Jul 19, 2023 at 4:15=E2=80=AFPM Scott Gasch = wrote: > Replying to my own post with more info... I tried stopping my wireguard > jail and unloading the if_wg kmod and it did not affect the swap memory > usage. Not sure if that lets wireguard off the hook or not though. > > If someone who understands kernel memory could chime in... it looks to me > like the aggregate swap usage of usermode processes is nowhere near the > total swap space used so I suspect something in kernel mode. Does this > make sense or is there another explanation? > > Thx, > Scott > > > On Wed, Jul 19, 2023 at 7:49=E2=80=AFAM Scott Gasch wrote: > >> I am running a 13.2-RELEASE GENERIC kernel and seeing a pattern where, >> after about 10 days of uptime, my swap begins to fill up. >> >> # swapinfo -h >> Device Size Used Avail Capacity >> /dev/ada0p3 48G 3.6G 44G 7% >> /dev/ada1p3 48G 3.6G 44G 7% >> /dev/ada2p3 48G 3.6G 44G 7% >> Total 144G 11G 133G 7% >> >> So, 11G of total swap space. What's using it? >> >> # systat -swap >> /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /1= 0 >> Load Average |||||| >> >> Device/Path Size Used |0% /10 /20 /30 /40 / 60\ 70\ 80\ >> 90\ 100| >> ada0p3 48G 3660M XXX >> ada1p3 48G 3666M XXX >> ada2p3 48G 3664M XXX >> Total 144G 11G XXX >> >> Pid Username Command Swap/Total Per-Process Per-System >> 14703 scott python3.8 4M / 154M 2% 0% >> 2451 scott rclone 4M / 934M 0% 0% >> 2452 scott rclone 3M / 1G 0% 0% >> 73827 scott bash 1M / 17M 6% 0% >> 39416 scott tmux 968K / 54M 1% 0% >> 41661 scott bash 828K / 17M 4% 0% >> 15727 scott bash 808K / 17M 4% 0% >> 39420 scott bash 804K / 17M 4% 0% >> 2455 scott bash 544K / 15M 3% 0% >> 39367 scott tmux 512K / 15M 3% 0% >> 2447 scott bash 376K / 15M 2% 0% >> 2450 scott bash 364K / 15M 2% 0% >> 2453 scott bash 324K / 15M 2% 0% >> 2454 scott bash 316K / 15M 2% 0% >> 2445 scott bash 312K / 15M 2% 0% >> 44937 scott bash 304K / 17M 1% 0% >> 2458 scott bash 72K / 15M 0% 0% >> >> At least they agree about it being 11G. Is this kernel memory being >> paged out to swap? The machine has 128G of physical memory and isn't un= der >> very heavy load at the moment. >> >> I suspect this is a bug in some kernel module... possibly >> wireguard because I run wireguard in a vnet jail and didn't observe this >> problem until setting that up. But I don't have any hard evidence. >> >> I've tried to mitigate this via swapoff -a. This works once but the nex= t >> day swap will be back, even fuller. I've been doing regular reboots to >> fix this but would like to get to the bottom of it. If left alone, swap >> will >> fill up and the machine will get into a "not quite hung" but unusable an= d >> useless state. >> >> Am I off-base with my suspicion that this is kernel mode memory? Can >> someone teach me how to diagnose the status of kernel mode memory heap? >> >> Thx, >> Scott >> >> --000000000000236b100600f7d43f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Ok, I'm an idiot.=C2=A0 I'm writing to confess and= to maybe save someone else in the future.=C2=A0 The issue was I mounted a = tmpfs on /tmp and didn't specify an upper size limit.=C2=A0 Invariably = over time, /tmp would begin to fill up and my swap space would start to be = used.=C2=A0 Of course, I couldn't find any usermode process that was us= ing the swap and I jumped to the conclusion that this had something to do w= ith kernel memory.=C2=A0 But really it was my own stupidity.

=
Thank you to Pete and others who tried to help.

Scott

On Wed, Jul 19, 2023 at 4:15=E2=80=AFPM Scott Gasch <scott.gasch@gmail.com> wrote:<= br>
Replying to my own post with more info... I tried stopping my wireguard ja= il and unloading the if_wg kmod and it did not affect the swap memory usage= .=C2=A0 Not sure if that lets wireguard off the hook or not though.
If someone who understands kernel memory could chime in... it l= ooks to me like the aggregate swap usage of usermode processes is nowhere n= ear the total swap space used so I suspect something in kernel mode.=C2=A0 = Does this make sense or is there another explanation?

<= div>Thx,
Scott


On Wed, Jul 19, 2023 at 7:49= =E2=80=AFAM Scott Gasch <scott.gasch@gmail.com> wrote:
I am running a 13.2-RELEA= SE GENERIC kernel and seeing a pattern where, after about 10 days of uptime= , my swap begins to fill up.

# swapinfo -h<= /font>
Device =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0Size =C2=A0 =C2=A0 Used =C2=A0 =C2=A0Avail Capacity
/de= v/ada0p3 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A048G =C2=A0 =C2=A0 3.6G =C2=A0 = =C2=A0 =C2=A044G =C2=A0 =C2=A0 7%
/dev/ada1p3 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A048G =C2=A0 =C2=A0 3.6G =C2=A0 =C2=A0 =C2=A044G =C2=A0 =C2=A0 7%/dev/ada2p3 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A048G =C2=A0 =C2=A0 3.6G =C2= =A0 =C2=A0 =C2=A044G =C2=A0 =C2=A0 7%
Total =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 144G =C2=A0 =C2=A0 =C2=A011G =C2=A0 =C2=A0 133G =C2=A0= =C2=A0 7%

So, 11G of total swap space.=C2= =A0 What's using it?

= # systat -swap
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /0 =C2=A0 /1 =C2=A0 /2= =C2=A0 /3 =C2=A0 /4 =C2=A0 /5 =C2=A0 /6 =C2=A0 /7 =C2=A0 /8 =C2=A0 /9 =C2= =A0 /10
=C2=A0 =C2=A0 =C2=A0Load Average =C2=A0 ||||||

Device/Pat= h =C2=A0 =C2=A0 =C2=A0 Size =C2=A0Used |0% =C2=A0/10 =C2=A0/20 =C2=A0/30 = =C2=A0/40 =C2=A0/ 60\ =C2=A070\ =C2=A080\ =C2=A090\ 100|
ada0p3 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 48G 3660M XXX
ada1p3 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 48G 3666M XXX
ada2p3 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 48G 3664M XXX
Total =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 144G =C2=A0 11G XXX

Pid =C2=A0 =C2=A0Username =C2=A0 Comm= and =C2=A0 =C2=A0 Swap/Total Per-Process =C2=A0 =C2=A0Per-System
=C2=A01= 4703 scott =C2=A0 =C2=A0 =C2=A0python3.8 =C2=A0 =C2=A04M / 154M =C2=A02% = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A0 2451 scott =C2= =A0 =C2=A0 =C2=A0rclone =C2=A0 =C2=A0 =C2=A0 4M / 934M =C2=A00% =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A0 2452 scott =C2=A0 =C2=A0= =C2=A0rclone =C2=A0 =C2=A0 =C2=A0 3M / =C2=A0 1G =C2=A00% =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A073827 scott =C2=A0 =C2=A0 =C2= =A0bash =C2=A0 =C2=A0 =C2=A0 =C2=A0 1M / =C2=A017M =C2=A06% =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A039416 scott =C2=A0 =C2=A0 =C2= =A0tmux =C2=A0 =C2=A0 =C2=A0 968K / =C2=A054M =C2=A01% =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A041661 scott =C2=A0 =C2=A0 =C2=A0bas= h =C2=A0 =C2=A0 =C2=A0 828K / =C2=A017M =C2=A04% =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A015727 scott =C2=A0 =C2=A0 =C2=A0bash = =C2=A0 =C2=A0 =C2=A0 808K / =C2=A017M =C2=A04% =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A00%
=C2=A039420 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0= =C2=A0 =C2=A0 804K / =C2=A017M =C2=A04% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A00%
=C2=A0 2455 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2= =A0 =C2=A0 544K / =C2=A015M =C2=A03% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A00%
=C2=A039367 scott =C2=A0 =C2=A0 =C2=A0tmux =C2=A0 =C2=A0 = =C2=A0 512K / =C2=A015M =C2=A03% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00%
=C2=A0 2447 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0= 376K / =C2=A015M =C2=A02% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0%
=C2=A0 2450 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 364K = / =C2=A015M =C2=A02% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
= =C2=A0 2453 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 324K / =C2= =A015M =C2=A02% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2= =A0 2454 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 316K / =C2=A01= 5M =C2=A02% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A0 24= 45 scott =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 312K / =C2=A015M =C2= =A02% =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A044937 sco= tt =C2=A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 304K / =C2=A017M =C2=A01% = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
=C2=A0 2458 scott =C2= =A0 =C2=A0 =C2=A0bash =C2=A0 =C2=A0 =C2=A0 =C2=A072K / =C2=A015M =C2=A00% = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00%
At least they agree about it being 11G.=C2=A0 Is this kernel me= mory being paged out to swap?=C2=A0 The machine has 128G of physical memory= and isn't under very heavy load at the moment.=C2=A0=C2=A0
<= br>
I suspect this is a bug in some kernel module... possibly wir= eguard=C2=A0because I run wireguard in a vnet jail and didn't observe t= his problem until setting that up.=C2=A0 But I don't have any hard evid= ence.

I've tried to mitigate this via swapoff -a.=C2=A0 T= his works once but the next
day swap will be back, even fuller.=C2=A0 I&= #39;ve been doing regular reboots to fix this but would like to get to the = bottom of it.=C2=A0 If left alone, swap will
fill up and the machine wil= l get into a "not quite hung" but unusable and useless state.
=
Am I off-base with my suspicion that this is kernel mode memory? Can so= meone teach me how to diagnose the status of kernel mode memory heap?
Thx,
Scott

--000000000000236b100600f7d43f--