Re: Swap filling up, usermode process swap usage doesn't explain

From: Scott Gasch <scott.gasch_at_gmail.com>
Date: Fri, 21 Jul 2023 04:37:29 UTC
Ok, I'm an idiot.  I'm writing to confess and to maybe save someone else in
the future.  The issue was I mounted a tmpfs on /tmp and didn't specify an
upper size limit.  Invariably over time, /tmp would begin to fill up and my
swap space would start to be used.  Of course, I couldn't find any usermode
process that was using the swap and I jumped to the conclusion that this
had something to do with kernel memory.  But really it was my own stupidity.

Thank you to Pete and others who tried to help.

Scott

On Wed, Jul 19, 2023 at 4:15 PM Scott Gasch <scott.gasch@gmail.com> wrote:

> Replying to my own post with more info... I tried stopping my wireguard
> jail and unloading the if_wg kmod and it did not affect the swap memory
> usage.  Not sure if that lets wireguard off the hook or not though.
>
> If someone who understands kernel memory could chime in... it looks to me
> like the aggregate swap usage of usermode processes is nowhere near the
> total swap space used so I suspect something in kernel mode.  Does this
> make sense or is there another explanation?
>
> Thx,
> Scott
>
>
> On Wed, Jul 19, 2023 at 7:49 AM Scott Gasch <scott.gasch@gmail.com> wrote:
>
>> I am running a 13.2-RELEASE GENERIC kernel and seeing a pattern where,
>> after about 10 days of uptime, my swap begins to fill up.
>>
>> # swapinfo -h
>> Device              Size     Used    Avail Capacity
>> /dev/ada0p3          48G     3.6G      44G     7%
>> /dev/ada1p3          48G     3.6G      44G     7%
>> /dev/ada2p3          48G     3.6G      44G     7%
>> Total               144G      11G     133G     7%
>>
>> So, 11G of total swap space.  What's using it?
>>
>> # systat -swap
>>                     /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
>>      Load Average   ||||||
>>
>> Device/Path       Size  Used |0%  /10  /20  /30  /40  / 60\  70\  80\
>>  90\ 100|
>> ada0p3             48G 3660M XXX
>> ada1p3             48G 3666M XXX
>> ada2p3             48G 3664M XXX
>> Total             144G   11G XXX
>>
>> Pid    Username   Command     Swap/Total Per-Process    Per-System
>>  14703 scott      python3.8    4M / 154M  2%              0%
>>   2451 scott      rclone       4M / 934M  0%              0%
>>   2452 scott      rclone       3M /   1G  0%              0%
>>  73827 scott      bash         1M /  17M  6%              0%
>>  39416 scott      tmux       968K /  54M  1%              0%
>>  41661 scott      bash       828K /  17M  4%              0%
>>  15727 scott      bash       808K /  17M  4%              0%
>>  39420 scott      bash       804K /  17M  4%              0%
>>   2455 scott      bash       544K /  15M  3%              0%
>>  39367 scott      tmux       512K /  15M  3%              0%
>>   2447 scott      bash       376K /  15M  2%              0%
>>   2450 scott      bash       364K /  15M  2%              0%
>>   2453 scott      bash       324K /  15M  2%              0%
>>   2454 scott      bash       316K /  15M  2%              0%
>>   2445 scott      bash       312K /  15M  2%              0%
>>  44937 scott      bash       304K /  17M  1%              0%
>>   2458 scott      bash        72K /  15M  0%              0%
>>
>> At least they agree about it being 11G.  Is this kernel memory being
>> paged out to swap?  The machine has 128G of physical memory and isn't under
>> very heavy load at the moment.
>>
>> I suspect this is a bug in some kernel module... possibly
>> wireguard because I run wireguard in a vnet jail and didn't observe this
>> problem until setting that up.  But I don't have any hard evidence.
>>
>> I've tried to mitigate this via swapoff -a.  This works once but the next
>> day swap will be back, even fuller.  I've been doing regular reboots to
>> fix this but would like to get to the bottom of it.  If left alone, swap
>> will
>> fill up and the machine will get into a "not quite hung" but unusable and
>> useless state.
>>
>> Am I off-base with my suspicion that this is kernel mode memory? Can
>> someone teach me how to diagnose the status of kernel mode memory heap?
>>
>> Thx,
>> Scott
>>
>>