Re: memory issue - bug?
- In reply to: infoomatic : "memory issue - bug?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 26 Oct 2023 18:33:38 UTC
On 10/21/23 01:56, infoomatic wrote: > Hi, > > I have encountered a strange memory issue with my pcengines APU2 > (running 13.2-RELEASE-p4). I use it as a firewall with pf and it acts as > a backup server with an external 4xdisk USB enclosure for a jail running > borg backup. Sometimes I do cache some videos from youtube using yt-dl - > this is where it becomes interesting. The box has 4GB of ram (and in > loader.conf is vfs.zfs.arc_max=268435456 configured), and sometimes > yt-dl needs too much memory, the system runs out of memory. The > following lines pop up in syslog: Your issue sounds like what I had when trying to run without swap (main issue) and had done some other recommended tweaking to have more vnodes for filesystem performance (other crashes/resource exhaustion). Have you tried it without running sysctl customization for zfs or otherwise? any other tweaks being applined in loader.conf or elsewhere? > swap_pager: cannot allocate bio How much swap does the system have? If none, does it still occur with a swap partition (I'd try at least the size of RAM). If you did have swap, any particular strange aspects to it (multiple partitions on different disks or of different size parttitions?) I have my desktop PC with its power button giving an orderly shutdown command to FreeBSD. Pressing that gives a single flicker of hard drive light activity and would 'almost' always shut down the system when in such a state (caused by RAM being used up by poudriere jobs, firefox, or even writing something to allocate too much memory which I used to be able to use to 'flush' zfs cache from it). With 32GB RAM I gave it 32GB swap with better results than none (system being nearly unresponsive once enough RAM was being used up) or too little (4GB swap would normally stay responsive but lead to less sane choices of crashing programs like xorg or nvidia graphics module when other programs were using up too much RAM). > and many lines like > pid 23342 (getty), jid 0, uid 0, was killed: failed to reclaim memory > pid 5935 (tmux), jid 0, uid 1001, was killed: failed to reclaim memory > pid 14258 (top), jid 0, uid 0, was killed: failed to reclaim memory > > When I encounter the out of memory situation, I logged in and inspected > the system. The system was sluggish and it reported that 0-16 MB RAM was > used., I saw that only basic system services are running [1]. > Starting top tells me that "Free" memory is ~ 800MB, and constantly > declining: > > Mem: 8816K Active, 11M Inact, 3344M Wired, 548M Free > > top -o res shows me top the process (18 processes: 1 running, 17 > sleeping) occupying the most memory (~3MB): > > 78154 root 1 20 0 14M 3172K CPU3 3 0:00 0.17% top > > I can just watch the systems wired memory going up until everything is > dying. Disabling and re-enabling swap has no effect. Top is killed after > about 2 minutes, and I am kicked out. I can login again shortly after, > but the same situation happens again: Free memory going down and Wired > going up. PF continues to work but I cannot start a backup or any file > operation because they just die after a short while. > > Has someone an idea to debug this ... or a workaround? > > Thanks for sharing your ideas! > > Robert > > > > [1]ps auxwww: > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > root 11 198.5 0.0 0 64 - RNL Sun18 19955:10.62 [idle] > root 0 101.2 0.1 0 2832 - DLs Sun18 7354:27.71 [kernel] > root 5 100.0 0.1 0 2080 - RL Sun18 1307:50.45 [zfskern] > root 44547 2.1 0.1 14908 2800 0 S 10:41 0:00.11 bash > root 7 1.0 0.0 0 48 - DL Sun18 3:47.38 > [pagedaemon] > robert 42128 0.9 0.1 21508 2112 - S 10:41 0:00.06 sshd: > robert@pts/0 (sshd) > root 41079 0.7 0.0 21508 1584 - Ss 10:41 0:00.26 sshd: > robert [priv] (sshd) > root 1 0.0 0.0 11768 36 - SLs Sun18 0:02.07 /sbin/init > root 2 0.0 0.0 0 64 - DL Sun18 0:00.00 [KTLS] > root 3 0.0 0.0 0 80 - DL Sun18 0:00.00 [crypto] > root 4 0.0 0.0 0 32 - DL Sun18 86:34.52 [cam] > root 6 0.0 0.0 0 16 - DL Sun18 1:28.84 > [rand_harvestq] > root 8 0.0 0.0 0 16 - DL Sun18 1:47.46 [vmdaemon] > root 9 0.0 0.0 0 48 - DL Sun18 0:19.60 [bufdaemon] > root 10 0.0 0.0 0 16 - DL Sun18 0:00.00 [audit] > root 12 0.0 0.0 0 240 - WL Sun18 98:16.01 [intr] > root 13 0.0 0.0 0 48 - DL Sun18 0:00.92 [geom] > root 14 0.0 0.0 0 16 - DL Sun18 0:00.00 > [sequencer 00] > root 15 0.0 0.0 0 160 - DL Sun18 126:23.98 [usb] > root 16 0.0 0.0 0 16 - DL Sun18 0:12.04 > [acpi_thermal] > root 17 0.0 0.0 0 16 - DL Sun18 0:07.64 [vnlru] > root 18 0.0 0.0 0 16 - DL Sun18 0:21.84 [syncer] > root 19 0.0 0.0 0 16 - DL Sun18 7:37.35 [racctd] > root 1265 0.0 0.0 0 16 - DL Sun18 156:54.73 > [g_eli[0] gpt/hd001] > root 1537 0.0 0.0 0 16 - DL Sun18 158:36.95 > [g_eli[1] gpt/hd001] > root 1934 0.0 0.0 0 16 - DL Sun18 161:37.98 > [g_eli[2] gpt/hd001] > root 2196 0.0 0.0 0 16 - DL Sun18 159:34.59 > [g_eli[3] gpt/hd001] > root 3057 0.0 0.0 0 16 - DL Sun18 157:32.36 > [g_eli[0] gpt/hd002] > root 3334 0.0 0.0 0 16 - DL Sun18 159:15.82 > [g_eli[1] gpt/hd002] > root 3574 0.0 0.0 0 16 - DL Sun18 0:00.25 [aiod1] > root 3751 0.0 0.0 0 16 - DL Sun18 161:45.95 > [g_eli[2] gpt/hd002] > root 3889 0.0 0.0 0 16 - DL Sun18 0:00.24 [aiod2] > root 4138 0.0 0.0 0 16 - DL Sun18 0:00.25 [aiod3] > root 4252 0.0 0.0 0 16 - DL Sun18 160:35.16 > [g_eli[3] gpt/hd002] > root 4397 0.0 0.0 0 16 - DL Sun18 0:00.24 [aiod4] > root 5008 0.0 0.0 0 16 - DL Sun18 156:54.76 > [g_eli[0] gpt/hd011] > root 5009 0.0 0.0 0 16 - DL Sun18 158:16.99 > [g_eli[1] gpt/hd011] > root 5457 0.0 0.0 0 16 - DL Sun18 161:03.34 > [g_eli[2] gpt/hd011] > root 5929 0.0 0.0 0 16 - DL Sun18 159:37.17 > [g_eli[3] gpt/hd011] > root 9001 0.0 0.0 21068 328 - Ss Sun18 0:00.25 sshd: > /usr/sbin/sshd [listener] 0 of 10-100 startups (sshd) > root 10008 0.0 0.0 12908 4 - IWs - 0:00.00 > /usr/sbin/cron -s > root 43429 0.0 0.0 0 16 - DL Sun18 11:45.45 [pf purge] > root 78988 0.0 0.0 12868 300 - Ss Sun18 0:17.57 > /usr/sbin/syslogd -ss > root 84156 0.0 0.0 0 16 - DL Sun18 35:19.32 > [g_eli[0] gpt/hd037] > root 84470 0.0 0.0 0 16 - DL Sun18 35:53.31 > [g_eli[1] gpt/hd037] > root 85309 0.0 0.0 0 16 - DL Sun18 38:08.91 > [g_eli[2] gpt/hd037] > root 86045 0.0 0.0 0 16 - DL Sun18 37:08.59 > [g_eli[3] gpt/hd037] > root 86051 0.0 0.0 0 16 - DL Sun18 35:51.55 > [g_eli[0] gpt/hd038] > root 86644 0.0 0.0 0 16 - DL Sun18 36:03.03 > [g_eli[1] gpt/hd038] > root 87199 0.0 0.0 0 16 - DL Sun18 37:56.77 > [g_eli[2] gpt/hd038] > root 87235 0.0 0.0 0 16 - DL Sun18 37:21.79 > [g_eli[3] gpt/hd038] > root 87490 0.0 0.0 0 16 - DL Sun18 35:39.81 > [g_eli[0] gpt/hd039] > root 87546 0.0 0.0 0 16 - DL Sun18 35:43.55 > [g_eli[1] gpt/hd039] > root 87988 0.0 0.0 0 16 - DL Sun18 38:05.54 > [g_eli[2] gpt/hd039] > root 88507 0.0 0.0 0 16 - DL Sun18 37:06.09 > [g_eli[3] gpt/hd039] > root 88536 0.0 0.0 0 16 - DL Sun18 35:28.86 > [g_eli[0] gpt/hd040] > root 89143 0.0 0.0 0 16 - DL Sun18 37:09.63 > [g_eli[1] gpt/hd040] > root 89723 0.0 0.0 0 16 - DL Sun18 37:56.60 > [g_eli[2] gpt/hd040] > root 90441 0.0 0.0 0 16 - DL Sun18 37:08.54 > [g_eli[3] gpt/hd040] > root 97972 0.0 0.0 12836 36 u0 Is+ 10:22 0:00.01 > /usr/libexec/getty 3wire ttyu0 > root 34488 0.0 0.0 12836 36 v0 Is+ 10:39 0:00.01 > /usr/libexec/getty Pc ttyv0 > root 97710 0.0 0.0 12836 36 v1 Is+ 10:22 0:00.01 > /usr/libexec/getty Pc ttyv1 > root 33736 0.0 0.0 12836 36 v2 Is+ 10:38 0:00.02 > /usr/libexec/getty Pc ttyv2 > root 22 0.0 0.0 12836 36 v3 Is+ 10:25 0:00.02 > /usr/libexec/getty Pc ttyv3 > root 34562 0.0 0.0 12836 36 v4 Is+ 10:39 0:00.01 > /usr/libexec/getty Pc ttyv4 > root 33901 0.0 0.0 12836 36 v5 Is+ 10:39 0:00.01 > /usr/libexec/getty Pc ttyv5 > root 20 0.0 0.0 12836 36 v6 Is+ 06:06 0:00.01 > /usr/libexec/getty Pc ttyv6 > root 21 0.0 0.0 12836 36 v7 Is+ 07:12 0:00.01 > /usr/libexec/getty Pc ttyv7 > robert 42985 0.0 0.0 13580 4 0 SWs - 0:00.00 -sh (sh) > robert 43832 0.0 0.0 14908 4 0 SW - 0:00.00 > /usr/local/bin/bash > root 45136 0.0 0.1 13444 2160 0 R+ 10:41 0:00.02 ps auxwww >