Re: ZFS + FreeBSD XEN dom0 panic
- In reply to: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 15 Apr 2022 09:06:12 UTC
On 2022.04.14. 10:39, Roger Pau Monné wrote: > .. > Thanks. I will groom those patches in order to prepare them for > commit. Regardless of whether there are other issues still lurking I > think those changes are worth committing now. Hi, So the tests still are running with 3 patches, seems a lot better than without, soon i think i will stop them, since they have proved a point. uptime 11:22AM up 1 day, 19:18, 8 users, load averages: 2.17, 2.04, 2.12 There has been some problem though, because at this stage xl list has (null) VM. xl list Name ID Mem VCPUs State Time(s) Domain-0 0 1023 4 r----- 211752.0 (null) 346 0 1 --ps-d 61.7 xen-vm2nonic-zvol-5 557 1024 1 r----- 47.2 xen-vm1nonic-zvol 558 1024 1 -b---- 42.0 I have no idea why it went in that state. I've been collecting vmstat -m since start and after filtering out, i think i got maybe useful hints. Pictures in url: https://file.fm/u/k67uhj436#/ In general we can see, that xbbd with 3 patches seems not to leak memory. I did not see any component whose memory usage was just growing. Images show values of InUse of vmstat -m, y is bytes, x unix timestamp, but i don't know if maybe i should have looked into MemUse instead. We see that solaris takes quiet a chunk, which is expected for ZFS i guess. But the interesting thing is a spike in newblk, looking deeper it can be seen that this spike happens at the same time when jsegdep spike is, and then there is at smaller scale, but still spike at the same time for jseg. At the same time when there is spike for newblk there is a little bit dip for solaris. If there are some specific ones i should pay more attention, i could look at them. I would like to speculate that "..pmap_growkernel.." panic happens at those times when there is spike high enough and that for this current case it was just a lucky coincidence that system had more mem and did not panic. Or maybe this is where this (null) VM appeared. Unfortunateley i did not log output of xl list with timestamps. Currently xenstore-ls -fp, does not contain any row with 346, so i suppose that disks have been freed, i don't see any suspicious sysctl variables either, so i do not know what state this (null) VM is in and why it is not cleaning up. Is there some useful command in this case to collect more info about (null) VM? Thanks.