Re: Recently moved to XS 8.2 on new hardware - seen a couple of FreeBSD DomU lock ups?
- Reply: Roger Pau Monné : "Re: Recently moved to XS 8.2 on new hardware - seen a couple of FreeBSD DomU lock ups?"
- Reply: Karl Pielorz : "Re: Recently moved to XS 8.2 on new hardware - seen a couple of FreeBSD DomU lock ups?"
- In reply to: Karl Pielorz : "Recently moved to XS 8.2 on new hardware - seen a couple of FreeBSD DomU lock ups?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 14 Oct 2022 14:40:17 UTC
On Thu, Oct 13, 2022 at 11:00:17AM +0100, Karl Pielorz wrote: > > Hi all, > > We've been running FreeBSD as a DomU under Xen Server for years now, and > only really had a few now 'known' issues (e.g. with networking). > > We've recently setup XS 8.2 on some new Dell servers, and whilst everything > appears fine - twice now in a couple of weeks we've had FreeBSD DomU's just > "lock up". > > There's no errors logged, no kernel panic, nothing - they just "stop". Xen > reckons the CPU is pegged at 100% (with brief periods of zero) - the console > is still 'available' (but locked up) - and the kernel is dead (i.e. you > cannot ping it). > > Aside from the "Has anyone else seen similar" - with nothing in the logs, no > panic, nothing - I'm kind of at a loss as to how best to troubleshoot this > further? > > The VM's are lightly loaded - haven't run out of resources (RAM /CPU etc.) - > and it's only happened a couple of times now (but in a couple of weeks) - > whereas our setup before never experienced this in it's lifetime. > > One was a legacy 11.4 system (amd64) - the other was a 12.3 system (amd64). > > A forced reboot brings them back (with some file system damage - as you'd > expect from a crash). > > Just at a loss as to where to look - given the lack of any panic/errors etc. > Any suggestions? Hello, Sorry, been very busy this week and forgot to reply earlier. Could you try to setup a watchdog in FreeBSD and see if that triggers? So that we can get an idea of where the guest locks up. Is also the 100% load on all CPUs, or just one? If the watchdog doesn't work we can try other methods. Thanks, Roger.