debugging OS hangs
- Reply: Edward Sanford Sutton, III: "Re: debugging OS hangs"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 02 Nov 2024 18:02:58 UTC
I have a FreeBSD host running 14.1 which has an uptime of maybe a day or two between hangs. When it hangs there is no console output indicating a problem and it no longer responds to ping. Keyboard input is ignored. I have attached the output of dmesg below. I ran a memory test; the RAM chips seem to be fine. I’ve replace the root disk too, and put in a new network card in the hope that the problem was a nic driver. The hangs still happen. Are there any switches I can set to get some clues as to what is happening? ---<<BOOT>>--- Copyright (c) 1992-2023 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 14.1-RELEASE-p5 GENERIC amd64 FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9) VT(efifb): resolution 1024x768 CPU: AMD Ryzen 9 3900X 12-Core Processor (3793.05-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x870f10 Family=0x17 Model=0x71 Stepping=0 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x7ef8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x75c237ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX,ADMSKX> Structured Extended Features=0x219c91a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,PQE,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA> Structured Extended Features2=0x400004<UMIP,RDPID> XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> AMD Extended Feature Extensions ID EBX=0x108b657<CLZERO,IRPerf,XSaveErPtr,RDPRU,WBNOINVD,IBPB,STIBP,SSBD> SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768 TSC: P-state invariant, performance statistics real memory = 68717379584 (65534 MB) avail memory = 66761863168 (63669 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <ALASKA A M I > FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs FreeBSD/SMP: 1 package(s) x 4 cache groups x 3 core(s) x 2 hardware threads random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" random: unblocking device. ioapic0 <Version 2.1> irqs 0-23 ioapic1 <Version 2.1> irqs 24-55 Launching APs: 16 17 15 13 8 6 1 7 10 2 18 5 11 20 22 4 3 21 23 19 14 9 12 random: entropy device external interface kbd1 at kbdmux0 efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s smbios0: <System Management BIOS> at iomem 0xbd9f1000-0xbd9f101e smbios0: Version: 3.3, BCD Revision: 3.3 aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256> acpi0: <ALASKA A M I > acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 350 Event timer "HPET1" frequency 14318180 Hz quality 350 Event timer "HPET2" frequency 14318180 Hz quality 350 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <base peripheral, IOMMU> at device 0.2 (no driver attached) pcib1: <ACPI PCI-PCI bridge> at device 1.1 on pci0 pci1: <ACPI PCI bus> on pcib1 nvme0: <Generic NVMe Device> mem 0xfcf00000-0xfcf03fff irq 24 at device 0.0 on pci1 pcib2: <ACPI PCI-PCI bridge> at device 1.2 on pci0 pci2: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> irq 28 at device 0.0 on pci2 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> at device 3.0 on pci3 pci4: <ACPI PCI bus> on pcib4 pci4: <network> at device 0.0 (no driver attached) pcib5: <ACPI PCI-PCI bridge> at device 4.0 on pci3 pci5: <ACPI PCI bus> on pcib5 igb0: <Intel(R) I211 (Copper)> port 0xe000-0xe01f mem 0xfc900000-0xfc91ffff,0xfc920000-0xfc923fff irq 28 at device 0.0 on pci5 igb0: NVM V0.6 imgtype1 igb0: Using 1024 TX descriptors and 1024 RX descriptors igb0: Using 2 RX queues 2 TX queues igb0: Using MSI-X interrupts with 3 vectors igb0: Ethernet address: 18:c0:4d:89:1c:0f igb0: netmap queues/slots: TX 2/1024, RX 2/1024 pcib6: <ACPI PCI-PCI bridge> at device 5.0 on pci3 pci6: <ACPI PCI bus> on pcib6 re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xd000-0xd0ff mem 0xfc804000-0xfc804fff,0xfc800000-0xfc803fff irq 29 at device 0.0 on pci6 re0: Using 1 MSI-X message re0: Chip rev. 0x54000000 re0: MAC rev. 0x00100000 miibus0: <MII bus> on re0 rgephy0: <RTL8251/8153 1000BASE-T media interface> PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Using defaults for TSO: 65518/35/2048 re0: Ethernet address: 00:e0:4c:69:2f:72 re0: netmap queues/slots: TX 1/256, RX 1/256 pcib7: <ACPI PCI-PCI bridge> at device 6.0 on pci3 pci7: <ACPI PCI bus> on pcib7 xhci0: <XHCI (generic) USB 3.0 controller> mem 0xfc700000-0xfc700fff irq 30 at device 0.0 on pci7 xhci0: 32 bytes context size, 64-bit DMA usbus0 on xhci0 usbus0: 5.0Gbps Super Speed USB v3.0 pcib8: <ACPI PCI-PCI bridge> irq 28 at device 8.0 on pci3 pci8: <ACPI PCI bus> on pcib8 xhci1: <AMD Matisse USB 3.0 controller> mem 0xfc400000-0xfc4fffff irq 28 at device 0.1 on pci8 xhci1: 64 bytes context size, 64-bit DMA usbus1 on xhci1 usbus1: 5.0Gbps Super Speed USB v3.0 xhci2: <AMD Matisse USB 3.0 controller> mem 0xfc300000-0xfc3fffff irq 30 at device 0.3 on pci8 xhci2: 64 bytes context size, 64-bit DMA usbus2 on xhci2 usbus2: 5.0Gbps Super Speed USB v3.0 pcib9: <PCI-PCI bridge> irq 29 at device 9.0 on pci3 pci9: <PCI bus> on pcib9 ahci0: <AMD KERNCZ AHCI SATA controller> mem 0xfc600000-0xfc6007ff irq 29 at device 0.0 on pci9 ahci0: AHCI v1.31 with 2 6Gbps ports, Port Multiplier supported with FBS ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich3: <AHCI channel> at channel 3 on ahci0 pcib10: <PCI-PCI bridge> irq 30 at device 10.0 on pci3 pci10: <PCI bus> on pcib10 ahci1: <AMD KERNCZ AHCI SATA controller> mem 0xfc500000-0xfc5007ff irq 30 at device 0.0 on pci10 ahci1: AHCI v1.31 with 4 6Gbps ports, Port Multiplier supported with FBS ahcich4: <AHCI channel> at channel 0 on ahci1 ahcich5: <AHCI channel> at channel 1 on ahci1 ahcich8: <AHCI channel> at channel 4 on ahci1 ahcich9: <AHCI channel> at channel 5 on ahci1 pcib11: <ACPI PCI-PCI bridge> at device 3.1 on pci0 pci11: <ACPI PCI bus> on pcib11 vgapci0: <VGA-compatible display> port 0xf000-0xf0ff mem 0xd0000000-0xdfffffff,0xe0000000-0xe01fffff,0xfce00000-0xfce3ffff irq 54 at device 0.0 on pci11 vgapci0: Boot video device hdac0: <ATI (0xaaf0) HDA Controller> mem 0xfce60000-0xfce63fff irq 55 at device 0.1 on pci11 pcib12: <ACPI PCI-PCI bridge> at device 7.1 on pci0 pci12: <ACPI PCI bus> on pcib12 pcib13: <ACPI PCI-PCI bridge> at device 8.1 on pci0 pci13: <ACPI PCI bus> on pcib13 pci13: <encrypt/decrypt> at device 0.1 (no driver attached) xhci3: <AMD Matisse USB 3.0 controller> mem 0xfcb00000-0xfcbfffff irq 39 at device 0.3 on pci13 xhci3: 64 bytes context size, 64-bit DMA usbus3 on xhci3 usbus3: 5.0Gbps Super Speed USB v3.0 hdac1: <AMD X570 HDA Controller> mem 0xfcd00000-0xfcd07fff irq 36 at device 0.4 on pci13 isab0: <PCI-ISA bridge> at device 20.3 on pci0 isa0: <ISA bus> on isab0 acpi_button0: <Power Button> on acpi0 acpi_tz0: <Thermal Zone> on acpi0 acpi_tz1: <Thermal Zone> on acpi0 acpi_tz2: <Thermal Zone> on acpi0 orm0: <ISA Option ROM> at iomem 0xc0000-0xce7ff pnpid ORM0000 on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbdc0: non-PNP ISA device will be removed from GENERIC in FreeBSD 15. hwpstate0: <Cool`n'Quiet 2.0> on cpu0 Timecounter "TSC-low" frequency 1896437301 Hz quality 1000 Timecounters tick every 1.000 msec hdacc0: <ATI R6xx HDA CODEC> at cad 0 on hdac0 hdaa0: <ATI R6xx Audio Function Group> at nid 1 on hdacc0 pcm0: <ATI R6xx (HDMI)> at nid 3 on hdaa0 pcm1: <ATI R6xx (HDMI)> at nid 5 on hdaa0 pcm2: <ATI R6xx (HDMI)> at nid 7 on hdaa0 pcm3: <ATI R6xx (HDMI)> at nid 9 on hdaa0 pcm4: <ATI R6xx (HDMI)> at nid 11 on hdaa0 pcm5: <ATI R6xx (HDMI)> at nid 13 on hdaa0 hdacc1: <Realtek ALCS1200A HDA CODEC> at cad 0 on hdac1 hdaa1: <Realtek ALCS1200A Audio Function Group> at nid 1 on hdacc1 pcm6: <Realtek ALCS1200A (Rear Analog 5.1/2.0)> at nid 20,22,21 and 24,26 on hdaa1 pcm7: <Realtek ALCS1200A (Front Analog)> at nid 27 and 25 on hdaa1 pcm8: <Realtek ALCS1200A (Rear Digital)> at nid 30 on hdaa1 ugen3.1: <AMD XHCI root HUB> at usbus3 Trying to mount root from ufs:/dev/nda0p2 [rw]... ugen1.1: <AMD XHCI root HUB> at usbus1 uhub0 on usbus1 uhub0: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1 ugen2.1: <AMD XHCI root HUB> at usbus2 uhub1 on usbus3 uhub1: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus3 uhub2 on usbus2 uhub2: <AMD XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2 ugen0.1: <(0x1106) XHCI root HUB> at usbus0 uhub3 on usbus0 uhub3: <(0x1106) XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 uhub3: 5 ports with 4 removable, self powered nda0 at nvme0 bus 0 scbus6 target 0 lun 1 nda0: <WD_BLACK SN850X 1000GB 620331WD 233204800587> nda0: Serial Number 233204800587 nda0: nvme version 1.4 nda0: 953869MB (1953525168 512 byte sectors) ada0 at ahcich4 bus 0 scbus2 target 0 lun 0 ada0: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device ada0: Serial Number WD-WX12D50LH5YY ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 5723166MB (11721045168 512 byte sectors) ada0: quirks=0x1<4K> ada1 at ahcich5 bus 0 scbus3 target 0 lun 0 ada1: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device ada1: Serial Number WD-WX32D50E53LJ ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 5723166MB (11721045168 512 byte sectors) ada1: quirks=0x1<4K> ada2 at ahcich8 bus 0 scbus4 target 0 lun 0 ada2: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device ada2: Serial Number WD-WX72D507J792 ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 5723166MB (11721045168 512 byte sectors) ada2: quirks=0x1<4K> ada3 at ahcich9 bus 0 scbus5 target 0 lun 0 ada3: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device ada3: Serial Number WD-WX72D507J8YK ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 5723166MB (11721045168 512 byte sectors) ada3: quirks=0x1<4K> uhub1: 8 ports with 8 removable, self powered uhub0: 10 ports with 10 removable, self powered uhub2: 10 ports with 10 removable, self powered ugen0.2: <vendor 0x2109 USB2.0 Hub> at usbus0 uhub4 on uhub3 uhub4: <vendor 0x2109 USB2.0 Hub, class 9/0, rev 2.10/4.20, addr 1> on usbus0 ugen1.2: <vendor 0x05e3 USB2.0 Hub> at usbus1 uhub5 on uhub0 uhub5: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/85.36, addr 1> on usbus1 ugen2.2: <vendor 0x8087 product 0x0025> at usbus2 ugen2.3: <vendor 0x05e3 USB2.0 Hub> at usbus2 uhub6 on uhub2 uhub6: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/85.36, addr 2> on usbus2 uhub4: 4 ports with 4 removable, self powered Root mount waiting for: usbus1 usbus2 uhub5: 4 ports with 4 removable, self powered uhub6: 4 ports with 4 removable, self powered ugen1.3: <ITE Tech. Inc. ITE Device(8595)> at usbus1 ukbd0 on uhub0 ukbd0: <ITE Tech. Inc. ITE Device(8595), class 0/0, rev 2.00/0.03, addr 2> on usbus1 kbd2 at ukbd0 ugen2.4: <Cooler Master Technology Inc. AMD SR4 lamplight Control> at usbus2 ukbd1 on uhub6 ukbd1: <Cooler Master Technology Inc. AMD SR4 lamplight Control, class 0/0, rev 2.00/11.01, addr 3> on usbus2 kbd3 at ukbd1 WARNING: / was not properly dismounted ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Intel(R) Wireless WiFi based driver for FreeBSD intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0 smbus0: <System Management Bus> on intsmb0 iwm0: <Intel(R) Dual Band Wireless AC 9260> mem 0xfca00000-0xfca03fff irq 31 at device 0.0 on pci4 iwm0: hw rev 0x320, fw ver 34.3125811985.0, address bc:17:b8:b7:8b:df acpi_wmi0: <ACPI-WMI mapping> on acpi0 acpi_wmi0: cannot find EC device acpi_wmi0: Embedded MOF found ACPI: \134GSA1.WQCC: 1 arguments were passed to a non-method ACPI object (Buffer) (20221020/nsarguments-361) acpi_wmi1: <ACPI-WMI mapping> on acpi0 acpi_wmi1: cannot find EC device acpi_wmi1: Embedded MOF found ACPI: \134AOD.WQBA: 1 arguments were passed to a non-method ACPI object (Buffer) (20221020/nsarguments-361) driver bug: Unable to set devclass (class: ppc devname: (unknown)) re0: link state changed to UP lo0: link state changed to UP re0: link state changed to DOWN pflog0: promiscuous mode enabled uhid0 on uhub6 uhid0: <Cooler Master Technology Inc. AMD SR4 lamplight Control, class 0/0, rev 2.00/11.01, addr 3> on usbus2 uhid1 on uhub6 uhid1: <Cooler Master Technology Inc. AMD SR4 lamplight Control, class 0/0, rev 2.00/11.01, addr 3> on usbus2 re0: link state changed to UP