Re: After 13.1 install, "panic: AP #1 (PHY #1) failed!" with SuperMicro X10SRL-F motherboard
Date: Tue, 13 Dec 2022 20:35:24 UTC
(Please email me too when you reply.) On Fri, Dec 9, 2022 at 9:35 AM Anubhav/FreeBSD wrote: > The computer server with ... > > SuperMicro X10SRL-F motherboard (LGA 2011-V3, C612 chipset), > Intel Xeon E5-1620 V3 CPU > > ... was working just fine with FreeBSD 12.x & 13.0. 13.0 was > installed from scratch with ZFS on root. > > Two days ago I updated the OS to 13.1-p5 in a new boot environment > ("freebsd-update -r 13.1-RELEASE upgrade"; "freebsd-update install"; > reboot; "freebsd-update install"). I did so over ssh. > > After a day, I could not connect to the computer via ssh. When I checked, > lots of error messages from sshd were *flying* on the console (failed to > take a photo). I could not do anything on the console. (The computer is > connected to video & keyboard via software KVM; there is no physical serial > connection.) > > After reboot of 13.1-p5, a "panic" happens all the 3-4 times I tried ... > > (transcribed from the photo of the screen after booting in verbose mode) > SMP: Added CPU 1 (AP) > MADT: Found CPU APIC ID 3 ACPI ID 3: enabled > SMP: Added CPU 3 (AP) > MADT: Found CPU APIC ID 5 ACPI ID 5: enabled > SMP: Added CPU 5 (AP) > MADT: Found CPU APIC ID 7 ACPI ID 7: enabled > SMP: Added CPU 7 (AP) > Event timer "LAPIC" quality 600 > LAPIC: ipi_wait() us multiplier 64 (r 5400080 tsc 3500095930) > ACPI APIC Table: <SUPERM SMCI--MB> > Package ID shift: 4 > L3 cache shift: 4 > L2 cache shift: 1 > L1 cache shift: 1 > Core ID shift: 1 > AP boot address: 0x98000 > panic: AP #1 (PHY #1) failed! > cpuid = 0 > time = 1 > KDB: stack backtrace > #0 0xffffffff80c694a5 at kdb_backtrace+0x65 > #1 0xffffffff80c1bb5f at vpanic+0x17f > #2 0xffffffff80c1b983 at panic+0x43 > #3 0xffffffff81093633 at native_start_all_aps+0x633 > #4 0xffffffff81092ce1 at cpu_mp_start+0x1a1 > #5 0xffffffff80c7c32a at mp_start+0x9a > #6 0xffffffff80ba970f at mi_startup+0xdf > #7 0xffffffff80385022 at btext+0x22 > Uptime: 1s > > > ... What is going on here, or what had happened with 13.1 install > that the machine panics? > > Booting with any of 13.0-p1[13] boot environments makes > no difference. > > ... After removing the machine from the rack (included disconnection of RaidMachine 24-bay disk enclosure from the LSI HBA card installed in the machine), it booted right up (with already installed FreeBSD 13.1-p5 on the internal disk) as if nothing had happened! There was no panic or any "AP #1 (PHY #1) failed!"-like messages. How? Why? If the machine still had panicked (after removal from the rack), then I could have tried ... - updating the BIOS; - booting from 13.[01] image from a USB flash stick; - installing 13.[01] from scratch. Now, I do not know how much I can trust the machine to not fail (panic again on a reboot). - Anubhav