From nobody Tue Dec 13 20:35:24 2022 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NWr0y3fVxz4kjln for ; Tue, 13 Dec 2022 20:38:10 +0000 (UTC) (envelope-from anubhav@hawaii.edu) Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NWr0x2XBZz3MLN for ; Tue, 13 Dec 2022 20:38:08 +0000 (UTC) (envelope-from anubhav@hawaii.edu) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=hawaii.edu header.s=google header.b="diH0/mXm"; spf=pass (mx1.freebsd.org: domain of anubhav@hawaii.edu designates 2607:f8b0:4864:20::b35 as permitted sender) smtp.mailfrom=anubhav@hawaii.edu; dmarc=pass (policy=none) header.from=hawaii.edu Received: by mail-yb1-xb35.google.com with SMTP id o127so19127518yba.5 for ; Tue, 13 Dec 2022 12:38:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hawaii.edu; s=google; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=Un7ELGQa+HN6H/v8Mc0XDFVHSg+aCUEyATYB4Ifxdn4=; b=diH0/mXmS0//wqQt+7cj9G2Vm6z0mHo1TNqORNr50Y+/AupWJC/Zqyp4brae/IJxNP 4Cz3wOqswaWu8jZlEc+Cv0sXT7eTro3h6TWAu0MoPy4zHw/pEDGx0ygz7ImyE98OZoga 6OFri/JTfx1ll42iimBzSZH6oHJj7uBc2wiuw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Un7ELGQa+HN6H/v8Mc0XDFVHSg+aCUEyATYB4Ifxdn4=; b=4UE2gYD0yNHBoiwCdrgLcTfOh2q+g3JAUxh+4R9YfDcz3+zXCy4u2Tw0qRdRh3IQ1U wDuNg7PKu2hQNV64EDBnaFUTNFt9ptzlEGy8TX0c1TNuT7Oi/4ewadCVwqajeDCc2HcY SfdIxkuabqJ7OtkgeABmao5gurRLdSIvqCuK81neJD9ZzQVVdWyeuTOscpf5C6kp7BWW 9IrAXgZRHfIbskDDUPNXs0D0Ig6gV5xmoXoMXZP7t8NZEza5mbtOIKGMDVB0wfWGhejv CkQ/3/RDT2wb4Gasey7cgySHr28i18otcyjD6q0XZFeL0IUsoArR/vSvKME73kngPWQL MdwQ== X-Gm-Message-State: ANoB5plDz++zUsEcWULnAvtmQYbpJ02pwRmNVd8+DdGaiJziFFvb3XyE 8IrdAv3TNRcUQXY/X5Nbo08HUOvwQeBjOcfP66P7Yhe1NoObKZ7psRUTyFkCvQrySgdzQBWLgBV l5WpahsUqS+toZlS/hQJcNl+hj7I0kOhlI6xP0KThSN5QvgDkCvbMdPKEqyTTody/hze/UOM+4V 1mZmOY3Oi50YDrY8fT2irSCf/a9RyUGk5Xx27FCEzH X-Google-Smtp-Source: AA0mqf4ZVmuRQfAhPj5kUtjWN7SJLkXavMVVRG41zIWsUNoVlDpd64IFVJ71+ALovgTwNtfqhvejlt6i7+eqd+HntLE= X-Received: by 2002:a25:7754:0:b0:6f9:c7ba:eb2f with SMTP id s81-20020a257754000000b006f9c7baeb2fmr38041052ybc.441.1670963887315; Tue, 13 Dec 2022 12:38:07 -0800 (PST) List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: "Anubhav (Re: FreeBSD)" Date: Tue, 13 Dec 2022 10:35:24 -1000 Message-ID: Subject: Re: After 13.1 install, "panic: AP #1 (PHY #1) failed!" with SuperMicro X10SRL-F motherboard To: freebsd-questions@freebsd.org Content-Type: multipart/alternative; boundary="000000000000efc1c705efbb98f1" X-Spamd-Result: default: False [-4.99 / 15.00]; DWL_DNSWL_LOW(-1.00)[hawaii.edu:dkim]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; NEURAL_HAM_SHORT(-1.00)[-0.997]; DMARC_POLICY_ALLOW(-0.50)[hawaii.edu,none]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; R_DKIM_ALLOW(-0.20)[hawaii.edu:s=google]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_TLS_LAST(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::b35:from]; ARC_NA(0.00)[]; SUBJECT_HAS_EXCLAIM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[hawaii.edu:+]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_NONE(0.00)[]; TAGGED_FROM(0.00)[freebsd]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Queue-Id: 4NWr0x2XBZz3MLN X-Spamd-Bar: ---- X-ThisMailContainsUnwantedMimeParts: N --000000000000efc1c705efbb98f1 Content-Type: text/plain; charset="UTF-8" (Please email me too when you reply.) On Fri, Dec 9, 2022 at 9:35 AM Anubhav/FreeBSD wrote: > The computer server with ... > > SuperMicro X10SRL-F motherboard (LGA 2011-V3, C612 chipset), > Intel Xeon E5-1620 V3 CPU > > ... was working just fine with FreeBSD 12.x & 13.0. 13.0 was > installed from scratch with ZFS on root. > > Two days ago I updated the OS to 13.1-p5 in a new boot environment > ("freebsd-update -r 13.1-RELEASE upgrade"; "freebsd-update install"; > reboot; "freebsd-update install"). I did so over ssh. > > After a day, I could not connect to the computer via ssh. When I checked, > lots of error messages from sshd were *flying* on the console (failed to > take a photo). I could not do anything on the console. (The computer is > connected to video & keyboard via software KVM; there is no physical serial > connection.) > > After reboot of 13.1-p5, a "panic" happens all the 3-4 times I tried ... > > (transcribed from the photo of the screen after booting in verbose mode) > SMP: Added CPU 1 (AP) > MADT: Found CPU APIC ID 3 ACPI ID 3: enabled > SMP: Added CPU 3 (AP) > MADT: Found CPU APIC ID 5 ACPI ID 5: enabled > SMP: Added CPU 5 (AP) > MADT: Found CPU APIC ID 7 ACPI ID 7: enabled > SMP: Added CPU 7 (AP) > Event timer "LAPIC" quality 600 > LAPIC: ipi_wait() us multiplier 64 (r 5400080 tsc 3500095930) > ACPI APIC Table: > Package ID shift: 4 > L3 cache shift: 4 > L2 cache shift: 1 > L1 cache shift: 1 > Core ID shift: 1 > AP boot address: 0x98000 > panic: AP #1 (PHY #1) failed! > cpuid = 0 > time = 1 > KDB: stack backtrace > #0 0xffffffff80c694a5 at kdb_backtrace+0x65 > #1 0xffffffff80c1bb5f at vpanic+0x17f > #2 0xffffffff80c1b983 at panic+0x43 > #3 0xffffffff81093633 at native_start_all_aps+0x633 > #4 0xffffffff81092ce1 at cpu_mp_start+0x1a1 > #5 0xffffffff80c7c32a at mp_start+0x9a > #6 0xffffffff80ba970f at mi_startup+0xdf > #7 0xffffffff80385022 at btext+0x22 > Uptime: 1s > > > ... What is going on here, or what had happened with 13.1 install > that the machine panics? > > Booting with any of 13.0-p1[13] boot environments makes > no difference. > > ... After removing the machine from the rack (included disconnection of RaidMachine 24-bay disk enclosure from the LSI HBA card installed in the machine), it booted right up (with already installed FreeBSD 13.1-p5 on the internal disk) as if nothing had happened! There was no panic or any "AP #1 (PHY #1) failed!"-like messages. How? Why? If the machine still had panicked (after removal from the rack), then I could have tried ... - updating the BIOS; - booting from 13.[01] image from a USB flash stick; - installing 13.[01] from scratch. Now, I do not know how much I can trust the machine to not fail (panic again on a reboot). - Anubhav --000000000000efc1c705efbb98f1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
(Please email me too when you reply.)

On Fri, Dec = 9, 2022 at 9:35 AM Anubhav/FreeBSD wrote:
The computer server with ...

SuperMicro X10SRL-F motherboard (LGA 2011-V3, C612 chipset),
Intel Xeon E5-1620 V3 CPU

... was working just fine with FreeBSD 12.x & 13.0. 13.0 was
installed from scratch with ZFS on root.

Two days ago I updated the OS to 13.1-p5 in a new boot environment
("freebsd-update -r 13.1-RELEASE upgrade"; "freebsd-update i=
nstall";
reboot; "freebsd-update install"). I did so over ssh.

After a day, I could not connect to the computer via ssh. When I checked,
lots of error messages from sshd were *flying* on the console (failed to
take a photo). I could not do anything on the console. (The computer is
connected to video & keyboard via software KVM; there is no physical se=
rial
connection.) After reboot of 13.1-p5, a "panic" happens all the 3-4 times I tr= ied ... (transcribed from the photo of the screen after booting in verbose mode) SMP: Added CPU 1 (AP)
MADT: Found CPU APIC ID 3 ACPI ID 3: enabled SMP: Added CPU 3 (AP) MADT: Found CPU APIC ID 5 ACPI ID 5: enabled SMP: Added CPU 5 (AP) MADT: Found CPU APIC ID 7 ACPI ID 7: enabled SMP: Added CPU 7 (AP) Event timer "LAPIC" quality 600 LAPIC: ipi_wait() us multiplier 64 (r 5400080 tsc 3500095930) ACPI APIC Table: <SUPERM SMCI--MB> Package ID shift: 4 L3 cache shift: 4 L2 cache shift: 1 L1 cache shift: 1 Core ID shift: 1 AP boot address: 0x98000 panic: AP #1 (PHY #1) failed! cpuid =3D 0 time =3D 1 KDB: stack backtrace #0 0xffffffff80c694a5 at kdb_backtrace+0x65 #1 0xffffffff80c1bb5f at vpanic+0x17f #2 0xffffffff80c1b983 at panic+0x43 #3 0xffffffff81093633 at native_start_all_aps+0x633 #4 0xffffffff81092ce1 at cpu_mp_start+0x1a1 #5 0xffffffff80c7c32a at mp_start+0x9a #6 0xffffffff80ba970f at mi_startup+0xdf #7 0xffffffff80385022 at btext+0x22 Uptime: 1s ... What is going on here, or what had happened with 13.1 install that the machine panics? Booting with any of 13.0-p1[13] boot environments makes no difference.
...

After removing t= he machine from the rack (included disconnection
of RaidMachine 24-bay disk enclosur= e from the LSI HBA card
installed in the machine), it booted right up (with already<= /div>
installed= FreeBSD 13.1-p5 on the internal disk) as if nothing
had happened! There was no pani= c or any "AP #1 (PHY #1)
failed!"-like messages.

How? Why?

If the machine still had panicked (after = removal from the rack),
then I could have tried ...
- updating the BIOS;
- booting from 13.[01] image = from a USB flash stick;
- installing 13.[01] from scratch.

Now, I do not know how much I can trus= t the machine to
not fail (panic again on a reboot).

<= div>
- Anubhav

--000000000000efc1c705efbb98f1--