Re: FreeBSD-15 kernel panic when the amdtemp device is in the kernel

From: Gary Jennejohn <garyj_at_gmx.de>
Date: Fri, 01 Sep 2023 16:21:34 UTC
On Fri, 01 Sep 2023 17:14:02 +0200
"Herbert J. Skuhra" <herbert@gojira.at> wrote:

> On Fri, 01 Sep 2023 16:04:41 +0200, Gary Jennejohn wrote:
> >
> > On Fri, 01 Sep 2023 14:15:20 +0200
> > "Herbert J. Skuhra" <herbert@gojira.at> wrote:
> >
> > > On Fri, 01 Sep 2023 13:03:14 +0200, Gary Jennejohn wrote:
> > > >
> > > > I have a laptop wioth a AMD Ryzen 5 and a tower with a AMD Ryzen 7 3700X.
> > > >
> > > > These are respectively Zen 1 and Zen 2 CPUs.
> > > >
> > > > I built a kernel on both computers using the FreeBSD-15 source tree.
> > > >
> > > > If I include the amdtemp device in my kernel file BOTH computers end up
> > > > with a kernel panic while trying to attach the amdtemp device.
> > > >
> > > > If I remove amdtemp both computers boot without any issues.
> > > >
> > > > I suspect that this commit is the cause:
> > > >
> > > > commit 323a94afb6236bcec3a07721566aec6f2ea2b209
> > > > Author: Akio Morita <akio.morita@kek.jp>
> > > > Date:   Tue Aug 1 22:32:12 2023 +0200
> > > >
> > > >     amdsmn(4), amdtemp(4): add support for Zen 4
> > > >
> > > >     Zen 4 support, tested on Ryzen 9 7900
> > > >
> > > >     Reviewed by:    imp (previous version), mhorne
> > > >     Approved by:    mhorne
> > > >     Obtained from:  http://jyurai.ddo.jp/~amorita/diary/?date=20221102#p01
> > > >     Differential Revision:  https://reviews.freebsd.org/D41049
> > >
> > > Thanks for sharing your findings.
> > >
> > > Now I probably know why my old kernel from stable/13 no longer booted
> > > after updating to stable/14. I've create a new kernel config and
> > > forgot to add "device amdtemp" & "device amdsmn" and forgot about the
> > > issue. After removing only "device amdtemp" from my old kernel config
> > > it boots again.
> > >
> > > Unfortunately reverting this commit (git revert -n 323a94afb623)
> > > doesn't resolve this issue. Old kernel does not boot if "device
> > > amdtemp" is enabled. Probably wrong commit or I am doing somethig
> > > wrong!?
> > >
> >
> > Strange.  My FreeBSD-14 kernel boots with device amdtemp (which automatically
> > results in amdsmn being included).  It's FreeBSD-15 which fails for me.
>
> 1. 'kload amdtemp' works:
>    12    1 0xffffffff81e7c000     3160 amdtemp.ko
>    13    1 0xffffffff81e80000     2138 amdsmn.ko
>
>    amdsmn0: <AMD Family 17h System Management Network> on hostb0
>    amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0
>
> 2. GENERIC boots fine. The following kernel does not:
>
>    include GENERIC
>
>    ident	TEST
>    device	amdtemp
>
> 3. Unfortunately this is a remote server without a serial console. I
> don't get a crashdump and I can't find anything in /var/log/messages.
>
> 4. I have no good revision for stable/14 and main. On main I always
> use GENERIC-NODEBUG. :-(
>

Thanks, Herbert!  kldload'ing amdsmn and amdtemp really does work!

Now I can run FBSD-15 :)

--
Gary Jennejohn