[Bug 268393] system always reboots once from a powered off state

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 02 Mar 2023 00:55:14 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268393

--- Comment #10 from Jonathan Vasquez <jon@xyinn.org> ---
Hey John,

Thanks for that. I got some interesting results!

It's been a few months since my last post and since then I've reinstalled
FreeBSD, it's currently on 13.2-STABLE
(stable/13-n254729-3912f99ecae6/GENERIC). There is nothing in /etc/sysctl.conf
at the moment. So let's begin from what perspective.

I first compiled the /usr/src/sys/amd64/conf/MINIMAL kernel and rebooted. The
first time I did this the system locked up since it couldn't find my root
filesystem, which is on ZFS on an NVMe drive. After some digging, I added a few
options (not the minimum options needed but I casted a wide enough net within
reason to allow the system to boot). After I got it booting successfully, I
wasn't able to type anything. Makes sense.. MINIMAL has no USB support lol. I
added those in as well, so I ended up with a MINIMAL config with the following
extra info:

device crypto
device acpi
device nvme
device nvd 

options ZSTDIO

device uhci
device ohci
device ehci
device xhci
device usb 
device hid

---------

Now that I was in the system successfully, we can notice that the system didn't
crash. I did a 'poweroff' as well to get the system back to the cold state
which causes it to crash on boot (first time, once it's hot it won't crash). I
did the 'kldload snd_hda' and the system immediately crashed, and I noticed
that I saw some messages regarding 'drm-510-kmod'. I thought, ah! yea I forgot
I needed to uncomment the kld_list in my /etc/rc.conf since I have 'amdgpu
vboxdrv' in there. So I was thinking, the AMD Radeon XT 6900 (sienna_cichlid)
and the snd_hda may be having a conflict. I commented out the kld_list line and
did a 'poweroff' again. I turned the machine back on immediately and booted up.
I did another 'kldload snd_hda' and the system didn't crash! I was like yea ..
maybe there is a conflict between those two drivers. But I was skeptical. I
decided to do another 'poweroff' and wait 5 seconds before I continue, to give
any internal system components time to properly reset themselves, just in case.
After the 5 seconds, I turned it back on and booted. I did another 'kldload
snd_hda', and the system crashed again! This time with no 'drm-510-kmod'
messages, just a clean dump. So that makes me think that there potentially
could be two issues here, or it could just be one underlying issue (the page
fault) that's causing it to appear in two places.

I also did a final test with loading the 'vboxdrv' and re-testing, that driver
didn't conflict, I just crashed with the same scenario as just mentioned
(without amdgpu loaded.. our clean dump).

I've attached the following crash dumps for inspection:

- 1_with_amdgpu_core.txt
- 2_clean_core.txt
- 3_clean_core.txt (this is the third run that has vboxdrv loaded but same info
as without vboxdrv.. so vboxdrv doesnt seem to cause an issue).

Thank you!

-- 
You are receiving this mail because:
You are the assignee for the bug.