[Bug 268393] system always reboots once from a powered off state
- In reply to: bugzilla-noreply_a_freebsd.org: "[Bug 268393] system always reboots once from a powered off state"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 07 Jul 2023 01:38:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268393 --- Comment #48 from Jonathan Vasquez <jon@xyinn.org> --- Hey all, So I spent a few hours today debugging this issue on 13.2-RELEASE and I have interesting stuff to report. TLDR: 1. There definitely seems to be a race condition somewhere with how either the AMD Raven HDA Controller is being enumerated, or how it's being accessed. 2. I was able to build on John's idea regarding the delays and come up with something that seems to no longer crash my system. Although I don't think it might be an acceptable solution since it would introduce a delay to all "hdac_intr_handler()" calls for any device that uses that function. But I'll keep testing it locally to see if I notice any new types of weirdness (outside of any known ones that I've experienced before this patch), and also because I don't want to have my system continuing to crash. A side note is that I ordered 2 PCIe sound cards that I want to see if they are FreeBSD compatible, which would help mitigate this issue if anything. Best case scenario, we fix this issue, and I also end up having a better sounding sound card that's not the on-board sound :). 3. We can experience different types of severity levels depending on the length of the delay. ----- So this is how the patch looks like in order to allow my system to no longer crash on first boot: diff --git a/sys/dev/sound/pci/hda/hdac.c b/sys/dev/sound/pci/hda/hdac.c index 9aa0e4bffdc8..e9d581a422cb 100644 --- a/sys/dev/sound/pci/hda/hdac.c +++ b/sys/dev/sound/pci/hda/hdac.c @@ -378,6 +378,11 @@ hdac_one_intr(struct hdac_softc *sc, uint32_t intsts) static void hdac_intr_handler(void *context) { + /* + * Add slight delay to avoid crashes with AMD Raven HDA Controllers + */ + DELAY(5000); + struct hdac_softc *sc; uint32_t intsts; ----- - If there is no DELAY (the default), the system will crash. - If there is a DELAY of 1000, the system won't crash, but we will see access errors! Which is revealing. Example: hdac2: <AMD Raven HDA Controller> mem 0xfc980000-0xfc987fff at device 0.6 on pci19 hdac2: Unexpected unsolicited response from address 0: 00000000 hdac2: Unexpected unsolicited response from address 0: 00000000 hdac2: Unexpected unsolicited response from address 0: 00000000 hdac2: Unexpected unsolicited response from address 0: 00000000 - If there is a DELAY of 5000, the system won't crash, and we no longer see any errors. In the situations where I don't use delays (and leading up to this reduced solution), I was able to have the machine stop crashing if I added at least 4 printf statements lol. If I used 3 printf, it would crash. I suppose 4 printf is relatively equal to a DELAY of 5000 for me. As stated before, with the above patch, the machine no longer crashes for me on a cold boot. I was also able to access and use my pcm8 device immediately and sound worked. This is progress. I've attached the following files: - bad.0.txt - Shows the access errors with a delay of 1000 with my previous expanded debug messages. - good.0.txt - Shows a good cold boot with a delay of 5000 with my previous expanded debug messages. - bad.1.txt - Shows the access errors with a delay of 1000 (minimal logging). root@weshly:/usr/src # uname -a FreeBSD weshly 13.2-RELEASE-p1 FreeBSD 13.2-RELEASE-p1 #23 releng/13.2-n254621-08b87f63a046-dirty: Thu Jul 6 21:22:10 EDT 2023 root@weshly:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 debugging on: commit 08b87f63a046bd966bd0ed548211ae98ff50e638 (HEAD -> releng/13.2, origin/releng/13.2) Author: Gordon Tetlow <gordon@FreeBSD.org> Date: Tue Jun 20 22:40:02 2023 -0700 Add UPDATING entries and bump version. Approved by: so -- You are receiving this mail because: You are the assignee for the bug.