[Bug 267028] kernel panics when booting with both (zfs,ko or vboxnetflt,ko or acpi_wmi.ko) and amdgpu.ko

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 22 Dec 2024 16:40:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267028

--- Comment #270 from Mark Millard <marklmi26-fbsd@yahoo.com> ---
Where in the found_modules list the corruption happens changes
across the examples. (So far no other lists examples have been
produced with a context for kgdb use but there were backtraces
for such previously.)

Name examples for which tqe_next is corrupt:
"amdgpu_raven_mec2_bin_fw" (vmcore.8 but older gpu-firmware-amd-kmod-raven-* )
"amdgpu_raven_mec_bin_fw"  (vmcore.9)
"amdgpu_raven_me_bin_fw"   (vmcore.0)

Since some later stuff is loading before the corruption is
detected by related activity, it suggests a potential
later corruption in the mid-range of the list is of
something that was previously okay.

vmcore.8 had (note: older gpu-firmware-amd-kmod-raven-* is involved):

{link = {tqe_next = 0xfffff80000000007, tqe_prev = 0xfffff8000465bbc0},
container = 0xfffff80004b29600, name = 0xffffffff82e62026
"amdgpu_raven_mec2_bin_fw", version = 1}

(kgdb) info sharedlibrary 
From                To                  Syms Read   Shared Object Library
0xffffffff82545000  0xffffffff82552000  Yes         ./boot/kernel/fusefs.ko
0xffffffff8256d000  0xffffffff8256f000  Yes         ./boot/kernel/sem.ko
0xffffffff82575000  0xffffffff825fc000  Yes (*)     ./boot/modules/if_re.ko
0xffffffff82a00000  0xffffffff82cf5000  Yes (*)     ./boot/modules/amdgpu.ko
0xffffffff82918000  0xffffffff8296d000  Yes (*)     ./boot/modules/drm.ko
0xffffffff8298a000  0xffffffff8298b000  Yes         ./boot/kernel/iic.ko
0xffffffff8298d000  0xffffffff8298f000  Yes (*)    
./boot/modules/linuxkpi_gplv2.ko
0xffffffff82991000  0xffffffff82996000  Yes (*)     ./boot/modules/dmabuf.ko
0xffffffff82998000  0xffffffff829a2000  Yes (*)     ./boot/modules/ttm.ko
0xffffffff829a5000  0xffffffff829a6000  Yes (*)    
./boot/modules/amdgpu_raven_gpu_info_bin.ko
0xffffffff829a8000  0xffffffff829a9000  Yes (*)    
./boot/modules/amdgpu_raven_sdma_bin.ko
0xffffffff829af000  0xffffffff829b0000  Yes (*)    
./boot/modules/amdgpu_raven_asd_bin.ko
0xffffffff829de000  0xffffffff829df000  Yes (*)    
./boot/modules/amdgpu_raven_ta_bin.ko
0xffffffff829e8000  0xffffffff829e9000  Yes (*)    
./boot/modules/amdgpu_raven_pfp_bin.ko
0xffffffff829f0000  0xffffffff829f1000  Yes (*)    
./boot/modules/amdgpu_raven_me_bin.ko
0xffffffff829f7000  0xffffffff829f8000  Yes (*)    
./boot/modules/amdgpu_raven_ce_bin.ko
0xffffffff82e11000  0xffffffff82e12000  Yes (*)    
./boot/modules/amdgpu_raven_rlc_bin.ko
0xffffffff82e1d000  0xffffffff82e1e000  Yes (*)    
./boot/modules/amdgpu_raven_mec_bin.ko
0xffffffff82e61000  0xffffffff82e62000  Yes (*)    
./boot/modules/amdgpu_raven_mec2_bin.ko
0xffffffff82ea5000  0xffffffff82ea6000  Yes (*)    
./boot/modules/amdgpu_raven_vcn_bin.ko
0xffffffff83000000  0xffffffff8324c000  Yes         ./boot/kernel/zfs.ko

vmcore.9 has:

{link = {tqe_next = 0xfffff80000000007, tqe_prev = 0xfffff800035b6f00},
container = 0xfffff8000359f300, name = 0xffffffff82e1e010
"amdgpu_raven_mec_bin_fw", version = 1

(kgdb) info sharedlibrary 
From                To                  Syms Read   Shared Object Library
0xffffffff82545000  0xffffffff82547000  Yes         ./boot/kernel/sem.ko
0xffffffff8254d000  0xffffffff8255a000  Yes         ./boot/kernel/fusefs.ko
0xffffffff82574000  0xffffffff825fb000  Yes (*)     ./boot/modules/if_re.ko
0xffffffff82a00000  0xffffffff82cf4000  Yes (*)     ./boot/modules/amdgpu.ko
0xffffffff82918000  0xffffffff8296c000  Yes (*)     ./boot/modules/drm.ko
0xffffffff8298a000  0xffffffff8298b000  Yes         ./boot/kernel/iic.ko
0xffffffff8298d000  0xffffffff8298f000  Yes (*)    
./boot/modules/linuxkpi_gplv2.ko
0xffffffff82991000  0xffffffff82996000  Yes (*)     ./boot/modules/dmabuf.ko
0xffffffff82998000  0xffffffff829a2000  Yes (*)     ./boot/modules/ttm.ko
0xffffffff829a5000  0xffffffff829a6000  Yes (*)    
./boot/modules/amdgpu_raven_gpu_info_bin.ko
0xffffffff829a8000  0xffffffff829a9000  Yes (*)    
./boot/modules/amdgpu_raven_sdma_bin.ko
0xffffffff829af000  0xffffffff829b0000  Yes (*)    
./boot/modules/amdgpu_raven_asd_bin.ko
0xffffffff829db000  0xffffffff829dc000  Yes (*)    
./boot/modules/amdgpu_raven_ta_bin.ko
0xffffffff829e6000  0xffffffff829e7000  Yes (*)    
./boot/modules/amdgpu_raven_pfp_bin.ko
0xffffffff829ee000  0xffffffff829ef000  Yes (*)    
./boot/modules/amdgpu_raven_me_bin.ko
0xffffffff829f5000  0xffffffff829f6000  Yes (*)    
./boot/modules/amdgpu_raven_ce_bin.ko
0xffffffff82e11000  0xffffffff82e12000  Yes (*)    
./boot/modules/amdgpu_raven_rlc_bin.ko
0xffffffff82e1d000  0xffffffff82e1e000  Yes (*)    
./boot/modules/amdgpu_raven_mec_bin.ko
0xffffffff82e61000  0xffffffff82e62000  Yes (*)    
./boot/modules/amdgpu_raven_mec2_bin.ko
0xffffffff82ea5000  0xffffffff82ea6000  Yes (*)    
./boot/modules/amdgpu_raven_vcn_bin.ko
0xffffffff83000000  0xffffffff8324c000  Yes         ./boot/kernel/zfs.ko


vmcore.0 has:

{link = {tqe_next = 0xfffff80000000007, tqe_prev = 0xfffff800047571c0},
container = 0xfffff80004bfad80, name = 0xffffffff829ef000
"amdgpu_raven_me_bin_fw", version = 1}

(kgdb) info sharedlibrary 
From                To                  Syms Read   Shared Object Library
0xffffffff82545000  0xffffffff82552000  Yes         ./boot/kernel/fusefs.ko
0xffffffff8256c000  0xffffffff8256e000  Yes         ./boot/kernel/sem.ko
0xffffffff82574000  0xffffffff825fb000  Yes (*)     ./boot/modules/if_re.ko
0xffffffff82a00000  0xffffffff82cf4000  Yes (*)     ./boot/modules/amdgpu.ko
0xffffffff82918000  0xffffffff8296c000  Yes (*)     ./boot/modules/drm.ko
0xffffffff8298a000  0xffffffff8298b000  Yes         ./boot/kernel/iic.ko
0xffffffff8298d000  0xffffffff8298f000  Yes (*)    
./boot/modules/linuxkpi_gplv2.ko
0xffffffff82991000  0xffffffff82996000  Yes (*)     ./boot/modules/dmabuf.ko
0xffffffff82998000  0xffffffff829a2000  Yes (*)     ./boot/modules/ttm.ko
0xffffffff829a5000  0xffffffff829a6000  Yes (*)    
./boot/modules/amdgpu_raven_gpu_info_bin.ko
0xffffffff829a8000  0xffffffff829a9000  Yes (*)    
./boot/modules/amdgpu_raven_sdma_bin.ko
0xffffffff829af000  0xffffffff829b0000  Yes (*)    
./boot/modules/amdgpu_raven_asd_bin.ko
0xffffffff829db000  0xffffffff829dc000  Yes (*)    
./boot/modules/amdgpu_raven_ta_bin.ko
0xffffffff829e6000  0xffffffff829e7000  Yes (*)    
./boot/modules/amdgpu_raven_pfp_bin.ko
0xffffffff829ee000  0xffffffff829ef000  Yes (*)    
./boot/modules/amdgpu_raven_me_bin.ko
0xffffffff829f5000  0xffffffff829f6000  Yes (*)    
./boot/modules/amdgpu_raven_ce_bin.ko
0xffffffff82e11000  0xffffffff82e12000  Yes (*)    
./boot/modules/amdgpu_raven_rlc_bin.ko
0xffffffff82e1d000  0xffffffff82e1e000  Yes (*)    
./boot/modules/amdgpu_raven_mec_bin.ko
0xffffffff82e61000  0xffffffff82e62000  Yes (*)    
./boot/modules/amdgpu_raven_mec2_bin.ko
0xffffffff82ea5000  0xffffffff82ea6000  Yes (*)    
./boot/modules/amdgpu_raven_vcn_bin.ko
                                        No          /boot/modules/vboxnetflt.ko

(I do not have vboxnetflt.ko.debug yet.)

-- 
You are receiving this mail because:
You are the assignee for the bug.